随机文章

工业极简风格gpt5.0：”极简主义“的工业级推荐系统--Embedding--GPT 速看

2023-09-02 22:48:32 分类:互联网+ 作者:axdmin 阅读:

哈喽大家好！今天我非常荣幸能够给大家带来这篇文章，希望通过我的文字能够给大家带来新的知识和启示。在我们的日常生活中，总会遇到各种各样的问题和挑战，而这篇文章就是希望能帮助大家解决这些问题，让我们的生活更加美好。在接下来的内容中，我会详细地介绍文章的主题和要点，希望大家能够喜欢并从中受益。

在”极简主义“的工业级推荐系统--Embedding--ELMo中，可以看到Deep contextualized word representations的EMLo架构将deep context直接建模到架构设计当中，对语法、语义高低文环境均有很好地效果。

2018年OpenAi提出的论文Improving Language Understanding by Generative Pre-Training提出的GPT(Generative Pre-Training)结构

采用pre-training + fine-tuning来提高1系列NLP任务的性能BackgroundNLP技术在textual entailment, question answering, semantic similarity assessment, and document classification等任务中有广泛的应用，但是面临着：。

非标注数据歉富，标注数据稀少缺医少药1直制约NLP相关领域的发展unsupervised 方法得到的word embedding可以显著提升任务性能现阶段NLP依靠监督学习的任务需要word-level information from unlabeled text的原因（

我已经努力的绞尽脑汁的去翻译了，请见谅哈）：目前还没有明确影响word representation 迁移学习效果的优化目标目前关于如何将word representation迁移待目标任务的机制尚没有明确

Paper workGPT先通过无标签的文本预训练生成语言模型，通过目标NLP任务（textual entailment, question answering, semantic similarity assessment, and document classification）的监督学习

对模型进行fine-tuning特征抽取器没有是用的LSTM，而是用的Transformer（可以看这1篇Transform模型原理_chen_yiwei的博客-CSDN博客_transform模型）GPT采用目标词

之前的单词context-before进行预测，ELMO训练的时候采用目标词的之前和以后的单词context-before+context-after共同进行预测process long-range dependencies，这点是ELMo欠缺的

按着习惯，先定义1些paper中作者定义的变量：Given an unsupervised corpus of tokens U={u1,......un}U = \left\{ u_1,......u_n \right\}

kk is the size of the context windowthe conditional probability PP is modeled using a neural network with parameters

ΘΘwhere U=(u−k,......u−1)U = (u_{−k},......u_{−1}) is the context vector of tokensnn is the number of layers of transformer,

WeW_e is the token embedding matrixWpW_p is the position embedding matrix a labeled dataset CCeach instance consists of a sequence of input tokens,

x1,......xmx_1,...... x_m , along with a label yythe final transformer block’s activation hlmh_l^mfed into an added linear output layer with parameters

WyW_yUnsupervised pre-training常用的语言模型（language model）采取最大似然进行参数估计(context-before)：

GPT采取的多层的transformer stack而成，输入context tokens + position encoder，最后transformer_block的输出经过softmax拟合单词的概率分布：

h0=UWe+Wph_0 = UW_e + Wp hl=transformerblock(hl−1)∀l∈[1,n]h_l = transformer_block(h_{l⑴}) \forall l \in \left[ 1,n \right]

P(u)=softmax(hnWeT)P(u) = softmax(h_nW_e^T )Supervised fine-tuninglabel yy 的条件概率：

objective to maximize:

We additionally found that including language modeling as an auxiliary objective to the fine-tuning helped learning by (a) improving generalization of the supervised model, and (b) accelerating convergence.

optimize the following objective (with weight λ):

Task-specific input transformations这部分直接去看原文就可以了，看图也可以Result

Embedding系列：”极简主义“的工业级推荐系统--All your need is Embedding”极简主义“的工业级推荐系统--Embedding--word2vec”极简主义“的工业级推荐系统--Embedding--Glove

”极简主义“的工业级推荐系统--Embedding--fastText”极简主义“的工业级推荐系统--Embedding--fastText--subword”极简主义“的工业级推荐系统--Embedding--ELMo

工业级推荐系统入门系列”极简主义“的工业级推荐系统⑴，什么是推荐系统？”极简主义“的工业级推荐系统⑵，从0⑴搭建的流水线系统”极简主义“的工业级推荐系统⑶，协同过滤（UserCF、ItemCF）

”极简主义“的工业级推荐系统⑷，协同过滤（Matrix Factorization）”极简主义“的工业级推荐系统⑸，逻辑回归LR”极简主义“的工业级推荐系统⑹，XGBoost”极简主义“的工业级推荐系统⑺，特征（推荐系统的血液）

”极简主义“的工业级推荐系统⑻，机器学习平台（很多都叫啥AI平台，balabala）多任务学习系列：”极简主义“的工业级推荐系统--多任务学习(MTL, Multitask)”极简主义“的工业级推荐系统--多任务学习(MTL, Multitask)--MOE

”极简主义“的工业级推荐系统--多任务学习(MTL, Multitask)--ESSM”极简主义“的工业级推荐系统--多任务学习(MTL, Multitask)--SNR”极简主义“的工业级推荐系统--多任务学习(MTL, Multitask)--PLE

https://www.wanxiangsucai.com/read/cv180604 命运方舟霸拳极意体术PVE技能搭配攻略

随机文章

工业极简风格gpt5.0：”极简主义“的工业级推荐系统--Embedding--GPT 速看

您可能也感兴趣:

最近发表

网站分类

TAG标签

随机文章

工业极简风格gpt5.0：”极简主义“的工业级推荐系统--Embedding--GPT 速看

您可能也感兴趣:

为您推荐

工业极简风格gpt5.0：”极简主义“的工业级推荐系统--Embedding--GPT 速看

最近发表

网站分类

TAG标签