国产香蕉AV久久久久,久久福利在线

論文鏈接：https://arxiv.org/pdf/1705.00108.pdf

從unlabeled text學(xué)習(xí)到的word embeddings已經(jīng)成為一個NLP任務(wù)中的標(biāo)準(zhǔn)組成部分。然而，大多數(shù)情況下，recurrent network提取word-level的表示，這種表示包含了上下文信息，在少量標(biāo)注的數(shù)據(jù)上訓(xùn)練。本文提出了一種半監(jiān)督的方法，用來給雙向語言模型添加pre-trained context embeddings，應(yīng)用到序列標(biāo)注問題上。在兩個數(shù)據(jù)集上評估模型for NER and chunking(組塊)，都取得了很好的結(jié)果。

在NLP任務(wù)中，知道單詞的意義很重要，知道它的上下文同樣重要。

bidirectional RNN是在有標(biāo)注的數(shù)據(jù)上訓(xùn)練的，本文提出一種半監(jiān)督的方法，不需要額外標(biāo)注數(shù)據(jù)。

使用一個neural language model（LM），在大量無標(biāo)注的數(shù)據(jù)集上預(yù)訓(xùn)練，計算序列中每個位置的上下文encoding，并在有監(jiān)督的序列標(biāo)注任務(wù)中使用它。

本文的主要貢獻(xiàn)是：

1. 將對上下文敏感的LM embeddings使用在有監(jiān)督的序列標(biāo)注上。將任務(wù)的F1值有提高。

2. 使用前向和后向LM embeddings可以提升performance，跟單前向LM相比。對領(lǐng)域特定的預(yù)訓(xùn)練是不必要的，通過使用LM trained in the news domain to scientific papers。

2. Language model augmented sequence taggers（TagLM）

Baseline sequence tagging model：

一句tokens (t1, t2, ..., tN)，首先構(gòu)造一個representation，xk，對每個token會連接字符的representation ck 和 token embedding wk。

baseline

ck是通過CNN或RNN獲取的morphological information(形態(tài)信息)

The token embeddings, wk, are obtained as a lookup E(., .)，initialized using pre-trained word embeddings, and fine tuned during training

第一層----多層雙向RNN：

多層雙向RNN

第二層----RNN：使用hk,1 輸出hk,2

最后----第二層RNN的輸出hk,L用于預(yù)測每個可能的標(biāo)注的分?jǐn)?shù)，using a single dense layer。

It is beneficial to model and decode each sentence jointly instead of independently predicting the label for each token

最后一層---with parameters for each label bigram, 計算句子的conditional random field(CRF) loss。在訓(xùn)練時使用前向-后向算法，使用Viterbi algorithm 找到最可能的標(biāo)注序列

Bidirectional LM：

語言模型獲取序列的概率(t1, t2, ..., tN)

前向

后向

得到LM embedding：將前向和后向的embedding拼接? hkLM = [前向hkLM, 后向hkLM]

最終將LM和序列模型結(jié)合：

除了單純連接之外，有另一種選擇，可以增加非線性映射

f是非線性函數(shù)

另一種是增加一種attention-like機(jī)制

實驗部分：對比了TagLM和沒有額外標(biāo)注數(shù)據(jù)的任務(wù)，對比了TagLM和有額外標(biāo)注數(shù)據(jù)的任務(wù)

LM embedding添加的位置：1. input to the first RNN layer? 2. output of the first RNN layer? 3. output of the second RNN layer

結(jié)論：在序列標(biāo)注模型中使用提前訓(xùn)練好的神經(jīng)網(wǎng)絡(luò)模型增加token representation，優(yōu)于其他模型for NER

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

《Semi-supervised sequence tagging with bidirectional language models》閱讀筆記

《Semi-supervised sequence tagging with bidirectional language models》閱讀筆記

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

《Semi-supervised sequence tagging with bidirectional language models》閱讀筆記

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av