論文摘要
論文目的
This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answer questions.
這篇文章提出了一種端對端的神經(jīng)網(wǎng)絡(luò)閱讀理解模型--動態(tài)塊閱讀器,能夠從文檔中提取候選答案并對答案進行排序。
模型概述
dataset: Stanford Question Answering Dataset (SQuAD) which contains a variety of human-generated factoid and non-factoid questions, have shown the effectiveness of above three contributions.
DCR encodes a document and an input question with recurrent neural networks, and then applies a word-by-word attention mechanism to acquire question-aware representations for the document, followed by the generation of chunk representations and a ranking module to propose the top-ranked chunk as the answer.
DCR用RNN對文章和問題進行編碼,然后應(yīng)用word-by-word的注意力機制來獲取問題敏感的文檔表達,接下用生成答案的塊表達,最后用一個排序模塊選擇得分最高的答案作為最終結(jié)果。
結(jié)果
DCR achieves state-of-the-art exact match and F1 scores on the SQuAD dataset.
實驗結(jié)果表明,DCR在SQuAD數(shù)據(jù)集上EM值和F1值都達到了理想的結(jié)果。
研究背景
** Reading comprehension-based question answering (RCQA)**
基于閱讀理解的問答研究
- The task of answering a question with a chunk of text taken from related document(s).
任務(wù)是從相關(guān)文檔中提取一段文本作為答案。 - In previous models, an answer boundary is either easy to determine or already given.
在之前的提出的模型中,問題答案或者容易確定,或者已經(jīng)給定。 - In the real-world QA scenario, people may ask questions about both entities (factoid) and non-entities such as explanations and reasons (non-factoid)
在現(xiàn)實世界的QA場景中,問題的形式既有關(guān)于實體的(factoid),又有非實體的(non-factoid),比如尋求解釋或者原因(non-factoid)。
問題類型:factoid&non-factoid###
Q1和 Q2屬于factoid類型的問題,Q3屬于non-factoid類型的問題

** Dynamic chunk reader **
- uses deep networks to learn better representations for candidate answer chunks, instead of using fixed feature representations
Second
用深度網(wǎng)絡(luò)學習候選答案更好的表達 - it represents answer candidates as chunks, instead of word-level representations
候選答案是基于塊表達,而不是詞表達。
** Contributions**
three-fold
- propose a novel neural network model for joint candidate answer chunking and ranking.
論文提出一個新的神經(jīng)網(wǎng)絡(luò)模型以結(jié)合候選答案塊和排序,答案以一種端對端的形式構(gòu)建和排序。
In this model the candidate answer chunks are dynamically constructed and ranked in an end-to-end manner - propose a new ** question-attention mechanism ** to enhance passage word representation used to construct chunk representations.
提出了一種新的問題-注意力機制來加強段落中詞語表達,用來構(gòu)建塊表達 - propose several simple but effective features to strengthen the attention mechanism, which fundamentally improves candidate ranking。
提出了幾種簡單但有效的特征來增強注意力機制,這種做法能從根本上排序部分的準確性。
論文要點
問題定義
基于一個段落P,通過選擇一個句子A,回答一個事實型的或者非事實型的問題Q。
Q,P,A都是句子序列,共用一個詞匯表V。
訓練集的組成為三元組(P,Q,A)
RC任務(wù)類型:
quiz-style,MovieQA:問題有多個選項
Cloze-style:通常通過代替在句子中的空格來自動生成答案。
answer selection:從文本中選擇一部分作為答案。
TREC-QA:從給定的多個段落文本中提起factoid答案
bAbI::推斷意圖
SQuAD數(shù)據(jù)集:滿足事實型和非事實型的答案提取,更接近于現(xiàn)實世界
Baseline: Chunk-and-Rank Pipeline with Neural RC
for cloze-style tasks
修改了一個用于cloze-style tasks的最好的模型,用于這篇文章的答案提取。
It has two main components: 1)
- Answer Chunking: a standalone answer chunker, which is trained to produce overlapping candidate chunks,
- Feature Extraction and Ranking:a neural RC model, which is used to score each word in a given passage to be used thereafter for generating chunk scores.
1)獨立的答案區(qū)塊,被訓練以生成重疊候選區(qū)塊;2)一個神經(jīng)RC模型,被用來給文章中的每個詞進行打分。具體解釋如下:
DCR

DCR works in four steps:
- First, the encoder layer encode passage and question separately, by using bidirectional recurrent neural networks (RNN).
編碼層:應(yīng)用bi-directional RNN encoder 對文章Pi 問題 Qi 進行編碼,得到每一個詞的隱藏狀態(tài)。 - Second, the attention layer calculates the relevance of each passage word to the question.word-by-word style attention methods
注意力層:應(yīng)用word-by-word的注意力機制,計算段落中的每個單詞到問題的相關(guān)度 - Third, the chunk representation layer dynamically extracts the candidate chunks from the given passage, and create chunk representation that encodes the contextual information of each chunk.
在得到attention layer的輸出后,塊表示層能動態(tài)生成一個候選答案塊表示。首先是確定候選答案塊的邊界,然后找到一種方式pooling - Fourth, the ranker layer scores the relevance between the representations of a chunk and the given question, and ranks all candidate chunks using a softmax layer.
排序?qū)樱河嬎忝恳粋€答案和問題的相關(guān)度(余弦相似性),用一個softmax 層對候選答案進行排序。
實驗
Stanford Question Answering
Dataset (SQuAD)
特點:包含了factoid和non-factoid questions
100k 的來自維基百科的536篇文章的問題-文章對
input word vector:5個部分
- a pre-trained 300-dimensional GloVe embedding
- a one-hot encoding (46 dimensions) for the part-of-speech (POS) tag of w;
一個46維的one-hot向量,用來表示詞語的詞性 - a one-hot encoding (14 dimensions) for named entity (NE) tag of w;
一個14維的one-hot 向量 ,用來小時詞語的命名實體屬性 - a binary value indicating whether w’s surface form is the same to any word in the quesiton;
一個二元值,表征一個詞語的表面形式是否與問題的其他詞語相同 - if the lemma form of w is the same to any word in the question;
訓練
We pre-processed the SQuAD dataset using Stanford CoreNLP tool5 (Manning et al.2014) with its default setting to tokenize the text and obtainthe POS and NE annotations.
用 Stanford CoreNLP tool5這個工具對SQuAD 數(shù)據(jù)集進行預處理
To train our model, we used stochastic gradient descent with the ADAM optimizer
實驗結(jié)果

We also studied how each component in our model contributes to the overall performance.

總結(jié)
在解決QA問題上,之前提出的模型都只針對factoid questions:或者預測單個命名實體作為答案,或者從預先定義的候選列表中選擇一個答案。
本論文論文針對QA問題提出了一種新型的神經(jīng)閱讀理解模型。模型創(chuàng)新點在于:
提出了一個聯(lián)合神經(jīng)網(wǎng)絡(luò)模型,并用一個新型的注意力模型和5個特征來加強,既可以針對factoid questions,也可以針對non-factoid questions。
不足:在預測長答案上仍然需要改進。