隨手記
A Comprehensive Survey on Schema-based Event Extraction with Deep Learning
一些概念
-
Entity: The entity is an object or group of objects in a semantic category. Entity mainly includes people,
organizations, places, times, things, etc. -
Event mentions: The phrase or sentences that describe the event contains a trigger and corresponding
arguments. -
Event type: The event type describes the nature of the event and refers to the category to which the
event corresponds, usually represented by the type of the event trigger. - Event trigger: Event trigger refers to the core unit in event extraction, a verb or a noun. Trigger identification is a key step in pipeline-based event extraction.
- Event argument: Event argument is the main attribute of events. It includes entities, nonentity participants, and time, and so on.
- Argument role: An argument role is a role played by an argument in an event, that is, the relationship representation between the event arguments and the event triggers.
schema-based EE可以包括以下子任務(wù)
-
Event classification: Event classification is to determine whether each sentence is an event. Furthermore,
if the sentence is an event, we need to determine one or several events types the sentence belongs to. Therefore, the event classification subtask can be seen as a multi-label text classification task. -
Trigger identification: It is generally considered that the trigger is the core unit in event extraction
that can clearly express an event’s occurrence. The trigger identification subtask it to find the trigger
from the text. -
Argument identification: Argument identification is to identify all the arguments contained in
an event type from the text. Argument identification usually depends on the result of event classification
and trigger identification. - Argument role classification: Argument role classification is based on the arguments contained in the event extraction schema, and the category of each argument is classified according to the identified arguments. Thus, it also can be seen as a multi-label text classification task.
EE可以視為以下任務(wù)
- classification task: 預(yù)定義
n個(gè)事件類型和它們對(duì)應(yīng)的角色,例如:event包含了一個(gè)角色集合
。給定一個(gè)事件表述
m,需要輸出向量T,表示m屬于event的概率,一般來說當(dāng)作多標(biāo)簽分類任務(wù)來做。
- sequence tagging: 就是role+text進(jìn)行NER
- MRC: 定義question schema->提取論元角色->確定事件類型->根據(jù)Question進(jìn)行MRC
EE模型
- 傳統(tǒng)提取模型
- DL
- CNN-based
- RNN-based
- Attention-based
- GCN-based
- Transformer-based
研究難點(diǎn)
- 事件論點(diǎn)之間的依賴關(guān)系提取
- 大規(guī)模數(shù)據(jù)集少
- few shot
- 通用的事件schema比較匱乏,現(xiàn)在都是手工構(gòu)建
- 專業(yè)領(lǐng)域較難

image.png