关系抽取调研——学术界（更新中。。。

1. 任务
2. 方法总结
- 2.1. 早期方法
  - 2.1.1. 基于字典的方法
  - 2.1.2. 基于规则的方法
  - 2.1.3.基于本体的方法
- 2.2. 机器学习
  - 2.2.1. 监督学习
    - 2.2.1.1.基于特征的方法
    - 2.2.1.2.基于内核的方法
  - 2.2.2. 半监督学习
    - 2.2.2.1.Bootstrap
    - 2.2.2.2.联合训练
    - 2.2.2.3.标签传播
  - 2.2.3.无监督学习
    - 关系实例聚类和关系类型词选择
- 2.3. 深度学习
  - 2.3.1. Pipeline（管道式）
  - 2.3.2. Joint（联合抽取式）
- 2.4.远监督学习
4. Paper List
- 4.1. 论文列表
  - 4.1.1. 监督类方法
  - 4.1.2. 远监督方法
- 4.2. 论文解读
- 5.相关链接
- 6.参考资源

1.1. 任务定义

自动识别句子中实体之间具有的某种语义关系。根据参与实体的多少可以分为二元关系抽取（两个实体）和多元关系抽取（三个及以上实体）。

通过关注两个实体间的语义关系，可以得到（arg1, relation, arg2）三元组，其中arg1和arg2表示两个实体，relation表示实体间的语义关系。

根据处理数据源的不同，关系抽取可以分为以下三种：

面向结构化文本的关系抽取：包括表格文档、XML文档、数据库数据等
面向非结构化文本的关系抽取：纯文本
面向半结构化文本的关系抽取：介于结构化和非结构化之间

根据抽取文本的范围不同，关系抽取可以分为以下两种：

句子级关系抽取：从一个句子中判别两个实体间是何种语义关系
语料（篇章）级关系抽取：不限定两个目标实体所出现的上下文

根据所抽取领域的划分，关系抽取又可以分为以下两种：

限定域关系抽取：在一个或者多个限定的领域内对实体间的语义关系进行抽取，限定关系的类别，可看成是一个文本分类任务
开放域关系抽取：不限定关系的类别

限定域关系抽取方法：

基于模板的关系抽取方法：通过人工编辑或者学习得到的模板对文本中的实体关系进行抽取和判别，受限于模板的质量和覆盖度，可扩张性不强
基于机器学习的关系抽取方法：将关系抽取看成是一个分类问题

1.2. 常见数据集

ACE 2005

数据集简介：ACE2005语料库是语言数据联盟(LDC)发布的由实体，关系和事件注释组成的各种类型的数据，包括英语，阿拉伯语和中文培训数据，目标是开发自动内容提取技术，支持以文本形式自动处理人类语言。ACE语料解决了五个子任务的识别：entities、values、temporal expressions、relations and events。这些任务要求系统处理文档中的语言数据，然后为每个文档输出有关其中提到或讨论的实体，值，时间表达式，关系和事件的信息。

获取方式：数据集收费，需在LDC联盟的官网上注册再购买，LDC账号注册地址 ACE 2005 下载地址
TACRED

数据集简介：TACRED(TAC Relation Extraction Dataset)是一个拥有106264条实例的大规模关系抽取数据集，这些数据来自于每年的TAC KBP（TAC Knowledge Base Population）比赛使用的语料库中的新闻专线和网络文本。包含了41关系类型，此外若句子无定义关系，被标注成no_relation类型。数据集的详细介绍可以访问TACRED文档

获取方式：数据集收费，需在LDC联盟官网注册会员再购买 LDC账号注册地址 TACRED 下载地址
SemEval2010_task8

数据集简介:对于给定了的句子和两个做了标注的名词，从给定的关系清单中选出最合适的关系。数据集一共9种关系类别数，此外包含一类Other关系，含有6674实例数量。

获取方式: 原始数据
FewRel

数据集简介:FewRel是目前最大规模的精标注关系抽取数据集，由孙茂松教授领导的清华大学自然语言处理实验室发布。一共100种关系类别数，含有70000实例数量。

获取方式：FewRel 网站地址论文地址
NYT10

NYT-10数据集文本来源于纽约时报，命名实体是通过 Stanford NER 工具并结合 Freebase 知识库进行标注的。实体对之间的关系是链接Freebase知识库中的关系，结合远监督方法所得到。该数据集共含有53种关系类型，包括特殊关系类型NA，即头尾实体无关系。

获取方式:原始数据

获取更多关系抽取数据集，可访问此处Annotated-Semantic-Relationships-Datasets

1.3. 评测标准

二分类：

Accuracy = (预测正确的样本数)/(总样本数)=(TP+TN)/(TP+TN+FP+FN)

Precision = (预测为正例且正确预测的样本数)/(所有预测为正例的样本数) = TP/(TP+FP)

Recall = (预测为正例且正确预测的样本数)/(所有真实情况为正例的样本数) = TP/(TP+FN)

F1 = 2 * (Precision * Recall) / (Precision + Recall )

多分类：

Macro Average

多类别（N类） F1/P/R的计算，即计算N个类别的F1/P/R，每次计算以当前类别为正例，其他所有类别为负例，最终将各类别结果求和并除以类别数取平均。

Micro Average

统计当前类别的TP、TN、FP、FN数量，再将该四类样本数各自求和作为新的TP、TN、FP、FN，计算F1/P/R公式同二分类。

P@N（最高置信度预测精度）:

通常在远监督关系抽取中使用到，由于知识库所含关系实例的不完善，会出现高置信度包含关系实例的实体对被叛为负例，从而低估了系统正确率。此时可以采用人工评价，将预测结果中知识库已包含的三元组移除，然后人工判断抽取关系实例是否正确，按照top N的准确率对抽取效果进行评价。

1.4. SOTA

Relation Extraction on TACRED：

模型	average F1	论文题目	年份	论文链接	code
BERTEM+MTB	71.5	Matching the Blanks: Distributional Similarity for Relation Learning	2019	https://arxiv.org/pdf/1906.03158v1.pdf	https://github.com/plkmo/BERT-Relation-Extraction
KnowBert-W+W	71.5	Knowledge Enhanced Contextual Word Representations	2019	https://arxiv.org/pdf/1909.04164v2.pdf
DG-SpanBERT	71.5	Efficient long-distance relation extraction with DG-SpanBERT	2020	https://arxiv.org/pdf/2004.03636v1.pdf
SpanBERT	70.8	SpanBERT: Improving Pre-training by Representing and Predicting Spans	2019	https://arxiv.org/pdf/1907.10529v3.pdf	https://github.com/facebookresearch/SpanBERT
R-BERT	69.4	Enriching Pre-trained Language Model with Entity Information for Relation Classification	2020	https://arxiv.org/pdf/1905.08284v1.pdf	https://github.com/wang-h/bert-relation-classification
C-GCN + PA-LSTM	68.2	Graph Convolution over Pruned Dependency Trees Improves Relation Extraction	2018	https://arxiv.org/pdf/1809.10185v1.pdf	https://github.com/qipeng/gcn-over-pruned-trees

Relation Extraction on SemEval-2010 Task 8：

模型	average F1	论文题目	年份	论文链接	code
Skeleton-Aware BERT	90.36	Enhancing Relation Extraction Using Syntactic Indicators and Sentential Contexts	2019	https://arxiv.org/pdf/1912.01858v1.pdf	https://github.com/wang-h/bert-relation-classification
EPGNN	90.2	Improving Relation Classification by Entity Pair Graph	2019	http://proceedings.mlr.press/v101/zhao19a/zhao19a.pdf
BERTEM+MTB	89.5	Matching the Blanks: Distributional Similarity for Relation Learning	2019	https://arxiv.org/pdf/1906.03158v1.pdf	https://github.com/plkmo/BERT-Relation-Extraction
R-BERT	89.25	Enriching Pre-trained Language Model with Entity Information for Relation Classification	2020	https://arxiv.org/pdf/1905.08284v1.pdf	https://github.com/wang-h/bert-relation-classification
KnowBert-W+W	89.1	Knowledge Enhanced Contextual Word Representations	2019	https://arxiv.org/pdf/1909.04164v2.pdf
Entity-Aware BERT	89	Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers	2019	https://arxiv.org/pdf/1902.01030v2.pdf	https://github.com/helloeve/mre-in-one-pass

Relation Extraction on ACE 2005：

模型	RELATION F1	ENTITY F1	SENTENCE ENCODER	论文题目	年份	论文链接	code
MRC4ERE++	62.1	85.5	BERT base	Asking Effective and Diverse Questions: A Machine Reading Comprehension based Framework for Joint Entity-Relation Extraction	2020	https://www.ijcai.org/Proceedings/2020/0546.pdf	https://github.com/TanyaZhao/MRC4ERE
Multi-turn QA	60.2	84.8	BERT base	Entity-Relation Extraction as Multi-Turn Question Answering	2019	https://arxiv.org/pdf/1905.05529v4.pdf
MRT	59.6	83.6	biLSTM	Extracting Entities and Relations with Joint Minimum Risk Training	2018	https://www.aclweb.org/anthology/D18-1249
GCN	59.1	84.2	biLSTM	Joint Type Inference on Entities and Relations via Graph Convolutional Networks	2019	https://www.aclweb.org/anthology/P19-1131
Global	57.5	83.6	biLSTM	End-to-End Neural Relation Extraction with Global Optimization	2017	https://www.aclweb.org/anthology/D17-1182
SPTree	55.6	83.4	biLSTM	End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures	2016	https://arxiv.org/pdf/1601.00770v3.pdf	https://github.com/tticoin/LSTM-ER

Relation Extraction on ACE 2004：

模型	RELATION F1	ENTITY F1	论文题目	年份	论文链接	code
DYGIE	59.7	87.4	A General Framework for Information Extraction using Dynamic Span Graphs	2019	https://arxiv.org/pdf/1904.03296v1.pdf	https://github.com/luanyi/DyGIE
Multi-turn QA	49.4	83.6	Entity-Relation Extraction as Multi-Turn Question Answering	2019	https://arxiv.org/pdf/1905.05529v4.pdf
SPTree	48.4	81.8	End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures	2016	https://arxiv.org/pdf/1601.00770v3.pdf	https://github.com/tticoin/LSTM-ER
multi-head + AT	47.45	81.64	Adversarial training for multi-context joint entity and relation extraction	2018	https://arxiv.org/pdf/1808.06876v3.pdf	https://github.com/bekou/multihead_joint_entity_relation_extraction
multi-head	47.14	81.16	Joint entity recognition and relation extraction as a multi-head selection problem	2018	https://arxiv.org/pdf/1804.07847v3.pdf	https://github.com/bekou/multihead_joint_entity_relation_extraction
Attention	45.7	79.6	Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees	2017	https://www.aclweb.org/anthology/P17-1085

Relation Extraction on NYT：

模型	average F1	论文题目	年份	论文链接	code
REDN	89.8	Downstream Model Design of Pre-trained Language Model for Relation Extraction Task	2020	https://arxiv.org/pdf/2004.03786v1.pdf	https://github.com/slczgwh/REDN
CASREL	89.6	A Novel Cascade Binary Tagging Framework	2019	https://arxiv.org/pdf/1909.03227v4.pdf	https://github.com/weizhepei/CasRel
HBT	89.5	A Novel Cascade Binary Tagging Framework for Relational Triple Extraction	2019	https://arxiv.org/pdf/1909.03227v4.pdf	https://github.com/weizhepei/CasRel
WDec	84.4	Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction	2019	https://arxiv.org/pdf/1911.09886v1.pdf	https://github.com/nusnlp/PtrNetDecoding4JERE
ETL-Span	78.0	Joint Extraction of Entities and Relations Based on a Novel Decomposition Strategy	2019	https://arxiv.org/pdf/1909.04273v3.pdf	https://github.com/yubowen-ph/JointER
CopyRE' OneDecoder	72.2	CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning	2019	https://arxiv.org/pdf/1911.10438v1.pdf	https://github.com/WindChimeRan/CopyMTL

Relation Extraction on CoNLL04：

模型	RELATION F1	ENTITY F1	论文题目	年份	论文链接	code
SpERT	71.47	88.94	Span-based Joint Entity and Relation Extraction with Transformer Pre-training	2019	https://arxiv.org/pdf/1909.07755v3.pdf	https://github.com/markus-eberts/spert
Multi-turn QA	68.9	87.8	Entity-Relation Extraction as Multi-Turn Question Answering	2019	https://arxiv.org/pdf/1905.05529v4.pdf
Global	67.8	85.6	End-to-End Neural Relation Extraction with Global Optimization	2017	https://www.aclweb.org/anthology/D17-1182
Biaffine attention	64.40	86.20	End-to-end neural relation extraction using deep biaffine attention	2018	https://arxiv.org/pdf/1812.11275v1.pdf	https://github.com/datquocnguyen/jointRE
Relation-Metric with AT	62.29	84.15	Neural Metric Learning for Fast End-to-End Relation Extraction	2019	https://arxiv.org/pdf/1905.07458v4.pdf
multi-head	62.04	83.9	Joint entity recognition and relation extraction as a multi-head selection problem	2018	https://arxiv.org/pdf/1804.07847v3.pdf	https://github.com/bekou/multihead_joint_entity_relation_extraction

Relation Extraction on FewRel：

模型	average F1	论文题目	年份	论文链接	code
ERNIE	88.32	ERNIE: Enhanced Language Representation with Informative Entities	2019	https://arxiv.org/pdf/1905.07129v3.pdf	https://github.com/thunlp/ERNIE

6. Paper List

6.1. 论文列表

6.1.1. 监督类方法

6.1.1.1. 利用语法信息的方法

论文题目	抽取任务	关键词	论文链接	会议及年份	code
Attention Guided Graph Convolutional Networks for Relation Extraction	关系提取	注意力导向图卷积网络（AGGCN）；语义依赖树；软修剪；自动学习子结构；	https://www.aclweb.org/anthology/P19-1024.pdf	ACL2019
A Richer-but-Smarter Shortest Dependency Path with Attentive Augmentation for Relation Extraction	关系提取	最短依赖路径SDP；注意力模型；深度神经模型；LSTM网络；CNN	https://www.aclweb.org/anthology/N19-1298	NAACL 2019	https://github.com/catcd/RbSP

1.1.1.2. 不利用语法信息的方法

论文题目	抽取任务	关键词	论文链接	会议及年份
Joint Type Inference on Entities and Relations via Graph Convolutional Networks	抽取三元组的joint任务	实体关系联合推断；图卷积模型（GCN）；二元关系分类	https://pdfs.semanticscholar.org/7ce8/ce2768907421fb1a6cbfe13a8a36992721a7.pdf	ACL2019
GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction	抽取三元组的joint任务	端到端关系抽取；图卷积网络；	https://tsujuifu.github.io/pubs/acl19_graph-rel.pdf	ACL2019
Exploiting Entity BIO Tag Embeddings and Multi-task Learning for Relation Extraction with Imbalanced Data	关系抽取	BIO字符/词嵌入；多任务体系结构；关系分类	https://arxiv.org/pdf/1906.08931.pdf	ACL2019
Entity-Relation Extraction as Multi-turn Question Answering	关系抽取	多回合QA；上下文识别答案范围任务	https://arxiv.org/pdf/1905.05529.pdf	ACL2019
Graph Neural Networks with Generated Parameters for Relation	关系抽取	图神经网络（GNN）；多跳关系推理	https://arxiv.org/pdf/1902.00756.pdf	ACL2019
Kernelized Hashcode Representations for Biomedical Relation Extraction	关系分类	核化的局部敏感哈希（KLSH）；降低计算成本	https://arxiv.org/pdf/1711.04044.pdf	ACL2019
Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs	关系抽取	图神经网络模型；文档级关系提取	https://arxiv.org/pdf/1909.00228v1.pdf	EMNLP2019

1.1.2. 远监督方法

论文题目	抽取任务	关键词	论文链接	会议及年份	code
Hybrid Attention-based Prototypical Networks for Noisy Few-Shot Relation Classification	关系分类	远监督；噪声；混合注意力圆形网络	https://gaotianyu1350.github.io/assets/aaai2019_hatt_paper.pdf	AAAI2019	https://github.com/thunlp/HATT-Proto
A Hierarchical Framework for Relation Extraction with Reinforcement Learning	关系提取	增强关系类型系和实体交互；分层强化学习（HRL）框架；远监督数据集	https://arxiv.org/pdf/1811.03925.pdf	AAAI2019
Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction	关系提取	远监督抗噪；Cross-relation Cross-bag Selective Attention；多实例学习；句子级别；注意力机制；关注高质量实体对	https://arxiv.org/pdf/1812.10604.pdf	AAAI2019
Structured Minimally Supervised Learning for Neural Relation Extraction	关系提取	最小监督；学习的表示形式；结构化学习	https://arxiv.org/pdf/1904.00118.pdf	NAACL2019
Combining Distant and Direct Supervision for Neural Relation Extraction	关系提取	降噪；监督学习+远监督模型	https://arxiv.org/pdf/1810.12956.pdf	NAACL2019	https://github.com/allenai/comb_dist_direct_relex/
Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions	关系提取	句子级别的Attention；	https://www.aclweb.org/anthology/N19-1288.pdf	NAACL2019
GAN Driven Semi-distant Supervision for Relation Extraction	关系提取	半远监督；生成对抗网络（GAN）	https://www.aclweb.org/anthology/N19-1307	NAACL 2019
Improving Distantly-Supervised Relation Extraction with Joint Label Embedding	关系提取	多层注意力模型；联合标签嵌入	https://www.aclweb.org/anthology/D19-1395.pdf	NAACL 2019
Self-Attention Enhanced CNNs and Collaborative Curriculum Learning for Distantly Supervised Relation Extraction	关系提取	协作式学习；卷积神经网（CNN)；卷积运算内部自注意机制	https://www.aclweb.org/anthology/D19-1037.pdf	NAACL 2019

雨落辟湖 / IE-Survey

关系抽取调研——学术界（更新中。。。

目录

1.1. 任务定义

1.2. 常见数据集

1.3. 评测标准

1.4. SOTA

6. Paper List

6.1. 论文列表

6.1.1. 监督类方法

6.1.1.1. 利用语法信息的方法

1.1.1.2. 不利用语法信息的方法

1.1.2. 远监督方法

简介

发行版

贡献者

近期动态

雨落辟湖 / IE-Survey .gitee-modal { width: 500px !important; }

关系抽取调研——学术界（更新中。。。

目录

1.1. 任务定义

1.2. 常见数据集

1.3. 评测标准

1.4. SOTA

6. Paper List

6.1. 论文列表

6.1.1. 监督类方法

6.1.1.1. 利用语法信息的方法

1.1.1.2. 不利用语法信息的方法

1.1.2. 远监督方法

简介

发行版

贡献者

近期动态

搜索帮助

雨落辟湖 / IE-Survey