4.2. 论文解读
自动识别句子中实体之间具有的某种语义关系。根据参与实体的多少可以分为二元关系抽取(两个实体)和多元关系抽取(三个及以上实体)。
通过关注两个实体间的语义关系,可以得到(arg1, relation, arg2)三元组,其中arg1和arg2表示两个实体,relation表示实体间的语义关系。
根据处理数据源的不同,关系抽取可以分为以下三种:
根据抽取文本的范围不同,关系抽取可以分为以下两种:
根据所抽取领域的划分,关系抽取又可以分为以下两种:
限定域关系抽取方法:
ACE 2005
数据集简介:ACE2005语料库是语言数据联盟(LDC)发布的由实体,关系和事件注释组成的各种类型的数据,包括英语,阿拉伯语和中文培训数据,目标是开发自动内容提取技术,支持以文本形式自动处理人类语言。ACE语料解决了五个子任务的识别:entities、values、temporal expressions、relations and events。这些任务要求系统处理文档中的语言数据,然后为每个文档输出有关其中提到或讨论的实体,值,时间表达式,关系和事件的信息。
获取方式:数据集收费,需在LDC联盟的官网上注册再购买,LDC账号注册地址 ACE 2005 下载地址
TACRED
数据集简介:TACRED(TAC Relation Extraction Dataset)是一个拥有106264条实例的大规模关系抽取数据集,这些数据来自于每年的TAC KBP(TAC Knowledge Base Population)比赛使用的语料库中的新闻专线和网络文本。包含了41关系类型,此外若句子无定义关系,被标注成no_relation类型。数据集的详细介绍可以访问TACRED文档
获取方式:数据集收费,需在LDC联盟官网注册会员再购买 LDC账号注册地址 TACRED 下载地址
SemEval2010_task8
数据集简介:对于给定了的句子和两个做了标注的名词,从给定的关系清单中选出最合适的关系。数据集一共9种关系类别数,此外包含一类Other关系,含有6674实例数量。
获取方式: 原始数据
FewRel
数据集简介:FewRel是目前最大规模的精标注关系抽取数据集,由孙茂松教授领导的清华大学自然语言处理实验室发布。一共100种关系类别数,含有70000实例数量。
获取方式:FewRel 网站地址 论文地址
NYT10
NYT-10数据集文本来源于纽约时报,命名实体是通过 Stanford NER 工具并结合 Freebase 知识库进行标注的。实体对之间的关系是链接Freebase知识库中的关系,结合远监督方法所得到。该数据集共含有53种关系类型,包括特殊关系类型NA,即头尾实体无关系。
获取方式:原始数据
获取更多关系抽取数据集,可访问此处Annotated-Semantic-Relationships-Datasets
二分类:
Accuracy = (预测正确的样本数)/(总样本数)=(TP+TN)/(TP+TN+FP+FN)
Precision = (预测为正例且正确预测的样本数)/(所有预测为正例的样本数) = TP/(TP+FP)
Recall = (预测为正例且正确预测的样本数)/(所有真实情况为正例的样本数) = TP/(TP+FN)
F1 = 2 * (Precision * Recall) / (Precision + Recall )
多分类:
Macro Average
多类别(N类) F1/P/R的计算,即计算N个类别的F1/P/R,每次计算以当前类别为正例,其他所有类别为负例,最终将各类别结果求和并除以类别数取平均。
Micro Average
统计当前类别的TP、TN、FP、FN数量,再将该四类样本数各自求和作为新的TP、TN、FP、FN,计算F1/P/R公式同二分类。
P@N(最高置信度预测精度):
通常在远监督关系抽取中使用到,由于知识库所含关系实例的不完善,会出现高置信度包含关系实例的实体对被叛为负例,从而低估了系统正确率。此时可以采用人工评价,将预测结果中知识库已包含的三元组移除,然后人工判断抽取关系实例是否正确,按照top N的准确率对抽取效果进行评价。
Relation Extraction on TACRED:
模型 | average F1 | 论文题目 | 年份 | 论文链接 | code |
---|---|---|---|---|---|
BERTEM+MTB | 71.5 | Matching the Blanks: Distributional Similarity for Relation Learning | 2019 | https://arxiv.org/pdf/1906.03158v1.pdf | https://github.com/plkmo/BERT-Relation-Extraction |
KnowBert-W+W | 71.5 | Knowledge Enhanced Contextual Word Representations | 2019 | https://arxiv.org/pdf/1909.04164v2.pdf | |
DG-SpanBERT | 71.5 | Efficient long-distance relation extraction with DG-SpanBERT | 2020 | https://arxiv.org/pdf/2004.03636v1.pdf | |
SpanBERT | 70.8 | SpanBERT: Improving Pre-training by Representing and Predicting Spans | 2019 | https://arxiv.org/pdf/1907.10529v3.pdf | https://github.com/facebookresearch/SpanBERT |
R-BERT | 69.4 | Enriching Pre-trained Language Model with Entity Information for Relation Classification | 2020 | https://arxiv.org/pdf/1905.08284v1.pdf | https://github.com/wang-h/bert-relation-classification |
C-GCN + PA-LSTM | 68.2 | Graph Convolution over Pruned Dependency Trees Improves Relation Extraction | 2018 | https://arxiv.org/pdf/1809.10185v1.pdf | https://github.com/qipeng/gcn-over-pruned-trees |
Relation Extraction on SemEval-2010 Task 8:
模型 | average F1 | 论文题目 | 年份 | 论文链接 | code |
---|---|---|---|---|---|
Skeleton-Aware BERT | 90.36 | Enhancing Relation Extraction Using Syntactic Indicators and Sentential Contexts | 2019 | https://arxiv.org/pdf/1912.01858v1.pdf | https://github.com/wang-h/bert-relation-classification |
EPGNN | 90.2 | Improving Relation Classification by Entity Pair Graph | 2019 | http://proceedings.mlr.press/v101/zhao19a/zhao19a.pdf | |
BERTEM+MTB | 89.5 | Matching the Blanks: Distributional Similarity for Relation Learning | 2019 | https://arxiv.org/pdf/1906.03158v1.pdf | https://github.com/plkmo/BERT-Relation-Extraction |
R-BERT | 89.25 | Enriching Pre-trained Language Model with Entity Information for Relation Classification | 2020 | https://arxiv.org/pdf/1905.08284v1.pdf | https://github.com/wang-h/bert-relation-classification |
KnowBert-W+W | 89.1 | Knowledge Enhanced Contextual Word Representations | 2019 | https://arxiv.org/pdf/1909.04164v2.pdf | |
Entity-Aware BERT | 89 | Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers | 2019 | https://arxiv.org/pdf/1902.01030v2.pdf | https://github.com/helloeve/mre-in-one-pass |
Relation Extraction on ACE 2005:
模型 | RELATION F1 | ENTITY F1 | SENTENCE ENCODER | 论文题目 | 年份 | 论文链接 | code |
---|---|---|---|---|---|---|---|
MRC4ERE++ | 62.1 | 85.5 | BERT base | Asking Effective and Diverse Questions: A Machine Reading Comprehension based Framework for Joint Entity-Relation Extraction | 2020 | https://www.ijcai.org/Proceedings/2020/0546.pdf | https://github.com/TanyaZhao/MRC4ERE |
Multi-turn QA | 60.2 | 84.8 | BERT base | Entity-Relation Extraction as Multi-Turn Question Answering | 2019 | https://arxiv.org/pdf/1905.05529v4.pdf | |
MRT | 59.6 | 83.6 | biLSTM | Extracting Entities and Relations with Joint Minimum Risk Training | 2018 | https://www.aclweb.org/anthology/D18-1249 | |
GCN | 59.1 | 84.2 | biLSTM | Joint Type Inference on Entities and Relations via Graph Convolutional Networks | 2019 | https://www.aclweb.org/anthology/P19-1131 | |
Global | 57.5 | 83.6 | biLSTM | End-to-End Neural Relation Extraction with Global Optimization | 2017 | https://www.aclweb.org/anthology/D17-1182 | |
SPTree | 55.6 | 83.4 | biLSTM | End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures | 2016 | https://arxiv.org/pdf/1601.00770v3.pdf | https://github.com/tticoin/LSTM-ER |
Relation Extraction on ACE 2004:
模型 | RELATION F1 | ENTITY F1 | 论文题目 | 年份 | 论文链接 | code |
---|---|---|---|---|---|---|
DYGIE | 59.7 | 87.4 | A General Framework for Information Extraction using Dynamic Span Graphs | 2019 | https://arxiv.org/pdf/1904.03296v1.pdf | https://github.com/luanyi/DyGIE |
Multi-turn QA | 49.4 | 83.6 | Entity-Relation Extraction as Multi-Turn Question Answering | 2019 | https://arxiv.org/pdf/1905.05529v4.pdf | |
SPTree | 48.4 | 81.8 | End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures | 2016 | https://arxiv.org/pdf/1601.00770v3.pdf | https://github.com/tticoin/LSTM-ER |
multi-head + AT | 47.45 | 81.64 | Adversarial training for multi-context joint entity and relation extraction | 2018 | https://arxiv.org/pdf/1808.06876v3.pdf | https://github.com/bekou/multihead_joint_entity_relation_extraction |
multi-head | 47.14 | 81.16 | Joint entity recognition and relation extraction as a multi-head selection problem | 2018 | https://arxiv.org/pdf/1804.07847v3.pdf | https://github.com/bekou/multihead_joint_entity_relation_extraction |
Attention | 45.7 | 79.6 | Going out on a limb: Joint Extraction of Entity Mentions and Relations without Dependency Trees | 2017 | https://www.aclweb.org/anthology/P17-1085 |
Relation Extraction on NYT:
模型 | average F1 | 论文题目 | 年份 | 论文链接 | code |
---|---|---|---|---|---|
REDN | 89.8 | Downstream Model Design of Pre-trained Language Model for Relation Extraction Task | 2020 | https://arxiv.org/pdf/2004.03786v1.pdf | https://github.com/slczgwh/REDN |
CASREL | 89.6 | A Novel Cascade Binary Tagging Framework | 2019 | https://arxiv.org/pdf/1909.03227v4.pdf | https://github.com/weizhepei/CasRel |
HBT | 89.5 | A Novel Cascade Binary Tagging Framework for Relational Triple Extraction | 2019 | https://arxiv.org/pdf/1909.03227v4.pdf | https://github.com/weizhepei/CasRel |
WDec | 84.4 | Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction | 2019 | https://arxiv.org/pdf/1911.09886v1.pdf | https://github.com/nusnlp/PtrNetDecoding4JERE |
ETL-Span | 78.0 | Joint Extraction of Entities and Relations Based on a Novel Decomposition Strategy | 2019 | https://arxiv.org/pdf/1909.04273v3.pdf | https://github.com/yubowen-ph/JointER |
CopyRE' OneDecoder | 72.2 | CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning | 2019 | https://arxiv.org/pdf/1911.10438v1.pdf | https://github.com/WindChimeRan/CopyMTL |
Relation Extraction on CoNLL04:
模型 | RELATION F1 | ENTITY F1 | 论文题目 | 年份 | 论文链接 | code |
---|---|---|---|---|---|---|
SpERT | 71.47 | 88.94 | Span-based Joint Entity and Relation Extraction with Transformer Pre-training | 2019 | https://arxiv.org/pdf/1909.07755v3.pdf | https://github.com/markus-eberts/spert |
Multi-turn QA | 68.9 | 87.8 | Entity-Relation Extraction as Multi-Turn Question Answering | 2019 | https://arxiv.org/pdf/1905.05529v4.pdf | |
Global | 67.8 | 85.6 | End-to-End Neural Relation Extraction with Global Optimization | 2017 | https://www.aclweb.org/anthology/D17-1182 | |
Biaffine attention | 64.40 | 86.20 | End-to-end neural relation extraction using deep biaffine attention | 2018 | https://arxiv.org/pdf/1812.11275v1.pdf | https://github.com/datquocnguyen/jointRE |
Relation-Metric with AT | 62.29 | 84.15 | Neural Metric Learning for Fast End-to-End Relation Extraction | 2019 | https://arxiv.org/pdf/1905.07458v4.pdf | |
multi-head | 62.04 | 83.9 | Joint entity recognition and relation extraction as a multi-head selection problem | 2018 | https://arxiv.org/pdf/1804.07847v3.pdf | https://github.com/bekou/multihead_joint_entity_relation_extraction |
Relation Extraction on FewRel:
模型 | average F1 | 论文题目 | 年份 | 论文链接 | code |
---|---|---|---|---|---|
ERNIE | 88.32 | ERNIE: Enhanced Language Representation with Informative Entities | 2019 | https://arxiv.org/pdf/1905.07129v3.pdf | https://github.com/thunlp/ERNIE |
论文题目 | 抽取任务 | 关键词 | 论文链接 | 会议及年份 | code |
---|---|---|---|---|---|
Attention Guided Graph Convolutional Networks for Relation Extraction | 关系提取 | 注意力导向图卷积网络(AGGCN);语义依赖树;软修剪;自动学习子结构; | https://www.aclweb.org/anthology/P19-1024.pdf | ACL2019 | |
A Richer-but-Smarter Shortest Dependency Path with Attentive Augmentation for Relation Extraction | 关系提取 | 最短依赖路径SDP;注意力模型;深度神经模型;LSTM网络;CNN | https://www.aclweb.org/anthology/N19-1298 | NAACL 2019 | https://github.com/catcd/RbSP |
论文题目 | 抽取任务 | 关键词 | 论文链接 | 会议及年份 | code |
---|---|---|---|---|---|
Joint Type Inference on Entities and Relations via Graph Convolutional Networks | 抽取三元组的joint任务 | 实体关系联合推断;图卷积模型(GCN);二元关系分类 | https://pdfs.semanticscholar.org/7ce8/ce2768907421fb1a6cbfe13a8a36992721a7.pdf | ACL2019 | |
GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction | 抽取三元组的joint任务 | 端到端关系抽取;图卷积网络; | https://tsujuifu.github.io/pubs/acl19_graph-rel.pdf | ACL2019 | |
Exploiting Entity BIO Tag Embeddings and Multi-task Learning for Relation Extraction with Imbalanced Data | 关系抽取 | BIO字符/词嵌入;多任务体系结构;关系分类 | https://arxiv.org/pdf/1906.08931.pdf | ACL2019 | |
Entity-Relation Extraction as Multi-turn Question Answering | 关系抽取 | 多回合QA;上下文识别答案范围任务 | https://arxiv.org/pdf/1905.05529.pdf | ACL2019 | |
Graph Neural Networks with Generated Parameters for Relation | 关系抽取 | 图神经网络(GNN);多跳关系推理 | https://arxiv.org/pdf/1902.00756.pdf | ACL2019 | |
Kernelized Hashcode Representations for Biomedical Relation Extraction | 关系分类 | 核化的局部敏感哈希(KLSH);降低计算成本 | https://arxiv.org/pdf/1711.04044.pdf | ACL2019 | |
Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs | 关系抽取 | 图神经网络模型;文档级关系提取 | https://arxiv.org/pdf/1909.00228v1.pdf | EMNLP2019 |
论文题目 | 抽取任务 | 关键词 | 论文链接 | 会议及年份 | code |
---|---|---|---|---|---|
Hybrid Attention-based Prototypical Networks for Noisy Few-Shot Relation Classification | 关系分类 | 远监督;噪声;混合注意力圆形网络 | https://gaotianyu1350.github.io/assets/aaai2019_hatt_paper.pdf | AAAI2019 | https://github.com/thunlp/HATT-Proto |
A Hierarchical Framework for Relation Extraction with Reinforcement Learning | 关系提取 | 增强关系类型系和实体交互;分层强化学习(HRL)框架;远监督数据集 | https://arxiv.org/pdf/1811.03925.pdf | AAAI2019 | |
Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction | 关系提取 | 远监督抗噪;Cross-relation Cross-bag Selective Attention;多实例学习;句子级别;注意力机制;关注高质量实体对 | https://arxiv.org/pdf/1812.10604.pdf | AAAI2019 | |
Structured Minimally Supervised Learning for Neural Relation Extraction | 关系提取 | 最小监督;学习的表示形式;结构化学习 | https://arxiv.org/pdf/1904.00118.pdf | NAACL2019 | |
Combining Distant and Direct Supervision for Neural Relation Extraction | 关系提取 | 降噪;监督学习+远监督模型 | https://arxiv.org/pdf/1810.12956.pdf | NAACL2019 | https://github.com/allenai/comb_dist_direct_relex/ |
Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions | 关系提取 | 句子级别的Attention; | https://www.aclweb.org/anthology/N19-1288.pdf | NAACL2019 | |
GAN Driven Semi-distant Supervision for Relation Extraction | 关系提取 | 半远监督;生成对抗网络(GAN) | https://www.aclweb.org/anthology/N19-1307 | NAACL 2019 | |
Improving Distantly-Supervised Relation Extraction with Joint Label Embedding | 关系提取 | 多层注意力模型;联合标签嵌入 | https://www.aclweb.org/anthology/D19-1395.pdf | NAACL 2019 | |
Self-Attention Enhanced CNNs and Collaborative Curriculum Learning for Distantly Supervised Relation Extraction | 关系提取 | 协作式学习;卷积神经网(CNN);卷积运算内部自注意机制 | https://www.aclweb.org/anthology/D19-1037.pdf | NAACL 2019 |
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。