NRE

Neural Relation Extraction, including CNN, PCNN, CNN+ATT, PCNN+ATT

  • 所有者: google-deepmind/mathematics_dataset
  • 平台:
  • 许可证: Apache License 2.0
  • 分类:
  • 主题:
  • 喜欢:
    0
      比较:

Github星跟踪图

New Code in Tensorflow is available at https://github.com/thunlp/OpenNRE!

Neural Relation Extraction (NRE)

Neural relation extraction aims to extract relations from plain text with neural models, which has been the state-of-the-art methods for relation extraction. In this project, we provide our implementations of CNN [Zeng et al., 2014] and PCNN [Zeng et al.,2015] and their extended version with sentence-level attention scheme [Lin et al., 2016] .

Evaluation Results

Precion/recall curves of CNN, CNN+ONE, CNN+AVE, CNN+ATT

image

Precion/recall curves of PCNN, PCNN+ONE, PCNN+AVE, PCNN+ATT

image

Data

We provide NYT10 dataset we used for the task relation extraction in data/ directory. We preprocess the original data to make it satisfy the input format of our codes. The original data of NYT10 can be downloaded from:

Relation Extraction: NYT10 is originally released by the paper "Sebastian Riedel, Limin Yao, and Andrew McCallum. Modeling relations and their mentions without labeled text." ( http://iesl.cs.umass.edu/riedel/ecml/)

Pre-Trained Word Vectors are learned from New York Times Annotated Corpus (LDC Data LDC2008T19), which should be obtained from LDC (https://catalog.ldc.upenn.edu/LDC2008T19).

Our train set is generated by merging all training data of manual and held-out datasets, deleted those data that have overlap with the test set, and used the remain one as our training data.

To run our code, the dataset should be put in the folder data/ using the following format, containing six files

  • train.txt: training file, format (fb_mid_e1, fb_mid_e2, e1_name, e2_name, relation, sentence).

  • test.txt: test file, same format as train.txt.

  • entity2id.txt: all entities and corresponding ids, one per line.

  • relation2id.txt: all relations and corresponding ids, one per line.

  • vec.bin: the pre-train word embedding file

Codes

The source codes of various methods are put in the folders CNN+ONE/, CNN+ATT/, PCNN+ONE/, PCNN+ATT/.

Compile

Just type "make" in the corresponding folders.

Train

For training, you need to type the following command in each model folder:

./train

The training model file will be saved in folder out/ .

Test

For testing, you need to type the following command in each model folder:

./test

The testing result which reports the precision/recall curve will be shown in pr.txt.

Cite

If you use the code, please cite the following paper:

[Lin et al., 2016] Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. Neural Relation Extraction with Selective Attention over Instances. In Proceedings of ACL.(http://thunlp.org/~lyk/publications/acl2016_nre.pdf)

Reference

[Zeng et al., 2014] Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao. Relation classification via convolutional deep neural network. In Proceedings of COLING.

[Zeng et al.,2015] Daojian Zeng,Kang Liu,Yubo Chen,and Jun Zhao. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proceedings of EMNLP.

主要指标

概览
名称与所有者google-deepmind/mathematics_dataset
主编程语言Python
编程语言C++ (语言数: 1)
平台
许可证Apache License 2.0
所有者活动
创建于2019-03-27 10:23:40
推送于2024-12-23 14:21:10
最后一次提交2024-12-23 14:21:10
发布数0
用户参与
星数1.9k
关注者数64
派生数257
提交数16
已启用问题?
问题数13
打开的问题数2
拉请求数5
打开的拉请求数0
关闭的拉请求数2
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?