Abstract
To solve the problems of insufficient semantic information representation in text and inadequate information transmission, leading to limited noise recognition capability and insufficient learning of long-tail relationships in distant supervised relation extraction, in this paper, a two-stage framework was proposed to integrate a pre-trained model (BERT) into multi-instance learning. Firstly, a pre-trained language model was utilized to learn text semantics so as to identify and mitigate noise. And than, a dual-modal encoder was designed within the framework to automatically learn the propagation patterns of entity types and relationships, tackling the long-tail problem. Experimental results on two widely-used datasets, NYT-10 and GDS, demonstrate that the proposed method can achieve significant improvements in both noise reduction and long-tail relation extraction.
Translated title of the contribution | Distantly Supervised Relation Extraction Based on Pre-trained Language Models and Dual-Modal Encoders |
---|---|
Original language | Chinese (Traditional) |
Pages (from-to) | 308-320 |
Number of pages | 13 |
Journal | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
Volume | 45 |
Issue number | 3 |
DOIs | |
Publication status | Published - Mar 2025 |