Abstract
To solve the problems of insufficient semantic information representation in text and inadequate information transmission, leading to limited noise recognition capability and insufficient learning of long-tail relationships in distant supervised relation extraction, in this paper, a two-stage framework was proposed to integrate a pre-trained model (BERT) into multi-instance learning. Firstly, a pre-trained language model was utilized to learn text semantics so as to identify and mitigate noise. And than, a dual-modal encoder was designed within the framework to automatically learn the propagation patterns of entity types and relationships, tackling the long-tail problem. Experimental results on two widely-used datasets, NYT-10 and GDS, demonstrate that the proposed method can achieve significant improvements in both noise reduction and long-tail relation extraction.
| Translated title of the contribution | Distantly Supervised Relation Extraction Based on Pre-trained Language Models and Dual-Modal Encoders |
|---|---|
| Original language | Chinese (Traditional) |
| Pages (from-to) | 308-320 |
| Number of pages | 13 |
| Journal | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
| Volume | 45 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - Mar 2025 |