TY - JOUR
T1 - GFN
T2 - A novel joint entity and relation extraction model with redundancy and denoising strategies
AU - Sun, Xin
AU - Guo, Qiyi
AU - Ge, Shi Qi
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/9/27
Y1 - 2024/9/27
N2 - Joint entity and relation extraction refers to the extraction of entities and their corresponding relationships in the given sentence, which has gained increasing attention in recent years. Some joint extraction models utilize a shared encoder to model the interactions between named entity recognition and relation extraction subtasks. Despite achieving decent performance, they inevitably face the issue of error propagation. One-step exhaustive methods can mitigate the error propagation problem to some extent, but they suffer from issues such as huge computation complexity and a proliferation of negative samples. Therefore, addressing the problems mentioned above, we propose a Greedy Filter Network that combines Greedy-NER and Filter-RE. GFN employs the Greedy-NER with a redundancy strategy to prioritize recall, thereby reducing error propagation between subtasks. To reduce the computational complexity, we design an innovative approach to represent and store spans in Greedy-NER. In Filter-RE, we traverse all pairwise combinations of candidate entities. To address the issue of widespread negative samples, we design a denoising strategy with two filters, effectively filtering out entity pairs without relations, which can eliminate noise and alleviate the issue of negative sample proliferation. Finally, to enable flexible control over the redundancy strategy, we design two misclassifying penalty parameters for each module. The experimental results indicate that GFN achieves the state-of-the-art F1-score on the CoNLL04 and NYT datasets, with a notable 2.0% improvement observed specifically on CoNLL04.
AB - Joint entity and relation extraction refers to the extraction of entities and their corresponding relationships in the given sentence, which has gained increasing attention in recent years. Some joint extraction models utilize a shared encoder to model the interactions between named entity recognition and relation extraction subtasks. Despite achieving decent performance, they inevitably face the issue of error propagation. One-step exhaustive methods can mitigate the error propagation problem to some extent, but they suffer from issues such as huge computation complexity and a proliferation of negative samples. Therefore, addressing the problems mentioned above, we propose a Greedy Filter Network that combines Greedy-NER and Filter-RE. GFN employs the Greedy-NER with a redundancy strategy to prioritize recall, thereby reducing error propagation between subtasks. To reduce the computational complexity, we design an innovative approach to represent and store spans in Greedy-NER. In Filter-RE, we traverse all pairwise combinations of candidate entities. To address the issue of widespread negative samples, we design a denoising strategy with two filters, effectively filtering out entity pairs without relations, which can eliminate noise and alleviate the issue of negative sample proliferation. Finally, to enable flexible control over the redundancy strategy, we design two misclassifying penalty parameters for each module. The experimental results indicate that GFN achieves the state-of-the-art F1-score on the CoNLL04 and NYT datasets, with a notable 2.0% improvement observed specifically on CoNLL04.
KW - Error propagation
KW - Information extraction
KW - Joint entity and relation extraction
KW - Overlapping relations
UR - http://www.scopus.com/inward/record.url?scp=85197072026&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2024.112137
DO - 10.1016/j.knosys.2024.112137
M3 - Article
AN - SCOPUS:85197072026
SN - 0950-7051
VL - 300
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 112137
ER -