TY - GEN
T1 - MG-CTG
T2 - 29th International Conference on Database Systems for Advanced Applications, DASFAA 2024
AU - Gu, Xiao
AU - Luo, Zhaojing
AU - Zhang, Meihui
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2024
Y1 - 2024
N2 - Managing text data is crucial given the abundance of unstructured textual data in real-world applications. Text generation not only assists in managing massive amounts of text through tasks such as summarization and report generation but also has the capability to generate the needed content to enrich the textual database. However, the generated text is often open-ended and may not meet specific target requirements that fall into three categories: semantic, structural, and lexical. Fine-tuning pre-trained language models can meet each specific control requirement, but there is no simultaneous integration of controls from all three categories. On the other hand, post-processing methods are limited to semantic control or lexical control only. In this paper, we propose MG-CTG, a Muti-Granularity Controllable Text Generation framework to generated text satisfying controls across multiple granularities. Specifically, we design distinct controllers that employ different strategies based on post-processing methods to achieve control. Further, our proposed framework is able to attain fine-grained control at the structural granularity, as well as enhance the incorporation of keywords into the generated text via a designed keyword-guided weighted decoding method. We conduct experiments by combining control information from different granularities and evaluate the results on standard benchmark dataset for controllable text generation. The experimental results demonstrate that our method outperforms other post-processing methods on two real-world datasets.
AB - Managing text data is crucial given the abundance of unstructured textual data in real-world applications. Text generation not only assists in managing massive amounts of text through tasks such as summarization and report generation but also has the capability to generate the needed content to enrich the textual database. However, the generated text is often open-ended and may not meet specific target requirements that fall into three categories: semantic, structural, and lexical. Fine-tuning pre-trained language models can meet each specific control requirement, but there is no simultaneous integration of controls from all three categories. On the other hand, post-processing methods are limited to semantic control or lexical control only. In this paper, we propose MG-CTG, a Muti-Granularity Controllable Text Generation framework to generated text satisfying controls across multiple granularities. Specifically, we design distinct controllers that employ different strategies based on post-processing methods to achieve control. Further, our proposed framework is able to attain fine-grained control at the structural granularity, as well as enhance the incorporation of keywords into the generated text via a designed keyword-guided weighted decoding method. We conduct experiments by combining control information from different granularities and evaluate the results on standard benchmark dataset for controllable text generation. The experimental results demonstrate that our method outperforms other post-processing methods on two real-world datasets.
KW - Controllable text generation
KW - Multiple granularities
KW - Natural language processing.
UR - http://www.scopus.com/inward/record.url?scp=85213339815&partnerID=8YFLogxK
U2 - 10.1007/978-981-97-5569-1_9
DO - 10.1007/978-981-97-5569-1_9
M3 - Conference contribution
AN - SCOPUS:85213339815
SN - 9789819755684
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 138
EP - 154
BT - Database Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Proceedings
A2 - Onizuka, Makoto
A2 - Xiao, Chuan
A2 - Lee, Jae-Gil
A2 - Tong, Yongxin
A2 - Ishikawa, Yoshiharu
A2 - Lu, Kejing
A2 - Amer-Yahia, Sihem
A2 - Jagadish, H.V.
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 2 July 2024 through 5 July 2024
ER -