Who is the Writer? Identifying the Generative Model by Writing Style

Jiawen Yan, Baohua Zhang, Wenyao Cui, Huaping Zhang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the current digital landscape, distinguishing between human-written texts and those generated by large language models (LLMs) is essential for information security and the prevention of academic fraud. Texts generated by LLMs often closely resemble high-quality human-written content, posing significant challenges for accurate identification. To tackle this, we introduce the Identify the Writer by Writing Style (IWWS) model, which integrates perplexity scores with text embeddings through feature fusion. Our innovative approach employs a similarity matrix and contrastive learning to improve the model’s ability to detect unique writing styles. Additionally, we present the HumanGenTextify dataset, which reflects real-world text generation scenarios and serves as a robust foundation for distinguishing between human and model-generated texts. Experimental results show that our IWWS model has superior performance over existing methods, achieving high accuracy in text source detection and offering insights into distinctive writing styles. In addition, our research paves the way for future advancements in automated LLMs-generated text detection and authenticity verification.

Original languageEnglish
Title of host publicationNeural Information Processing - 31st International Conference, ICONIP 2024, Proceedings
EditorsMufti Mahmud, Maryam Doborjeh, Kevin Wong, Andrew Chi Sing Leung, Zohreh Doborjeh, M. Tanveer
PublisherSpringer Science and Business Media Deutschland GmbH
Pages319-331
Number of pages13
ISBN (Print)9789819665907
DOIs
Publication statusPublished - 2025
Event31st International Conference on Neural Information Processing, ICONIP 2024 - Auckland, New Zealand
Duration: 2 Dec 20246 Dec 2024

Publication series

NameLecture Notes in Computer Science
Volume15291 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Conference on Neural Information Processing, ICONIP 2024
Country/TerritoryNew Zealand
CityAuckland
Period2/12/246/12/24

Keywords

  • Contrastive Learning
  • Feature Fusion
  • Large Language Model
  • Text Detection
  • Text Generation

Fingerprint

Dive into the research topics of 'Who is the Writer? Identifying the Generative Model by Writing Style'. Together they form a unique fingerprint.

Cite this