TY - GEN
T1 - Evaluating AI-Generated Questionnaires Using LDA Topic Modeling and KMeans Clustering
T2 - 2nd International Conference on Digital Systems and Design Innovation, ICDSDI 2025
AU - Cheng, Menghan
AU - Lu, Zhaolin
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/9/10
Y1 - 2025/9/10
N2 - The rapid development of large language models (LLMs) has opened up new opportunities for automated questionnaire design in academic research. However, questions remain about the validity and quality of AI-generated tools, especially when applied to theory-driven frameworks. The aim of this study is to assess the performance of AI-generated questionnaires and human-designed questionnaires in single and integrated theoretical models. To this end, we employ a hybrid approach that combines expert evaluation and unsupervised machine learning techniques. Specifically, we use latent dirichlet allocation (LDA) for topic modeling to assess semantic coverage. And, KMeans clustering was used to detect redundancy and assess semantic consistency. We created four questionnaires: two based on validated manual writing tools and two generated by the GPT-4. This covered the Unified Theory of Acceptance and Use of Technology (UTAUT) and its extended models. 310 The questionnaires assessed program quality on multiple dimensions. And, these judgments were objectively validated using machine learning outputs. Results show that AI-generated questionnaires are fluent and objective, but perform poorly in terms of accuracy, clarity, and comprehensiveness, especially under complex modeling conditions. Redundancy and semantic drift increased with theory integration. However, the AI performed well in areas requiring standardization and neutrality.
AB - The rapid development of large language models (LLMs) has opened up new opportunities for automated questionnaire design in academic research. However, questions remain about the validity and quality of AI-generated tools, especially when applied to theory-driven frameworks. The aim of this study is to assess the performance of AI-generated questionnaires and human-designed questionnaires in single and integrated theoretical models. To this end, we employ a hybrid approach that combines expert evaluation and unsupervised machine learning techniques. Specifically, we use latent dirichlet allocation (LDA) for topic modeling to assess semantic coverage. And, KMeans clustering was used to detect redundancy and assess semantic consistency. We created four questionnaires: two based on validated manual writing tools and two generated by the GPT-4. This covered the Unified Theory of Acceptance and Use of Technology (UTAUT) and its extended models. 310 The questionnaires assessed program quality on multiple dimensions. And, these judgments were objectively validated using machine learning outputs. Results show that AI-generated questionnaires are fluent and objective, but perform poorly in terms of accuracy, clarity, and comprehensiveness, especially under complex modeling conditions. Redundancy and semantic drift increased with theory integration. However, the AI performed well in areas requiring standardization and neutrality.
KW - KMeans Clustering
KW - Large Language Models
KW - Latent Dirichlet Allocation
KW - Questionnaire Design
UR - https://www.scopus.com/pages/publications/105019949162
U2 - 10.1145/3759275.3759297
DO - 10.1145/3759275.3759297
M3 - Conference contribution
AN - SCOPUS:105019949162
T3 - Proceedings of 2025 2nd International Conference on Digital Systems and Design Innovation, ICDSDI 2025
SP - 150
EP - 155
BT - Proceedings of 2025 2nd International Conference on Digital Systems and Design Innovation, ICDSDI 2025
PB - Association for Computing Machinery, Inc
Y2 - 13 June 2025 through 15 June 2025
ER -