TY - GEN
T1 - Prompt-based and Fine-tuned GPT Models for Context-Dependent and -Independent Deductive Coding in Social Annotation
AU - Hou, Chenyu
AU - Zhu, Gaoxia
AU - Zheng, Juan
AU - Zhang, Lishan
AU - Huang, Xiaoshan
AU - Zhong, Tianlong
AU - Li, Shan
AU - Du, Hanxiang
AU - Ker, Chin Lee
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/3/18
Y1 - 2024/3/18
N2 - GPT has demonstrated impressive capabilities in executing various natural language processing (NLP) and reasoning tasks, showcasing its potential for deductive coding in social annotations. This research explored the effectiveness of prompt engineering and fine-tuning approaches of GPT for deductive coding of context-dependent and context-independent dimensions. Coding context-dependent dimensions (i.e., Theorizing, Integration, Reflection) requires a contextualized understanding that connects the target comment with reading materials and previous comments, whereas coding context-independent dimensions (i.e., Appraisal, Questioning, Social, Curiosity, Surprise) relies more on the comment itself. Utilizing strategies such as prompt decomposition, multi-prompt learning, and a codebook-centered approach, we found that prompt engineering can achieve fair to substantial agreement with expert-labeled data across various coding dimensions. These results affirm GPT's potential for effective application in real-world coding tasks. Compared to context-independent coding, context-dependent dimensions had lower agreement with expert-labeled data. To enhance accuracy, GPT models were fine-tuned using 102 pieces of expert-labeled data, with an additional 102 cases used for validation. The fine-tuned models demonstrated substantial agreement with ground truth in context-independent dimensions and elevated the inter-rater reliability of context-dependent categories to moderate levels. This approach represents a promising path for significantly reducing human labor and time, especially with large unstructured datasets, without sacrificing the accuracy and reliability of deductive coding tasks in social annotation. The study marks a step toward optimizing and streamlining coding processes in social annotation. Our findings suggest the promise of using GPT to analyze qualitative data and provide detailed, immediate feedback for students to elicit deepening inquiries.
AB - GPT has demonstrated impressive capabilities in executing various natural language processing (NLP) and reasoning tasks, showcasing its potential for deductive coding in social annotations. This research explored the effectiveness of prompt engineering and fine-tuning approaches of GPT for deductive coding of context-dependent and context-independent dimensions. Coding context-dependent dimensions (i.e., Theorizing, Integration, Reflection) requires a contextualized understanding that connects the target comment with reading materials and previous comments, whereas coding context-independent dimensions (i.e., Appraisal, Questioning, Social, Curiosity, Surprise) relies more on the comment itself. Utilizing strategies such as prompt decomposition, multi-prompt learning, and a codebook-centered approach, we found that prompt engineering can achieve fair to substantial agreement with expert-labeled data across various coding dimensions. These results affirm GPT's potential for effective application in real-world coding tasks. Compared to context-independent coding, context-dependent dimensions had lower agreement with expert-labeled data. To enhance accuracy, GPT models were fine-tuned using 102 pieces of expert-labeled data, with an additional 102 cases used for validation. The fine-tuned models demonstrated substantial agreement with ground truth in context-independent dimensions and elevated the inter-rater reliability of context-dependent categories to moderate levels. This approach represents a promising path for significantly reducing human labor and time, especially with large unstructured datasets, without sacrificing the accuracy and reliability of deductive coding tasks in social annotation. The study marks a step toward optimizing and streamlining coding processes in social annotation. Our findings suggest the promise of using GPT to analyze qualitative data and provide detailed, immediate feedback for students to elicit deepening inquiries.
KW - Context-Dependent
KW - Fine-tuning
KW - GPT
KW - Prompt Engineering
KW - Social Annotation
KW - deductive coding
UR - https://www.scopus.com/pages/publications/85187550642
U2 - 10.1145/3636555.3636910
DO - 10.1145/3636555.3636910
M3 - Conference contribution
AN - SCOPUS:85187550642
T3 - ACM International Conference Proceeding Series
SP - 518
EP - 528
BT - LAK 2024 Conference Proceedings - 14th International Conference on Learning Analytics and Knowledge
PB - Association for Computing Machinery
T2 - 14th International Conference on Learning Analytics and Knowledge, LAK 2024
Y2 - 18 March 2024 through 22 March 2024
ER -