面向心理健康咨询的藏语数据集及大语言模型构建

Translated title of the contribution: Construction of Tibetan Datasets and Large Language Models for Psychological Health Counseling

Mengxiao Zhu, Shajiu, Chong Feng*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Anxiety and depression have become prevalent psychological disorders, and moderate counselling plays a critical role in alleviating mental and psychological stress. However, due to reasons such as the sense of shame, many individuals do not receive timely counseling and treatment. With the advancement of artificial intelligence, large language models (LLMs) with their superior abilities in knowledge integration and cognitive chaining have become effective tools for psychological counseling. Nevertheless, existing psychological health LLMs are primarily focused on resource-rich languages like English and Chinese, with limited research on their application in low-resource languages. This paper focuses on Tibetan, a representative low-resource language, to explore the construction of Tibetan psychological counseling datasets and Tibetan psychological health LLMs. Initially, we collect high-quality Chinese psychological counseling dialogue data, process it, and create a multi-turn dialogue dataset for mental health; subsequently, we develop a Chinese-Tibetan translation tool to translate this into Tibetan, using multiple mechanisms to filter and produce high-quality Tibetan psychological health multi-turn dialogue data. Utilizing the constructed data, we fine-tune existing general LLMs, Baichuan2 and LLaMA2, to develop a Tibetan psychological health LLM, which will be open-sourced for scientific research. Finally, experiments validate the effectiveness of the released Tibetan psychological health multi-turn dialogue dataset and the Tibetan psychological health counseling LLM.

Translated title of the contributionConstruction of Tibetan Datasets and Large Language Models for Psychological Health Counseling
Original languageChinese (Traditional)
Title of host publicationMain Conference
EditorsMaosong Sun, Jiye Liang, Xianpei Han, Zhiyuan Liu, Yulan He
PublisherChinese National Conference on Computational Linguistic (CCL)
Pages326-339
Number of pages14
ISBN (Electronic)9780000000002
Publication statusPublished - 2024
Externally publishedYes
Event23rd Chinese National Conference on Computational Linguistics, CCL 2024 - Taiyuan, China
Duration: 24 Jul 202428 Jul 2024

Publication series

NameCCL 2024 - 23rd Chinese National Conference on Computational Linguistics
Volume1

Conference

Conference23rd Chinese National Conference on Computational Linguistics, CCL 2024
Country/TerritoryChina
CityTaiyuan
Period24/07/2428/07/24

Cite this