1
|
Luo L, Ning J, Zhao Y, Wang Z, Ding Z, Chen P, Fu W, Han Q, Xu G, Qiu Y, Pan D, Li J, Li H, Feng W, Tu S, Liu Y, Yang Z, Wang J, Sun Y, Lin H. Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks. J Am Med Inform Assoc 2024:ocae037. [PMID: 38422367 DOI: 10.1093/jamia/ocae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 01/08/2024] [Accepted: 02/16/2024] [Indexed: 03/02/2024] Open
Abstract
OBJECTIVE Most existing fine-tuned biomedical large language models (LLMs) focus on enhancing performance in monolingual biomedical question answering and conversation tasks. To investigate the effectiveness of the fine-tuned LLMs on diverse biomedical natural language processing (NLP) tasks in different languages, we present Taiyi, a bilingual fine-tuned LLM for diverse biomedical NLP tasks. MATERIALS AND METHODS We first curated a comprehensive collection of 140 existing biomedical text mining datasets (102 English and 38 Chinese datasets) across over 10 task types. Subsequently, these corpora were converted to the instruction data used to fine-tune the general LLM. During the supervised fine-tuning phase, a 2-stage strategy is proposed to optimize the model performance across various tasks. RESULTS Experimental results on 13 test sets, which include named entity recognition, relation extraction, text classification, and question answering tasks, demonstrate that Taiyi achieves superior performance compared to general LLMs. The case study involving additional biomedical NLP tasks further shows Taiyi's considerable potential for bilingual biomedical multitasking. CONCLUSION Leveraging rich high-quality biomedical corpora and developing effective fine-tuning strategies can significantly improve the performance of LLMs within the biomedical domain. Taiyi shows the bilingual multitasking capability through supervised fine-tuning. However, those tasks such as information extraction that are not generation tasks in nature remain challenging for LLM-based generative approaches, and they still underperform the conventional discriminative approaches using smaller language models.
Collapse
Affiliation(s)
- Ling Luo
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jinzhong Ning
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Yingwen Zhao
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhijun Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zeyuan Ding
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Peng Chen
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Weiru Fu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Qinyu Han
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Guangtao Xu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Yunzhi Qiu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Dinghao Pan
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jiru Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Hao Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Wenduo Feng
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Senbo Tu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Yuqi Liu
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhihao Yang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jian Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Yuanyuan Sun
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Hongfei Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|