Medical Specialty Classification Using Large Language Models (LLMs)
DOI:
https://doi.org/10.32473/flairs.38.1.138953Abstract
This study evaluates the performance of Large Language Model (LLM)-based classifiers, including BERT, Bio-BERT, and Distil-BERT, in comparison to traditional Machine Learning algorithms to classify the medical transcription reports into various specialties. While LLMs are increasingly utilized in healthcare applications, Naive Bayes demonstrated the highest performance, achieving an accuracy of 86.16% and an F1-score of 84.52%, outperforming all other models. Machine Learning approaches, such as Random Forest and Multi-Layer Perceptron, also exhibited strong performance, whereas BERT-based models achieved lower accuracy, around 63%. However, LLM-based models performed better in certain specialties, such as Surgery, while Machine Learning models demonstrated superior performance in areas such as Allergy/Immunology, Cardiovascular/Pulmonary, and General Medicine. Our results show that LLM-based models might offer ad-
vantages in classifying certain medical specialties (e.g., Surgery) due to the benefits of pre-trained LLM models on domain-specific knowledge bases and/or their better semantic understanding with fewer data points.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Surya Kathirvel, Lenin

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.