TaxTajweez: A Large Language Model-based Chatbot for Income Tax Information In Pakistan Using Retrieval Augmented Generation (RAG)

作者

  • Mohammad Affan Habib Habib University
  • Shehryar Amin Habib University
  • Muhammad Oqba Habib University
  • Sameer Jaipal Habib University
  • Muhammad Junaid Khan University of Central Florida
  • Abdul Samad Habib University

##plugins.pubIds.doi.readerDisplayName##:

https://doi.org/10.32473/flairs.37.1.135648

摘要

The advent of Large Language Models (LLMs) has heralded a transformative era in natural language processing across diverse fields, igniting considerable interest in domain-specific applications. However, while proprietary models have made significant strides in sectors such as medicine, education, and law through tailored data accumulations, similar advancements have yet to emerge in the Pakistani taxation domain, hindering its digital transformation.

 In this paper, we introduce TaxTajweez, a specialized Retrieval Augmented Generation (RAG) system powered by the OpenAI GPT-3.5-turbo LLM, designed specifically for income taxation. Complemented by a meticulously curated dataset tailored to the intricacies of income taxation, TaxTajweez leverages the RAG pipeline to mitigate model hallucinations, enhancing the reliability of generated responses. Through a blend of qualitative and quantitative evaluation methodologies, we rigorously assess the accuracy and usability of TaxTajweez, establishing its efficacy as an income tax advisory tool.

##submission.downloads##

已出版

2024-05-12

栏目

Special Track: Applied Natural Language Processing