TaxTajweez: A Large Language Model-based Chatbot for Income Tax Information In Pakistan Using Retrieval Augmented Generation (RAG)

Authors

  • Mohammad Affan Habib Habib University
  • Shehryar Amin Habib University
  • Muhammad Oqba Habib University
  • Sameer Jaipal Habib University
  • Muhammad Junaid Khan University of Central Florida
  • Abdul Samad Habib University

DOI:

https://doi.org/10.32473/flairs.37.1.135648

Abstract

The advent of Large Language Models (LLMs) has heralded a transformative era in natural language processing across diverse fields, igniting considerable interest in domain-specific applications. However, while proprietary models have made significant strides in sectors such as medicine, education, and law through tailored data accumulations, similar advancements have yet to emerge in the Pakistani taxation domain, hindering its digital transformation.

 In this paper, we introduce TaxTajweez, a specialized Retrieval Augmented Generation (RAG) system powered by the OpenAI GPT-3.5-turbo LLM, designed specifically for income taxation. Complemented by a meticulously curated dataset tailored to the intricacies of income taxation, TaxTajweez leverages the RAG pipeline to mitigate model hallucinations, enhancing the reliability of generated responses. Through a blend of qualitative and quantitative evaluation methodologies, we rigorously assess the accuracy and usability of TaxTajweez, establishing its efficacy as an income tax advisory tool.

Downloads

Published

12-05-2024

How to Cite

Mohammad Affan Habib, Shehryar Amin, Muhammad Oqba, Sameer Jaipal, Muhammad Junaid Khan, & Abdul Samad. (2024). TaxTajweez: A Large Language Model-based Chatbot for Income Tax Information In Pakistan Using Retrieval Augmented Generation (RAG). The International FLAIRS Conference Proceedings, 37(1). https://doi.org/10.32473/flairs.37.1.135648

Issue

Section

Special Track: Applied Natural Language Processing