From Tax Compliance in Natural Language to Executable Calculations: Combining Lexical-grammar-based Parsing and Machine Learning
DOI:
https://doi.org/10.32473/flairs.v34i1.128351Keywords:
tax domain, compliance, hybrid approach, executable calculations, raw texts, unannotated, lexical chart parsing, BERTAbstract
Regulatory agencies publish tax-compliance content written in natural language intended for human consumption. There has been very little work on automated methods for interpreting this content and for generating executable calculations from it. In this paper, we describe a combination of lexical grammar-based parsing with encoder-decoder architectures for automatically bootstrapping executable calculations from natural language. The combination is particularly suitable for domains such as compliance where training data is scarce and accuracy of interpretation is of high importance. We provide an overview of the implementation for North American income-tax forms.