Evaluating Synthetic Sentence Coherence Using a Large Language Model

Authors

  • Richard Thompson Naval Postgraduate School
  • Angelos Toutsios Naval Postgraduate School
  • Adam Pease Naval Postgraduate School
  • Mathias Kölsch
  • Christian Darken

DOI:

https://doi.org/10.32473/flairs.39.1.141844

Keywords:

Ontology, Synthetic Training Data, English Coherence, Language to Logic, High Precision Filtering, Large Language Models

Abstract

Fine-tuning a Large Language Model (LLM) to translate imprecise, ambiguous natural language into a formal logic language that supports automated reasoning requires a significant amount of training data. With the assistance of a large ontology, millions of synthetic sentences can be generated in natural language with a corresponding formal representation. A problem arises in that generated sentences are often nonsensical. Detecting and omitting incoherent sentences improves the quality of the training dataset, and provides useful feedback to the ontologist for adding "common sense" rules to the ontology. Using approximately 6,000 human labeled sentences, this research analyzes three methods for detecting linguistic coherence and conducting high precision filtering. The first method makes use of expected next-token statistics from an LLM. The second method submits a prompt to an LLM asking it to make a coherence determination. The third method is a composite of the first two. Our results have dramatically improved synthetic training data quality and are expected to contribute to significantly better language reasoning skills.

Downloads

Published

06-05-2026

How to Cite

Thompson, R., Toutsios, A., Pease, A., Kölsch, M., & Darken, C. (2026). Evaluating Synthetic Sentence Coherence Using a Large Language Model. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141844

Issue

Section

Special Track: Applied Natural Language Processing