Feasibility of Tiny Recursion Models for the Traveling Salesman Problem

Learned Insertion and 2-opt Refinement

Authors

  • Alexander V Mantzaris UCF

DOI:

https://doi.org/10.32473/flairs.39.1.141582

Abstract

Tiny Recursion Models (TRMs) have recently shown strong performance on hard, highly structured reasoning tasks by repeatedly refining an internal solution representation with a small shared network. In this paper we investigate whether the same recursive refinement principle can be transferred from puzzle-style prediction to combinatorial optimization. We focus on the Euclidean Traveling Salesman Problem (TSP) and identify a key obstacle for naive TRM formulations: unconstrained iterative rewrites easily violate feasibility (duplicate or missing cities) and can collapse into short subtours that are difficult to escape during refinement. To address this, we cast TSP solving as iterative constraint satisfaction in which every intermediate state is a valid single-cycle tour. We train two supervised TRM policies from synthetic mid-trajectory data generated by classical teachers: (i) a constructive insertion policy that selects a city and an insertion edge in the current partial tour, and (ii) a learned 2-opt policy that proposes improving swaps and predicts a STOP action. Experiments on small random Euclidean instances show stable, non-degenerate refinement dynamics: the learned 2-opt policy both improves tours and learns to terminate, and the full insertion + 2-opt pipeline consistently outperforms an insertion-only ablation. While optimality gaps remain nontrivial in this proof-of-concept regime, performance matches the general range of lightweight baselines.

Downloads

Published

06-05-2026

How to Cite

Mantzaris, A. V. (2026). Feasibility of Tiny Recursion Models for the Traveling Salesman Problem: Learned Insertion and 2-opt Refinement. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141582