Back to Search Start Over

Text-to-SQL based on Large Language Models and Database Keyword Search

Authors :
Nascimento, Eduardo R.
Avila, Caio Viktor S.
Izquierdo, Yenier T.
García, Grettel M.
Andrade, Lucas Feijó L.
Facina, Michelle S. P.
Lemos, Melissa
Casanova, Marco A.
Publication Year :
2025

Abstract

Text-to-SQL prompt strategies based on Large Language Models (LLMs) achieve remarkable performance on well-known benchmarks. However, when applied to real-world databases, their performance is significantly less than for these benchmarks, especially for Natural Language (NL) questions requiring complex filters and joins to be processed. This paper then proposes a strategy to compile NL questions into SQL queries that incorporates a dynamic few-shot examples strategy and leverages the services provided by a database keyword search (KwS) platform. The paper details how the precision and recall of the schema-linking process are improved with the help of the examples provided and the keyword-matching service that the KwS platform offers. Then, it shows how the KwS platform can be used to synthesize a view that captures the joins required to process an input NL question and thereby simplify the SQL query compilation step. The paper includes experiments with a real-world relational database to assess the performance of the proposed strategy. The experiments suggest that the strategy achieves an accuracy on the real-world relational database that surpasses state-of-the-art approaches. The paper concludes by discussing the results obtained.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2501.13594
Document Type :
Working Paper