Text To Sql

Use when (1) user describes what data they want in plain English and asks for the corresponding SQL query. (2) user says "write SQL for this", "convert to qu...

installs

stars

karma

SkillRank score ↗

7.9/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-05-26

skill-factory-text-sql bridges natural language data requests to syntactically correct sql queries. covers schema validation, intent mapping, multi-table joins, and validation rules. explicitly excludes execution, design, and analysis scope.

structure

9.0

trigger phrases

8.0

procedure

8.0

edge cases

6.0

documentation

8.0

strengths

view original SKILL.md from clawhubclick to expand

---
name: text-to-sql
description: >
  Use when (1) user describes what data they want in plain English and asks for the corresponding SQL query. 
  (2) user says "write SQL for this", "convert to query", "how do I select", or "give me the SQL". 
  (3) user provides a database schema or table descriptions and asks a question answerable by SQL. 
license: MIT
metadata:
  version: "1.0"
  category: data
  author: wangjipeng
  sources:
    - https://github.com/MiniMax-AI/skills
---

# Text to SQL

Use when (1) user describes what data they want in plain English and asks for the corresponding SQL query. (2) user says "write SQL for this", "convert to query", "how do I select", or "give me the SQL". (3) user provides a database schema or table descriptions and asks a question answerable by SQL.

## Core Position

This skill solves the specific problem of: *non-technical users who know what data they want cannot translate their intent into SQL — they need a bridge from natural language to query.*

This skill IS NOT:
- A SQL execution environment — it writes queries, does not run them
- A schema design tool — it works with existing schema the user provides
- A data analysis tool — it produces SQL, not results or insights

This skill IS activated ONLY when: natural language description + database schema + SQL request are all present.

## Modes

### `/text-to-sql`

**Default mode.** Converts natural language into a syntactically correct SQL query.

When to use: User describes data needs and provides schema — wants the query.

### `/text-to-sql/explain`

Outputs the SQL query with inline comments explaining each clause.

When to use: User wants to understand the query while seeing it, for learning purposes.

### `/text-to-sql/alternatives`

Provides 2-3 alternative query approaches (different JOINs, subqueries vs CTEs, etc.).

When to use: User is learning SQL or wants to compare query strategies.

## Execution Steps

### Step 1 — Confirm Schema

1. Receive natural language request and detect if schema is present
2. Schema may be provided as:
   - Table/column names explicitly in the request
   - A CREATE TABLE statement
   - A DESCRIBE output
   - Column names from a previous query
3. If schema is NOT provided, ask the user for it before proceeding — do not guess table or column names
4. Build a schema map: `table_name → {column: type}`

### Step 2 — Translate Intent to SQL Clauses

Map natural language intent to SQL components:

| Natural Language | SQL Clause |
|---|---|
| "all", "every", "complete list" | `SELECT *` or `SELECT all columns` |
| "only", "just", "specifically" | `SELECT [specific columns]` |
| "where [condition]" | `WHERE` clause |
| "sorted by", "in order of" | `ORDER BY` |
| "grouped by", "each [X]" | `GROUP BY` |
| "top N", "first N", "N most" | `LIMIT N` + `ORDER BY` |
| "not", "exclude", "without" | `WHERE NOT` or `!=` / `<>` |
| "both X and Y", "along with" | `AND` in WHERE, or JOIN |
| "either X or Y", "or" | `OR` in WHERE |
| "between X and Y" | `BETWEEN` |
| "like", "containing", "includes" | `LIKE '%value%'` |
| "before", "after", "earlier than" | `WHERE date_column < 'date'` |
| "latest", "most recent", "newest" | `ORDER BY date DESC LIMIT 1` |
| "count of", "how many" | `COUNT(*)` aggregate |
| "total of", "sum of" | `SUM(column)` |
| "average of" | `AVG(column)` |

### Step 3 — Handle Joins and Relationships

If the request involves multiple tables:
1. Identify which tables contain the needed columns
2. Determine the join key (foreign key relationship)
3. Select join type: `INNER JOIN` (default), `LEFT JOIN` (if some side may be empty), `RIGHT JOIN` (rare)
4. Write join on the correct key pair

If schema doesn't include relationship info, ask user to clarify which column links the tables.

### Step 4 — Generate and Validate

```sql
SELECT
  o.order_id,
  o.created_at,
  c.customer_name,
  SUM(o.total_amount) AS total_revenue
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
WHERE o.created_at >= '2024-01-01'
GROUP BY o.order_id, o.created_at, c.customer_name
ORDER BY total_revenue DESC
LIMIT 10;
```

Check:
- All columns referenced exist in the schema
- All table aliases are defined
- JOIN conditions are valid (same type, correct keys)
- No ambiguous column references (all tables have aliases)
- Aggregate queries have appropriate GROUP BY

## Mandatory Rules

### Do not

- Do not invent table names or column names not in the provided schema
- Do not use SQL keywords as column names without backtick quoting where needed
- Do not write `SELECT *` in production queries — list specific columns
- Do not assume which table a column belongs to — qualify all column references

### Do

- Ask for schema information before writing the query if it wasn't provided
- Qualify all column references with table aliases (e.g., `o.order_id`)
- Use backtick or quoted identifiers if column names are SQL reserved words
- Provide both the query and a one-line plain English translation of what it does

## Quality Bar

**A good output:**
- Query is syntactically correct for the stated dialect (PostgreSQL, MySQL, SQLite, etc.)
- All column and table names match the provided schema exactly
- JOIN conditions are valid and use the correct key types
- The query actually answers the stated question

**A bad output:**
- References a column not in the schema
- `SELECT *` without justification in a query that should return specific columns
- Missing `GROUP BY` for an aggregate query
- JOIN on mismatched types (string ID to integer ID)

## Good vs. Bad Examples

| Scenario | Bad Output | Good Output |
|---|---|---|
| Schema: `users(id, name)`, `orders(user_id, total)` | `SELECT * FROM orders` | `SELECT u.name, SUM(o.total) FROM users u JOIN orders o ON u.id = o.user_id GROUP BY u.name` |
| "top 10 customers" | No `LIMIT 10` or `ORDER BY` | `ORDER BY total DESC LIMIT 10` |
| No schema provided | Writes a query with invented columns | "Could you share the table schema (column names and types)?" |
| "show me revenue by month" | `SELECT revenue` without `GROUP BY` | `SELECT DATE_TRUNC('month', date), SUM(revenue) FROM orders GROUP BY 1` |

## References

- `references/` — SQL dialect cheat sheet (PostgreSQL, MySQL, SQLite), JOIN types and when to use each, common intent-to-clause mapping

don't have the plugin yet? install it then click "run inline in claude" again.

added explicit inputs section documenting schema formats and SQL dialect parameter, restructured execution steps with clear input/output for each step, formalized decision points covering schema validation, mode selection, ambiguity handling, reserved keywords, and join type inference, added specific output contract requirements for column qualification and alias consistency, defined outcome signals tied to actual execution success

Text to SQL

intent

activate this skill when a user describes what data they want in plain English and asks for the corresponding SQL query. use cases: user says "write SQL for this", "convert to query", "how do I select", or "give me the SQL". user provides a database schema or table descriptions and asks a question answerable by SQL. the skill bridges non-technical users from natural language intent to executable queries. this is not a query execution environment, not a schema design tool, and not a data analysis tool. it writes queries only.

inputs

natural language request: user's plain English description of the data they want
database schema: table names, column names, column types, and primary/foreign key relationships. can be provided as: CREATE TABLE statements, DESCRIBE output, column lists, or explicit relationship documentation. if not provided, skill must ask for it before proceeding
SQL dialect (optional): PostgreSQL, MySQL, SQLite, T-SQL, BigQuery, etc. defaults to PostgreSQL if unspecified
mode selection (optional): default mode generates query; /explain mode adds inline comments; /alternatives mode provides 2-3 query approaches

no external API connections required. no authentication. no rate limits.

procedure

step 1: confirm schema availability

input: natural language request

detect if schema is present in the user's message
if schema is present, parse it into a map: table_name → {column: type, key_type: pk|fk|none}
if schema is missing, explicitly ask user for it (table names, column names, column types, and any foreign key relationships)
do not guess or invent table or column names

output: confirmed schema map

step 2: translate natural language intent to SQL components

input: natural language request + schema map

map each phrase to a SQL clause using the intent table below:

natural language phrase	SQL clause
"all", "every", "complete list"	SELECT all columns (list explicitly, no *)
"only", "just", "specifically"	SELECT [specific columns only]
"where [condition]"	WHERE clause
"sorted by", "in order of", "ascending", "descending"	ORDER BY
"grouped by", "each [X]", "per [X]"	GROUP BY
"top N", "first N", "N most"	LIMIT N with ORDER BY DESC
"not", "exclude", "without"	WHERE NOT or != or <>
"both X and Y", "along with"	AND in WHERE or JOIN
"either X or Y", "or"	OR in WHERE
"between X and Y"	BETWEEN operator
"like", "containing", "includes", "starts with"	LIKE '%value%' or LIKE 'value%'
"before", "after", "earlier than", "later than"	WHERE date_column < or > 'date'
"latest", "most recent", "newest"	ORDER BY date_column DESC LIMIT 1
"count of", "how many"	COUNT(*) aggregate
"total of", "sum of"	SUM(column) aggregate
"average of", "mean"	AVG(column) aggregate
"distinct", "unique", "different"	SELECT DISTINCT

output: list of SQL clauses and their components

step 3: handle joins and multi-table queries

input: SQL clauses + schema map

if request involves multiple tables:
- identify which columns come from which table
- locate the join key (foreign key relationship in schema)
- select join type: INNER JOIN (default, rows must exist in both tables), LEFT JOIN (keep all rows from left table, nulls on right), RIGHT JOIN (rare, keep all rows from right table)
- construct join condition: ON table1.fk_col = table2.pk_col
if schema lacks explicit relationship info, ask user to specify which column links the tables
qualify all column references with table aliases (e.g., o.order_id, c.customer_name)

output: validated JOIN clauses with aliases

step 4: generate and validate query

input: all SQL clauses + schema map

assemble the query in standard order: SELECT, FROM, JOIN, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT
validate:
- all referenced columns exist in the provided schema
- all table aliases are defined and used consistently
- JOIN conditions reference existing foreign keys with matching types
- no ambiguous column names (all qualified with alias)
- aggregate queries have GROUP BY for all non-aggregated columns
- parentheses balance and syntax is correct for the target SQL dialect
output the query and a one-line plain English translation of what it does

output: syntactically correct SQL query + translation

decision points

if schema is not provided: ask user for it. do not invent table or column names. wait for explicit schema information before generating any query.

if user asks for /explain mode: add inline comments above each SQL clause explaining its purpose and what data it filters or returns.

if user asks for /alternatives mode: generate 2-3 alternative queries using different approaches (e.g., JOIN vs subquery, CTE vs nested query, GROUP BY with HAVING vs filtering in WHERE). explain trade-offs of each.

if the request is ambiguous (e.g., "show me sales" without specifying which columns or which tables): ask clarifying questions before writing the query. do not assume.

if a column name is a SQL reserved keyword (e.g., "date", "user", "order"): quote it with backticks or double quotes depending on SQL dialect. PostgreSQL and SQLite use double quotes; MySQL uses backticks.

if multiple join types are valid (e.g., user asks for "orders with customers"): default to INNER JOIN and note in the query why. if user intends LEFT JOIN, they must specify "all orders, even those without a customer".

if the request includes aggregates but no explicit GROUP BY (e.g., "total revenue by region"): infer GROUP BY region and flag this assumption in the output.

if the request references a column not in the schema: ask user to clarify. do not reference invented columns.

output contract

success looks like:

a SQL query that is syntactically correct for the specified dialect (or PostgreSQL if unspecified)
all table and column names exactly match the provided schema, with no invented names
all columns referenced in SELECT, WHERE, JOIN, GROUP BY, and ORDER BY are defined in the schema
all table aliases are defined in FROM or JOIN and used consistently throughout
JOIN conditions reference valid foreign key relationships with matching data types
for aggregate queries, GROUP BY includes all non-aggregated columns in the SELECT clause
the query actually answers the user's original question in plain English

file location / delivery format:

query is output as a single markdown code block with sql language tag (triple backtick + sql)
query is preceded by a one-line plain English summary
if /explain mode: comments are inline in the query explaining each clause
if /alternatives mode: multiple queries are provided, each in its own code block, with brief explanations of when to use each

outcome signal

user knows the skill worked when:

they copy-paste the generated SQL into their database tool and it executes without syntax errors
the query returns the rows and columns they expected based on their original request
if they run the query against their actual schema and data, the results match their intent (e.g., if they asked for "top 10 customers by revenue", the output shows 10 rows sorted by revenue descending)
for /explain mode: user reads the comments and understands what each clause does
for /alternatives mode: user can compare the three approaches and choose one based on performance or readability

Text To Sql

related skills

Text to SQL

intent

inputs

procedure

decision points

output contract

outcome signal