Generate CREATE TABLE from JSON: Schema-First Migration in Seconds

The Manual Schema Mapping Problem

When you ingest a new JSON API into your database, the workflow looks like this:

Get a sample JSON response
Open a spreadsheet
Map each field to a SQL column
Guess the type (is price an integer or a float?)
Write the CREATE TABLE by hand
Run it
Realize created_at is a string, not a timestamp
Alter the table
Repeat for the next 12 API endpoints

This is slow, error-prone, and doesn't scale. Every new API endpoint means another manual schema mapping session. And if the API changes — a field goes from integer to string, a new field appears — you're back to step 2.

json2sql's --schema-only flag eliminates the manual mapping:

json2sql convert api-response.json --schema-only --dialect postgres

Output:

CREATE TABLE "data" ( "id" INTEGER, "name" TEXT, "email" TEXT, "age" INTEGER, "active" BOOLEAN, "balance" DOUBLE PRECISION, "created_at" TEXT );

No spreadsheet. No guessing. The types are inferred from the actual data values.

Quick Start: Generate Your First Schema

Step 1: Install json2sql

# pip
pip install json2sql

# Homebrew (macOS / Linux)
brew tap Coding-Dev-Tools/tap
brew install json2sql

# Scoop (Windows)
scoop bucket add Coding-Dev-Tools https://github.com/Coding-Dev-Tools/scoop-bucket
scoop install json2sql

Step 2: Save a Sample JSON Response

# Fetch a sample from your API
curl -s https://api.example.com/users | python3 -m json.tool > users.json

The JSON file:

[
  {
    "id": 1,
    "name": "Alice Chen",
    "email": "alice@example.com",
    "age": 30,
    "active": true,
    "balance": 1250.50,
    "created_at": "2024-01-15T10:30:00Z"
  },
  {
    "id": 2,
    "name": "Bob Martinez",
    "email": "bob@example.com",
    "age": 25,
    "active": false,
    "balance": 890.25,
    "created_at": "2024-02-20T14:00:00Z"
  }
]

Step 3: Generate the Schema

json2sql convert users.json --schema-only --dialect postgres

Output:

CREATE TABLE "data" ( "id" INTEGER, "name" TEXT, "email" TEXT, "age" INTEGER, "active" BOOLEAN, "balance" DOUBLE PRECISION, "created_at" TEXT );

Step 4: Write to a Migration File

json2sql convert users.json --schema-only --dialect postgres --output migrations/001_create_users.sql

Type Inference: How It Works

json2sql inspects every value in your JSON data and maps it to the correct SQL type for your dialect:

JSON Value	Example	PostgreSQL	MySQL	SQLite
String	`"Alice"`	`TEXT`	`VARCHAR(255)`	`TEXT`
Integer	`42`	`INTEGER`	`INT`	`INTEGER`
Float	`3.14`	`DOUBLE PRECISION`	`DOUBLE`	`REAL`
Boolean	`true`	`BOOLEAN`	`TINYINT(1)`	`INTEGER`
Null	`null`	`TEXT`	`TEXT`	`TEXT`

Type upgrading: If the first record has "count": null and a later record has "count": 42, json2sql upgrades the column from TEXT to INTEGER. It scans all records and uses the most specific type found — so you get accurate types even when the first record has nulls.

Schema-Only vs. Full Conversion

The --schema-only flag generates DDL without data — perfect for migration workflows. The default (without the flag) generates both DDL and INSERT statements:

	`json2sql convert`	`json2sql convert --schema-only`
Output	CREATE TABLE + INSERT	CREATE TABLE only
Best for	One-shot data loading	Schema-first migration, incremental loads
Review workflow	Hard to review DDL mixed with data	Clean DDL for human review
CI/CD	Too much output for PR reviews	Compact DDL diffs in PR comments
Schema changes	Re-run with new data	Compare old vs. new DDL

Five Real-World Schema Generation Workflows

Workflow 1: New API Onboarding

You're integrating a third-party payments API. You have the docs but no schema. Fetch a sample response, generate the DDL, review it, and run it. Five minutes instead of an hour of manual mapping.

# Fetch sample and generate schema in one pipeline
curl -s https://api.payment.com/transactions | \
  json2sql convert --schema-only --dialect postgres --table transactions \
  --output migrations/001_create_transactions.sql

Workflow 2: Multi-API Schema Bootstrap

You're building a data warehouse that ingests from five different APIs. Each has its own JSON shape. Generate all five schemas in one script:

#!/bin/bash
# bootstrap_warehouse.sh
APIS=("users" "orders" "products" "payments" "analytics")

for api in "${APIS[@]}"; do
  curl -s "https://api.example.com/$api" > "/tmp/${api}.json"
  json2sql convert "/tmp/${api}.json" \
    --schema-only \
    --dialect postgres \
    --table "$api" \
    --output "migrations/001_create_${api}.sql"
  echo "Generated schema for $api"
done

Workflow 3: Schema Drift Detection in CI

Your API partner changed their response format — a new field appeared, a type changed. Detect it before it breaks your ETL:

# .github/workflows/schema-drift.yml
name: API Schema Drift Detection

on:
  schedule:
    - cron: '0 8 * * 1'  # Monday 8 AM

jobs:
  drift-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install json2sql
        run: pip install json2sql

      - name: Fetch current API schema
        run: |
          curl -s $API_URL > /tmp/current.json
          json2sql convert /tmp/current.json \
            --schema-only --dialect postgres > /tmp/current_schema.sql

      - name: Compare with baseline
        run: |
          if ! diff -q schemas/baseline_schema.sql /tmp/current_schema.sql > /dev/null 2>&1; then
            echo "::warning::API schema has changed!"
            diff schemas/baseline_schema.sql /tmp/current_schema.sql
            # Create an issue or notify
          else
            echo "No schema drift detected."
          fi

Workflow 4: Nested JSON → Multiple Tables

Your JSON has nested objects and arrays. --schema-only --flatten generates DDL for all the relational tables at once:

json2sql convert orders.json --schema-only --flatten --dialect postgres

Output:

CREATE TABLE "orders" ( "id" INTEGER, "customer_name" TEXT, "customer_email" TEXT, "total" DOUBLE PRECISION ); CREATE TABLE "orders_items" ( "orders_id" INTEGER, "product" TEXT, "quantity" INTEGER, "price" DOUBLE PRECISION );

Workflow 5: Multi-Dialect Schema Generation

Your dev team uses SQLite locally, staging runs MySQL, and production is PostgreSQL. Generate schemas for all three from the same JSON:

# Generate for all three dialects
for dialect in postgres mysql sqlite; do
  json2sql convert api-data.json \
    --schema-only \
    --dialect "$dialect" \
    --table "users" \
    --output "schemas/users_${dialect}.sql"
done

PostgreSQL output:

CREATE TABLE "users" ( "id" INTEGER, "name" TEXT, "balance" DOUBLE PRECISION, "active" BOOLEAN );

MySQL output:

CREATE TABLE `users` ( `id` INT, `name` VARCHAR(255), `balance` DOUBLE, `active` TINYINT(1) );

SQLite output:

CREATE TABLE "users" ( "id" INTEGER, "name" TEXT, "balance" REAL, "active" INTEGER );

Type Inference Edge Cases

Null Values Get Upgraded

When the first record has a null value but later records have a real value, json2sql upgrades the type:

[
  {"id": 1, "nickname": null},
  {"id": 2, "nickname": "Ace"}
]

Result: nickname is inferred as TEXT, not left as a generic nullable column. json2sql scans all records before deciding the final type.

Booleans Are Not Integers

In JSON, true/false are distinct from 1/0. json2sql handles this correctly — even though Python's bool is a subclass of int, the type inference checks for booleans first:

[
  {"id": 1, "active": true},
  {"id": 2, "active": false}
]

Result: active → BOOLEAN (PostgreSQL), TINYINT(1) (MySQL), INTEGER (SQLite).

Mixed Types Default to TEXT

If a field contains both strings and numbers across records (which shouldn't happen in a well-designed API but does in practice), json2sql falls back to TEXT for safety:

[
  {"id": 1, "code": "ABC"},
  {"id": 2, "code": 42}
]

Result: code → TEXT. The integer value is preserved — it'll be inserted as '42' — and your pipeline doesn't crash on a type mismatch.

Tip: If you see unexpected TEXT types in your schema, check your source data for mixed types. This is usually a sign that the upstream API is inconsistent — worth flagging with the API owner.

CI/CD Integration: Schema-as-Code

Pattern 1: PR-Driven Schema Changes

When someone changes the JSON schema sample in your repository, automatically generate the DDL and post it as a PR comment:

# .github/workflows/schema-review.yml
name: Schema Review

on:
  pull_request:
    paths: ['schemas/*.json']

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install json2sql
        run: pip install json2sql

      - name: Generate DDL for changed schemas
        run: |
          for file in $(git diff --name-only origin/main -- 'schemas/*.json'); do
            table=$(basename "$file" .json)
            echo "### Schema for \`$table\`" >> /tmp/ddl_review.md
            echo "" >> /tmp/ddl_review.md
            echo '```sql' >> /tmp/ddl_review.md
            json2sql convert "$file" --schema-only --dialect postgres --table "$table" >> /tmp/ddl_review.md
            echo '```' >> /tmp/ddl_review.md
            echo "" >> /tmp/ddl_review.md
          done

      - name: Post DDL as PR comment
        if: hashFiles('/tmp/ddl_review.md') != ''
        run: |
          gh pr comment ${{ github.event.pull_request.number }} \
            --body-file /tmp/ddl_review.md

Pattern 2: Automated Migration Generation

Generate migration files from JSON samples as part of your build process:

# .github/workflows/generate-migrations.yml
name: Generate Migrations

on:
  push:
    paths: ['schemas/*.json']
    branches: [main]

jobs:
  generate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install json2sql
        run: pip install json2sql

      - name: Generate migration for each schema
        run: |
          VERSION=$(date +%Y%m%d%H%M%S)
          for file in schemas/*.json; do
            table=$(basename "$file" .json)
            json2sql convert "$file" \
              --schema-only \
              --dialect postgres \
              --table "$table" \
              --output "migrations/${VERSION}_create_${table}.sql"
          done

      - name: Commit migrations
        run: |
          git add migrations/
          git commit -m "Auto-generate migrations from schema samples" || true
          git push

Combining Schema-Only with Flatten

The --schema-only and --flatten flags work together to generate DDL for nested JSON — producing multiple CREATE TABLE statements from a single nested JSON document:

json2sql convert orders.json --schema-only --flatten --dialect postgres

For a JSON document with nested customer info and line items, this produces three tables:

orders — the parent table with flattened customer fields
orders_items — a child table for the line items array

Each child table automatically gets a foreign key reference to its parent (using the parent's id, name, or {table}_id field if present).

Schema-first flattening: Use --schema-only --flatten to design your relational schema before loading any data. Review the DDL, adjust column names or types if needed, then run the full conversion with INSERT statements to load the data.

Install json2sql

# pip
pip install json2sql

# Homebrew (macOS / Linux)
brew tap Coding-Dev-Tools/tap
brew install json2sql

# Scoop (Windows)
scoop bucket add Coding-Dev-Tools https://github.com/Coding-Dev-Tools/scoop-bucket
scoop install json2sql

Star json2sql on GitHub

Generate CREATE TABLE from JSON: Schema-First Migration in Seconds

The Manual Schema Mapping Problem

Quick Start: Generate Your First Schema

Step 1: Install json2sql

Step 2: Save a Sample JSON Response

Step 3: Generate the Schema

Step 4: Write to a Migration File

Type Inference: How It Works

Schema-Only vs. Full Conversion

Five Real-World Schema Generation Workflows

Workflow 1: New API Onboarding

Workflow 2: Multi-API Schema Bootstrap

Workflow 3: Schema Drift Detection in CI

Workflow 4: Nested JSON → Multiple Tables

Workflow 5: Multi-Dialect Schema Generation

Type Inference Edge Cases

Null Values Get Upgraded

Booleans Are Not Integers

Mixed Types Default to TEXT

CI/CD Integration: Schema-as-Code

Pattern 1: PR-Driven Schema Changes

Pattern 2: Automated Migration Generation

Combining Schema-Only with Flatten

Install json2sql

Related Reading