The JSON-to-SQL Gap
Modern data is JSON. Databases speak SQL. Between them sits a tedious manual process: parse the JSON, infer column types, generate CREATE TABLE statements, flatten nested objects into relational tables, and produce INSERT statements in the right dialect. Every developer writes this script once, then rewrites it when the schema changes or the target database switches from PostgreSQL to MySQL.
The gap gets worse with nested JSON. A single API response might contain an array of objects, each with nested fields that need their own table and foreign key relationships. Writing the flattening logic by hand is error-prone and doesn't generalize.
Four approaches try to bridge this gap, at very different scales:
| Approach | json2sql | Papa Parse + manual SQL | AWS DMS | Airbyte |
|---|---|---|---|---|
| What it does | CLI: JSON to SQL in one command | Parse CSV/JSON, write INSERT strings | Cloud migration service | ELT platform with 350+ connectors |
| Scale | Developer laptop | Developer laptop | Enterprise cloud | Team/org data pipelines |
| Setup time | 30 seconds | 30 minutes | 1-2 hours | 30-60 minutes |
Tool 1: json2sql -- One-Command JSON to SQL
json2sql -- Convert JSON Files to SQL INSERT Statements
convert-- JSON to INSERT statements in one command- Nested JSON auto-flattens into relational tables with
--flatten - Multi-dialect: PostgreSQL, MySQL, SQLite
- Stdin pipe support for data pipelines
json2sql is a CLI tool that does one thing: convert JSON data into valid SQL. You give it a JSON file (or pipe JSON to stdin), and it outputs CREATE TABLE and INSERT statements. It handles nested objects, type inference, and multi-dialect output -- all from the command line.
Core workflow
# Install
pip install json2sql-cli
# Basic conversion -- JSON to SQL INSERTs
json2sql convert data.json
# CREATE TABLE data (id INTEGER, name TEXT, email TEXT, active BOOLEAN);
# INSERT INTO data (id, name, email, active) VALUES (1, 'Alice', 'alice@example.com', TRUE);
# INSERT INTO data (id, name, email, active) VALUES (2, 'Bob', 'bob@example.com', FALSE);
# Target a specific database dialect
json2sql convert data.json --dialect postgres
json2sql convert data.json --dialect mysql
json2sql convert data.json --dialect sqlite
# Specify table name
json2sql convert users.json --table users
# Flatten nested JSON into relational tables
json2sql convert nested_data.json --flatten
# Creates parent table + child tables with foreign keys
# Pipe from stdin
cat api_response.json | json2sql convert --dialect postgres --table events
# Output to file
json2sql convert data.json -o seed.sql
# CI/CD: seed a test database
json2sql convert fixtures.json --dialect sqlite -o seed.sql
sqlite3 test.db < seed.sql
What json2sql gets right
- One command. No scripts to write, no transformations to define.
json2sql convert data.jsonproduces valid SQL. If you need it again with a different dialect, change one flag. - Nested JSON flattening.
--flattenautomatically breaks nested objects into separate relational tables with foreign key relationships. This is the feature that saves the most time -- writing flattening logic by hand takes 50-100 lines per nested schema. - Multi-dialect output. PostgreSQL, MySQL, and SQLite have different INSERT syntaxes (BOOLEAN vs TINYINT, quoted identifiers, auto-increment). json2sql handles the dialect differences automatically.
- Type inference. Auto-detects strings, numbers, booleans, and nulls. No schema definition needed for simple data.
- Pipe support. Read from stdin, write to stdout. Fits into any Unix pipeline or CI/CD step.
- Schema generation. Generates CREATE TABLE statements alongside INSERT statements. You get a complete, runnable SQL file.
- Zero-config. No YAML, no mapping files, no UI. Works out of the box for 90% of JSON-to-SQL use cases.
Where json2sql is limited
- File-based input. Designed for JSON files and stdin. Doesn't connect to APIs, message queues, or databases directly.
- Not a migration service. Doesn't continuously sync data. It's a one-shot conversion tool, not a CDC (change data capture) pipeline.
- No UI. CLI only. Teams that prefer visual mapping tools need something else.
- Limited to INSERT. Generates INSERT statements, not UPDATE or UPSERT. For idempotent seeding, you'd need to add ON CONFLICT clauses manually.
Tool 2: Papa Parse + Hand-Rolled SQL -- The Script Approach
Papa Parse + Manual INSERT Generation -- Parse JSON, Write SQL Strings
- Papa Parse: fast CSV/JSON parser for Node.js
- You write the INSERT statement templates
- Full control over mapping and transformation
This is what most developers do first: parse the JSON (with Papa Parse, jq, or a built-in parser), then loop through the results and build INSERT strings. Papa Parse is an excellent parser -- fast, well-tested, and handles edge cases like quoted fields and mixed encodings. The problem isn't the parsing; it's everything after.
Core workflow
// Parse JSON with Papa Parse
const results = Papa.parse(jsonString, { header: true, dynamicTyping: true });
// Hand-roll INSERT statements
const columns = Object.keys(results.data[0]);
const colList = columns.join(', ');
for (const row of results.data) {
const values = columns.map(col => {
const val = row[col];
if (val === null || val === undefined) return 'NULL';
if (typeof val === 'number') return val;
if (typeof val === 'boolean') return val ? 'TRUE' : 'FALSE';
return `'${val.replace(/'/g, "''")}'`; // SQL injection risk
});
console.log(`INSERT INTO my_table (${colList}) VALUES (${values.join(', ')});`);
}
// Problems you still need to solve:
// 1. CREATE TABLE generation
// 2. Nested object flattening
// 3. Dialect differences (PostgreSQL vs MySQL vs SQLite)
// 4. SQL injection escaping
// 5. Type inference for columns
// 6. Foreign key relationships from nested data
// 7. NULL handling per dialect
// 8. Date/timestamp formatting
What Papa Parse + manual SQL gets right
- Full control. You decide exactly how each field maps to a column, how types are inferred, and how INSERT statements are formatted.
- No new dependency. If you already use Papa Parse (or Node.js's built-in JSON parser), you're just writing more code in an existing project.
- Free. No license cost. Papa Parse is MIT-licensed.
- Custom transformation. Any data transformation, no matter how unusual, can be implemented in code.
Where hand-rolling SQL falls apart
- Nested JSON flattening. This is the killer. Nested objects require relational table design, foreign key generation, and separate INSERT statements per table. A 3-level nested JSON object needs 3 tables and careful ordering of INSERTs. Writing this by hand for every new schema takes hours.
- SQL injection risk. Hand-rolled string interpolation is the #1 cause of SQL injection. Even with basic escaping, edge cases (unicode, newlines, backslashes) can break your INSERT statements.
- Dialect differences. PostgreSQL uses
TRUE/FALSEfor booleans. MySQL uses1/0. SQLite uses1/0but also acceptsTRUE/FALSEin newer versions. Date formats differ. Identifier quoting differs. If you ever switch databases, you rewrite your script. - Schema generation. You still need CREATE TABLE statements with the right column types. Type inference from JSON values is non-trivial -- a column that looks like a number might need to be TEXT if some rows have string values.
- Maintenance burden. Every schema change requires updating the script. Every new JSON source requires a new script. The approach doesn't scale beyond 2-3 data sources.
Best for: One-off conversions where you have a simple, flat JSON structure and need full control over the output. If your JSON is nested or you need to support multiple database dialects, use json2sql instead.
Tool 3: AWS DMS -- Cloud Database Migration Service
AWS Database Migration Service -- Migrate Data Between Databases
- Full database migration: schema, data, indexes, constraints
- Continuous data replication (CDC)
- Source to Target for 20+ database engines
AWS DMS is a cloud service for migrating databases -- from on-premises to AWS, between AWS databases, or from one database engine to another. It handles schema conversion, data migration, and ongoing change replication. It's enterprise-grade infrastructure for moving entire databases, not for converting a JSON file into INSERT statements.
Core workflow
# Via AWS Console or CLI:
# 1. Create a replication instance (t3.medium ~$0.036/hr)
aws dms create-replication-instance ...
# 2. Create source endpoint (e.g., S3 with JSON files)
aws dms create-endpoint --endpoint-type source --engine-name s3 ...
# 3. Create target endpoint (e.g., PostgreSQL RDS)
aws dms create-endpoint --endpoint-type target --engine-name postgres ...
# 4. Create migration task
aws dms create-task --migration-type full-load ...
# 5. Start task and wait
aws dms start-task ...
# Total setup time: 1-2 hours
# Cost: ~$50/month minimum for a small replication instance
What AWS DMS gets right
- Full database migration. Migrates schemas, data, indexes, views, stored procedures, and constraints. Not just INSERT statements -- the entire database.
- Continuous replication. CDC (change data capture) keeps the target in sync with the source. Changes to the source are replicated to the target in near-real-time.
- 20+ database engines. Source and target can be different engines (Oracle to PostgreSQL, MySQL to Aurora, etc.). Schema conversion handles most type differences.
- Enterprise reliability. Monitoring, logging, failure recovery, and AWS support. Built for production workloads.
Where AWS DMS is overkill for JSON-to-SQL
- Setup overhead. Creating replication instances, endpoints, and migration tasks takes 1-2 hours. For converting a 5KB JSON file into INSERT statements, this is 100x more setup than the task requires.
- Cost. A small replication instance costs ~$50/month. You pay for compute even when you're not migrating data.
- Not designed for JSON files. DMS migrates between databases, not from JSON files to SQL. The S3 source endpoint can read JSON, but it expects a specific format and partitioning scheme.
- No CLI quick-start. Requires AWS Console or CLI configuration, IAM roles, VPC networking, and security groups. Not a developer tool -- it's infrastructure.
- No schema generation from JSON. DMS assumes the target schema already exists. It doesn't infer column types from JSON data.
Best for: Migrating entire production databases between engines or replicating data continuously between systems. Not appropriate for converting JSON files into SQL -- use json2sql for that.
Tool 4: Airbyte -- Open-Source ELT Platform
Airbyte -- ELT Platform with 350+ Connectors
- 350+ source connectors (APIs, databases, files, SaaS)
- Transformations via dbt integration
- Scheduled syncs and incremental extraction
Airbyte is an ELT (Extract, Load, Transform) platform that connects data sources (APIs, databases, SaaS platforms) to destinations (data warehouses, databases, lakes). It has 350+ pre-built connectors and handles scheduling, error recovery, and incremental extraction. It's the right tool when you have many data sources that need to flow into a central database or warehouse on a regular schedule.
Core workflow
# Self-hosted (Docker)
docker run -p 8000:8000 airbyte/airbyte
# Or use Airbyte Cloud (no infrastructure)
# 1. Create a source (e.g., REST API, S3, Stripe, etc.)
# 2. Create a destination (PostgreSQL, Snowflake, BigQuery, etc.)
# 3. Create a connection (source to destination)
# 4. Configure sync frequency (hourly, daily, etc.)
# 5. Airbyte extracts, normalizes, and loads data
# CLI alternative:
airbyte-cli configure ...
airbyte-cli sync ...
What Airbyte gets right
- 350+ connectors. Connects to almost any data source -- REST APIs, databases, SaaS platforms, file stores, message queues. No custom integration code needed.
- Scheduled syncs. Set it up once and Airbyte keeps data flowing on a schedule. No cron jobs or manual triggers.
- Incremental extraction. Only pulls new or changed records. Efficient for large, continuously updated data sources.
- dbt integration. Transform loaded data with dbt models. Full ELT pipeline from extraction to analytics-ready tables.
- Open source. Self-host for free. Full control over your data pipeline.
Where Airbyte is overkill for JSON-to-SQL
- Platform, not a tool. Airbyte is a data platform with a UI, scheduler, database, and container runtime. It's not a CLI command you run in a script. Setup takes 30-60 minutes even with Docker.
- Not designed for one-shot conversions. Airbyte shines at recurring data syncs. For "convert this JSON file to SQL once," it's the wrong abstraction level.
- No JSON file source. Airbyte's file connector expects CSV or structured file formats in S3/GCS. A local JSON file on your laptop isn't a supported source.
- Schema normalization is opinionated. Airbyte normalizes nested JSON into a specific schema format that may not match your desired table structure. json2sql's
--flattengives you more control over the relational model. - Cost at scale. Cloud pricing is credit-based. Self-hosted requires Docker, a database, and compute resources. Overkill for a developer who needs INSERT statements from a JSON file.
Best for: Teams building data pipelines with many sources flowing into a warehouse. Not appropriate for one-shot JSON-to-SQL conversion -- use json2sql for that.
Feature Comparison
| Capability | json2sql | Papa Parse + SQL | AWS DMS | Airbyte |
|---|---|---|---|---|
| JSON to SQL in one command | Yes | No (write script) | No (configure service) | No (configure pipeline) |
| Nested JSON flattening | Yes (--flatten) | No (manual) | No | ~ (opinionated) |
| CREATE TABLE generation | Yes (auto) | No (manual) | Yes (schema conversion) | ~ (normalization) |
| Multi-dialect output | Yes (Postgres/MySQL/SQLite) | No (write per dialect) | Yes (20+ engines) | Yes (many destinations) |
| Type inference from JSON | Yes (auto) | No (manual) | No | ~ (basic) |
| Stdin pipe support | Yes | Yes | No | No |
| CI/CD friendly | Yes (30 seconds) | ~ (custom script) | No (infrastructure) | No (platform) |
| Continuous data sync | No | No | Yes (CDC) | Yes (scheduled syncs) |
| Multi-source connectors | No | No | Yes (20+) | Yes (350+) |
| Setup time | 30 seconds | 30 minutes | 1-2 hours | 30-60 minutes |
| Works offline | Yes | Yes | No | ~ (self-host only) |
| Open source | Yes (MIT) | Yes (MIT) | No | Yes (core) |
Use Case Comparison
| Use Case | json2sql | Papa Parse + SQL | AWS DMS | Airbyte |
|---|---|---|---|---|
| Seed a test database from JSON fixtures | Ideal | Works | No | No |
| Convert API response to INSERT statements | Ideal | Works | No | No |
| Flatten nested JSON into relational tables | Ideal | No (manual) | No | ~ (opinionated) |
| Generate SQL for multiple database dialects | Ideal | No (manual) | Yes | Yes |
| CI/CD data pipeline step | Ideal | ~ (custom script) | No | No |
| Migrate entire production database | No | No | Ideal | ~ (possible) |
| Continuous data sync (CDC) | No | No | Ideal | Ideal |
| Multi-source data warehouse loading | No | No | ~ (possible) | Ideal |
Cost Comparison
| Cost Factor | json2sql | Papa Parse + SQL | AWS DMS | Airbyte |
|---|---|---|---|---|
| License/tool | MIT (free tier) | MIT (free) | Pay per use | Open source / Cloud paid |
| Dev time per conversion | 5 minutes | 30-60 minutes | 1-2 hours setup | 30-60 minutes setup |
| Ongoing cost | $9/mo or $49/mo Suite | $0 | ~$50/mo per instance | Free (self-host) / $2.50/credit |
| Full suite (11 tools) | $49/mo | N/A | N/A | N/A |
When to Use Which
Use json2sql when:
You have JSON data (files, API responses, fixtures) and need SQL INSERT statements. You want one command, not a script. You need nested JSON flattened into relational tables. You need output for PostgreSQL, MySQL, or SQLite. This covers 90% of developer JSON-to-SQL needs.
Use Papa Parse + manual SQL when:
You have a simple, flat JSON structure and need full control over the output format. You're already using Papa Parse and don't want another dependency. Your schema is stable and unlikely to change. Be honest: this approach saves 5 minutes of install time but costs 30+ minutes of scripting.
Use AWS DMS when:
You're migrating an entire production database from one engine to another or need continuous data replication between databases. This is infrastructure, not a developer tool. Don't use it for JSON-to-SQL conversion.
Use Airbyte when:
You're building a data pipeline with multiple sources flowing into a data warehouse on a regular schedule. Airbyte is a platform for recurring ELT workflows, not a CLI for one-shot conversions. Don't use it when you need INSERT statements from a JSON file.
The Complementary Stack
These four tools aren't competing -- they solve different problems at different scales. Here's how they fit together:
| Layer | Tool | Purpose |
|---|---|---|
| 1. Dev laptop | json2sql | Convert JSON fixtures and API responses to SQL instantly. Seed test databases in CI. One command, no scripts. |
| 2. One-off script | Papa Parse + SQL | Custom transformations that don't fit json2sql's conventions. Rare edge cases with unusual mapping requirements. |
| 3. Database migration | AWS DMS | Migrate production databases between engines. Keep replicas in sync with CDC. Enterprise compliance and reliability. |
| 4. Data platform | Airbyte | Multi-source data pipelines into a warehouse. Scheduled syncs, incremental extraction, dbt transformations. |
The key insight: most developers spend 90% of their JSON-to-SQL time on Layer 1 problems -- converting JSON files to INSERT statements for test databases, data migrations, and CI pipelines. That's the layer json2sql owns. The other three tools solve different problems at different scales.
Install json2sql
# Install via pip
pip install json2sql-cli
# Or via Homebrew (macOS/Linux)
brew tap Coding-Dev-Tools/tap
brew install json2sql
# Or via Scoop (Windows)
scoop bucket add Coding-Dev-Tools https://github.com/Coding-Dev-Tools/scoop-bucket
scoop install json2sql
# Convert your JSON
json2sql convert data.json --dialect postgres
Star json2sql on GitHub
Related Reading
- Convert JSON to SQL in One Command -- json2sql Tutorial -- getting started
- Generate CREATE TABLE from JSON: Schema-First Migration -- schema generation
- Nested JSON to Relational SQL: One Flag That Replaces Your ETL Script -- flattening nested objects
- Dead Code Detection Compared -- DeadCode vs knip vs ts-prune vs ESLint
- Config Drift Detection Compared -- ConfigDrift vs driftctl vs Terraform Plan vs Checkov