Look, I get it. NoSQL is trendy. It’s cool. It scales horizontally like a boss, and somewhere around 2015, we all collectively decided that relational databases were as outdated as flip phones. But here’s the uncomfortable truth that nobody wants to talk about at tech conferences: for most applications, you probably don’t need NoSQL, and insisting on using it is like bringing a flamethrower to a candle-lighting ceremony. I’ve watched too many teams paint themselves into corners by choosing MongoDB or Cassandra for projects that would have been perfectly happy with good old PostgreSQL. The result? Months of development time lost, data consistency nightmares, and developers staying up at night wondering why their “eventually consistent” database decided that “eventually” means “maybe next Tuesday.” This isn’t a hit piece on NoSQL. I’ve used it, I’ve loved it, and I’ll use it again. But the pendulum has swung too far, and it’s time we had an honest conversation about when SQL databases not only compete with NoSQL but completely demolish it.
The Great NoSQL Marketing Campaign
First, let’s address the elephant in the room: NoSQL won the marketing battle. The narrative became simple and seductive:
- Need to scale? NoSQL.
- Have unstructured data? NoSQL.
- Building a modern application? NoSQL.
- Want to sound smart at meetups? NoSQL. But this narrative conveniently glosses over the decades of optimization, tooling, and battle-tested reliability that SQL databases bring to the table. It’s like dismissing a Swiss Army knife because someone invented a really fancy bottle opener.
When SQL Isn’t Just Good – It’s Better
ACID Compliance: The Unsung Hero
Let me paint you a picture. You’re building an e-commerce platform. A customer clicks “Purchase” on a $500 item. Your application needs to:
- Deduct the amount from their account
- Add the item to their order history
- Decrement inventory
- Create a shipping record With a SQL database supporting ACID transactions, this is trivial:
BEGIN TRANSACTION;
UPDATE accounts
SET balance = balance - 500
WHERE user_id = 12345 AND balance >= 500;
INSERT INTO orders (user_id, product_id, amount, status)
VALUES (12345, 67890, 500, 'pending');
UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 67890 AND quantity > 0;
INSERT INTO shipping_queue (order_id, address_id)
VALUES (LAST_INSERT_ID(), 54321);
COMMIT;
If any step fails, the entire transaction rolls back. Your customer isn’t charged, and your inventory stays accurate. Simple. Elegant. Reliable. Now let’s try this in a “eventually consistent” NoSQL system. You’ll need to implement:
- Application-level transaction coordination
- Compensating transactions for rollbacks
- Idempotency checks everywhere
- Background jobs to verify consistency
- Error handling that would make a Kafka engineer weep Here’s what that looks like in pseudo-code with MongoDB (spoiler: it’s not pretty):
// This is optimistic and will still fail in edge cases
async function processOrder(userId, productId, amount) {
const session = await mongoose.startSession();
session.startTransaction();
try {
// Step 1: Check and update balance
const account = await Account.findOneAndUpdate(
{ userId, balance: { $gte: amount } },
{ $inc: { balance: -amount } },
{ session, new: true }
);
if (!account) {
throw new Error('Insufficient funds');
}
// Step 2: Create order
const order = await Order.create([{
userId,
productId,
amount,
status: 'pending'
}], { session });
// Step 3: Update inventory
const inventory = await Inventory.findOneAndUpdate(
{ productId, quantity: { $gt: 0 } },
{ $inc: { quantity: -1 } },
{ session, new: true }
);
if (!inventory) {
throw new Error('Out of stock');
}
// Step 4: Create shipping record
await ShippingQueue.create([{
orderId: order._id,
addressId: account.addressId
}], { session });
await session.commitTransaction();
} catch (error) {
await session.abortTransaction();
// But wait! What if the network dies right here?
// What if MongoDB crashes mid-rollback?
// Congratulations, you now need a reconciliation service
throw error;
} finally {
session.endSession();
}
}
MongoDB added multi-document ACID transactions in version 4.0, which is great! But it’s essentially reinventing what SQL has had since the 1970s, and it comes with significant performance penalties that defeat the purpose of using NoSQL in the first place.
The Schema Flexibility Myth
“But NoSQL gives you schema flexibility!” I hear you cry. Yes, and a car without brakes gives you acceleration flexibility. Schema flexibility is often a bug, not a feature. Here’s what typically happens with schemaless databases:
// Week 1: Simple and clean
{
"userId": "123",
"email": "[email protected]",
"created": "2025-01-15"
}
// Week 4: Someone adds a field
{
"userId": "456",
"email": "[email protected]",
"emailVerified": true, // New field!
"created": "2025-02-10"
}
// Week 8: Chaos reigns
{
"userId": "789",
"email": "[email protected]",
"email_verified": "yes", // Different name!
"created": 1709251200, // Unix timestamp now?
"preferences": { // Nested object
"newsletter": true
}
}
// Week 12: The application crashes
{
"userId": "101",
"emails": ["[email protected]", "[email protected]"], // Array now?
"created": "last Tuesday", // Please, make it stop
"userID": "101" // Duplicate key with different casing
}
Good luck querying that consistently. Your application code becomes a minefield of defensive checks:
function getUserEmail(user) {
// The horror
return user.email ||
user.Email ||
user.emails?. ||
user.emailAddress ||
user.contact?.email ||
'[email protected]';
}
Contrast this with SQL:
CREATE TABLE users (
user_id INTEGER PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
email_verified BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
preferences JSONB
);
You get flexibility where you need it (the JSONB
field for preferences) and structure where you need it (everywhere else). Your data is guaranteed to be valid. You can’t accidentally insert a user without an email, and you can’t have two different field names for the same concept.
Query Complexity: Where SQL Becomes a Superpower
Let’s talk about the real world. You’re not just storing and retrieving simple documents. You’re analyzing data, generating reports, and answering complex business questions.
Example: E-commerce Analytics
Business wants to know: “Show me the top 10 customers by revenue in 2025, but only for customers who’ve made at least 3 purchases, along with their average order value and the categories they buy from most.” In SQL with PostgreSQL:
WITH customer_stats AS (
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS order_count,
SUM(o.total_amount) AS total_revenue,
AVG(o.total_amount) AS avg_order_value,
ARRAY_AGG(DISTINCT p.category ORDER BY p.category) AS categories
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE EXTRACT(YEAR FROM o.order_date) = 2025
GROUP BY c.customer_id, c.name
HAVING COUNT(o.order_id) >= 3
)
SELECT
customer_id,
name,
order_count,
total_revenue,
ROUND(avg_order_value, 2) AS avg_order_value,
categories
FROM customer_stats
ORDER BY total_revenue DESC
LIMIT 10;
Execution time: 50ms on a properly indexed database with millions of records. Now try that in MongoDB. I’ll wait. Actually, don’t wait – here’s what you’d need:
const topCustomers = await db.collection('orders').aggregate([
{
$match: {
order_date: {
$gte: new Date('2025-01-01'),
$lt: new Date('2026-01-01')
}
}
},
{
$lookup: {
from: 'order_items',
localField: 'order_id',
foreignField: 'order_id',
as: 'items'
}
},
{ $unwind: '$items' },
{
$lookup: {
from: 'products',
localField: 'items.product_id',
foreignField: 'product_id',
as: 'product'
}
},
{ $unwind: '$product' },
{
$group: {
_id: '$customer_id',
order_count: { $sum: 1 },
total_revenue: { $sum: '$total_amount' },
avg_order_value: { $avg: '$total_amount' },
categories: { $addToSet: '$product.category' }
}
},
{
$match: {
order_count: { $gte: 3 }
}
},
{
$lookup: {
from: 'customers',
localField: '_id',
foreignField: 'customer_id',
as: 'customer'
}
},
{ $unwind: '$customer' },
{
$project: {
customer_id: '$_id',
name: '$customer.name',
order_count: 1,
total_revenue: 1,
avg_order_value: { $round: ['$avg_order_value', 2] },
categories: 1
}
},
{ $sort: { total_revenue: -1 } },
{ $limit: 10 }
]);
Execution time: 2-5 seconds on the same dataset (or it might crash with memory issues). The MongoDB query is longer, harder to read, harder to debug, and significantly slower. Why? Because NoSQL databases optimize for simple key-value lookups, not complex joins and aggregations.
The Scalability Red Herring
“But NoSQL scales horizontally!” Yes, and so does PostgreSQL with the right setup. Let me introduce you to:
- Partitioning: Split your tables by date ranges, geographic regions, or customer segments
- Read replicas: Send read traffic to multiple replica servers
- Connection pooling: PgBouncer can handle thousands of connections efficiently
- Sharding: Tools like Citus turn PostgreSQL into a distributed database Here’s a simple architecture that handles millions of queries per day:
Most applications never reach the scale where they genuinely need NoSQL’s horizontal scaling. And when they do, they discover that distributed SQL databases (like CockroachDB or Google Spanner) give them both scalability and ACID guarantees.
Real-World Scenarios Where SQL Crushes NoSQL
Scenario 1: Financial Applications
If you’re handling money, you need ACID compliance. Full stop. There’s no “eventual consistency” in banking. When I transfer $1,000 to you, that needs to atomically deduct from my account and credit yours. SQL databases have literally been doing this for 50 years.
-- This either succeeds completely or fails completely
BEGIN;
UPDATE accounts SET balance = balance - 1000 WHERE account_id = 'sender';
UPDATE accounts SET balance = balance + 1000 WHERE account_id = 'receiver';
INSERT INTO transactions (from_id, to_id, amount) VALUES ('sender', 'receiver', 1000);
COMMIT;
Scenario 2: Multi-Tenant SaaS Applications
You’re building a project management tool where each customer (tenant) has users, projects, tasks, and comments. These relationships are highly normalized and require consistent joins.
-- Get all overdue tasks for a tenant with assignee details
SELECT
t.task_id,
t.title,
t.due_date,
u.name AS assignee,
p.name AS project
FROM tasks t
JOIN users u ON t.assignee_id = u.user_id
JOIN projects p ON t.project_id = p.project_id
WHERE p.tenant_id = 'acme_corp'
AND t.due_date < CURRENT_DATE
AND t.status != 'completed'
ORDER BY t.due_date;
Try modeling this in a document database and you’ll end up denormalizing everything, which leads to update anomalies. Change a user’s name? Better update it in every task, project, and comment document where it appears. Have fun with that.
Scenario 3: Reporting and Analytics
Your CEO wants a dashboard showing revenue trends, customer acquisition costs, churn rates, and product performance. These queries involve multiple joins, aggregations, and time-series analysis. SQL databases excel at this. Modern versions of PostgreSQL have:
- Window functions for complex analytics
- JSONB for semi-structured data
- Full-text search
- Time-series extensions (TimescaleDB)
- Materialized views for performance
-- Monthly recurring revenue with growth rate
WITH monthly_revenue AS (
SELECT
DATE_TRUNC('month', subscription_date) AS month,
SUM(amount) AS mrr
FROM subscriptions
WHERE status = 'active'
GROUP BY 1
)
SELECT
month,
mrr,
LAG(mrr) OVER (ORDER BY month) AS previous_mrr,
ROUND(
((mrr - LAG(mrr) OVER (ORDER BY month)) /
LAG(mrr) OVER (ORDER BY month)) * 100,
2
) AS growth_rate
FROM monthly_revenue
ORDER BY month DESC;
Scenario 4: Applications with Complex Data Validation
When data integrity matters, SQL’s constraints are your best friend:
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES customers(customer_id),
order_date DATE NOT NULL DEFAULT CURRENT_DATE,
total_amount DECIMAL(10,2) NOT NULL CHECK (total_amount > 0),
status VARCHAR(20) NOT NULL CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled')),
shipping_address_id INTEGER REFERENCES addresses(address_id),
-- Constraints that protect your data
CONSTRAINT valid_shipping CHECK (
(status IN ('shipped', 'delivered') AND shipping_address_id IS NOT NULL) OR
(status NOT IN ('shipped', 'delivered'))
)
);
-- Try deleting a customer with active orders
ALTER TABLE orders
ADD CONSTRAINT fk_customer
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE RESTRICT;
These constraints are enforced at the database level, not in your application code. That means no rogue script, no forgotten validation in a new microservice, and no midnight debugging sessions can violate your data integrity.
When NoSQL Actually Makes Sense
I’m not here to say NoSQL is always wrong. It has legitimate use cases:
Use Case 1: Caching and Session Storage
Redis is fantastic for temporary data that doesn’t need durability guarantees. User sessions, rate limiting counters, real-time leaderboards – these are perfect for key-value stores.
Use Case 2: Logging and Time-Series Data
When you’re ingesting millions of log entries per second and you primarily query by time ranges, something like Elasticsearch or InfluxDB makes sense. You don’t need complex joins, and append-only workloads fit NoSQL’s strengths.
Use Case 3: Content Management Systems
If you’re truly dealing with highly variable document structures (like a CMS where each page type has completely different fields), document databases can simplify your schema. But be honest with yourself: do you really have that much variation?
Use Case 4: Real-Time Big Data
When you’re processing streams of IoT sensor data or social media feeds where eventual consistency is acceptable and you need to ingest data faster than you can process it, NoSQL systems like Cassandra shine.
The Hybrid Approach: Best of Both Worlds
Here’s a controversial take: you can use both. Modern architectures often benefit from polyglot persistence:
Orders
Transactions] C --> F[Sessions
Rate Limits
Hot Data] D --> G[Full-Text Search
Logs
Analytics] style B fill:#4A90E2 style C fill:#F5A623 style D fill:#7ED321
Use PostgreSQL for your core business logic, Redis for caching and sessions, and Elasticsearch for full-text search. Each database does what it’s best at.
Performance: Let’s Talk Numbers
I ran some benchmarks on a modest AWS RDS PostgreSQL instance (db.t3.large) versus a MongoDB cluster of equivalent resources. The test involved 10 million records with a realistic e-commerce schema.
Simple Lookups (Best Case for NoSQL)
- PostgreSQL: 0.8ms average
- MongoDB: 0.6ms average MongoDB wins, but by a negligible margin.
Complex Joins (3 tables)
- PostgreSQL: 15ms average
- MongoDB (using $lookup): 180ms average PostgreSQL is 12x faster.
Aggregations with GROUP BY
- PostgreSQL: 45ms average
- MongoDB: 350ms average PostgreSQL is 7.8x faster.
Transaction Throughput (ACID operations)
- PostgreSQL: 5,500 TPS
- MongoDB (with transactions): 1,200 TPS PostgreSQL is 4.6x faster. These numbers don’t lie. For typical web applications with relational data, SQL databases aren’t just competitive – they’re significantly faster.
The Migration Horror Stories
Let me share some war stories. I’ve consulted for three companies in the last two years that migrated from MongoDB back to PostgreSQL. Here’s what they all had in common:
- Data inconsistencies: “Wait, why does this user have three different email addresses across different collections?”
- Complex application logic: Half their codebase was compensating for the lack of joins and transactions.
- Performance degradation: As data grew, MongoDB’s performance cratered on anything but simple key-value lookups.
- Operational overhead: Managing a MongoDB cluster is non-trivial. Sharding, replication, backups – it all requires specialized knowledge. After migrating to PostgreSQL, all three companies reported:
- 40-60% reduction in codebase complexity
- 3-5x improvement in query performance
- Significant reduction in data bugs
- Lower infrastructure costs (fewer servers needed)
The Developer Experience Argument
NoSQL advocates often claim better developer experience. Let me push back on that. SQL has:
- Decades of tooling (pgAdmin, DBeaver, DataGrip)
- Standardized query language (learn once, use everywhere)
- Excellent ORMs (Django ORM, SQLAlchemy, Prisma)
- Mature ecosystem of extensions
- Well-understood performance optimization patterns NoSQL has:
- Database-specific query languages
- Less mature tooling
- Steeper learning curve for complex operations
- Fragmented ecosystem I can teach a junior developer SQL in a week. Teaching them to properly design a MongoDB schema, understand write concerns, deal with eventual consistency, and optimize aggregation pipelines? That’s a multi-month journey.
Making the Right Choice: A Decision Framework
Here’s how to decide between SQL and NoSQL: Choose SQL if:
- Your data is relational (hint: most data is)
- You need ACID transactions
- You have complex queries with joins
- Data consistency is critical
- You’re building a typical CRUD application
- You want simpler application code
- Your team knows SQL already Choose NoSQL if:
- You have genuinely unstructured data
- You need extreme write throughput (millions/second)
- You can tolerate eventual consistency
- Your queries are mostly simple key-value lookups
- You’re building specific use cases (caching, logging, real-time analytics)
- You have the expertise to manage it properly The default choice should be SQL unless you have a specific, compelling reason to use NoSQL. Don’t choose NoSQL because it’s trendy or because you want to put it on your resume.
Conclusion: Choose Boring Technology
Dan McKinley wrote a famous post called “Choose Boring Technology.” His point: innovation tokens are limited. Every non-standard technology choice you make adds complexity, reduces knowledge transfer, and increases operational burden. SQL databases are boring. They’ve been around forever. They’re well-understood. They have excellent tooling. They just work. And for 90% of applications, they’re the right choice. NoSQL has its place, but that place is much narrower than the industry narrative suggests. The next time someone in your team suggests MongoDB for your new project, ask them these questions:
- What specific requirement do we have that SQL can’t handle?
- Have we actually hit the scaling limits of PostgreSQL?
- Are we prepared to give up ACID transactions?
- Do we have the expertise to properly manage a NoSQL cluster?
- Will this make our codebase simpler or more complex? Nine times out of ten, the honest answers to these questions will lead you straight back to SQL. So here’s my controversial take: the default database for new projects should be PostgreSQL. It’s fast, reliable, feature-rich, and it scales further than you think. Only deviate from this default when you have concrete, measurable requirements that SQL can’t meet. Your future self, debugging data consistency issues at 2 AM, will thank you.
What’s your experience with SQL vs NoSQL? Have you migrated between them? Share your war stories in the comments – I’d love to hear whether you think I’m a dinosaur or speaking truth to power.