Engineering Note
Architecture

Database Choices That Hurt at Scale

Schema and Query Decisions with Long Shadows

9 min read
AdvancedArchitecture

Introduction

Database decisions made early in a project often seem harmless. They work well with small datasets and low traffic. But as the system grows, these decisions can become major bottlenecks.

Scaling problems are rarely caused by the database itself. They are usually caused by how the database is used.

The Problem

Many systems start with simple database setups that do not account for future scale.

  • No indexes on frequently queried columns
  • Large tables scanned for simple queries
  • Mixing transactional and analytical workloads
  • Repeated queries without caching

These issues are not visible early, but become critical under load.

System Design / Approach

Database design should be driven by how data is accessed, not just how it is stored.

  • Design indexes based on query patterns
  • Separate read-heavy and write-heavy workloads
  • Use caching for repeated queries
  • Avoid unnecessary joins in hot paths

The goal is to make common queries fast and predictable.

Implementation

Step 1: Add Indexes

Create indexes that match query patterns.


CREATE INDEX idx_users_email ON users(email);

Indexes reduce query time significantly.

Step 2: Optimize Queries

Avoid fetching unnecessary data.


SELECT id, name FROM users;

Smaller queries improve performance.

Step 3: Introduce Caching

Cache frequently accessed data.


const cached = await redis.get(key);

Caching reduces database load.

Step 4: Separate Workloads

Use replicas or separate databases for different workloads.


const readDb = replicaConnection;

This improves scalability and performance.

Trade-offs

Approach Benefit Cost
Indexing Faster queries Storage overhead
Caching Reduced load Cache invalidation
Replication Better scalability Operational complexity

Real-World Impact

  • Reduced query latency
  • Improved system performance under load
  • Better scalability for growing datasets
  • More predictable database behavior

Key Takeaways

Choosing a database based only on ease of use often leads to scaling issues later

Unindexed queries become a major bottleneck as data grows

Using a single database for all workloads creates contention and performance degradation

Ignoring read vs write patterns leads to inefficient system design

Schema design decisions are hard to change and impact long-term scalability

Future Improvements

Introduce proper indexing strategies based on query patterns

Separate read and write workloads using replicas

Use caching layers to reduce database load

Adopt data partitioning or sharding for large datasets

Continuously monitor query performance and optimize slow queries

Database Choices That Hurt at Scale | Tushar Kanti Dey