Databases Interview Questions

Foundations

SQL vs NoSQL

SQL: structured tables, ACID transactions, strong consistency (Postgres, MySQL). NoSQL: flexible schemas, horizontal scaling (MongoDB, DynamoDB).

Modeling

Normalization

Reduce redundancy via normal forms (1NF, 2NF, 3NF). Use denormalization selectively for read performance.

Performance

Indexes

B-tree indexes speed lookups and sorts. Trade-offs: extra storage and slower writes.

-- Postgres
CREATE INDEX users_email_idx ON users(email);
Reliability

Transactions & Isolation

ACID: Atomicity, Consistency, Isolation, Durability. Isolation levels: Read Committed, Repeatable Read, Serializable.

Querying

Joins

INNER, LEFT, RIGHT, FULL. Use indexes on join keys to improve performance.

SELECT u.name, o.total
FROM users u
JOIN orders o ON o.user_id = u.id;
Operations

Replication & Failover

Master-replica for read scaling and HA. Use health checks and automatic failover (e.g., Patroni).

Operations

Sharding

Split data across shards by key (user_id). Beware hot shards and cross-shard queries.

Development

ORMs vs Raw SQL

ORMs speed development; raw SQL offers control and performance. Mix both pragmatically.

Foundations

ACID vs BASE

ACID ensures reliable transactions; BASE favors availability and eventual consistency in distributed systems. Choose per workload and consistency needs.

Performance

Index types

B-tree (default), Hash (equality), GIN/GiST (arrays/full-text/geospatial). Pick indexes based on query patterns.

Operations

Connection pooling

Limit concurrent DB connections via pools; reuse connections to reduce overhead. Tune pool size per workload and DB limits.

Reliability

Backups and point-in-time restore

Use snapshots and WAL archiving; verify backups, test restores regularly, and define RPO/RTO.

Foundations

NoSQL models

Document (MongoDB), key-value (Redis), column-family (Cassandra), graph (Neo4j). Design schemas around access patterns.

SQL

Isolation levels

Read Uncommitted, Read Committed, Repeatable Read, Serializable; anomalies and trade-offs per level.

Distributed

CAP theorem

In network partitions, systems choose between consistency and availability. Many DBs prefer AP with eventual consistency.

Storage

Write-Ahead Logging (WAL)

Ensure durability by recording changes to a log before applying to pages; enables crash recovery and replication.

Performance

EXPLAIN and query optimization

Use EXPLAIN to inspect access paths and costs; add indexes, rewrite queries, and tune statistics.

Distributed

Two-phase commit and distributed transactions

Coordinators ensure atomicity across participants; can block under failures. Modern systems prefer idempotent operations and Sagas.