PostgreSQL Performance Optimization: From Slow to Fast

Poor database performance is one of the most common issues we encounter. This guide covers practical optimization techniques that deliver measurable results.

Understanding the Performance Problem

Before optimizing, you need to measure:

Key Metrics to Track

Query execution time: Target <100ms for user-facing queries
Connection pool usage: Should stay below 80% capacity
Database CPU/Memory: Consistent high usage indicates problems
Disk I/O: High IOPS can bottleneck performance
Cache hit ratio: Should be >99% for most workloads

Enable Query Logging

-- Log slow queries (>500ms)
ALTER SYSTEM SET log_min_duration_statement = 500;

-- Log all queries with their duration
ALTER SYSTEM SET log_duration = on;

-- Apply changes
SELECT pg_reload_conf();

Finding Slow Queries

Using pg_stat_statements

-- Enable pg_stat_statements extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find top 10 slowest queries
SELECT
  query,
  calls,
  total_exec_time,
  mean_exec_time,
  max_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

Real-World Example

Before optimization:

SELECT users.*, COUNT(orders.id) as order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
WHERE users.created_at > '2024-01-01'
GROUP BY users.id
ORDER BY order_count DESC;

-- Execution time: 8,500ms
-- Rows scanned: 1,200,000

After optimization (see techniques below): 850ms (10x faster)

Optimization Technique #1: Indexing

Index Strategy

Not all indexes are created equal. Here's our systematic approach:

-- Check which indexes are actually being used
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

Common Indexing Patterns

1. Composite Indexes (Order Matters):

-- Bad: Separate indexes
CREATE INDEX idx_user_created ON users(created_at);
CREATE INDEX idx_user_status ON users(status);

-- Good: Composite index (most selective column first)
CREATE INDEX idx_user_status_created ON users(status, created_at);

-- Query that benefits:
SELECT * FROM users
WHERE status = 'active'
  AND created_at > '2024-01-01';

2. Partial Indexes (For Common Filters):

-- Only index active users (90% of data is inactive)
CREATE INDEX idx_active_users ON users(created_at)
WHERE status = 'active';

-- Index size: 100MB instead of 1GB

3. Expression Indexes (For Computed Values):

-- For case-insensitive email lookups
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Now this query uses the index:
SELECT * FROM users WHERE LOWER(email) = '[email protected]';

Optimization Technique #2: Query Rewriting

Problem: N+1 Queries

Bad (100 queries):

// Fetches users
const users = await db.query('SELECT * FROM users LIMIT 100');

// Makes 100 additional queries
for (const user of users) {
  user.orders = await db.query(
    'SELECT * FROM orders WHERE user_id = $1',
    [user.id]
  );
}

Good (1 query):

const result = await db.query(`
  SELECT
    users.*,
    json_agg(orders.*) as orders
  FROM users
  LEFT JOIN orders ON orders.user_id = users.id
  GROUP BY users.id
  LIMIT 100
`);

Problem: SELECT *

Bad:

SELECT * FROM users WHERE id = 123;
-- Returns 50 columns, including large JSONB fields

Good:

SELECT id, email, first_name, last_name FROM users WHERE id = 123;
-- Returns only needed columns

Impact: 10KB → 1KB response size

Problem: Inefficient JOINs

Bad:

SELECT users.*, orders.*
FROM users
JOIN orders ON users.id = orders.user_id
WHERE users.email = '[email protected]';
-- Scans entire orders table

Good:

SELECT users.*, orders.*
FROM users
JOIN orders ON orders.user_id = users.id
WHERE users.email = '[email protected]';
-- Uses orders.user_id index

Optimization Technique #3: Connection Pooling

The Problem

Creating new database connections is expensive:

Connection time: 50-100ms each
Memory overhead: 10MB per connection
Max connections: Usually 100-500

The Solution: pgBouncer

# pgbouncer.ini
[databases]
myapp = host=localhost dbname=production

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3

Configuration in Application

// Without pooling (Bad)
const { Client } = require('pg');
const client = new Client({ connectionString: process.env.DATABASE_URL });

// With pooling (Good)
const { Pool } = require('pg');
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

Optimization Technique #4: Caching

Query Result Caching

const redis = require('redis');
const client = redis.createClient();

async function getUser(userId) {
  const cacheKey = `user:${userId}`;

  // Check cache first
  const cached = await client.get(cacheKey);
  if (cached) return JSON.parse(cached);

  // Query database
  const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]);

  // Cache for 5 minutes
  await client.setex(cacheKey, 300, JSON.stringify(user));

  return user;
}

Materialized Views

For complex aggregations that don't need real-time data:

-- Create materialized view
CREATE MATERIALIZED VIEW user_statistics AS
SELECT
  users.id,
  users.email,
  COUNT(DISTINCT orders.id) as total_orders,
  SUM(orders.amount) as total_spent,
  MAX(orders.created_at) as last_order_date
FROM users
LEFT JOIN orders ON orders.user_id = users.id
GROUP BY users.id, users.email;

-- Create index on materialized view
CREATE INDEX idx_user_stats_id ON user_statistics(id);

-- Refresh periodically (run via cron)
REFRESH MATERIALIZED VIEW CONCURRENTLY user_statistics;

Performance: Complex query from 5s to 10ms

Optimization Technique #5: Partitioning

For large tables (>10M rows), partitioning can dramatically improve performance:

-- Create partitioned table
CREATE TABLE orders (
  id BIGSERIAL,
  user_id INTEGER,
  created_at TIMESTAMP,
  amount DECIMAL
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE orders_2024_q1 PARTITION OF orders
  FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

CREATE TABLE orders_2024_q2 PARTITION OF orders
  FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');

-- Queries automatically use correct partition
SELECT * FROM orders
WHERE created_at BETWEEN '2024-02-01' AND '2024-02-28';
-- Only scans orders_2024_q1 partition

Optimization Technique #6: VACUUM and ANALYZE

PostgreSQL requires regular maintenance:

-- Manual vacuum (reclaim space)
VACUUM ANALYZE users;

-- Check table bloat
SELECT
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size,
  n_dead_tup,
  n_live_tup,
  round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) as dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Enable autovacuum (should be on by default):

ALTER TABLE users SET (autovacuum_vacuum_scale_factor = 0.1);
ALTER TABLE users SET (autovacuum_analyze_scale_factor = 0.05);

Real-World Case Study

Client: E-commerce platform with 5M users

Problems:

Homepage load time: 8 seconds
Checkout timeout errors: 15% of transactions
Database CPU: 95% constantly

Solutions Applied:

Added 12 strategic indexes (composite + partial)
Implemented Redis caching for product catalog
Partitioned orders table by month
Set up pgBouncer with transaction pooling
Created materialized views for analytics dashboard

Results:

Homepage load time: 1.2 seconds (85% improvement)
Checkout errors: <1% (93% reduction)
Database CPU: 35% average
Cost savings: $2,400/month (smaller RDS instance)

Monitoring Setup

Use pg_stat_statements + Grafana dashboard:

-- Query for Prometheus/Grafana
SELECT
  query,
  calls,
  mean_exec_time,
  max_exec_time,
  rows
FROM pg_stat_statements
WHERE mean_exec_time > 100
ORDER BY mean_exec_time DESC;

Quick Wins Checklist

✅ Enable pg_stat_statements ✅ Log queries >500ms ✅ Add indexes for WHERE/JOIN columns ✅ Use connection pooling ✅ Cache frequently accessed data ✅ Avoid SELECT * ✅ Fix N+1 queries ✅ Monitor cache hit ratio ✅ Set up autovacuum ✅ Review EXPLAIN ANALYZE for slow queries

Conclusion

Database optimization is iterative:

Measure current performance
Identify bottlenecks with pg_stat_statements
Optimize queries and indexes
Monitor results
Repeat

Most performance issues can be resolved with proper indexing and query optimization before needing to scale hardware.

Need help optimizing your database? Schedule a performance audit or download our PostgreSQL optimization checklist.