Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Chapter 5: Performance Optimization in DynamoDB

Introduction

Performance optimization in DynamoDB requires understanding how data distribution, access patterns, and throughput settings interact with the underlying distributed architecture. Unlike traditional databases where performance tuning often means adding indexes or rewriting queries, DynamoDB performance is fundamentally determined by how well your data model distributes load across the partition infrastructure. The single most common performance problem in DynamoDB is the “hot partition” — a situation where traffic concentrates on a small number of partitions while others sit idle. This happens because DynamoDB distributes capacity evenly across partitions, so a table provisioned for 10,000 RCUs across 10 partitions gives each partition only 1,000 RCUs. If 90% of your traffic hits one partition, you will be throttled even though the table has plenty of aggregate capacity. Understanding and avoiding this pattern is the difference between a DynamoDB deployment that hums along at single-digit millisecond latency and one that drowns in ProvisionedThroughputExceededException errors. DynamoDB’s performance model has evolved significantly since launch. The original 2012 service had rigid partition-level throughput limits that punished even slight imbalances. Over time, AWS introduced burst capacity (2016), adaptive capacity (2018), on-demand mode (2018), and instant adaptive capacity (2019) to soften these sharp edges. Understanding this evolution matters because many older blog posts and Stack Overflow answers describe limitations that no longer exist, and many teams over-engineer their partition strategies for problems that adaptive capacity now handles automatically. This chapter explores techniques for maximizing throughput, minimizing latency, and efficiently utilizing DynamoDB’s capacity — covering both the timeless fundamentals and the modern features that have changed best practices.

Understanding DynamoDB Performance Fundamentals

Read and Write Capacity Units

DynamoDB’s performance is measured in capacity units, a concept borrowed from the idea of “request units” that Azure Cosmos DB also adopted. The capacity unit abstraction is DynamoDB’s way of hiding the underlying hardware complexity (SSD IOPS, network bandwidth, replication overhead) behind a simple, predictable billing model. One important nuance that trips up many engineers: capacity units are calculated based on item size rounded up to the nearest boundary (4KB for reads, 1KB for writes), so a 4.1KB item costs the same as an 8KB item for reads. This rounding behavior makes item size optimization one of the highest-leverage performance improvements you can make. Read Capacity Units (RCUs):
  • 1 RCU = one strongly consistent read per second for items up to 4KB
  • 1 RCU = two eventually consistent reads per second for items up to 4KB
  • Transactional reads = 2 RCUs per item per second
Write Capacity Units (WCUs):
  • 1 WCU = one write per second for items up to 1KB
  • Transactional writes = 2 WCUs per item per second
<svg viewBox="0 0 900 700" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="450" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    Capacity Units Calculation
  </text>

  <!-- Read Capacity Section -->
  <rect x="50" y="60" width="800" height="300" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
  <text x="450" y="90" font-size="16" font-weight="bold" text-anchor="middle" fill="#1976d2">
    Read Capacity Units (RCUs)
  </text>

  <!-- Strong Consistent Reads -->
  <rect x="80" y="110" width="360" height="220" fill="#fff" stroke="#1976d2" stroke-width="2" rx="3"/>
  <text x="260" y="140" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    Strong Consistency
  </text>

  <text x="100" y="170" font-size="12" fill="#333">1 RCU =</text>
  <text x="120" y="195" font-size="11" fill="#666">• 1 strongly consistent read/sec</text>
  <text x="120" y="215" font-size="11" fill="#666">• Up to 4 KB per item</text>

  <rect x="100" y="235" width="320" height="80" fill="#f5f5f5" stroke="#1976d2" stroke-width="1" rx="3"/>
  <text x="260" y="260" font-size="12" font-weight="bold" text-anchor="middle" fill="#333">
    Example
  </text>
  <text x="120" y="280" font-size="11" fill="#666">Item size: 3.5 KB</text>
  <text x="120" y="297" font-size="11" fill="#666">Reads/sec: 10 (strong)</text>
  <text x="120" y="314" font-size="11" fill="#4caf50" font-weight="bold">Required: 10 RCUs</text>

  <!-- Eventually Consistent Reads -->
  <rect x="460" y="110" width="360" height="220" fill="#fff" stroke="#4caf50" stroke-width="2" rx="3"/>
  <text x="640" y="140" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    Eventual Consistency
  </text>

  <text x="480" y="170" font-size="12" fill="#333">1 RCU =</text>
  <text x="500" y="195" font-size="11" fill="#666">• 2 eventually consistent reads/sec</text>
  <text x="500" y="215" font-size="11" fill="#666">• Up to 4 KB per item</text>

  <rect x="480" y="235" width="320" height="80" fill="#f5f5f5" stroke="#4caf50" stroke-width="1" rx="3"/>
  <text x="640" y="260" font-size="12" font-weight="bold" text-anchor="middle" fill="#333">
    Example
  </text>
  <text x="500" y="280" font-size="11" fill="#666">Item size: 3.5 KB</text>
  <text x="500" y="297" font-size="11" fill="#666">Reads/sec: 10 (eventual)</text>
  <text x="500" y="314" font-size="11" fill="#4caf50" font-weight="bold">Required: 5 RCUs</text>

  <!-- Write Capacity Section -->
  <rect x="50" y="380" width="800" height="280" fill="#fff3e0" stroke="#ff9800" stroke-width="2" rx="5"/>
  <text x="450" y="410" font-size="16" font-weight="bold" text-anchor="middle" fill="#ff9800">
    Write Capacity Units (WCUs)
  </text>

  <!-- Standard Writes -->
  <rect x="80" y="430" width="360" height="220" fill="#fff" stroke="#ff9800" stroke-width="2" rx="3"/>
  <text x="260" y="460" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    Standard Writes
  </text>

  <text x="100" y="490" font-size="12" fill="#333">1 WCU =</text>
  <text x="120" y="515" font-size="11" fill="#666">• 1 write per second</text>
  <text x="120" y="535" font-size="11" fill="#666">• Up to 1 KB per item</text>

  <rect x="100" y="555" width="320" height="80" fill="#f5f5f5" stroke="#ff9800" stroke-width="1" rx="3"/>
  <text x="260" y="580" font-size="12" font-weight="bold" text-anchor="middle" fill="#333">
    Example
  </text>
  <text x="120" y="600" font-size="11" fill="#666">Item size: 2.5 KB</text>
  <text x="120" y="617" font-size="11" fill="#666">Writes/sec: 15</text>
  <text x="120" y="634" font-size="11" fill="#4caf50" font-weight="bold">Required: 45 WCUs (3 × 15)</text>

  <!-- Transactional Writes -->
  <rect x="460" y="430" width="360" height="220" fill="#fff" stroke="#f44336" stroke-width="2" rx="3"/>
  <text x="640" y="460" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    Transactional Writes
  </text>

  <text x="480" y="490" font-size="12" fill="#333">1 Transactional WCU =</text>
  <text x="500" y="515" font-size="11" fill="#666">• 1 write per second</text>
  <text x="500" y="535" font-size="11" fill="#666">• Up to 1 KB per item</text>
  <text x="500" y="555" font-size="11" fill="#f44336">• Costs 2× standard writes</text>

  <rect x="480" y="575" width="320" height="60" fill="#f5f5f5" stroke="#f44336" stroke-width="1" rx="3"/>
  <text x="640" y="600" font-size="12" font-weight="bold" text-anchor="middle" fill="#333">
    Example
  </text>
  <text x="500" y="620" font-size="11" fill="#666">Item size: 1 KB, 10 transactional writes/sec</text>
  <text x="500" y="637" font-size="11" fill="#f44336" font-weight="bold">Required: 20 WCUs</text>
</svg>

Calculating Required Capacity

// RCU calculation helper
function calculateRCUs(itemSizeKB, readsPerSecond, stronglyConsistent = false) {
  const unitsPerRead = Math.ceil(itemSizeKB / 4);
  const multiplier = stronglyConsistent ? 1 : 0.5;
  return Math.ceil(unitsPerRead * readsPerSecond * multiplier);
}

// WCU calculation helper
function calculateWCUs(itemSizeKB, writesPerSecond, transactional = false) {
  const unitsPerWrite = Math.ceil(itemSizeKB / 1);
  const multiplier = transactional ? 2 : 1;
  return Math.ceil(unitsPerWrite * writesPerSecond * multiplier);
}

// Examples
console.log('RCUs for 10 eventual reads/sec of 3KB items:',
  calculateRCUs(3, 10, false)); // Output: 5

console.log('WCUs for 15 writes/sec of 2.5KB items:',
  calculateWCUs(2.5, 15)); // Output: 45

console.log('WCUs for 10 transactional writes/sec of 1KB items:',
  calculateWCUs(1, 10, true)); // Output: 20

Partition Key Design for Performance

Hot Partition Problem

<svg viewBox="0 0 900 600" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="450" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    Hot Partition vs Distributed Load
  </text>

  <!-- Bad Design (Hot Partition) -->
  <rect x="50" y="60" width="380" height="480" fill="#ffebee" stroke="#f44336" stroke-width="2" rx="5"/>
  <text x="240" y="90" font-size="15" font-weight="bold" text-anchor="middle" fill="#f44336">
    BAD: Hot Partition
  </text>

  <!-- Partition 1 (Overloaded) -->
  <rect x="80" y="110" width="320" height="140" fill="#fff" stroke="#f44336" stroke-width="3" rx="3"/>
  <text x="240" y="135" font-size="13" font-weight="bold" text-anchor="middle" fill="#333">
    Partition 1: STATUS#ACTIVE
  </text>
  <text x="240" y="160" font-size="11" text-anchor="middle" fill="#f44336">
    90% of all requests
  </text>
  <rect x="100" y="175" width="280" height="30" fill="#f44336" stroke="#d32f2f" stroke-width="1" rx="2"/>
  <text x="240" y="195" font-size="11" font-weight="bold" text-anchor="middle" fill="#fff">
    THROTTLED! 900 RPS
  </text>
  <text x="100" y="225" font-size="10" fill="#666">Load: ████████████████████</text>
  <text x="100" y="242" font-size="10" fill="#666">Capacity: ██████</text>

  <!-- Partition 2 (Underutilized) -->
  <rect x="80" y="270" width="320" height="110" fill="#fff" stroke="#4caf50" stroke-width="1" rx="3"/>
  <text x="240" y="295" font-size="13" font-weight="bold" text-anchor="middle" fill="#333">
    Partition 2: STATUS#PENDING
  </text>
  <text x="240" y="320" font-size="11" text-anchor="middle" fill="#666">
    5% of requests
  </text>
  <text x="100" y="345" font-size="10" fill="#666">Load: ██</text>
  <text x="100" y="362" font-size="10" fill="#666">Capacity: ██████</text>

  <!-- Partition 3 (Underutilized) -->
  <rect x="80" y="400" width="320" height="110" fill="#fff" stroke="#4caf50" stroke-width="1" rx="3"/>
  <text x="240" y="425" font-size="13" font-weight="bold" text-anchor="middle" fill="#333">
    Partition 3: STATUS#INACTIVE
  </text>
  <text x="240" y="450" font-size="11" text-anchor="middle" fill="#666">
    5% of requests
  </text>
  <text x="100" y="475" font-size="10" fill="#666">Load: ██</text>
  <text x="100" y="492" font-size="10" fill="#666">Capacity: ██████</text>

  <!-- Good Design (Distributed) -->
  <rect x="470" y="60" width="380" height="480" fill="#e8f5e9" stroke="#4caf50" stroke-width="2" rx="5"/>
  <text x="660" y="90" font-size="15" font-weight="bold" text-anchor="middle" fill="#4caf50">
    GOOD: Distributed Load
  </text>

  <!-- Partition 1 (Balanced) -->
  <rect x="500" y="110" width="320" height="110" fill="#fff" stroke="#4caf50" stroke-width="2" rx="3"/>
  <text x="660" y="135" font-size="13" font-weight="bold" text-anchor="middle" fill="#333">
    Partition 1: USER#001-333
  </text>
  <text x="660" y="160" font-size="11" text-anchor="middle" fill="#4caf50">
    33% of requests
  </text>
  <text x="520" y="185" font-size="10" fill="#666">Load: ██████</text>
  <text x="520" y="202" font-size="10" fill="#666">Capacity: ██████</text>

  <!-- Partition 2 (Balanced) -->
  <rect x="500" y="240" width="320" height="110" fill="#fff" stroke="#4caf50" stroke-width="2" rx="3"/>
  <text x="660" y="265" font-size="13" font-weight="bold" text-anchor="middle" fill="#333">
    Partition 2: USER#334-666
  </text>
  <text x="660" y="290" font-size="11" text-anchor="middle" fill="#4caf50">
    33% of requests
  </text>
  <text x="520" y="315" font-size="10" fill="#666">Load: ██████</text>
  <text x="520" y="332" font-size="10" fill="#666">Capacity: ██████</text>

  <!-- Partition 3 (Balanced) -->
  <rect x="500" y="370" width="320" height="110" fill="#fff" stroke="#4caf50" stroke-width="2" rx="3"/>
  <text x="660" y="395" font-size="13" font-weight="bold" text-anchor="middle" fill="#333">
    Partition 3: USER#667-999
  </text>
  <text x="660" y="420" font-size="11" text-anchor="middle" fill="#4caf50">
    34% of requests
  </text>
  <text x="520" y="445" font-size="10" fill="#666">Load: ██████</text>
  <text x="520" y="462" font-size="10" fill="#666">Capacity: ██████</text>

  <!-- Note -->
  <rect x="150" y="555" width="600" height="30" fill="#fff9c4" stroke="#fbc02d" stroke-width="1" rx="3"/>
  <text x="450" y="575" font-size="11" text-anchor="middle" fill="#333">
    Use high-cardinality partition keys to distribute load evenly
  </text>
</svg>

Strategies for Avoiding Hot Partitions

The goal of every partition key strategy is to spread requests as uniformly as possible across partitions. This is the same fundamental challenge that consistent hashing (used in Dynamo, Cassandra, and memcached) was designed to solve at the infrastructure level. At the application level, you control the distribution by choosing partition key values that map roughly uniformly across the hash space. The three strategies below are listed in order of preference: start with high-cardinality keys (which solve most cases), escalate to write sharding only when you have a genuinely hot aggregate (like a global counter), and use time-based partitioning for time-series data where queries are naturally scoped to time windows. Strategy 1: Use High-Cardinality Keys
// BAD: Low cardinality (few unique values)
{
  PK: `STATUS#${status}`,  // Only 3-4 possible values
  SK: `ORDER#${orderId}`
}

// GOOD: High cardinality (many unique values)
{
  PK: `USER#${userId}`,    // Millions of possible values
  SK: `ORDER#${orderId}`
}

// GOOD: Composite high-cardinality key
{
  PK: `CUSTOMER#${customerId}#DATE#${date}`,
  SK: `TRANSACTION#${transactionId}`
}
Strategy 2: Write Sharding
// Add random shard suffix to distribute writes
const writeWithSharding = async (item) => {
  const shardCount = 10;
  const shardId = Math.floor(Math.random() * shardCount);

  await dynamodb.put({
    TableName: 'Metrics',
    Item: {
      PK: `COUNTER#${item.name}#SHARD#${shardId}`,
      SK: `TIMESTAMP#${Date.now()}`,
      value: item.value,
      shardId: shardId
    }
  }).promise();
};

// Read from all shards and aggregate
const readShardedCounter = async (counterName) => {
  const shardCount = 10;
  const results = await Promise.all(
    Array.from({ length: shardCount }, (_, i) =>
      dynamodb.query({
        TableName: 'Metrics',
        KeyConditionExpression: 'PK = :pk',
        ExpressionAttributeValues: {
          ':pk': `COUNTER#${counterName}#SHARD#${i}`
        }
      }).promise()
    )
  );

  // Aggregate results
  return results.reduce((total, result) => {
    return total + result.Items.reduce((sum, item) => sum + item.value, 0);
  }, 0);
};
Strategy 3: Time-Based Partitioning
// Distribute time-series data by time windows
const writeTimeSeriesData = async (sensorId, reading) => {
  const timestamp = new Date();
  const hour = timestamp.toISOString().substring(0, 13); // YYYY-MM-DDTHH

  await dynamodb.put({
    TableName: 'SensorData',
    Item: {
      PK: `SENSOR#${sensorId}#HOUR#${hour}`,
      SK: `TIMESTAMP#${timestamp.toISOString()}`,
      temperature: reading.temperature,
      humidity: reading.humidity
    }
  }).promise();
};

// Query specific time window
const querySensorData = async (sensorId, startHour, endHour) => {
  // Generate list of hour partitions to query
  const hours = generateHourRange(startHour, endHour);

  const results = await Promise.all(
    hours.map(hour =>
      dynamodb.query({
        TableName: 'SensorData',
        KeyConditionExpression: 'PK = :pk',
        ExpressionAttributeValues: {
          ':pk': `SENSOR#${sensorId}#HOUR#${hour}`
        }
      }).promise()
    )
  );

  return results.flatMap(r => r.Items);
};

Deep Dive: Burst and Adaptive Capacity

DynamoDB provides built-in mechanisms to handle temporary spikes and sustained imbalances in traffic. These features — burst capacity and adaptive capacity — were added over time as AWS learned from real customer workloads. They represent the DynamoDB team’s recognition that perfectly uniform partition key distributions are rare in practice, and the system needed to be more forgiving. Understanding these mechanisms is critical for production performance tuning, because they can mask underlying design problems during development and testing, only for those problems to surface under sustained production load when burst buckets are exhausted.

1. Burst Capacity

DynamoDB allows you to “burst” above your provisioned throughput for short periods. This is achieved by retaining unused capacity for up to 5 minutes (300 seconds).
  • How it works: If you don’t use your full throughput, DynamoDB stores the remainder in a “burst bucket.”
  • Benefit: Handles sudden, micro-spikes in traffic without throttling.
  • Limit: Once the 5-minute bucket is exhausted, requests are throttled back to the provisioned level.
<svg viewBox="0 0 900 400" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="450" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    Burst Capacity Mechanics (5-Minute Window)
  </text>

  <!-- Capacity Line -->
  <line x1="100" y1="250" x2="800" y2="250" stroke="#666" stroke-width="2" stroke-dasharray="5,5"/>
  <text x="810" y="255" font-size="12" fill="#666">Provisioned Throughput</text>

  <!-- Traffic Wave -->
  <path d="M 100 250 Q 200 250 250 100 T 400 250 Q 500 250 600 250" fill="none" stroke="#1976d2" stroke-width="3"/>
  <path d="M 250 100 Q 300 50 350 250" fill="#bbdefb" fill-opacity="0.5" stroke="#1976d2" stroke-width="2"/>

  <!-- Annotations -->
  <text x="300" y="150" font-size="12" font-weight="bold" text-anchor="middle" fill="#1976d2">
    BURST CONSUMPTION
  </text>
  <text x="300" y="170" font-size="10" text-anchor="middle" fill="#333">
    Using accumulated capacity
  </text>

  <rect x="100" y="260" width="150" height="40" fill="#fff9c4" stroke="#fbc02d" rx="3"/>
  <text x="175" y="285" font-size="11" text-anchor="middle" fill="#333">Normal Usage</text>

  <rect x="400" y="260" width="150" height="40" fill="#ffebee" stroke="#f44336" rx="3"/>
  <text x="475" y="285" font-size="11" text-anchor="middle" fill="#333">Bucket Depleted</text>
</svg>

2. Adaptive Capacity

Adaptive capacity handles sustained imbalances where one partition receives significantly more traffic than others.
  • Dynamic Boosting: DynamoDB automatically increases the throughput for a hot partition if the total table-level throughput is not exceeded.
  • Isolation: It helps prevent “noisy neighbor” problems within your own table’s partitions.
  • Instant vs. Delayed: Modern DynamoDB (since 2019) applies adaptive capacity almost instantly for most workloads.
FeatureBurst CapacityAdaptive Capacity
DurationShort-term (up to 5 mins)Long-term / Sustained
TriggerTemporal spikesSpatial imbalance (hot keys)
LimitAccumulated bucket sizeTotal Table Throughput
AutomationAlways onAlways on

3. Throttling and Error Handling

When capacity (including burst and adaptive) is exhausted, DynamoDB returns a ProvisionedThroughputExceededException (HTTP 400).
// Recommended Retry Strategy with Exponential Backoff
const putWithRetry = async (item, retryCount = 0) => {
  const MAX_RETRIES = 5;
  try {
    await dynamodb.put({ TableName: 'Orders', Item: item }).promise();
  } catch (error) {
    if (error.code === 'ProvisionedThroughputExceededException' && retryCount < MAX_RETRIES) {
      const delay = Math.pow(2, retryCount) * 100 + Math.random() * 100;
      await new Promise(resolve => setTimeout(resolve, delay));
      return putWithRetry(item, retryCount + 1);
    }
    throw error;
  }
};

Optimizing Read Performance

Read optimization in DynamoDB operates on three levers: reduce the data read per operation (projection expressions, smaller items), reduce the number of operations (batch reads, queries instead of scans), and avoid the database entirely (caching). The first lever is unique to DynamoDB because read capacity is billed based on the full item size stored, not just the bytes returned — a ProjectionExpression reduces network transfer and deserialization cost but still consumes RCUs based on the stored item size. This is a critical distinction that many engineers miss: projections optimize bandwidth and client-side processing, not RCU consumption. The second lever — reducing operation count — is where batch operations and query design have the biggest impact. The third lever — caching — was covered in its own section above.

Using Projection Expressions

Reduce data transfer by reading only needed attributes (note that this reduces network bandwidth and client processing time, but RCU consumption is still based on the full item size on disk):
// BAD: Reading entire item (wasteful)
const result = await dynamodb.get({
  TableName: 'Users',
  Key: { userId: '12345' }
}).promise();
// Returns: { userId, email, name, address, preferences, history, ... }

// GOOD: Reading only needed attributes
const result = await dynamodb.get({
  TableName: 'Users',
  Key: { userId: '12345' },
  ProjectionExpression: '#name, email',
  ExpressionAttributeNames: {
    '#name': 'name'
  }
}).promise();
// Returns: { name, email }
// Benefits: Lower RCUs, less network transfer, faster response

Batch Operations

// INEFFICIENT: Sequential GetItem calls
const getUsersSequential = async (userIds) => {
  const users = [];
  for (const userId of userIds) {
    const result = await dynamodb.get({
      TableName: 'Users',
      Key: { userId: userId }
    }).promise();
    users.push(result.Item);
  }
  return users;
};
// Time: O(n) network round trips

// EFFICIENT: BatchGetItem
const getUsersBatch = async (userIds) => {
  const result = await dynamodb.batchGet({
    RequestItems: {
      'Users': {
        Keys: userIds.map(id => ({ userId: id })),
        ProjectionExpression: '#name, email, createdAt',
        ExpressionAttributeNames: { '#name': 'name' }
      }
    }
  }).promise();

  return result.Responses.Users;
};
// Time: O(1) network round trip (up to 100 items)
// Benefits: 10-100x faster for multiple items

// Handle unprocessed keys
const getUsersBatchWithRetry = async (userIds) => {
  let unprocessedKeys = {
    'Users': {
      Keys: userIds.map(id => ({ userId: id }))
    }
  };

  const allItems = [];

  while (Object.keys(unprocessedKeys).length > 0) {
    const result = await dynamodb.batchGet({
      RequestItems: unprocessedKeys
    }).promise();

    allItems.push(...(result.Responses?.Users || []));
    unprocessedKeys = result.UnprocessedKeys || {};

    if (Object.keys(unprocessedKeys).length > 0) {
      // Exponential backoff
      await new Promise(resolve => setTimeout(resolve, 100));
    }
  }

  return allItems;
};

Query Optimization

Query is the most important operation in DynamoDB because it is the only way to efficiently retrieve multiple items from a single partition. Unlike Scan (which reads every item in the table), Query operates within a single partition key value and uses the sort key for range filtering — making it O(log n + k) where k is the number of items returned, not O(n) over the entire table. One critical subtlety: FilterExpression is applied after the query reads data from storage, so it reduces the items returned to your application but does not reduce the RCUs consumed. If you find yourself relying heavily on FilterExpression, it usually means your key schema does not match your access pattern and should be redesigned.
// Optimize Query operations

// 1. Use ScanIndexForward for ordering
const getRecentOrders = async (userId, limit = 10) => {
  return await dynamodb.query({
    TableName: 'Orders',
    KeyConditionExpression: 'userId = :uid',
    ExpressionAttributeValues: { ':uid': userId },
    ScanIndexForward: false,  // Descending order (most recent first)
    Limit: limit  // Stop after 10 items
  }).promise();
};

// 2. Use BETWEEN for range queries
const getOrdersInDateRange = async (userId, startDate, endDate) => {
  return await dynamodb.query({
    TableName: 'Orders',
    KeyConditionExpression: 'userId = :uid AND orderDate BETWEEN :start AND :end',
    ExpressionAttributeValues: {
      ':uid': userId,
      ':start': startDate,
      ':end': endDate
    }
  }).promise();
};

// 3. Use begins_with for hierarchical data
const getDepartmentEmployees = async (orgId, deptName) => {
  return await dynamodb.query({
    TableName: 'Organization',
    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
    ExpressionAttributeValues: {
      ':pk': `ORG#${orgId}`,
      ':sk': `DEPT#${deptName}#EMP#`
    }
  }).promise();
};

// 4. Use FilterExpression sparingly (applied after read)
const getHighValueOrders = async (userId, minValue) => {
  return await dynamodb.query({
    TableName: 'Orders',
    KeyConditionExpression: 'userId = :uid',
    FilterExpression: 'totalAmount > :minValue',  // Applied after Query
    ExpressionAttributeValues: {
      ':uid': userId,
      ':minValue': minValue
    }
  }).promise();
  // Note: Still consumes RCUs for all items before filtering!
};

Parallel Query Pattern

// Query multiple partitions in parallel
const queryMultiplePartitions = async (partitionKeys) => {
  const promises = partitionKeys.map(pk =>
    dynamodb.query({
      TableName: 'Data',
      KeyConditionExpression: 'PK = :pk',
      ExpressionAttributeValues: { ':pk': pk }
    }).promise()
  );

  const results = await Promise.all(promises);
  return results.flatMap(r => r.Items);
};

// Example: Query all user orders across time partitions
const getAllUserOrders = async (userId) => {
  const months = [
    '2024-01', '2024-02', '2024-03', '2024-04',
    '2024-05', '2024-06', '2024-07', '2024-08'
  ];

  const partitionKeys = months.map(month => `USER#${userId}#MONTH#${month}`);

  return await queryMultiplePartitions(partitionKeys);
};
// Benefits: 8x faster than sequential queries

Optimizing Write Performance

Write optimization in DynamoDB is fundamentally different from read optimization because writes are inherently more expensive (1 WCU covers only 1KB vs 4KB for reads) and cannot be cached away. Every write must be durably committed to at least two of the three Availability Zone replicas before DynamoDB acknowledges success. This replication overhead is the price of DynamoDB’s durability guarantee, and it means that write latency has a harder floor than read latency. The three main levers for write optimization are: reduce the number of round trips (batch writes), reduce the data written per operation (smaller items, targeted updates instead of full-item replacements), and avoid unnecessary writes (conditional writes, atomic counters). One pattern that trips up many teams: using PutItem to update a single attribute on a large item. This replaces the entire item, consuming WCUs proportional to the full item size. Using UpdateExpression instead writes only the changed attributes at the storage layer, though WCU billing is still based on the new item’s total size.

BatchWriteItem

// INEFFICIENT: Sequential PutItem
const createItemsSequential = async (items) => {
  for (const item of items) {
    await dynamodb.put({
      TableName: 'Products',
      Item: item
    }).promise();
  }
};
// Time: O(n) sequential writes

// EFFICIENT: BatchWriteItem
const createItemsBatch = async (items) => {
  // Batch size limit: 25 items
  const batchSize = 25;
  const batches = [];

  for (let i = 0; i < items.length; i += batchSize) {
    const batch = items.slice(i, i + batchSize);
    batches.push(batch);
  }

  for (const batch of batches) {
    await dynamodb.batchWrite({
      RequestItems: {
        'Products': batch.map(item => ({
          PutRequest: { Item: item }
        }))
      }
    }).promise();
  }
};

// Handle unprocessed items
const batchWriteWithRetry = async (tableName, items) => {
  let unprocessedItems = {
    [tableName]: items.map(item => ({
      PutRequest: { Item: item }
    }))
  };

  let retryCount = 0;
  const maxRetries = 5;

  while (unprocessedItems[tableName]?.length > 0 && retryCount < maxRetries) {
    const result = await dynamodb.batchWrite({
      RequestItems: unprocessedItems
    }).promise();

    unprocessedItems = result.UnprocessedItems || {};

    if (unprocessedItems[tableName]?.length > 0) {
      retryCount++;
      // Exponential backoff
      await new Promise(resolve =>
        setTimeout(resolve, Math.pow(2, retryCount) * 100)
      );
    }
  }

  if (retryCount >= maxRetries) {
    throw new Error('Max retries exceeded for batch write');
  }
};

Conditional Writes for Efficiency

// Only update if value changed (saves WCUs)
const updateIfChanged = async (userId, newEmail) => {
  try {
    await dynamodb.update({
      TableName: 'Users',
      Key: { userId: userId },
      UpdateExpression: 'SET email = :newEmail',
      ConditionExpression: 'email <> :newEmail',  // Only if different
      ExpressionAttributeValues: {
        ':newEmail': newEmail
      }
    }).promise();

    console.log('Updated');
  } catch (error) {
    if (error.code === 'ConditionalCheckFailedException') {
      console.log('No update needed - value unchanged');
      // No WCUs consumed!
    } else {
      throw error;
    }
  }
};

// Atomic increment (more efficient than read-modify-write)
const incrementCounter = async (counterId) => {
  await dynamodb.update({
    TableName: 'Counters',
    Key: { counterId: counterId },
    UpdateExpression: 'ADD #count :inc',
    ExpressionAttributeNames: { '#count': 'count' },
    ExpressionAttributeValues: { ':inc': 1 }
  }).promise();
};
// vs
const incrementCounterInefficient = async (counterId) => {
  // BAD: Read-modify-write (uses 1 RCU + 1 WCU)
  const result = await dynamodb.get({
    TableName: 'Counters',
    Key: { counterId: counterId },
    ConsistentRead: true
  }).promise();

  await dynamodb.put({
    TableName: 'Counters',
    Item: {
      counterId: counterId,
      count: (result.Item?.count || 0) + 1
    }
  }).promise();
};

Update Expression Optimization

// INEFFICIENT: Multiple updates
const updateUserInefficient = async (userId, updates) => {
  await dynamodb.update({
    TableName: 'Users',
    Key: { userId: userId },
    UpdateExpression: 'SET #name = :name',
    ExpressionAttributeNames: { '#name': 'name' },
    ExpressionAttributeValues: { ':name': updates.name }
  }).promise();

  await dynamodb.update({
    TableName: 'Users',
    Key: { userId: userId },
    UpdateExpression: 'SET email = :email',
    ExpressionAttributeValues: { ':email': updates.email }
  }).promise();
};
// Cost: 2 WCUs × item size

// EFFICIENT: Single update expression
const updateUserEfficient = async (userId, updates) => {
  const updateParts = [];
  const attributeNames = {};
  const attributeValues = {};

  if (updates.name) {
    updateParts.push('#name = :name');
    attributeNames['#name'] = 'name';
    attributeValues[':name'] = updates.name;
  }

  if (updates.email) {
    updateParts.push('email = :email');
    attributeValues[':email'] = updates.email;
  }

  if (updates.lastLogin) {
    updateParts.push('lastLogin = :lastLogin');
    attributeValues[':lastLogin'] = updates.lastLogin;
  }

  await dynamodb.update({
    TableName: 'Users',
    Key: { userId: userId },
    UpdateExpression: `SET ${updateParts.join(', ')}`,
    ExpressionAttributeNames: attributeNames,
    ExpressionAttributeValues: attributeValues
  }).promise();
};
// Cost: 1 WCU × item size

Caching Strategies

Caching is arguably the single most impactful performance optimization for DynamoDB-backed applications, yet it is the one most teams implement last. The reason caching matters more for DynamoDB than for traditional databases is economic: DynamoDB charges per request, so every cache hit directly reduces your bill in addition to reducing latency. A well-implemented cache with an 80% hit rate does not just make your application 5x faster for cached reads — it reduces your DynamoDB read costs by 80%. AWS recognized this pattern and built DynamoDB Accelerator (DAX) as a first-party solution, but many production systems use external caches (Redis, Memcached) for greater control over eviction policies, TTLs, and cross-service sharing. The choice between DAX and an external cache is one of the most consequential architectural decisions you will make with DynamoDB: DAX offers zero-code-change integration and write-through semantics, but it only supports eventually consistent reads and ties you to a VPC-deployed cluster. An external cache like Redis gives you fine-grained control, sorted sets for leaderboards, pub/sub for invalidation, and cross-service reuse — but you own the consistency model and invalidation logic.

Application-Level Caching

// Redis/Memcached caching layer
const Redis = require('ioredis');
const redis = new Redis();

class CachedDynamoDB {
  constructor(dynamodb) {
    this.dynamodb = dynamodb;
    this.defaultTTL = 300; // 5 minutes
  }

  async get(params, ttl = this.defaultTTL) {
    const cacheKey = this.getCacheKey(params);

    // Try cache first
    const cached = await redis.get(cacheKey);
    if (cached) {
      return { Item: JSON.parse(cached), fromCache: true };
    }

    // Cache miss - read from DynamoDB
    const result = await this.dynamodb.get(params).promise();

    if (result.Item) {
      // Store in cache
      await redis.setex(cacheKey, ttl, JSON.stringify(result.Item));
    }

    return { ...result, fromCache: false };
  }

  async put(params) {
    // Write to DynamoDB
    await this.dynamodb.put(params).promise();

    // Invalidate cache
    const cacheKey = this.getCacheKey({
      TableName: params.TableName,
      Key: this.extractKey(params.Item)
    });
    await redis.del(cacheKey);
  }

  async update(params) {
    // Update DynamoDB
    const result = await this.dynamodb.update({
      ...params,
      ReturnValues: 'ALL_NEW'
    }).promise();

    // Update cache with new value
    const cacheKey = this.getCacheKey({
      TableName: params.TableName,
      Key: params.Key
    });
    await redis.setex(
      cacheKey,
      this.defaultTTL,
      JSON.stringify(result.Attributes)
    );

    return result;
  }

  getCacheKey(params) {
    return `dynamodb:${params.TableName}:${JSON.stringify(params.Key)}`;
  }

  extractKey(item) {
    // Extract primary key from item
    // Implement based on your schema
    return { userId: item.userId };
  }
}

// Usage
const cachedDb = new CachedDynamoDB(dynamodb);

// Read (uses cache)
const user = await cachedDb.get({
  TableName: 'Users',
  Key: { userId: '12345' }
});
console.log('From cache:', user.fromCache);

// Write (invalidates cache)
await cachedDb.put({
  TableName: 'Users',
  Item: { userId: '12345', name: 'Updated Name' }
});

DynamoDB Accelerator (DAX)

DAX (launched in 2017) is a fully managed, in-memory cache purpose-built for DynamoDB. It sits between your application and DynamoDB, intercepting API calls and serving cached results with microsecond latency. Architecturally, DAX is a write-through, read-through cache cluster deployed within your VPC. The “write-through” behavior means that when you write through DAX, it updates both the cache and DynamoDB synchronously, keeping the item cache consistent without manual invalidation. The “read-through” behavior means cache misses are automatically populated from DynamoDB. DAX maintains two internal caches: an item cache (for GetItem/PutItem results) and a query cache (for Query/Scan results). The item cache is invalidated on writes, but the query cache uses TTL-based expiration only — a distinction that catches many teams off guard. DAX is best suited for read-heavy workloads that repeatedly access the same items. It is not a good fit for write-heavy workloads, workloads requiring strongly consistent reads, or applications that primarily use Scan operations.
// DAX client setup
const AmazonDaxClient = require('amazon-dax-client');

const daxEndpoint = 'mycluster.dax-clusters.us-east-1.amazonaws.com:8111';
const dax = new AmazonDaxClient({ endpoints: [daxEndpoint] });

// Use DAX client exactly like DynamoDB client
const getUserWithDAX = async (userId) => {
  return await dax.get({
    TableName: 'Users',
    Key: { userId: userId }
  }).promise();
};

// DAX provides:
// - Microsecond read latency (vs milliseconds)
// - Transparent caching
// - No code changes needed
// - Automatic cache invalidation
// - Eventually consistent reads only

// Performance comparison
const compareDynamoDBvsDAX = async (userId) => {
  // Direct DynamoDB
  const start1 = Date.now();
  await dynamodb.get({
    TableName: 'Users',
    Key: { userId: userId }
  }).promise();
  const dynamoDBLatency = Date.now() - start1;

  // Through DAX
  const start2 = Date.now();
  await dax.get({
    TableName: 'Users',
    Key: { userId: userId }
  }).promise();
  const daxLatency = Date.now() - start2;

  console.log('DynamoDB latency:', dynamoDBLatency, 'ms'); // ~10-20ms
  console.log('DAX latency:', daxLatency, 'ms');          // ~1-2ms (cached)
};

Cache-Aside Pattern

// Implement cache-aside pattern
class CacheAsideService {
  constructor(cache, database) {
    this.cache = cache;
    this.database = database;
  }

  async getUser(userId) {
    // 1. Check cache
    const cacheKey = `user:${userId}`;
    let user = await this.cache.get(cacheKey);

    if (user) {
      return JSON.parse(user);
    }

    // 2. Cache miss - read from database
    const result = await this.database.get({
      TableName: 'Users',
      Key: { userId: userId }
    }).promise();

    user = result.Item;

    if (user) {
      // 3. Populate cache
      await this.cache.setex(cacheKey, 300, JSON.stringify(user));
    }

    return user;
  }

  async updateUser(userId, updates) {
    // 1. Update database
    const result = await this.database.update({
      TableName: 'Users',
      Key: { userId: userId },
      UpdateExpression: 'SET #name = :name',
      ExpressionAttributeNames: { '#name': 'name' },
      ExpressionAttributeValues: { ':name': updates.name },
      ReturnValues: 'ALL_NEW'
    }).promise();

    // 2. Update cache (write-through)
    const cacheKey = `user:${userId}`;
    await this.cache.setex(
      cacheKey,
      300,
      JSON.stringify(result.Attributes)
    );

    return result.Attributes;
  }

  async deleteUser(userId) {
    // 1. Delete from database
    await this.database.delete({
      TableName: 'Users',
      Key: { userId: userId }
    }).promise();

    // 2. Invalidate cache
    const cacheKey = `user:${userId}`;
    await this.cache.del(cacheKey);
  }
}

On-Demand vs Provisioned Capacity

Before November 2018, DynamoDB only offered provisioned capacity mode, which meant you had to predict your traffic and pre-allocate throughput. Under-provision and you get throttled; over-provision and you waste money. This capacity planning burden was one of the most common complaints about DynamoDB and drove many teams to alternative databases. The introduction of on-demand mode at re:Invent 2018 was a watershed moment: it eliminated capacity planning entirely by charging per-request instead of per-hour. The trade-off is cost — on-demand mode costs roughly 6-7x more per request than well-utilized provisioned capacity. For most production workloads with predictable traffic, provisioned mode with auto-scaling remains the cost-optimal choice. On-demand mode shines for truly unpredictable workloads, new applications where you do not yet know your traffic patterns, and development/test environments where simplicity outweighs cost. One practical pattern many teams use: start with on-demand to observe traffic patterns, then switch to provisioned with auto-scaling once the workload stabilizes. DynamoDB allows one mode switch per table every 24 hours.
<svg viewBox="0 0 900 600" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="450" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    On-Demand vs Provisioned Capacity
  </text>

  <!-- On-Demand -->
  <rect x="50" y="60" width="380" height="500" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
  <text x="240" y="95" font-size="16" font-weight="bold" text-anchor="middle" fill="#1976d2">
    On-Demand Mode
  </text>

  <text x="80" y="130" font-size="13" font-weight="bold" fill="#333">Characteristics:</text>
  <text x="90" y="155" font-size="11" fill="#666">• Pay per request</text>
  <text x="90" y="175" font-size="11" fill="#666">• No capacity planning needed</text>
  <text x="90" y="195" font-size="11" fill="#666">• Scales automatically</text>
  <text x="90" y="215" font-size="11" fill="#666">• Higher cost per request</text>

  <text x="80" y="250" font-size="13" font-weight="bold" fill="#333">Best For:</text>
  <text x="90" y="275" font-size="11" fill="#666">✓ Unknown workloads</text>
  <text x="90" y="295" font-size="11" fill="#666">✓ Unpredictable traffic</text>
  <text x="90" y="315" font-size="11" fill="#666">✓ New applications</text>
  <text x="90" y="335" font-size="11" fill="#666">✓ Spiky workloads</text>
  <text x="90" y="355" font-size="11" fill="#666">✓ Dev/test environments</text>

  <text x="80" y="390" font-size="13" font-weight="bold" fill="#333">Pricing:</text>
  <rect x="80" y="400" width="320" height="60" fill="#fff" stroke="#1976d2" stroke-width="1" rx="3"/>
  <text x="240" y="425" font-size="11" fill="#333">Write: $1.25 per million requests</text>
  <text x="240" y="445" font-size="11" fill="#333">Read: $0.25 per million requests</text>

  <text x="80" y="490" font-size="13" font-weight="bold" fill="#333">Limits:</text>
  <text x="90" y="515" font-size="11" fill="#666">• 40K RCUs / 40K WCUs per table</text>
  <text x="90" y="535" font-size="11" fill="#666">• Can handle 2x previous peak</text>

  <!-- Provisioned -->
  <rect x="470" y="60" width="380" height="500" fill="#fff3e0" stroke="#ff9800" stroke-width="2" rx="5"/>
  <text x="660" y="95" font-size="16" font-weight="bold" text-anchor="middle" fill="#ff9800">
    Provisioned Mode
  </text>

  <text x="500" y="130" font-size="13" font-weight="bold" fill="#333">Characteristics:</text>
  <text x="510" y="155" font-size="11" fill="#666">• Pre-defined capacity</text>
  <text x="510" y="175" font-size="11" fill="#666">• Requires capacity planning</text>
  <text x="510" y="195" font-size="11" fill="#666">• Auto-scaling available</text>
  <text x="510" y="215" font-size="11" fill="#666">• Lower cost per request</text>

  <text x="500" y="250" font-size="13" font-weight="bold" fill="#333">Best For:</text>
  <text x="510" y="275" font-size="11" fill="#666">✓ Predictable workloads</text>
  <text x="510" y="295" font-size="11" fill="#666">✓ Steady-state traffic</text>
  <text x="510" y="315" font-size="11" fill="#666">✓ Cost optimization</text>
  <text x="510" y="335" font-size="11" fill="#666">✓ High throughput apps</text>
  <text x="510" y="355" font-size="11" fill="#666">✓ Production workloads</text>

  <text x="500" y="390" font-size="13" font-weight="bold" fill="#333">Pricing:</text>
  <rect x="500" y="400" width="320" height="60" fill="#fff" stroke="#ff9800" stroke-width="1" rx="3"/>
  <text x="660" y="425" font-size="11" fill="#333">Write: $0.00065 per WCU-hour</text>
  <text x="660" y="445" font-size="11" fill="#333">Read: $0.00013 per RCU-hour</text>

  <text x="500" y="490" font-size="13" font-weight="bold" fill="#333">Features:</text>
  <text x="510" y="515" font-size="11" fill="#666">• Reserved capacity discounts</text>
  <text x="510" y="535" font-size="11" fill="#666">• Auto-scaling policies</text>
</svg>

Choosing the Right Mode

// Cost comparison calculator
function calculateMonthlyCost(readsPerMonth, writesPerMonth, mode = 'on-demand') {
  if (mode === 'on-demand') {
    const readCost = (readsPerMonth / 1_000_000) * 0.25;
    const writeCost = (writesPerMonth / 1_000_000) * 1.25;
    return readCost + writeCost;
  } else {
    // Provisioned mode
    const avgReadsPerSecond = readsPerMonth / (30 * 24 * 60 * 60);
    const avgWritesPerSecond = writesPerMonth / (30 * 24 * 60 * 60);

    const rcuCost = avgReadsPerSecond * 0.00013 * 24 * 30;
    const wcuCost = avgWritesPerSecond * 0.00065 * 24 * 30;

    return rcuCost + wcuCost;
  }
}

// Example
const readsPerMonth = 100_000_000;  // 100M reads
const writesPerMonth = 10_000_000;  // 10M writes

console.log('On-Demand Cost:', calculateMonthlyCost(readsPerMonth, writesPerMonth, 'on-demand'));
// Output: ~$37.50

console.log('Provisioned Cost:', calculateMonthlyCost(readsPerMonth, writesPerMonth, 'provisioned'));
// Output: ~$15.00

// Decision: Use provisioned if steady traffic (60% cheaper)

Auto-Scaling Configuration

const AWS = require('aws-sdk');
const applicationAutoScaling = new AWS.ApplicationAutoScaling();

// Configure auto-scaling for provisioned capacity
const configureAutoScaling = async (tableName) => {
  // Register scalable target
  await applicationAutoScaling.registerScalableTarget({
    ServiceNamespace: 'dynamodb',
    ResourceId: `table/${tableName}`,
    ScalableDimension: 'dynamodb:table:ReadCapacityUnits',
    MinCapacity: 5,
    MaxCapacity: 1000
  }).promise();

  // Define scaling policy
  await applicationAutoScaling.putScalingPolicy({
    PolicyName: `${tableName}-read-scaling-policy`,
    ServiceNamespace: 'dynamodb',
    ResourceId: `table/${tableName}`,
    ScalableDimension: 'dynamodb:table:ReadCapacityUnits',
    PolicyType: 'TargetTrackingScaling',
    TargetTrackingScalingPolicyConfiguration: {
      TargetValue: 70.0,  // Target 70% utilization
      PredefinedMetricSpecification: {
        PredefinedMetricType: 'DynamoDBReadCapacityUtilization'
      },
      ScaleInCooldown: 60,  // Wait 60s before scaling in
      ScaleOutCooldown: 60  // Wait 60s before scaling out
    }
  }).promise();

  console.log('Auto-scaling configured');
};

Latency Optimization Techniques

DynamoDB advertises single-digit millisecond latency, and for well-designed tables with simple GetItem operations, it delivers: p50 latency is typically 3-5ms, and p99 is under 10ms. But these numbers apply to the DynamoDB service itself — your application’s observed latency includes TLS handshake time, SDK overhead, serialization/deserialization, and network round-trip time between your compute and the DynamoDB endpoint. In Lambda functions with cold starts, the first DynamoDB call can take 200-500ms due to TCP connection establishment and TLS negotiation. In long-running services, connection pooling and keep-alive eliminate this overhead for subsequent calls. The techniques below address these infrastructure-level latency sources, which are often larger than DynamoDB’s own processing time.

Connection Pooling

// Reuse DynamoDB client connections
const AWS = require('aws-sdk');

// BAD: Creating new client for each request
const getUser = async (userId) => {
  const dynamodb = new AWS.DynamoDB.DocumentClient();  // New connection!
  return await dynamodb.get({
    TableName: 'Users',
    Key: { userId: userId }
  }).promise();
};

// GOOD: Reuse client instance
const dynamodb = new AWS.DynamoDB.DocumentClient({
  maxRetries: 3,
  httpOptions: {
    timeout: 5000,
    connectTimeout: 2000
  }
});

const getUserOptimized = async (userId) => {
  return await dynamodb.get({
    TableName: 'Users',
    Key: { userId: userId }
  }).promise();
};

Parallel Requests

// Sequential requests (slow)
const getDataSequential = async () => {
  const user = await dynamodb.get({
    TableName: 'Users',
    Key: { userId: '123' }
  }).promise();

  const orders = await dynamodb.query({
    TableName: 'Orders',
    KeyConditionExpression: 'userId = :uid',
    ExpressionAttributeValues: { ':uid': '123' }
  }).promise();

  const products = await dynamodb.scan({
    TableName: 'Products',
    FilterExpression: 'category = :cat',
    ExpressionAttributeValues: { ':cat': 'Electronics' }
  }).promise();

  return { user: user.Item, orders: orders.Items, products: products.Items };
};
// Total time: ~60ms (3 × 20ms)

// Parallel requests (fast)
const getDataParallel = async () => {
  const [user, orders, products] = await Promise.all([
    dynamodb.get({
      TableName: 'Users',
      Key: { userId: '123' }
    }).promise(),

    dynamodb.query({
      TableName: 'Orders',
      KeyConditionExpression: 'userId = :uid',
      ExpressionAttributeValues: { ':uid': '123' }
    }).promise(),

    dynamodb.scan({
      TableName: 'Products',
      FilterExpression: 'category = :cat',
      ExpressionAttributeValues: { ':cat': 'Electronics' }
    }).promise()
  ]);

  return { user: user.Item, orders: orders.Items, products: products.Items };
};
// Total time: ~20ms (max of the three)

Regional Endpoints

// Use region-local DynamoDB endpoint
const dynamodb = new AWS.DynamoDB.DocumentClient({
  region: 'us-east-1',  // Same region as application
  endpoint: 'https://dynamodb.us-east-1.amazonaws.com'
});

// Global Accelerator for global applications
const dynamodbGA = new AWS.DynamoDB.DocumentClient({
  endpoint: 'https://dynamodb.us-east-1.amazonaws.com'  // Through Global Accelerator
});

Monitoring and Performance Metrics

You cannot optimize what you cannot measure, and DynamoDB provides unusually rich observability through CloudWatch. The most important metrics to watch are not the obvious ones (consumed capacity) but the diagnostic ones: ThrottledRequests tells you when you are hitting limits, SuccessfulRequestLatency tells you when the service is performing normally, and — most valuably — Contributor Insights (launched in 2019) tells you which specific partition keys are receiving the most traffic. Before Contributor Insights existed, diagnosing hot partitions required guesswork and instrumentation in application code. Now it is a one-click enablement that reveals your top partition keys by traffic volume, making it the single most important monitoring tool for DynamoDB performance debugging.

CloudWatch Metrics

const AWS = require('aws-sdk');
const cloudwatch = new AWS.CloudWatch();

// Monitor key performance metrics
const monitorTablePerformance = async (tableName) => {
  const metrics = await cloudwatch.getMetricStatistics({
    Namespace: 'AWS/DynamoDB',
    MetricName: 'ConsumedReadCapacityUnits',
    Dimensions: [
      {
        Name: 'TableName',
        Value: tableName
      }
    ],
    StartTime: new Date(Date.now() - 3600000),  // Last hour
    EndTime: new Date(),
    Period: 300,  // 5-minute periods
    Statistics: ['Sum', 'Average', 'Maximum']
  }).promise();

  return metrics.Datapoints;
};

// Track throttling events
const monitorThrottling = async (tableName) => {
  const throttles = await cloudwatch.getMetricStatistics({
    Namespace: 'AWS/DynamoDB',
    MetricName: 'UserErrors',  // Throttling errors
    Dimensions: [
      {
        Name: 'TableName',
        Value: tableName
      }
    ],
    StartTime: new Date(Date.now() - 3600000),
    EndTime: new Date(),
    Period: 300,
    Statistics: ['Sum']
  }).promise();

  return throttles.Datapoints;
};

Custom Performance Tracking

class PerformanceTracker {
  async trackOperation(operationName, operation) {
    const startTime = Date.now();
    const startMemory = process.memoryUsage().heapUsed;

    try {
      const result = await operation();
      const duration = Date.now() - startTime;
      const memoryUsed = process.memoryUsage().heapUsed - startMemory;

      // Log metrics
      console.log({
        operation: operationName,
        duration: `${duration}ms`,
        memory: `${(memoryUsed / 1024 / 1024).toFixed(2)}MB`,
        success: true
      });

      // Send to CloudWatch
      await this.sendMetric(operationName, duration);

      return result;
    } catch (error) {
      const duration = Date.now() - startTime;

      console.error({
        operation: operationName,
        duration: `${duration}ms`,
        success: false,
        error: error.message
      });

      throw error;
    }
  }

  async sendMetric(operationName, duration) {
    await cloudwatch.putMetricData({
      Namespace: 'MyApp/DynamoDB',
      MetricData: [
        {
          MetricName: 'OperationLatency',
          Value: duration,
          Unit: 'Milliseconds',
          Dimensions: [
            {
              Name: 'Operation',
              Value: operationName
            }
          ]
        }
      ]
    }).promise();
  }
}

// Usage
const tracker = new PerformanceTracker();

const user = await tracker.trackOperation('GetUser', async () => {
  return await dynamodb.get({
    TableName: 'Users',
    Key: { userId: '12345' }
  }).promise();
});

GSI Performance and Write Amplification

Global Secondary Indexes (GSIs) are one of DynamoDB’s most powerful features and one of its most dangerous performance traps. They enable alternative query patterns on your data by maintaining a separate, automatically-synchronized copy of selected attributes with a different key schema. The key word is “copy” — a GSI is not a pointer or a reference; it is a physically separate partition structure that DynamoDB maintains by replicating writes from the base table. This replication is the source of write amplification, and understanding it is essential for cost control and performance planning in production. The write amplification problem in DynamoDB GSIs is conceptually identical to the write amplification in LSM-tree databases (RocksDB, LevelDB, Cassandra) and in secondary indexes of any distributed database — you are paying for the convenience of additional access patterns with additional write overhead. The critical practical implication: a table with 5 GSIs where every write touches all indexes effectively costs 6x the WCUs of the same table with no indexes. Teams that add GSIs casually during development often face sticker shock when production traffic arrives.

1. The Write Amplification Problem

Every write to a table with GSIs triggers one or more “shadow” writes to the index partitions. Under the hood, DynamoDB’s replication subsystem detects which attributes changed, determines which GSIs are affected (only indexes whose key or projected attributes were modified), and propagates the changes asynchronously. This is similar to how MySQL’s InnoDB engine maintains secondary indexes, except that in DynamoDB the propagation happens across distributed storage nodes rather than within a single database instance.
  • Write Cost: A single PutItem that updates a GSI-indexed attribute costs WCUs on the base table plus WCUs on every affected GSI. If the old and new GSI key values differ, the operation incurs both a delete on the old GSI partition and an insert on the new one — effectively double the GSI write cost for that index.
  • Latency: GSI updates are asynchronous but highly optimized (typically propagated within milliseconds). However, if the GSI is throttled due to insufficient provisioned capacity, it can create backpressure that throttles writes to the base table itself — one of the most confusing failure modes in DynamoDB, because the throttling error appears on the base table even though the root cause is GSI capacity.
<svg viewBox="0 0 900 450" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="450" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    GSI Write Amplification Flow
  </text>

  <!-- Base Table -->
  <rect x="50" y="80" width="250" height="150" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
  <text x="175" y="110" font-size="14" font-weight="bold" text-anchor="middle" fill="#1976d2">Base Table (Orders)</text>
  <text x="175" y="140" font-size="11" text-anchor="middle" fill="#333">PK: OrderID</text>
  <text x="175" y="160" font-size="11" text-anchor="middle" fill="#333">Attr: Status, Date</text>

  <!-- GSI 1 -->
  <rect x="550" y="60" width="250" height="120" fill="#fff3e0" stroke="#ff9800" stroke-width="2" rx="5"/>
  <text x="675" y="90" font-size="14" font-weight="bold" text-anchor="middle" fill="#ff9800">GSI 1 (StatusIndex)</text>
  <text x="675" y="115" font-size="11" text-anchor="middle" fill="#333">PK: Status</text>

  <!-- GSI 2 -->
  <rect x="550" y="210" width="250" height="120" fill="#f1f8e9" stroke="#8bc34a" stroke-width="2" rx="5"/>
  <text x="675" y="240" font-size="14" font-weight="bold" text-anchor="middle" fill="#8bc34a">GSI 2 (DateIndex)</text>
  <text x="675" y="265" font-size="11" text-anchor="middle" fill="#333">PK: Date</text>

  <!-- Arrows -->
  <path d="M 300 130 L 540 100" stroke="#ff9800" stroke-width="2" fill="none" marker-end="url(#arroworange)"/>
  <path d="M 300 180 L 540 250" stroke="#8bc34a" stroke-width="2" fill="none" marker-end="url(#arrowgreen)"/>

  <text x="420" y="110" font-size="10" fill="#ff9800" font-weight="bold">1 WCU</text>
  <text x="420" y="230" font-size="10" fill="#8bc34a" font-weight="bold">1 WCU</text>
  <text x="175" y="260" font-size="12" font-weight="bold" text-anchor="middle" fill="#f44336">TOTAL COST: 3 WCUs</text>

  <!-- Definitions -->
  <defs>
    <marker id="arroworange" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
      <polygon points="0 0, 10 3, 0 6" fill="#ff9800"/>
    </marker>
    <marker id="arrowgreen" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
      <polygon points="0 0, 10 3, 0 6" fill="#8bc34a"/>
    </marker>
  </defs>
</svg>

2. Index Projection Strategy

To minimize performance impact, project only the attributes necessary for the index’s specific query.
Projection TypeDescriptionPerformance Impact
KEYS_ONLYSmallest index size.Highest (requires “Fetch” from base table).
INCLUDESelected attributes only.Medium (balanced cost/latency).
ALLFull item duplication.Lowest latency, Highest WCU cost.

3. GSI Backpressure and Index Creation

When creating a new GSI on an existing table:
  1. Scanning Phase: DynamoDB scans the base table to populate the index.
  2. Backpressure: If the GSI’s provisioned write capacity is too low during creation, the scan slows down to avoid overwhelming the index.
  3. Impact on Base Table: GSI creation does not consume base table RCUs (it uses background capacity).

Interview Questions and Answers

DynamoDB performance questions in system design interviews are testing whether you understand the distributed systems mechanics beneath the API surface. The strongest candidates do not just recite optimization techniques — they explain why each technique works by connecting it to partitioning, replication, or caching fundamentals. Interviewers are particularly interested in hearing you reason about trade-offs: when is on-demand better than provisioned? When should you add a GSI vs. denormalize? When is DAX the right call vs. an external cache? Frame your answers around trade-offs rather than prescriptions.

Question 1: How do you prevent hot partitions in DynamoDB?

Answer: Hot partitions occur when traffic is unevenly distributed across partition keys. This matters because DynamoDB distributes provisioned capacity evenly across physical partitions, so a table with 10,000 RCUs across 10 partitions gives each partition only 1,000 RCUs. Adaptive capacity (introduced in 2018) mitigates this to some extent by dynamically reallocating unused capacity from cold partitions to hot ones, but it cannot exceed the table’s total provisioned throughput. Prevention strategies include:
  1. Use high-cardinality partition keys:
// BAD: Low cardinality
PK: `STATUS#${status}`  // Only a few values

// GOOD: High cardinality
PK: `USER#${userId}`  // Millions of values
  1. Write sharding:
const shardId = Math.floor(Math.random() * 10);
PK: `METRIC#${name}#SHARD#${shardId}`
  1. Time-based partitioning:
PK: `SENSOR#${sensorId}#HOUR#${hour}`
  1. Composite keys:
PK: `REGION#${region}#CUSTOMER#${customerId}`
Key principle: Distribute writes across as many partitions as possible.

Question 2: Explain the difference between on-demand and provisioned capacity modes.

Answer: On-Demand Mode:
  • Pay per request (no capacity planning)
  • Automatically scales to handle traffic
  • Higher cost per request (1.25/Mwrites,1.25/M writes, 0.25/M reads)
  • Best for unpredictable or spiky workloads
  • No throttling (up to 40K RCU/WCU)
Provisioned Mode:
  • Pre-define RCU/WCU capacity
  • Lower cost per request (0.00065/WCUhour,0.00065/WCU-hour, 0.00013/RCU-hour)
  • Requires capacity planning or auto-scaling
  • Best for steady, predictable workloads
  • Can throttle if capacity exceeded
Decision criteria:
  • Use on-demand for: new apps, dev/test, unpredictable traffic
  • Use provisioned for: production, predictable traffic, cost optimization (60%+ cheaper at scale)

Question 3: How would you optimize a query that retrieves 1,000 items frequently?

Answer: Multi-layered approach:
  1. Caching (most important):
// Add Redis cache
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);

const result = await dynamodb.query({...}).promise();
await redis.setex(cacheKey, 300, JSON.stringify(result.Items));
  1. DAX (DynamoDB Accelerator):
// Transparent caching with microsecond latency
const dax = new AmazonDaxClient({ endpoints: [daxEndpoint] });
const result = await dax.query({...}).promise();
  1. Projection expressions:
// Only fetch needed attributes
ProjectionExpression: 'id, name, status'  // Not all attributes
  1. Pagination:
// Don't fetch all 1000 at once
Limit: 100,  // Fetch in batches
ExclusiveStartKey: lastEvaluatedKey
  1. Parallel queries (if sharded):
const results = await Promise.all(
  shards.map(shard => dynamodb.query({...}).promise())
);
Impact: 10-100x latency reduction with caching.

Question 4: What causes throttling in DynamoDB and how do you handle it?

Answer: Causes:
  1. Exceeding provisioned capacity
  2. Hot partitions (uneven distribution)
  3. Burst capacity exhausted
  4. GSI throttling
Solutions:
  1. Exponential backoff with jitter:
async function retryWithBackoff(operation, maxRetries = 5) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await operation();
    } catch (error) {
      if (error.code === 'ProvisionedThroughputExceededException') {
        const backoff = Math.min(1000 * Math.pow(2, i), 10000);
        const jitter = Math.random() * 100;
        await new Promise(resolve => setTimeout(resolve, backoff + jitter));
      } else {
        throw error;
      }
    }
  }
  throw new Error('Max retries exceeded');
}
  1. Increase capacity:
// Switch to on-demand
await dynamodb.updateTable({
  TableName: 'MyTable',
  BillingMode: 'PAY_PER_REQUEST'
}).promise();

// Or increase provisioned
await dynamodb.updateTable({
  TableName: 'MyTable',
  ProvisionedThroughput: {
    ReadCapacityUnits: 1000,
    WriteCapacityUnits: 500
  }
}).promise();
  1. Fix hot partitions:
// Add sharding
const shardId = itemId % 10;
PK: `SHARD#${shardId}#ITEM#${itemId}`
  1. Enable auto-scaling:
// Configure target tracking at 70% utilization
TargetTrackingScalingPolicyConfiguration: {
  TargetValue: 70.0,
  PredefinedMetricType: 'DynamoDBReadCapacityUtilization'
}

Question 5: How do you optimize write performance for bulk data loads?

Answer: Strategies:
  1. Use BatchWriteItem:
// 25 items per batch (max)
await dynamodb.batchWrite({
  RequestItems: {
    'MyTable': items.map(item => ({
      PutRequest: { Item: item }
    }))
  }
}).promise();
// 25x faster than individual puts
  1. Parallel batch writes:
const batches = chunk(items, 25);
await Promise.all(
  batches.map(batch => dynamodb.batchWrite({...}).promise())
);
// 100x+ faster for 1000s of items
  1. Temporarily increase capacity:
// Before load
await updateTableCapacity(5000, 5000);

// Perform load
await bulkLoad(items);

// After load
await updateTableCapacity(100, 100);
  1. Optimize item size:
// Remove unnecessary attributes
// Use shorter attribute names
// Compress large text fields
  1. Disable streams/triggers temporarily:
// Disable during bulk load to reduce overhead
// Re-enable after completion
Performance: Can achieve 100K+ writes/sec with proper parallelization.

Question 6: Explain how DynamoDB Accelerator (DAX) improves performance.

Answer: DAX is an in-memory cache for DynamoDB that provides: Benefits:
  1. Microsecond latency: 1-2ms vs 10-20ms
  2. Transparent: Drop-in replacement for DynamoDB client
  3. Automatic cache management: No manual invalidation
  4. Read-through: Automatically populates cache on misses
  5. Write-through: Updates cache on writes
How it works:
// Client code unchanged
const dax = new AmazonDaxClient({ endpoints: [daxEndpoint] });

const result = await dax.get({
  TableName: 'Users',
  Key: { userId: '123' }
}).promise();

// Flow:
// 1. Check DAX cache
// 2. If miss, read from DynamoDB
// 3. Populate cache
// 4. Return result
Limitations:
  • Only eventually consistent reads
  • Requires DAX cluster deployment
  • Additional cost ($0.12/hour for t2.small)
Best for:
  • Read-heavy workloads
  • Low-latency requirements
  • Repeated reads of same items

Question 7: How do you calculate RCUs and WCUs for your table?

Answer: RCU Calculation:
// Formula: (Item size / 4KB) × Reads/sec × Consistency factor

// Example: 3KB items, 100 reads/sec, eventual consistency
const itemSizeUnits = Math.ceil(3 / 4);  // 1 unit
const consistencyFactor = 0.5;  // Eventual = 0.5, Strong = 1
const rcus = itemSizeUnits * 100 * consistencyFactor;  // 50 RCUs

// Example: 6KB items, 50 reads/sec, strong consistency
const rcus2 = Math.ceil(6 / 4) * 50 * 1;  // 100 RCUs
WCU Calculation:
// Formula: (Item size / 1KB) × Writes/sec × Transaction factor

// Example: 2KB items, 30 writes/sec, standard
const itemSizeUnits = Math.ceil(2 / 1);  // 2 units
const wcus = itemSizeUnits * 30 * 1;  // 60 WCUs

// Example: 1KB items, 20 writes/sec, transactional
const wcus2 = Math.ceil(1 / 1) * 20 * 2;  // 40 WCUs
Monitoring actual usage:
// Check CloudWatch metrics
const consumed = await cloudwatch.getMetricStatistics({
  Namespace: 'AWS/DynamoDB',
  MetricName: 'ConsumedReadCapacityUnits',
  // Returns actual RCU consumption
}).promise();

Question 8: What are the best practices for pagination in DynamoDB?

Answer: Standard pagination:
const paginateQuery = async (params, allItems = []) => {
  const result = await dynamodb.query(params).promise();
  const items = [...allItems, ...result.Items];

  if (result.LastEvaluatedKey) {
    // More results available
    return paginateQuery({
      ...params,
      ExclusiveStartKey: result.LastEvaluatedKey
    }, items);
  }

  return items;
};

// Usage
const allOrders = await paginateQuery({
  TableName: 'Orders',
  KeyConditionExpression: 'userId = :uid',
  ExpressionAttributeValues: { ':uid': '123' }
});
Efficient pagination with limits:
// Fetch one page at a time
const getPage = async (lastKey = null) => {
  const params = {
    TableName: 'Orders',
    KeyConditionExpression: 'userId = :uid',
    ExpressionAttributeValues: { ':uid': '123' },
    Limit: 20
  };

  if (lastKey) {
    params.ExclusiveStartKey = lastKey;
  }

  const result = await dynamodb.query(params).promise();

  return {
    items: result.Items,
    nextToken: result.LastEvaluatedKey  // Send to client
  };
};

// Client-driven pagination
// Page 1
const page1 = await getPage();

// Page 2
const page2 = await getPage(page1.nextToken);
Best practices:
  1. Use Limit to control page size
  2. Return LastEvaluatedKey as opaque token
  3. Don’t expose internal key structure
  4. Implement timeout handling for large scans
  5. Consider caching for expensive queries

Question 9: How do you monitor and optimize table performance?

Answer: Key metrics to monitor:
  1. Consumed capacity:
// CloudWatch metrics
- ConsumedReadCapacityUnits
- ConsumedWriteCapacityUnits
- ProvisionedReadCapacityUnits
- ProvisionedWriteCapacityUnits

// Calculate utilization
const utilization = consumed / provisioned * 100;
  1. Throttling:
// Monitor throttles
- ReadThrottleEvents
- WriteThrottleEvents
- UserErrors (includes throttles)
  1. Latency:
// Track operation latency
- SuccessfulRequestLatency
- GetItem latency
- Query latency
Optimization actions:
  1. Identify hot partitions:
// Enable CloudWatch Contributor Insights
aws dynamodb put-contributor-insights \
  --table-name MyTable \
  --contributor-insights-action ENABLE

// Review top partition keys by traffic
  1. Analyze access patterns:
// Use AWS X-Ray for request tracing
const AWSXRay = require('aws-xray-sdk');
const AWS = AWSXRay.captureAWS(require('aws-sdk'));

// View query patterns and latency
  1. Set up alarms:
await cloudwatch.putMetricAlarm({
  AlarmName: 'HighThrottling',
  MetricName: 'UserErrors',
  Namespace: 'AWS/DynamoDB',
  Threshold: 10,
  ComparisonOperator: 'GreaterThanThreshold',
  EvaluationPeriods: 2,
  Period: 300
}).promise();

Question 10: How would you design a high-performance leaderboard system?

Answer: Requirements:
  • Millions of users
  • Real-time score updates
  • Top 100 leaderboard queries
  • User rank queries
Design:
  1. Main table (for score updates):
{
  PK: 'USER#userId',
  SK: 'SCORE',
  score: 1000,
  timestamp: '2024-01-15T10:00:00Z'
}

// Update scores efficiently
await dynamodb.update({
  TableName: 'Leaderboard',
  Key: { PK: 'USER#123', SK: 'SCORE' },
  UpdateExpression: 'SET score = score + :inc',
  ExpressionAttributeValues: { ':inc': 10 }
}).promise();
  1. GSI for ranking (limited use):
// GSI: GSI1PK (fixed) + GSI1SK (score)
{
  PK: 'USER#userId',
  SK: 'SCORE',
  score: 1000,
  GSI1PK: 'LEADERBOARD',  // Fixed value
  GSI1SK: String(9999999 - score).padStart(7, '0')  // Inverted for top-100
}

// Query top 100
await dynamodb.query({
  TableName: 'Leaderboard',
  IndexName: 'GSI1',
  KeyConditionExpression: 'GSI1PK = :pk',
  ExpressionAttributeValues: { ':pk': 'LEADERBOARD' },
  Limit: 100
}).promise();
  1. ElastiCache for rankings:
// Redis Sorted Set for real-time rankings
await redis.zadd('leaderboard', score, userId);

// Get top 100
const top100 = await redis.zrevrange('leaderboard', 0, 99, 'WITHSCORES');

// Get user rank
const rank = await redis.zrevrank('leaderboard', userId);
  1. Hybrid approach:
// DynamoDB: Authoritative score storage
// Redis: Real-time rankings (rebuilt periodically)
// DAX: Cache frequently accessed user scores

class LeaderboardService {
  async updateScore(userId, points) {
    // Update DynamoDB
    await dynamodb.update({...}).promise();

    // Update Redis
    await redis.zincrby('leaderboard', points, userId);
  }

  async getTop100() {
    // Read from Redis (fast)
    return await redis.zrevrange('leaderboard', 0, 99, 'WITHSCORES');
  }

  async getUserRank(userId) {
    // Read from Redis
    return await redis.zrevrank('leaderboard', userId);
  }
}
Performance: Sub-millisecond queries, millions of updates/sec.

Summary

The overarching principle behind DynamoDB performance is that you are not tuning a database engine — you are tuning a distributed system. Every optimization in this chapter ultimately comes back to one idea: distribute load evenly across partitions, minimize the amount of data read or written per operation, and cache aggressively to avoid hitting the database at all. These principles are not unique to DynamoDB; they apply to any distributed storage system, from Cassandra to Bigtable to CockroachDB. What makes DynamoDB distinctive is that the performance model is explicit and quantified through capacity units, which means you can predict costs and performance characteristics before deploying to production — if you understand the mechanics. Key performance optimization strategies:
  1. Partition key design: Use high-cardinality keys, avoid hot partitions. This is the single most important decision and the hardest to change later
  2. Batch operations: Use BatchGetItem/BatchWriteItem for multiple items to amortize network round-trip overhead
  3. Caching: Implement Redis/DAX for read-heavy workloads. An 80% cache hit rate reduces both latency and cost by 5x for reads
  4. Capacity planning: Choose on-demand vs provisioned based on workload predictability. Start with on-demand, migrate to provisioned once patterns stabilize
  5. Query optimization: Use projection expressions, parallel queries, and avoid FilterExpression as a substitute for proper key design
  6. GSI discipline: Treat every GSI as a write amplifier. Limit to 2-3 indexes per table in production, and use sparse indexes where possible
  7. Monitoring: Track CloudWatch metrics (especially ThrottledRequests and per-partition metrics via Contributor Insights), set up alarms before you need them
Performance hierarchy (fastest to slowest):
  1. Application cache (Redis): less than 1ms
  2. DAX: 1-2ms
  3. DynamoDB eventual read: 5-10ms
  4. DynamoDB strong read: 10-20ms
  5. DynamoDB query: 10-50ms
  6. DynamoDB scan: 100ms+
The practical takeaway: design your data model to serve most requests from the top of this hierarchy. If your critical path involves a scan, you have a data modeling problem, not a performance tuning problem. Effective performance optimization requires understanding your access patterns and choosing the right combination of these techniques — and having the discipline to revisit your capacity settings and caching strategy as your workload evolves.