• Uncategorised

Troubleshooting High CPU Usage in Redis: Commands, Examples, and Analysis

High CPU usage in Redis can be caused by various factors, ranging from configuration issues to inefficient client commands. Here’s a structured approach to troubleshooting and identifying the reasons for high CPU usage in Redis:

1. Check Redis Statistics

Command:

redis-cli INFO

Example Output:

# Server
redis_version:6.2.6
...
# Clients
connected_clients:52
...
# Memory
used_memory:10485760
...
# CPU
used_cpu_sys:120.44
used_cpu_user:80.23
used_cpu_sys_children:0.00
used_cpu_user_children:0.00
...
# Stats
total_connections_received:1045
total_commands_processed:56897
...
# Commandstats
cmdstat_get:calls=27324,usec=1034856,usec_per_call=37.87
cmdstat_set:calls=18236,usec=742339,usec_per_call=40.71
# Stats
total_connections_received:1500
total_commands_processed:500000
instantaneous_ops_per_sec:1000
total_net_input_bytes:2048000
total_net_output_bytes:4096000
rejected_connections:0
expired_keys:500
evicted_keys:10
keyspace_hits:40000
keyspace_misses:5000
# Persistence
loading:0
rdb_changes_since_last_save:100
rdb_bgsave_in_progress:0
rdb_last_save_time:1627884000
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:2
aof_enabled:0
aof_rewrite_in_progress:0
aof_last_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok

Analysis:

  • Clients: connected_clients shows the number of connected clients. High numbers may indicate many simultaneous operations.
  • Memory: used_memory gives current memory usage. High memory usage could affect performance.
  • CPU: used_cpu_sys and used_cpu_user provide CPU usage metrics. High values may indicate intensive operations.
  • Commandstats: Shows stats per command. Look for commands with high calls or high usec_per_call values to identify potential bottlenecks.
  • connected_clients: Number of clients currently connected to the server.
  • blocked_clients: Number of clients waiting for blocking commands like BLPOP.
  • used_memory: Total memory used by Redis.
  • used_memory_rss: Memory allocated by the OS for Redis.
  • used_memory_peak: Peak memory usage.
  • maxmemory: Maximum memory configuration for Redis.
  • rdb_changes_since_last_save: Changes since the last RDB save.
  • rdb_last_save_time: Timestamp of the last successful RDB save.
  • aof_enabled: Whether Append-Only File (AOF) persistence is enabled.
  • total_commands_processed: Total number of commands processed by the server.
  • instantaneous_ops_per_sec: Current operations per second.
  • expired_keys: Total number of expired keys.
  • evicted_keys: Total number of evicted keys due to memory limits.
  • used_cpu_sys / used_cpu_user: CPU time consumed by Redis in system and user modes.
  • used_cpu_sys_children / used_cpu_user_children: CPU time consumed by background processes.

2. Monitor Slow Log

Configuring the Slow Log

Before using the Slow Log, you need to configure its parameters:

  1. slowlog-log-slower-than: This setting defines the execution time threshold (in microseconds) for commands to be logged. Commands that take longer than this threshold will be recorded in the Slow Log.Command to set the threshold:
redis-cli CONFIG SET slowlog-log-slower-than 10000

Command:

redis-cli SLOWLOG GET

Example Output:

1) 1) (integer) 12345                  # Unique entry ID
   2) (integer) 1627884230             # Timestamp (Unix time) of when the command was executed
   3) (integer) 15000                  # Execution time in microseconds
   4) 1) "HGETALL"                     # Command and its arguments
      2) "large_hash_key"
2) 1) (integer) 12344
   2) (integer) 1627884220
   3) (integer) 25000
   4) 1) "LRANGE"
      2) "large_list_key"
      3) "0"
      4) "100"

Analyzing the Slow Log Entries

Each Slow Log entry provides detailed information about a slow command:

  1. Entry ID:
    • A unique identifier for the Slow Log entry.
  2. Timestamp:
    • The Unix timestamp indicating when the command was executed. This helps correlate slow commands with specific times and events.
  3. Execution Time:
    • The time taken by the command to execute, in microseconds. This indicates how much time was spent processing the command.
  4. Command and Arguments:
    • The actual command and its arguments that were executed. This helps in understanding which operations are slow and potentially why they are slow.

3. Analyze Command Usage

Command:

redis-cli INFO commandstats

Example Output:

cmdstat_get:calls=27324,usec=1034856,usec_per_call=37.87
cmdstat_set:calls=18236,usec=742339,usec_per_call=40.71
cmdstat_lrange:calls=5000,usec=320000,usec_per_call=64.00
cmdstat_hgetall:calls=1000,usec=150000,usec_per_call=150.00
cmdstat_zadd:calls=2500,usec=250000,usec_per_call=100.00

Detailed Breakdown

  1. cmdstat_get:
    • Calls: 27,324
    • Total Time (usec): 1,034,856 microseconds
    • Average Time Per Call: 37.87 microseconds
    • Analysis: The GET command is frequently used and performs well, with an average execution time of 37.87 microseconds per call.
  2. cmdstat_set:
    • Calls: 18,236
    • Total Time (usec): 742,339 microseconds
    • Average Time Per Call: 40.71 microseconds
    • Analysis: The SET command is also frequently used with a slightly higher average execution time than GET. This is expected as SET involves writing data.
  3. cmdstat_lrange:
    • Calls: 5,000
    • Total Time (usec): 320,000 microseconds
    • Average Time Per Call: 64.00 microseconds
    • Analysis: The LRANGE command has a higher average execution time, likely because it needs to retrieve multiple elements from a list. If large ranges are frequently requested, this can be optimized.
  4. cmdstat_hgetall:
    • Calls: 1,000
    • Total Time (usec): 150,000 microseconds
    • Average Time Per Call: 150.00 microseconds
    • Analysis: The HGETALL command retrieves all fields and values of a hash, which can be resource-intensive for large hashes. This high average time indicates potential inefficiency.
  5. cmdstat_zadd:
    • Calls: 2,500
    • Total Time (usec): 250,000 microseconds
    • Average Time Per Call: 100.00 microseconds
    • Analysis: The ZADD command adds members to a sorted set. This command has a moderate execution time, reflecting the additional work of maintaining the sorted order.

4. Check for Large Keys

Command:

redis-cli –bigkeys

Example Output:

# Scanning the entire keyspace to find biggest keys as well as average sizes per type.
# Press ctrl+C to abort.

[00.00%] Biggest string found so far 'large_string_key' with 1048576 bytes
[00.00%] Biggest list found so far 'large_list_key' with 10000 items
[00.00%] Biggest set found so far 'large_set_key' with 2000 members
[00.00%] Biggest hash found so far 'large_hash_key' with 5000 fields
[00.00%] Biggest zset found so far 'large_zset_key' with 1000 members

-------- summary -------

Sampled 100000 keys in the keyspace!
Total key length in bytes is 800000 (avg len 8.00)

Biggest string found 'large_string_key' has 1048576 bytes
Biggest list found 'large_list_key' has 10000 items
Biggest set found 'large_set_key' has 2000 members
Biggest hash found 'large_hash_key' has 5000 fields
Biggest zset found 'large_zset_key' has 1000 members

Detailed Analysis

  1. Biggest String Key:
    • Key: large_string_key
    • Size: 1,048,576 bytes (1 MB)
    • Analysis: Large strings can consume significant memory. Operations like GET, SET, or APPEND on such large strings can be CPU-intensive.
  2. Biggest List Key:
    • Key: large_list_key
    • Items: 10,000 items
    • Analysis: Large lists can slow down operations like LRANGE, LPOP, or LPUSH, especially if they require iterating over many elements or accessing elements near the end.
  3. Biggest Set Key:
    • Key: large_set_key
    • Members: 2,000 members
    • Analysis: Sets with many members can be inefficient for operations like SMEMBERS or SINTER. Large sets also impact memory usage.
  4. Biggest Hash Key:
    • Key: large_hash_key
    • Fields: 5,000 fields
    • Analysis: Hashes with many fields can be resource-intensive for operations like HGETALL, HKEYS, or HVALS. Large hashes can also lead to memory fragmentation.
  5. Biggest Sorted Set Key:
    • Key: large_zset_key
    • Members: 1,000 members
    • Analysis: Sorted sets (zsets) with many members can slow down operations like ZRANGE, ZREVRANGE, or ZREM. They are also more complex due to maintaining order, which can increase CPU usage.

5. Evaluate Client Connections

Command:

redis-cli INFO clients

Example Output:

# Clients
connected_clients:52
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:1

Analysis:

  • connected_clients: Number of currently connected clients.
  • blocked_clients: Clients waiting for blocking commands (like BLPOP).

6. Review Configuration

Command:

redis-cli CONFIG GET *

Example Output:

1) "maxmemory"
2) "0"
3) "maxclients"
4) "10000"
5) "save"
6) "900 1 300 10 60 10000"
...

Analysis:

  • maxmemory: Maximum memory Redis can use. A value of 0 means no limit.
  • maxclients: Maximum number of client connections.
  • save: Snapshotting configuration. Frequent snapshots can increase CPU usage.

7. Check for Expiry and Eviction

Command:

redis-cli INFO keyspace

Example Output:

# Keyspace
db0:keys=1000,expires=500,avg_ttl=600000
db1:keys=2000,expires=0,avg_ttl=0

Explanation and Analysis

  1. Key Count and Expiration Statistics:
    • db0:keys=1000,expires=500,avg_ttl=600000
      • keys=1000: There are 1000 keys in db0.
      • expires=500: Out of these 1000 keys, 500 keys have an expiration time set.
      • avg_ttl=600000: The average TTL (time-to-live) for the expiring keys is 600,000 milliseconds (or 10 minutes).
    • db1:keys=2000,expires=0,avg_ttl=0
      • keys=2000: There are 2000 keys in db1.
      • expires=0: None of these keys have an expiration time set.
      • avg_ttl=0: Since there are no expiring keys, the average TTL is 0.

How to Analyze

  1. High Number of Expiring Keys:
    • If a significant portion of keys have an expiration (expires value is high), it means Redis has to frequently check and remove expired keys. This can lead to high CPU usage, especially if many keys expire simultaneously or if the average TTL is short, causing frequent expirations.
  2. Average TTL:
    • A low avg_ttl means keys are expiring quickly. This could result in more frequent expiration checks and deletions, increasing CPU load.
    • A high avg_ttl means keys are retained longer, potentially reducing the immediate load from expirations but increasing the memory usage over time.
  3. Eviction Policy:
    • If you have configured Redis with a maxmemory limit and an eviction policy (e.g., allkeys-lru, volatile-lru, etc.), the system will evict keys to maintain memory usage within the specified limit. The eviction process can also contribute to CPU usage, especially under heavy load or when memory usage frequently hits the limit.

You may also like...