Skip to content

Performance Profiling Tools

Category: DevOps and System Tools
Type: Linux Commands
Generated on: 2025-07-10 03:19:49
For: System Administration, Development & Technical Interviews


Performance Profiling Tools Cheatsheet (Linux Commands)

Section titled “Performance Profiling Tools Cheatsheet (Linux Commands)”

This cheatsheet provides a practical guide to performance profiling tools in Linux, focusing on commands essential for DevOps engineers, system administrators, and developers.


Command Overview: Displays real-time system resource usage (CPU, memory, swap, processes). Useful for identifying processes consuming excessive resources. Excellent for a quick overview of system health.

Basic Syntax:

Terminal window
top [options]

Practical Examples:

  • Basic Usage:

    Terminal window
    top

    (Output: A continuously updating table of processes and their resource usage)

  • Sort by CPU Usage:

    Terminal window
    top -o %CPU

    (Output: Processes sorted by CPU usage, highest at the top)

  • Sort by Memory Usage:

    Terminal window
    top -o %MEM

    (Output: Processes sorted by memory usage, highest at the top)

  • Display User’s Processes:

    Terminal window
    top -u <username>

    (Output: Only processes owned by the specified user are displayed)

Common Options:

  • -d <seconds>: Set delay between screen updates.
  • -u <username>: Monitor only processes owned by a specific user.
  • -p <pid>: Monitor only a specific process ID.
  • -o <field>: Sort by a specific field (e.g., %CPU, %MEM, PID).
  • q: Quit top.
  • h: Display help.
  • k: Kill a process (prompted for PID and signal).
  • M: Sort by memory usage.
  • P: Sort by CPU usage.
  • 1: Toggle per-CPU core usage display.

Advanced Usage:

  • Batch Mode (for scripting):

    Terminal window
    top -b -n 1 > top_output.txt

    (Output: Saves a single top snapshot to a file in batch mode. Useful for automated monitoring.)

  • Interactive commands within top: Press Shift + > or < to change the sort field. Press c to toggle command display.

Tips & Tricks:

  • Use top as a first line of defense when diagnosing performance issues.
  • Pay attention to %CPU, %MEM, and RES (resident memory) columns.
  • Use -d to increase the update interval if top itself is consuming too much CPU.
  • Press k and enter a signal like 15 (SIGTERM) to gracefully stop a process. Use 9 (SIGKILL) as a last resort.

Troubleshooting:

  • High CPU usage by top itself: Reduce the update interval using -d.
  • Unresponsive top: Try killing the top process (as a last resort).

Related Commands: ps, htop, vmstat, free


Command Overview: Displays information about active processes. More static than top, but offers powerful filtering and formatting options. Crucial for identifying specific processes and their attributes.

Basic Syntax:

Terminal window
ps [options]

Practical Examples:

  • List all processes:

    Terminal window
    ps aux

    (Output: A list of all processes with user, PID, CPU, memory usage, and command.)

  • List processes for a specific user:

    Terminal window
    ps -u <username>

    (Output: Processes owned by the specified user.)

  • Find a process by name:

    Terminal window
    ps aux | grep <process_name>

    (Output: Processes whose command line matches the specified name.)

  • Display process tree:

    Terminal window
    ps axjf

    (Output: Processes displayed in a hierarchical tree structure, showing parent-child relationships.)

Common Options:

  • a: Display processes for all users.
  • u: Display the user name.
  • x: Include processes without controlling terminals.
  • f: Display full command lines.
  • -u <username>: Display processes owned by a specific user.
  • -p <pid>: Display information about a specific process ID.
  • -C <command>: Select by command name.
  • -o <format>: Customize the output format (e.g., pid,ppid,user,cmd).

Advanced Usage:

  • Custom Output Formatting:

    Terminal window
    ps -eo pid,ppid,user,cmd,%cpu,%mem --sort=-%cpu | head -n 10

    (Output: Top 10 processes sorted by CPU usage, displaying PID, parent PID, user, command, CPU usage, and memory usage.)

  • Killing a process based on its name:

    Terminal window
    kill $(ps -ef | grep <process_name> | grep -v grep | awk '{print $2}')

    (Output: Kills all processes matching the specified name. USE WITH CAUTION!)

Tips & Tricks:

  • Use ps aux for a comprehensive process listing.
  • Use grep to filter the output of ps for specific processes.
  • Customize the output format with -o to display only the information you need.
  • The -ef options show full command lines and user IDs.
  • Use pgrep and pkill for simpler process finding and killing.

Troubleshooting:

  • ps showing incorrect information: Ensure the system time is correct.
  • Process not found: Double-check the process name or PID.

Related Commands: top, htop, kill, pgrep, pkill


Command Overview: Reports statistics about virtual memory, disk I/O, CPU activity, and system processes. Useful for identifying memory bottlenecks, disk I/O issues, and CPU saturation.

Basic Syntax:

Terminal window
vmstat [options] [delay] [count]

Practical Examples:

  • Basic Usage (report every second):

    Terminal window
    vmstat 1

    (Output: Continuously updating statistics about processes, memory, swap, I/O, and CPU.)

  • Report five times with a two-second delay:

    Terminal window
    vmstat 2 5

    (Output: Five reports, each two seconds apart.)

  • Show active and inactive memory:

    Terminal window
    vmstat -a

    (Output: Similar to basic usage, but with active/inactive memory breakdown.)

Common Options:

  • [delay]: Interval between reports (in seconds).
  • [count]: Number of reports to generate.
  • -a: Display active and inactive memory.
  • -d: Report disk statistics.
  • -n: Display header only once.
  • -S <unit>: Specify units (k, K, m, M).
  • -s: Display event counters and memory statistics.

Advanced Usage:

  • Disk I/O Statistics:

    Terminal window
    vmstat -d 1

    (Output: Continuously updating disk I/O statistics, showing reads and writes per device.)

  • Memory Statistics Summary:

    Terminal window
    vmstat -s

    (Output: A comprehensive summary of memory statistics, including total memory, used memory, free memory, buffers, cache, swap usage, and more.)

Tips & Tricks:

  • Pay attention to si (swap in) and so (swap out) columns. High values indicate memory pressure.
  • Monitor bi (blocks received from a block device) and bo (blocks sent to a block device) columns to identify disk I/O bottlenecks.
  • Look at the us (user CPU), sy (system CPU), and id (idle CPU) columns to understand CPU utilization.
  • Combine vmstat with other tools like iostat and mpstat for a more complete picture.

Troubleshooting:

  • High swap activity: Indicates insufficient RAM. Consider adding more memory or optimizing memory usage.
  • High disk I/O: May indicate slow storage. Consider upgrading to faster storage or optimizing I/O patterns.

Related Commands: iostat, mpstat, free, top


Command Overview: Reports CPU utilization and disk I/O statistics. Essential for identifying disk I/O bottlenecks and assessing the performance of storage devices.

Basic Syntax:

Terminal window
iostat [options] [delay] [count] [disk...]

Practical Examples:

  • Basic Usage (report every second):

    Terminal window
    iostat 1

    (Output: Continuously updating CPU and disk I/O statistics.)

  • Report five times with a two-second delay:

    Terminal window
    iostat 2 5

    (Output: Five reports, each two seconds apart.)

  • Report statistics for a specific disk (e.g., sda):

    Terminal window
    iostat 1 sda

    (Output: Continuously updating statistics for the specified disk.)

  • Extended Statistics:

    Terminal window
    iostat -x 1

    (Output: More detailed disk I/O statistics, including queue length, service time, and utilization.)

Common Options:

  • [delay]: Interval between reports (in seconds).
  • [count]: Number of reports to generate.
  • [disk...]: Specific disks to monitor (e.g., sda, sdb). Defaults to all disks.
  • -x: Display extended statistics.
  • -p [device|ALL] Display statistics for block devices and all their partitions.
  • -c: Show only CPU utilization.
  • -d: Show only disk utilization.
  • -N: Show the device mapper names.

Advanced Usage:

  • Detailed Disk I/O Analysis:

    Terminal window
    iostat -x 1 -p ALL

    (Output: Extended statistics for all disks and partitions, updated every second. Analyze columns like await, svctm, and %util to pinpoint I/O bottlenecks.)

  • CPU Utilization Only:

    Terminal window
    iostat -c 1

    (Output: Continuously updating CPU utilization statistics.)

Tips & Tricks:

  • Focus on the %util (disk utilization) column. High values (close to 100%) indicate a disk bottleneck.
  • Analyze await (average wait time for I/O requests) and svctm (average service time) to understand the nature of the I/O bottleneck.
  • Use -p ALL to monitor individual partitions and identify which partitions are experiencing high I/O.
  • Combine iostat with vmstat to get a holistic view of system performance.

Troubleshooting:

  • High %util: Indicates a disk bottleneck. Consider upgrading to faster storage or optimizing I/O patterns.
  • High await: Indicates that I/O requests are waiting for a long time. This could be due to a slow disk, a busy disk controller, or a high number of concurrent I/O requests.
  • No output: Ensure the sysstat package is installed (e.g., apt-get install sysstat or yum install sysstat).

Related Commands: vmstat, mpstat, iotop, hdparm


Command Overview: Reports CPU utilization statistics. Useful for identifying CPU bottlenecks and understanding how CPU resources are being utilized by different processes.

Basic Syntax:

Terminal window
mpstat [options] [delay] [count]

Practical Examples:

  • Basic Usage (report every second):

    Terminal window
    mpstat 1

    (Output: Continuously updating CPU utilization statistics for all CPUs.)

  • Report five times with a two-second delay:

    Terminal window
    mpstat 2 5

    (Output: Five reports, each two seconds apart.)

  • Report statistics for a specific CPU (e.g., CPU 0):

    Terminal window
    mpstat -P 0 1

    (Output: Continuously updating statistics for CPU 0.)

  • All CPUs:

    Terminal window
    mpstat -P ALL 1

Common Options:

  • [delay]: Interval between reports (in seconds).
  • [count]: Number of reports to generate.
  • -P [cpu|ALL]: Specific CPU(s) to monitor (e.g., 0, 1, ALL). Defaults to all CPUs.
  • -u: Report CPU utilization in user mode.
  • -i interval: Report each interval in microseconds.

Advanced Usage:

  • Detailed CPU Utilization Analysis:

    Terminal window
    mpstat -P ALL 1

    (Output: Statistics for each CPU core, updated every second. Analyze columns like %user, %sys, %iowait, %idle to understand CPU utilization patterns.)

Tips & Tricks:

  • Pay attention to %user (CPU time spent in user mode), %sys (CPU time spent in system mode), %iowait (CPU time waiting for I/O), and %idle (CPU time idle).
  • High %user indicates CPU-bound applications.
  • High %sys indicates kernel activity.
  • High %iowait indicates that the CPU is waiting for I/O operations to complete.
  • Combine mpstat with top and vmstat to get a comprehensive view of system performance.

Troubleshooting:

  • High %user: Identify the CPU-intensive processes using top and optimize their code.
  • High %sys: Investigate kernel activity using tools like perf or ftrace.
  • High %iowait: Investigate disk I/O bottlenecks using iostat and optimize I/O patterns.
  • No output: Ensure the sysstat package is installed.

Related Commands: top, vmstat, iostat, perf


Command Overview: Displays the amount of free and used memory in the system. Provides a quick overview of memory usage and swap activity.

Basic Syntax:

Terminal window
free [options]

Practical Examples:

  • Basic Usage (in kilobytes):

    Terminal window
    free

    (Output: Memory usage in kilobytes.)

  • In megabytes:

    Terminal window
    free -m

    (Output: Memory usage in megabytes.)

  • Continuously update every second:

    Terminal window
    free -s 1

    (Output: Memory usage updated every second.)

  • Display total RAM in bytes:

    Terminal window
    free -b

Common Options:

  • -b: Display output in bytes.
  • -k: Display output in kilobytes (default).
  • -m: Display output in megabytes.
  • -g: Display output in gigabytes.
  • -h: Display output in human-readable format (e.g., “1.2G”).
  • -s <seconds>: Continuously update the display every <seconds> seconds.
  • -t: Display a total line.

Advanced Usage:

  • Human-Readable Output with Total Line:

    Terminal window
    free -ht

    (Output: Memory usage in human-readable format, with a total line showing the sum of all memory.)

Tips & Tricks:

  • Use free -h for easy-to-understand output.
  • Pay attention to the available column. This represents the amount of memory available for new applications without swapping.
  • High swap usage indicates memory pressure.
  • The buff/cache row shows the amount of memory used for disk buffers and cache. This memory can be reclaimed by the system if needed.

Troubleshooting:

  • Low available memory: Consider adding more RAM or optimizing memory usage.
  • High swap usage: Indicates insufficient RAM.
  • Incorrect memory values: Ensure the system is correctly detecting the installed memory.

Related Commands: top, vmstat, ps


7. iotop - I/O Monitoring (Requires python and pip install iotop)

Section titled “7. iotop - I/O Monitoring (Requires python and pip install iotop)”

Command Overview: A top-like utility for monitoring disk I/O usage by processes. Helps identify which processes are generating the most disk I/O. Requires installation.

Basic Syntax:

Terminal window
iotop [options]

Practical Examples:

  • Basic Usage:

    Terminal window
    iotop

    (Output: A real-time display of processes and their disk I/O usage.)

  • Display accumulated I/O:

    Terminal window
    iotop -a

    (Output: Shows the total disk I/O usage since iotop started.)

  • Only show processes doing I/O:

    Terminal window
    iotop -o

    (Output: Only displays processes actively performing I/O.)

Common Options:

  • -o: Only show processes doing I/O.
  • -a: Accumulated I/O instead of bandwidth.
  • -k: Use kilobytes instead of human-readable units.
  • -q: Quiet mode (less output). Use -qqq for no output at all except errors.
  • -b: Batch mode. Useful for logging.

Advanced Usage:

  • Batch Mode Logging:

    Terminal window
    iotop -b -n 5 > iotop_output.txt

    (Output: Saves five snapshots of iotop output to a file in batch mode.)

  • Accumulated I/O with Quiet Mode:

    Terminal window
    iotop -aq

Tips & Tricks:

  • iotop is invaluable for identifying processes that are causing disk I/O bottlenecks.
  • Pay attention to the DISK READ and DISK WRITE columns.
  • Use -o to focus on processes that are actively performing I/O.
  • Use -a to see the total I/O usage since iotop started.

Troubleshooting:

  • command not found: Install iotop using your distribution’s package manager (e.g., apt-get install iotop or yum install iotop) or pip install iotop.
  • Permissions issues: Run iotop as root or with sudo.
  • No I/O activity shown: Ensure that processes are actually performing disk I/O.

Related Commands: iostat, vmstat, top, lsof


Command Overview: A powerful performance analysis tool built into the Linux kernel. Allows you to profile CPU usage, memory access patterns, and other system events. Requires root privileges for many operations.

Basic Syntax:

Terminal window
perf [command] [options]

Practical Examples:

  • List available events:

    Terminal window
    perf list

    (Output: A list of hardware and software events that can be profiled.)

  • Record CPU cycles for a command:

    Terminal window
    perf record -e cycles <command>

    (Output: Records CPU cycles while the specified command is running.)

  • Report the recorded data:

    Terminal window
    perf report

    (Output: Displays a report of the recorded data, showing the functions that consumed the most CPU cycles.)

  • Record all system calls:

    Terminal window
    perf record -e syscalls:sys_enter,syscalls:sys_exit -a sleep 1
    perf report

Common Options:

  • record: Record performance data.
  • report: Generate a report from recorded data.
  • top: Display a real-time profile of CPU usage.
  • list: List available events.
  • -e <event>: Specify the event to profile (e.g., cycles, cache-misses, syscalls:sys_enter).
  • -a: Profile all CPUs.
  • -p <pid>: Profile a specific process ID.
  • -g: Enable call graph recording (for function-level profiling).

Advanced Usage:

  • Function-Level Profiling with Call Graph:

    Terminal window
    perf record -g -e cycles <command>
    perf report

    (Output: Records CPU cycles and call graph information, allowing you to identify the functions that are consuming the most CPU time.)

  • Profiling a Specific Process:

    Terminal window
    perf record -p <pid> -e cycles
    perf report

    (Output: Profiles CPU cycles for the specified process ID.)

  • Real-time profiling with perf top:

    Terminal window
    perf top

    (Output: A dynamic view of the most active functions, similar to top but at a function level.)

  • Flame Graph Generation: (Requires additional tools like FlameGraph from GitHub.)

    Terminal window
    perf record -F 99 -p <pid> -g --call-graph dwarf <command>
    perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl > flamegraph.svg

    (Output: Generates a flame graph (SVG image) showing the call stack and CPU time spent in each function. Extremely useful for visualizing performance bottlenecks.)

Tips & Tricks:

  • perf is a powerful but complex tool. Start with simple examples and gradually explore more advanced features.
  • Use perf list to discover available events.
  • The -g option (call graph recording) is essential for function-level profiling.
  • Flame graphs provide an excellent way to visualize performance bottlenecks.
  • Consider using perf top for real-time performance analysis.
  • Ensure debug symbols are installed for the target application to get meaningful function names in the reports.

Troubleshooting:

  • Permission denied: Run perf as root or with sudo.
  • Missing debug symbols: Install debug symbols for the target application using your distribution’s package manager (e.g., apt-get install <package>-dbg or yum install <package>-debuginfo).
  • perf: not found: Install the perf package (e.g., apt-get install linux-tools-common linux-tools-$(uname -r) or yum install perf).
  • No output: Verify that the target application is running and that perf is correctly configured.

Related Commands: oprofile, gprof, valgrind (more specialized profiling tools)


Command Overview: A powerful command-line packet analyzer that captures and displays network traffic. Essential for diagnosing network issues, analyzing network protocols, and identifying security vulnerabilities.

Basic Syntax:

Terminal window
tcpdump [options] [expression]

Practical Examples:

  • Capture all traffic on the default interface:

    Terminal window
    tcpdump

    (Output: Continuously displays captured network packets.)

  • Capture traffic on a specific interface (e.g., eth0):

    Terminal window
    tcpdump -i eth0

    (Output: Continuously displays captured network packets on the specified interface.)

  • Capture traffic to or from a specific host (e.g., 192.168.1.100):

    Terminal window
    tcpdump host 192.168.1.100

    (Output: Displays packets to or from the specified host.)

  • Capture traffic on a specific port (e.g., port 80):

    Terminal window
    tcpdump port 80

    (Output: Displays packets on port 80.)

  • Capture traffic using a specific protocol (e.g., TCP):

    Terminal window
    tcpdump tcp

    (Output: Displays TCP packets.)

  • Save captured packets to a file:

    Terminal window
    tcpdump -w capture.pcap

    (Output: Saves captured packets to the capture.pcap file in binary format.)

  • Read packets from a file:

    Terminal window
    tcpdump -r capture.pcap

    (Output: Reads and displays packets from the capture.pcap file.)

Common Options:

  • -i <interface>: Specify the network interface to capture traffic on.
  • -w <file>: Save captured packets to a file.
  • -r <file>: Read packets from a file.
  • -n: Do not resolve hostnames.
  • -nn: Do not resolve hostnames or port names.
  • -v: Verbose output.
  • -vv: More verbose output.
  • -vvv: Most verbose output
  • -X: Print the packet’s contents in both hexadecimal and ASCII.
  • -XX: Print the packet’s contents in hexadecimal and ASCII, including the Ethernet header.
  • -s <snaplen>: Set the snapshot length (the number of bytes to capture from each packet). Use -s 0 to capture the entire packet.
  • expression: A filter expression to specify which packets to capture (e.g., host 192.168.1.100, port 80, tcp, udp).

Advanced Usage:

  • Capture traffic to a specific host and port:

    Terminal window
    tcpdump host 192.168.1.100 and port 80

    (Output: Displays packets to or from the specified host on port 80.)

  • Capture traffic from a specific network:

    Terminal window
    tcpdump net 192.168.1.0/24

    (Output: Displays packets from the specified network.)

  • Capture only SYN packets (TCP connection initiation):

    Terminal window
    tcpdump 'tcp[tcpflags] & tcp-syn != 0'

    (Output: Displays only TCP SYN packets.)

  • Capture packets larger than a specific size:

    Terminal window
    tcpdump 'greater 1000'

    (Output: Displays packets larger than 1000 bytes.)

Tips & Tricks:

  • Use -n to avoid DNS lookups, which can slow down packet capture.
  • Use -w to save captured packets to a file for later analysis with tools like Wireshark.
  • Use filter expressions to narrow down the captured traffic and focus on the packets of interest.
  • Be mindful of the amount of traffic you are capturing, as it can consume significant disk space.
  • Use -s 0 to capture the entire packet, but be aware that this can increase the size of the captured data.

Troubleshooting:

  • Permission denied: Run tcpdump as root or with sudo.
  • tcpdump: listening on <interface>, link-type EN10MB (Ethernet), capture size 262144 bytes but no output: Verify that there is network traffic on the specified interface and that the filter expression is correct.
  • tcpdump: pcap_loop: The interface is in promiscuous mode, but the filter is not set.: This is a warning message, not an error. It means that the interface is in promiscuous mode (capturing all traffic), but no filter expression is specified.
  • Capture not working: Ensure your firewall is not blocking the traffic you are trying to capture.

Related Commands: Wireshark, tshark, netstat, ss


This cheatsheet provides a solid foundation for performance profiling in Linux. Remember to consult the man pages for each command for a complete list of options and features: man <command>. Experiment with these tools in a safe environment before using them in production. Good luck!