Skip to content

Log File Analysis and Management

Category: Intermediate Linux Commands
Type: Linux Commands
Generated on: 2025-07-10 03:10:16
For: System Administration, Development & Technical Interviews


Log File Analysis and Management Cheatsheet (Linux)

Section titled “Log File Analysis and Management Cheatsheet (Linux)”

This cheatsheet provides a practical guide to log file analysis and management using intermediate Linux commands. It’s designed for both system administrators and developers.

CommandDescriptionWhen to Use
tailDisplays the last part of a file.Monitor log files in real-time, view recent events.
headDisplays the beginning of a file.Quickly inspect the initial lines of a log file.
catConcatenates and displays files.View entire log files, append log files. Use with caution on very large files.
lessDisplays file content page by page.Browse large log files efficiently.
grepSearches for patterns in files.Extract specific information from log files.
awkPattern scanning and processing language.Extract, transform, and report data from log files.
sedStream editor for text manipulation.Replace, delete, or insert text in log files. Use with caution!
sortSort lines of text files.Order log entries based on timestamp or other fields.
uniqReport or omit repeated lines.Identify frequently occurring log entries.
wcCount words, lines, and characters.Get a quick overview of log file size.
findSearches for files in a directory hierarchy.Locate specific log files based on name, size, or modification time.
rsyslogdRocket-fast system for log processing.Centralized logging, log rotation, filtering.
logrotateRotates, compresses, and manages log files.Automate log file maintenance.
journalctlQuery the systemd journal.View and analyze systemd logs.
Terminal window
tail [OPTIONS] [FILE]
Terminal window
head [OPTIONS] [FILE]
Terminal window
cat [OPTIONS] [FILE...]
Terminal window
less [OPTIONS] FILE
Terminal window
grep [OPTIONS] PATTERN [FILE...]
Terminal window
awk '[CONDITION] { ACTION }' [FILE...]
Terminal window
sed [OPTIONS] 'COMMAND' [FILE...]
Terminal window
sort [OPTIONS] [FILE...]
Terminal window
uniq [OPTIONS] [INPUT [OUTPUT]]
Terminal window
wc [OPTIONS] [FILE...]
Terminal window
find [PATH] [OPTIONS] [EXPRESSION]

Configuration file: /etc/rsyslog.conf

Configuration file: /etc/logrotate.conf and files in /etc/logrotate.d/

Terminal window
journalctl [OPTIONS]
Terminal window
# Display the last 10 lines of syslog
tail /var/log/syslog
# Display the last 50 lines of auth.log
tail -n 50 /var/log/auth.log
# Follow the log file for real-time updates
tail -f /var/log/apache2/access.log

Sample Output:

... (last 10 lines of /var/log/syslog) ...
Terminal window
# Display the first 20 lines of error.log
head -n 20 /var/log/apache2/error.log

Sample Output:

... (first 20 lines of /var/log/apache2/error.log) ...
Terminal window
# Display the entire contents of a small log file
cat /var/log/dmesg
# Append one log file to another (USE WITH CAUTION - can be large)
cat /var/log/syslog >> /tmp/combined_log.txt

Sample Output:

... (entire content of /var/log/dmesg) ...
Terminal window
# Browse a large log file
less /var/log/nginx/access.log
# Search within less (press / and enter your search term)
/error

Sample Output: (interactive page-by-page display)

Terminal window
# Find lines containing "error" in error.log
grep "error" /var/log/apache2/error.log
# Find lines containing "failed password" in auth.log, case-insensitive
grep -i "failed password" /var/log/auth.log
# Find lines that DO NOT contain "success"
grep -v "success" /var/log/auth.log
# Count the number of occurrences of "error"
grep -c "error" /var/log/apache2/error.log

Sample Output:

[Tue Oct 27 10:00:00 2023] [error] ...
Terminal window
# Print the first field (e.g., IP address) from access.log
awk '{print $1}' /var/log/apache2/access.log
# Print the timestamp and request from access.log (assuming common format)
awk '{print $4, $7}' /var/log/apache2/access.log
# Filter log entries based on a condition (e.g., status code 500)
awk '$9 == 500 {print $0}' /var/log/apache2/access.log
# Calculate the total number of bytes transferred (sum of the 10th field)
awk '{sum += $10} END {print "Total bytes: " sum}' /var/log/apache2/access.log

Sample Output:

192.168.1.100
Terminal window
# Replace "error" with "warning" in a log file (careful - modifies the file!)
# Create a backup first!
cp /var/log/my_app.log /var/log/my_app.log.bak
sed 's/error/warning/g' /var/log/my_app.log
# Delete lines containing "debug"
sed '/debug/d' /var/log/my_app.log

WARNING: sed can modify files directly. Always create a backup before using sed for replacements or deletions.

Terminal window
# Sort log entries by timestamp (assuming timestamp is the first field)
sort /var/log/my_app.log > sorted_log.txt
# Sort numerically (e.g., by process ID)
sort -n /var/log/my_app.log
# Sort in reverse order
sort -r /var/log/my_app.log

Sample Output: (sorted log entries)

Terminal window
# Count the number of unique IP addresses in access.log (requires sorting first)
sort /var/log/apache2/access.log | awk '{print $1}' | uniq -c
# Show only the unique lines
sort /var/log/apache2/access.log | awk '{print $1}' | uniq

Sample Output:

123 192.168.1.100
45 192.168.1.101
Terminal window
# Count the number of lines in a log file
wc -l /var/log/syslog
# Count the number of words in a log file
wc -w /var/log/syslog
# Count the number of bytes in a log file
wc -c /var/log/syslog

Sample Output:

12345 /var/log/syslog
Terminal window
# Find log files modified in the last 7 days
find /var/log -name "*.log" -mtime -7
# Find log files larger than 10MB
find /var/log -name "*.log" -size +10M
# Find and delete log files older than 30 days (USE WITH CAUTION!)
find /var/log -name "*.log" -mtime +30 -delete

WARNING: The -delete option of find is destructive. Use with extreme caution. Test with -print first to see what files will be deleted.

Example configuration in /etc/rsyslog.conf

# Send all messages to a remote server
*.* @192.168.1.200:514
# Filter messages based on severity
kern.err /var/log/kernel_errors.log
# Log everything from a specific application to a dedicated file
if $programname == 'my_app' then /var/log/my_app.log

Restart rsyslog after making changes:

Terminal window
sudo systemctl restart rsyslog

Example configuration in /etc/logrotate.d/my_app

/var/log/my_app.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
create 640 root adm
}

Explanation:

  • daily: Rotate logs daily.
  • rotate 7: Keep 7 rotated logs.
  • compress: Compress rotated logs.
  • delaycompress: Delay compression until the next rotation cycle.
  • missingok: Do not report an error if the log file is missing.
  • notifempty: Do not rotate the log if it’s empty.
  • create 640 root adm: Create a new log file after rotation with permissions 640, owned by root:adm.

Force log rotation:

Terminal window
sudo logrotate -f /etc/logrotate.d/my_app
Terminal window
# View all systemd logs
journalctl
# View logs for a specific service
journalctl -u apache2.service
# View logs from the current boot
journalctl -b
# View logs from the previous boot
journalctl -b -1
# View logs from a specific time
journalctl --since "yesterday"
journalctl --until "now"
journalctl --since "2023-10-26 10:00:00"
# Follow logs in real-time
journalctl -f
# Show only errors and warnings
journalctl -p err -p warning
# Save logs to a file
journalctl > systemd_logs.txt
CommandOptionDescription
tail, head-n <number>Specify the number of lines to display.
tail-fFollow the file for real-time updates.
grep-iCase-insensitive search.
grep-vInvert the search (show lines that don’t match).
grep-cCount the number of matching lines.
grep-rRecursive search in directories.
grep-lList only the names of files containing matches.
awk-F <delimiter>Specify a field delimiter.
sed-iEdit the file in place (USE WITH CAUTION!).
sort-nNumeric sort.
sort-rReverse sort.
sort-k <field>Sort by a specific field.
uniq-cCount the number of occurrences of each unique line.
find-name <pattern>Find files matching a name pattern.
find-mtime <n>Find files modified n days ago. -mtime +n for older than n days, -mtime -n for newer than n days.
find-size <+-><size>Find files larger (+) or smaller (-) than a specified size (e.g., +10M).
journalctl-u <unit>Filter logs by systemd unit (service).
journalctl-bFilter logs by boot.
journalctl--since <date>Filter logs since a specific date/time.
journalctl--until <date>Filter logs until a specific date/time.
journalctl-fFollow logs in real-time.
journalctl-p <priority>Filter logs by priority (e.g., err, warning, info).
Terminal window
# Find all error messages in access.log, sort them by frequency, and display the top 10
grep "error" /var/log/apache2/access.log | sort | uniq -c | sort -nr | head -n 10
# Find all IP addresses that have generated more than 100 requests
awk '{print $1}' /var/log/apache2/access.log | sort | uniq -c | awk '$1 > 100 {print $2, $1}'
# Extract all unique URLs accessed in a given day from access log
grep "27/Oct/2023" /var/log/apache2/access.log | awk '{print $7}' | sort | uniq
Terminal window
# Extract IP addresses and the number of bytes transferred from access.log
awk '{print "IP: " $1 ", Bytes: " $10}' /var/log/apache2/access.log
# Calculate average request size
awk '{total += $10; count++} END {if (count > 0) print "Average request size: " total/count " bytes"; else print "No requests found"}' /var/log/apache2/access.log

Using sed for Log Sanitization (Example - Remove PII)

Section titled “Using sed for Log Sanitization (Example - Remove PII)”
Terminal window
# WARNING: Be EXTREMELY careful when using sed to sanitize logs.
# Ensure you understand the regular expressions and test thoroughly.
# Create a backup first!
cp /var/log/my_app.log /var/log/my_app.log.bak
# Example: Replace email addresses with "REDACTED"
sed 's/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/REDACTED/g' /var/log/my_app.log

Analyzing Systemd Logs with journalctl and grep

Section titled “Analyzing Systemd Logs with journalctl and grep”
Terminal window
# Find all errors related to a specific process ID (PID)
journalctl _PID=12345 | grep "error"
# Analyze boot time performance (get time between systemd startup and target reached)
journalctl -b | grep "Reached target multi-user.target"
journalctl -b | grep "Startup finished"
  • Use aliases: Create aliases for frequently used commands to save time (e.g., alias tl='tail -f /var/log/syslog').
  • Use wildcards: Use wildcards to analyze multiple log files at once (e.g., grep "error" /var/log/apache2/*.log).
  • Redirect output to a file: Save the output of a command to a file for later analysis (e.g., grep "error" /var/log/syslog > errors.txt).
  • Use watch command to periodically execute commands: watch -n 5 'tail /var/log/syslog' will execute the tail command every 5 seconds.
  • Use tee command to both display and save output: tail -f /var/log/syslog | tee log_output.txt will display the syslog in the terminal and save it to log_output.txt.
  • Be mindful of performance: Avoid using cat on extremely large log files, as it can consume significant resources. Use less or tail instead.
  • Backup before modifying: Always back up log files before using sed or other commands that can modify them.
  • Learn regular expressions: Regular expressions are extremely powerful for pattern matching with grep and sed.
ErrorSolution
grep: /var/log/mylog.txt: No such file or directoryVerify the file path is correct.
Permission deniedUse sudo to run the command with elevated privileges.
sed: -e expression #1, char 1: unknown command: 's'Check the sed command syntax and ensure the command is properly quoted.
logrotate: error: /etc/logrotate.conf:22 lines must begin with a keyword or a filenameCheck the logrotate configuration file for syntax errors. Use logrotate -d /etc/logrotate.conf to debug.
journalctl: Failed to issue method call: Unit dbus-org.freedesktop.journal1.service not found.Ensure the systemd-journald service is running. sudo systemctl start systemd-journald
  • dmesg: Display kernel messages.
  • strace: Trace system calls and signals.
  • lsof: List open files.
  • netstat, ss: Network statistics.
  • tcpdump: Network packet analyzer.
  • systemctl: Control the systemd system and service manager.

This cheatsheet provides a solid foundation for log file analysis and management in Linux. Remember to practice and experiment with these commands to become proficient in their use. Always prioritize safety and data integrity when working with log files.