Text Processing (grep, sed, awk)

Category: Linux Command Basics
Type: Linux Commands
Generated on: 2025-07-10 03:05:57
For: System Administration, Development & Technical Interviews

Text Processing Cheatsheet: grep, sed, awk (Linux)

This cheatsheet provides a comprehensive guide to using grep, sed, and awk for text processing in Linux. It’s designed for both beginners and experienced users, covering basic syntax, practical examples, advanced techniques, and troubleshooting tips.

1. Command Overview

grep (Global Regular Expression Print): Searches for patterns within files and prints lines that match. Primarily used for finding text within files.
sed (Stream EDitor): Edits text streams. Used for find and replace, deleting lines, inserting text, and other text transformations. Operates on a line-by-line basis.
awk: A powerful text processing language that can be used for data extraction, report generation, and more complex text manipulations. Works by dividing each line into fields.

2. Basic Syntax

`grep`

grep [OPTIONS] PATTERN [FILE...]

`sed`

sed [OPTIONS] 'COMMAND' [FILE...]

`awk`

awk [OPTIONS] 'CONDITION { ACTION }' [FILE...]

3. Practical Examples

`grep`

Find all lines containing “error” in logfile.txt:

grep "error" logfile.txt
# Sample Output:
# 2023-10-27 10:00:00 ERROR: Something went wrong
# 2023-10-27 10:05:00 ERROR: Another error occurred

Find all lines that do not contain “success” in logfile.txt:
Terminal window
```
grep -v "success" logfile.txt
```

Find all lines starting with “DEBUG” (case-insensitive):

grep -i "^debug" logfile.txt
# Sample Output:
# DEBUG: Starting process
# Debug: Another debug message

`sed`

Replace the first occurrence of “old” with “new” in file.txt and print to stdout:

sed 's/old/new/' file.txt
# Sample Input (file.txt):
# This is an old file with an old problem.
# Sample Output:
# This is an new file with an old problem.

Replace all occurrences of “old” with “new” in file.txt and print to stdout:

sed 's/old/new/g' file.txt
# Sample Input (file.txt):
# This is an old file with an old problem.
# Sample Output:
# This is an new file with an new problem.

Replace all occurrences of “old” with “new” in file.txt and save the changes in place (use with caution!):
Terminal window
```
sed -i 's/old/new/g' file.txt
```

Delete all lines containing “error” in file.txt and print to stdout:

sed '/error/d' file.txt
# Sample Input (file.txt):
# This is a normal line.
# This is an error line.
# This is another normal line.
# Sample Output:
# This is a normal line.
# This is another normal line.

Insert the line “New line” before the line containing “problem” in file.txt:

sed '/problem/i New line' file.txt
# Sample Input (file.txt):
# This is a line with a problem.
# Sample Output:
# New line
# This is a line with a problem.

`awk`

Print the first field of each line in data.txt:

awk '{print $1}' data.txt
# Sample Input (data.txt):
# John Doe 25
# Jane Smith 30
# Sample Output:
# John
# Jane

Print the first and third fields, separated by a comma, of each line in data.txt:

awk '{print $1 "," $3}' data.txt
# Sample Input (data.txt):
# John Doe 25
# Jane Smith 30
# Sample Output:
# John,25
# Jane,30

Print lines where the third field is greater than 25:

awk '$3 > 25 {print $0}' data.txt
# Sample Input (data.txt):
# John Doe 25
# Jane Smith 30
# Sample Output:
# Jane Smith 30

Calculate the sum of the third field (assuming it’s a number):

awk '{sum += $3} END {print "Sum:", sum}' data.txt
# Sample Input (data.txt):
# John Doe 25
# Jane Smith 30
# Sample Output:
# Sum: 55

4. Common Options

`grep`

-i: Case-insensitive search.
-v: Invert match (select non-matching lines).
-n: Show line numbers.
-c: Count matching lines.
-r or -R: Recursive search (through directories). -r follows symbolic links, -R does not.
-l: List only file names containing matches.
-w: Match whole words only.
-x: Match whole lines only.
-E: Use extended regular expressions (ERE).
-P: Use Perl-compatible regular expressions (PCRE).
-o: Print only the matching part of the line.
-A NUM: Print NUM lines after the matching line.
-B NUM: Print NUM lines before the matching line.
-C NUM: Print NUM lines around the matching line (context).

`sed`

-i: Edit the file in-place. USE WITH CAUTION! Consider creating a backup first.
-n: Suppress default output (useful with p command).
-e: Execute multiple sed commands.
-f: Read sed commands from a file.
-r: Use extended regular expressions (ERE).

`awk`

-F: Specify the field separator. Defaults to whitespace.
-v: Assign a variable value.
-f: Read awk program from a file.

5. Advanced Usage

`grep`

Using grep with a regular expression to find IP addresses in a file:

grep -E "([0-9]{1,3}\.){3}[0-9]{1,3}" logfile.txt
# Sample Output:
# Connection from 192.168.1.100
# Connection from 10.0.0.5

Combining grep and wc to count the number of lines containing a specific pattern:
Terminal window
```
grep "error" logfile.txt | wc -l
# Sample Output:
# 23
```

`sed`

Using sed to replace multiple patterns in a single command:

sed -e 's/pattern1/replacement1/g' -e 's/pattern2/replacement2/g' file.txt

Using sed to extract a specific part of a line (e.g., a date) using capture groups:

sed -n 's/.*\(20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]\).*/\1/p' logfile.txt
# This extracts the date from lines like:
# 2023-10-27 10:00:00 Message
# Sample Output:
# 2023-10-27

Backup before in-place edit:

sed -i.bak 's/old/new/g' file.txt # Creates file.txt.bak

`awk`

Custom Field Separator:

awk -F':' '{print $1}' /etc/passwd # Prints usernames from /etc/passwd

Using awk to generate a CSV file from a space-separated file:
Terminal window
```
awk 'BEGIN {OFS=","} {print $1, $2, $3}' data.txt > output.csv
```

Using awk to format output:

awk '{printf "%-20s %5d\n", $1, $3}' data.txt
# This prints the first field left-aligned in a 20-character field,
# and the third field right-aligned in a 5-character field.

Using awk to process log files and generate reports:

awk '/error/ {count++} END {print "Total errors:", count}' logfile.txt

6. Tips & Tricks

Piping: Combine commands for powerful text processing: cat file.txt | grep "pattern" | sed 's/old/new/g'
Regular Expressions: Master regular expressions for precise pattern matching. Use online regex testers to experiment.
Shell Variables: Use shell variables to store patterns or replacements: pattern="error"; grep "$pattern" logfile.txt
Testing with sed and awk: Always test your sed and awk commands without the -i option first to preview the changes.
Readability: For complex awk scripts, consider putting the script in a separate file and using the -f option.

7. Troubleshooting

grep: “Binary file (standard input) matches”: This means grep found a match in a binary file. Use -a to treat all files as text.
**sed: “unterminated s' command"**: This usually means you forgot a closing /in yours/old/new/` command.
awk: Incorrect field separation: Double-check your -F option or the default whitespace separation. Consider using FS variable in BEGIN block for more complex separators.
sed -i overwrites files unexpectedly: Double-check your command thoroughly before using -i. Always back up important files.
Performance: For large files, awk is often faster than sed for complex operations. grep is generally the fastest for simple pattern matching.

cut: Extract specific columns (fields) from a file.
tr: Translate or delete characters.
sort: Sort lines of text files.
uniq: Remove duplicate lines.
wc: Word, line, and character count.
head: Display the first few lines of a file.
tail: Display the last few lines of a file. Useful with -f for monitoring log files.
find: Find files based on various criteria, often used in conjunction with grep.

This cheatsheet provides a strong foundation for using grep, sed, and awk. Experiment with the examples and explore the advanced features to become proficient in text processing in Linux. Remember to always test your commands before applying them to critical data.

Text Processing (grep, sed, awk)

Text Processing Cheatsheet: grep, sed, awk (Linux)

1. Command Overview

2. Basic Syntax

grep

sed

awk

3. Practical Examples

grep

sed

awk

4. Common Options

grep

sed

awk

5. Advanced Usage

grep

sed

awk

6. Tips & Tricks

7. Troubleshooting

8. Related Commands

`grep`

`sed`

`awk`

`grep`

`sed`

`awk`

`grep`

`sed`

`awk`

`grep`

`sed`

`awk`