Skip to content

Archive and Compression (tar, gzip, zip)

Category: Linux Command Basics
Type: Linux Commands
Generated on: 2025-07-10 03:06:32
For: System Administration, Development & Technical Interviews


Archive and Compression Cheatsheet (tar, gzip, zip)

Section titled “Archive and Compression Cheatsheet (tar, gzip, zip)”

This cheatsheet provides a comprehensive guide to using tar, gzip, and zip commands in Linux for archiving and compressing files. It’s designed for both beginners and experienced users, covering basic syntax, practical examples, advanced usage, and troubleshooting tips.

1. Command Overview

  • tar (Tape Archive): Creates archives (collections of files and directories) without compression. Primarily used for packaging and backing up files.
  • **gzip (GNU zip):** Compresses single files, typically using the .gz` extension. Good for individual file compression.
  • zip: Creates compressed archives (collections of files) in the .zip format. Widely compatible with other operating systems (Windows, macOS).
  • unzip: Extracts files from .zip archives.

When to Use:

  • tar: Archiving multiple files/directories for backup, distribution, or organization.
  • tar.gz (tar + gzip): Archiving and compressing files together for efficient storage and transfer (common on Linux).
  • gzip: Compressing single large files to save space.
  • zip: Creating archives compatible with Windows and macOS, or when you need specific features like password protection.

2. Basic Syntax

  • tar [options] [archive-file] [files/directories]
  • gzip [options] [file]
  • zip [options] [archive-file] [files/directories]
  • unzip [options] [archive-file]

3. Practical Examples

tar Examples:

  • Create an archive:
Terminal window
tar -cvf myarchive.tar file1.txt file2.txt directory1/

Output: (verbose listing of files added)

file1.txt
file2.txt
directory1/
directory1/file3.txt
  • Extract an archive:
Terminal window
tar -xvf myarchive.tar

Output: (verbose listing of files extracted)

file1.txt
file2.txt
directory1/
directory1/file3.txt
  • List the contents of an archive:
Terminal window
tar -tvf myarchive.tar

Output: (detailed listing of files within the archive)

-rw-r--r-- user group 1024 2023-10-27 10:00 file1.txt
-rw-r--r-- user group 2048 2023-10-27 10:00 file2.txt
drwxr-xr-x user group 0 2023-10-27 10:00 directory1/
-rw-r--r-- user group 512 2023-10-27 10:00 directory1/file3.txt

gzip Examples:

  • Compress a file:
Terminal window
gzip myfile.txt

Output: (none, replaces myfile.txt with myfile.txt.gz)

  • Decompress a file:
Terminal window
gzip -d myfile.txt.gz

Output: (none, replaces myfile.txt.gz with myfile.txt)

  • Compress a file and keep the original:
Terminal window
gzip -c myfile.txt > myfile.txt.gz

Output: (creates myfile.txt.gz while keeping myfile.txt)

zip Examples:

  • Create a zip archive:
Terminal window
zip myarchive.zip file1.txt file2.txt directory1/

Output: (listing of files added)

adding: file1.txt (deflated 50%)
adding: file2.txt (deflated 60%)
adding: directory1/ (stored 0%)
adding: directory1/file3.txt (deflated 70%)
  • Extract a zip archive:
Terminal window
unzip myarchive.zip

Output: (listing of files extracted)

Archive: myarchive.zip
inflating: file1.txt
inflating: file2.txt
creating: directory1/
inflating: directory1/file3.txt
  • List the contents of a zip archive:
Terminal window
unzip -l myarchive.zip

Output: (detailed listing of files within the archive)

Archive: myarchive.zip
Length Date Time Name
--------- ---------- ----- ----
1024 2023-10-27 10:00 file1.txt
2048 2023-10-27 10:00 file2.txt
0 2023-10-27 10:00 directory1/
512 2023-10-27 10:00 directory1/file3.txt
--------- -------
3584 4 files

4. Common Options

tar Options:

  • -c (create): Create an archive.
  • -x (extract): Extract an archive.
  • -v (verbose): List files processed verbosely.
  • -f (file): Specify the archive file name. Required.
  • -z (gzip): Compress/decompress using gzip. Implies .tar.gz extension.
  • -j (bzip2): Compress/decompress using bzip2. Implies .tar.bz2 extension.
  • -J (xz): Compress/decompress using xz. Implies .tar.xz extension.
  • -t (list): List the contents of an archive.
  • -C (directory): Change to the specified directory before extracting/creating.
  • --exclude=pattern: Exclude files matching the pattern.
  • --delete: Delete from the archive (use with caution!). Not always supported.
  • -p: Preserve permissions.
  • --numeric-owner: Use numeric user and group IDs instead of names. Useful for transferring archives between systems with different user/group mappings.

gzip Options:

  • -d (decompress): Decompress.
  • -c (stdout): Write output to standard output; keep original file.
  • -r (recursive): Recursively compress files in directories.
  • -v (verbose): Verbose output.
  • -l (list): List compressed file contents.
  • -k (keep): Keep (don’t delete) input files during compression or decompression.
  • -N (name): Save original file name and timestamp.
  • -1 to -9: Compression level (1 = fastest, least compression; 9 = slowest, best compression). Default is 6.

zip Options:

  • -r (recursive): Recursively include directories.
  • -e (encrypt): Encrypt the archive with a password.
  • -l (list): List archive contents.
  • -d (delete): Delete entries from a zip archive.
  • -u (update): Update existing entries in a zip archive.
  • -j (junk-paths): Store just the name of the files; don’t record directory names.
  • -9: Best compression (slowest). -0 is no compression (store only).

unzip Options:

  • -l (list): List archive contents.
  • -d (directory): Extract files into specified directory.
  • -o (overwrite): Overwrite existing files without prompting.
  • -n (never): Never overwrite existing files.
  • -j (junk-paths): Do not recreate directory structure.
  • -P password: Use password to decrypt the archive.

5. Advanced Usage

tar Examples:

  • Create a compressed archive (.tar.gz) excluding a directory:
Terminal window
tar -czvf myarchive.tar.gz --exclude=./excluded_directory ./
  • Extract a .tar.gz archive to a specific directory:
Terminal window
tar -xzvf myarchive.tar.gz -C /path/to/extract/to
  • Backup a directory using incremental backups (requires gawk):

    • First, create a full backup:
    Terminal window
    tar -czvf full_backup.tar.gz /path/to/backup
    • Then, create an incremental backup (assuming you have GNU tar version > 1.15):
    Terminal window
    tar -g snapshot.snar -czvf incremental_backup.tar.gz /path/to/backup
    • The snapshot.snar file keeps track of which files have been backed up. Subsequent incremental backups will only contain files that have changed since the last backup.
  • Using --transform for renaming files during extraction:

    Terminal window
    tar -xvf archive.tar --transform='s/old_prefix/new_prefix/'

    This replaces old_prefix with new_prefix in the extracted file paths. Useful for renaming files while extracting.

gzip Examples:

  • Recursively compress all files in a directory (use with caution, compress individual files):
Terminal window
find . -type f -print0 | xargs -0 gzip
  • Compress to a specific compression level:
Terminal window
gzip -9 myfile.txt # Maximum compression, slowest
gzip -1 myfile.txt # Fastest compression, least compression

zip Examples:

  • Create a password-protected zip archive:
Terminal window
zip -e myarchive.zip file1.txt file2.txt

The command will prompt you to enter and verify the password.

  • Add files to an existing zip archive:
Terminal window
zip myarchive.zip newfile.txt
  • Create a zip archive without directory structure (just the files):
Terminal window
zip -j myarchive.zip /path/to/files/*

6. Tips & Tricks

  • Combine tar and gzip in a single command: Use the -z option with tar (e.g., tar -czvf myarchive.tar.gz ...).
  • Use tab completion: Type the beginning of a filename or directory and press Tab to auto-complete.
  • Check available disk space: Before creating large archives, use df -h to ensure you have enough space.
  • Progress bar: Use pv (Pipe Viewer) to show a progress bar for tar operations (install pv if needed: sudo apt-get install pv or sudo yum install pv):
Terminal window
tar -czvf - /path/to/backup | pv > backup.tar.gz
tar -xzvf - < backup.tar.gz | pv > /path/to/restore
  • Verify archive integrity: While tar doesn’t have built-in verification, consider using checksums (like md5sum or sha256sum) on the archive file after creation and before extraction.
  • Standard input/output redirection: Use - as the filename for tar to read from standard input or write to standard output. Useful for piping data.
  • Using find with tar:
    Terminal window
    find /path/to/search -name "*.log" -print0 | tar -czvf logs.tar.gz --null -T -
    This finds all .log files under /path/to/search and adds them to logs.tar.gz. The --null and -T - options handle filenames with spaces correctly.

7. Troubleshooting

  • “gzip: stdin: unexpected end of file”: This often means the compressed file is corrupted or incomplete. Try downloading it again or checking the source.
  • “tar: Skipping to next header”: Indicates a corrupted archive. Try recreating the archive. Check the media for errors if the archive is on a physical drive.
  • “Cannot open: No such file or directory”: Double-check the file paths you are using. Ensure the files or directories exist.
  • Permission denied: You might not have the necessary permissions to read or write to the files or directories involved. Use sudo if necessary (but be cautious!).
  • **“tar: Removing leading /' from member names":** This is a warning, not an error. It means that taris removing the leading/from the file paths in the archive. It's generally safe to ignore, but if you need to preserve the absolute paths, use the—absolute-names` option (but be very careful, as this can overwrite files in unexpected locations during extraction). Consider using relative paths instead.
  • unzip: cannot find or open archive.zip, archive.zip.zip or archive.zip.ZIP.: This means unzip could not find the specified archive file. Double-check the filename and path.
  • unzip: bad zipfile offset (local header sig): 0: This usually indicates a corrupted zip file.

8. Related Commands

  • bzip2: Another compression utility, often provides better compression than gzip but is slower. Uses .bz2 extension.
  • xz: A more modern compression utility, offering even better compression than bzip2 but is even slower. Uses .xz extension.
  • zstd: Fast real-time compression algorithm, offering a good balance between compression ratio and speed.
  • cpio: Another archiving utility, similar to tar.
  • dd: For creating disk images.
  • file: Determines file type.
  • md5sum, sha256sum: Calculate checksums for file integrity verification.
  • rsync: For synchronizing files and directories between locations.
  • lzip: Another lossless data compressor with a user interface similar to gzip or bzip2.

This cheatsheet provides a solid foundation for working with archiving and compression tools in Linux. Remember to always practice on test data before applying these commands to critical systems. Always read the man pages (man tar, man gzip, man zip, man unzip) for the most up-to-date information and options.