Archive and Compression (tar, gzip, zip)
Category: Linux Command Basics
Type: Linux Commands
Generated on: 2025-07-10 03:06:32
For: System Administration, Development & Technical Interviews
Archive and Compression Cheatsheet (tar, gzip, zip)
Section titled “Archive and Compression Cheatsheet (tar, gzip, zip)”This cheatsheet provides a comprehensive guide to using tar, gzip, and zip commands in Linux for archiving and compressing files. It’s designed for both beginners and experienced users, covering basic syntax, practical examples, advanced usage, and troubleshooting tips.
1. Command Overview
tar(Tape Archive): Creates archives (collections of files and directories) without compression. Primarily used for packaging and backing up files.- **
gzip (GNU zip):** Compresses single files, typically using the.gz` extension. Good for individual file compression. zip: Creates compressed archives (collections of files) in the.zipformat. Widely compatible with other operating systems (Windows, macOS).unzip: Extracts files from.ziparchives.
When to Use:
tar: Archiving multiple files/directories for backup, distribution, or organization.tar.gz(tar + gzip): Archiving and compressing files together for efficient storage and transfer (common on Linux).gzip: Compressing single large files to save space.zip: Creating archives compatible with Windows and macOS, or when you need specific features like password protection.
2. Basic Syntax
tar [options] [archive-file] [files/directories]gzip [options] [file]zip [options] [archive-file] [files/directories]unzip [options] [archive-file]
3. Practical Examples
tar Examples:
- Create an archive:
tar -cvf myarchive.tar file1.txt file2.txt directory1/Output: (verbose listing of files added)
file1.txtfile2.txtdirectory1/directory1/file3.txt- Extract an archive:
tar -xvf myarchive.tarOutput: (verbose listing of files extracted)
file1.txtfile2.txtdirectory1/directory1/file3.txt- List the contents of an archive:
tar -tvf myarchive.tarOutput: (detailed listing of files within the archive)
-rw-r--r-- user group 1024 2023-10-27 10:00 file1.txt-rw-r--r-- user group 2048 2023-10-27 10:00 file2.txtdrwxr-xr-x user group 0 2023-10-27 10:00 directory1/-rw-r--r-- user group 512 2023-10-27 10:00 directory1/file3.txtgzip Examples:
- Compress a file:
gzip myfile.txtOutput: (none, replaces myfile.txt with myfile.txt.gz)
- Decompress a file:
gzip -d myfile.txt.gzOutput: (none, replaces myfile.txt.gz with myfile.txt)
- Compress a file and keep the original:
gzip -c myfile.txt > myfile.txt.gzOutput: (creates myfile.txt.gz while keeping myfile.txt)
zip Examples:
- Create a zip archive:
zip myarchive.zip file1.txt file2.txt directory1/Output: (listing of files added)
adding: file1.txt (deflated 50%) adding: file2.txt (deflated 60%) adding: directory1/ (stored 0%) adding: directory1/file3.txt (deflated 70%)- Extract a zip archive:
unzip myarchive.zipOutput: (listing of files extracted)
Archive: myarchive.zip inflating: file1.txt inflating: file2.txt creating: directory1/ inflating: directory1/file3.txt- List the contents of a zip archive:
unzip -l myarchive.zipOutput: (detailed listing of files within the archive)
Archive: myarchive.zip Length Date Time Name --------- ---------- ----- ---- 1024 2023-10-27 10:00 file1.txt 2048 2023-10-27 10:00 file2.txt 0 2023-10-27 10:00 directory1/ 512 2023-10-27 10:00 directory1/file3.txt --------- ------- 3584 4 files4. Common Options
tar Options:
-c(create): Create an archive.-x(extract): Extract an archive.-v(verbose): List files processed verbosely.-f(file): Specify the archive file name. Required.-z(gzip): Compress/decompress using gzip. Implies.tar.gzextension.-j(bzip2): Compress/decompress using bzip2. Implies.tar.bz2extension.-J(xz): Compress/decompress using xz. Implies.tar.xzextension.-t(list): List the contents of an archive.-C(directory): Change to the specified directory before extracting/creating.--exclude=pattern: Exclude files matching the pattern.--delete: Delete from the archive (use with caution!). Not always supported.-p: Preserve permissions.--numeric-owner: Use numeric user and group IDs instead of names. Useful for transferring archives between systems with different user/group mappings.
gzip Options:
-d(decompress): Decompress.-c(stdout): Write output to standard output; keep original file.-r(recursive): Recursively compress files in directories.-v(verbose): Verbose output.-l(list): List compressed file contents.-k(keep): Keep (don’t delete) input files during compression or decompression.-N(name): Save original file name and timestamp.-1to-9: Compression level (1 = fastest, least compression; 9 = slowest, best compression). Default is 6.
zip Options:
-r(recursive): Recursively include directories.-e(encrypt): Encrypt the archive with a password.-l(list): List archive contents.-d(delete): Delete entries from a zip archive.-u(update): Update existing entries in a zip archive.-j(junk-paths): Store just the name of the files; don’t record directory names.-9: Best compression (slowest).-0is no compression (store only).
unzip Options:
-l(list): List archive contents.-d(directory): Extract files into specified directory.-o(overwrite): Overwrite existing files without prompting.-n(never): Never overwrite existing files.-j(junk-paths): Do not recreate directory structure.-P password: Use password to decrypt the archive.
5. Advanced Usage
tar Examples:
- Create a compressed archive (.tar.gz) excluding a directory:
tar -czvf myarchive.tar.gz --exclude=./excluded_directory ./- Extract a
.tar.gzarchive to a specific directory:
tar -xzvf myarchive.tar.gz -C /path/to/extract/to-
Backup a directory using incremental backups (requires
gawk):- First, create a full backup:
Terminal window tar -czvf full_backup.tar.gz /path/to/backup- Then, create an incremental backup (assuming you have GNU
tarversion > 1.15):
Terminal window tar -g snapshot.snar -czvf incremental_backup.tar.gz /path/to/backup- The
snapshot.snarfile keeps track of which files have been backed up. Subsequent incremental backups will only contain files that have changed since the last backup.
-
Using
--transformfor renaming files during extraction:Terminal window tar -xvf archive.tar --transform='s/old_prefix/new_prefix/'This replaces
old_prefixwithnew_prefixin the extracted file paths. Useful for renaming files while extracting.
gzip Examples:
- Recursively compress all files in a directory (use with caution, compress individual files):
find . -type f -print0 | xargs -0 gzip- Compress to a specific compression level:
gzip -9 myfile.txt # Maximum compression, slowestgzip -1 myfile.txt # Fastest compression, least compressionzip Examples:
- Create a password-protected zip archive:
zip -e myarchive.zip file1.txt file2.txtThe command will prompt you to enter and verify the password.
- Add files to an existing zip archive:
zip myarchive.zip newfile.txt- Create a zip archive without directory structure (just the files):
zip -j myarchive.zip /path/to/files/*6. Tips & Tricks
- Combine
tarandgzipin a single command: Use the-zoption withtar(e.g.,tar -czvf myarchive.tar.gz ...). - Use tab completion: Type the beginning of a filename or directory and press Tab to auto-complete.
- Check available disk space: Before creating large archives, use
df -hto ensure you have enough space. - Progress bar: Use
pv(Pipe Viewer) to show a progress bar fortaroperations (installpvif needed:sudo apt-get install pvorsudo yum install pv):
tar -czvf - /path/to/backup | pv > backup.tar.gztar -xzvf - < backup.tar.gz | pv > /path/to/restore- Verify archive integrity: While
tardoesn’t have built-in verification, consider using checksums (likemd5sumorsha256sum) on the archive file after creation and before extraction. - Standard input/output redirection: Use
-as the filename fortarto read from standard input or write to standard output. Useful for piping data. - Using find with tar:
This finds all
Terminal window find /path/to/search -name "*.log" -print0 | tar -czvf logs.tar.gz --null -T -.logfiles under/path/to/searchand adds them tologs.tar.gz. The--nulland-T -options handle filenames with spaces correctly.
7. Troubleshooting
- “gzip: stdin: unexpected end of file”: This often means the compressed file is corrupted or incomplete. Try downloading it again or checking the source.
- “tar: Skipping to next header”: Indicates a corrupted archive. Try recreating the archive. Check the media for errors if the archive is on a physical drive.
- “Cannot open: No such file or directory”: Double-check the file paths you are using. Ensure the files or directories exist.
- Permission denied: You might not have the necessary permissions to read or write to the files or directories involved. Use
sudoif necessary (but be cautious!). - **“tar: Removing leading
/' from member names":** This is a warning, not an error. It means thattaris removing the leading/from the file paths in the archive. It's generally safe to ignore, but if you need to preserve the absolute paths, use the—absolute-names` option (but be very careful, as this can overwrite files in unexpected locations during extraction). Consider using relative paths instead. unzip: cannot find or open archive.zip, archive.zip.zip or archive.zip.ZIP.: This meansunzipcould not find the specified archive file. Double-check the filename and path.unzip: bad zipfile offset (local header sig): 0: This usually indicates a corrupted zip file.
8. Related Commands
bzip2: Another compression utility, often provides better compression thangzipbut is slower. Uses.bz2extension.xz: A more modern compression utility, offering even better compression thanbzip2but is even slower. Uses.xzextension.zstd: Fast real-time compression algorithm, offering a good balance between compression ratio and speed.cpio: Another archiving utility, similar totar.dd: For creating disk images.file: Determines file type.md5sum,sha256sum: Calculate checksums for file integrity verification.rsync: For synchronizing files and directories between locations.lzip: Another lossless data compressor with a user interface similar to gzip or bzip2.
This cheatsheet provides a solid foundation for working with archiving and compression tools in Linux. Remember to always practice on test data before applying these commands to critical systems. Always read the man pages (man tar, man gzip, man zip, man unzip) for the most up-to-date information and options.