SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Linux For Embedded Systems
ForArabs
Ahmed ElArabawy
Course 102:
UnderstandingLinux
Lecture 24:
Archiving and Compression of Files
Archiving Files
• Archiving is to combine a group of files organized in a tree
structure into one file
• This enables easier handling, such as for backup or transfer
purposes
• This is a separate procedure from compression
• The tool used for archiving files is “tar”
Archiving Files
(tar Command)
$ tar <options> <destination archive file> <directories/files to archive>
• This command is used to archives a group of files/directories
into a single tar file
• It is also used to un-archive a tar file into its original directory
tree structure
• To archive (tar) a group of files and directories,
$ tar cvf <archiveFile.tar> <a set of files and directories>
• ‘c’ for create
• ‘v’ for verbose
• ‘f’ for file
• Examples:
$ tar cvf my-docs.tar ~/Documents/pdfs/
$ tar cvf selected-files.tar ~/file-1.txt ~/Documents/file-2.txt ~/*.pdf
Working with a tar file
(tar Command)
• Now we have the tar file, and we can do the following with it,
• To un-archive (un-tar) a file into all of its original components
$ tar xvf <archiveFile.tar>
• ‘x’ for extract
• To extract some of the files/dirs from a tar file
$ tar xvf <archiveFile.tar> <files/dir to untar>
• To show the contents of a tar file
$ tar tvf <archiveFile.tar>
• Examples:
$ tar cvf tar-file.tar *.pdf *.txt
$ tar tvf tar-file.tar
$ tar xvf tar-file.tar *.pdf
$ tar xvf tar-file.tar
Using tar inside a find Command
• Let us say, we need to archive all pdf files in our home
directory.
• We will need the find command to search for those files
• The outcome of the find command is then passed to the tar
command
• This can be achieved by using a pipe,
$ find ~ -name ‘*.pdf’ | tar cvf file.tar
• Another way to achieve the same target is to use the tar
command as the user defined command to be executed
inside the find command
$ find ~ -name ‘*.pdf’ -exec tar cvf file.tar ‘{}’ ‘+’
Compressing Files
• Compressing files is the procedure of reducing the size of the file by a
tool that removes any redundant data in the file
• There are multiple formats for compressed files,
• ‘.gz’  use the tools gzip, and gunzip
$ gzip <file> file is compressed into file.gz
$ gunzip <file.gz> file.gz is flattened into file
• Better compression ‘.bz2’  use the tools bzip2 and bunzip2
$ bzip2 <file>
$ bunzip2 <file.bz2>
• Even better compression ‘lzma’  use the tools lzma and unlzma
$ lzma <file>
$ unlzma <file.lzma>
• Note that these tools,
• They deal with a single file and not a group of files, to be able to compress a
group of file in a directory tree, you need to archive them first
• They replace the file with its compressed/flattened version. The original file is
deleted
Accessing Compressing Text Files
• If a text file is compressed to a .gz format, we can still access it
without uncompressing it using a set of tools (Z-tools)
$ gzip my-file.txt
• To view the file contents, we use the zcat command (similar to
cat command for uncompressed text files)
$ zcat my-file.gz
• If the file is long, and we need to view it page by page, we can
use the commands zmore and zless (similar to more and less
commands for uncompressed text commands)
$ zless my-file.gz
$ zmore my-file.gz
• We can also search inside the compressed text document via
the zgrep and zegrep commands (similar to grep & egrep)
More on Compressing Files
• Note that compression preserves the permissions and
timestamp of the files. After uncompressing the compressed
file, we end up with the same permissions and timestamps of
the original file
• It is not useful to try to compress an already compressed file,
it will probably increase its size due to
• No reduction in file size since reduction already happened the
first time
• Added more overhead data (meta-data) at the second
compression time
Mixing Archive and Compression
• Archiving means combining multiple files organized in a tree
structure into a single file
• Compression is to reduce the size of a single file
• So if we want to perform both, we can do this in two steps,
$ tar cvf pdf-files.tar *.pdf (this will generate the pdf-files.tar)
$ gzip pdf-files.tar (this will generate the pdf-file.tar.gz)
• We can replace the gzip with any other compression tool
(depending on the desired compression format)
• We can perform both tasks (archiving and compression) in a
single step
Using tar for Archive + Compress
$ tar <options> <compressed file> <files or folders>
$ tar <options> <compressed file>
• With the right set of options to the tar command, we can perform
both archiving and compression (to the desired compression format)
1. Add the option ‘z’ to perform gzip or gunzip
2. Add the option ‘j’ to perform bzip2 or bunzip2
3. Add the option ‘--lzma’ to perform lzma or unlzma
• Examples:
$ tar cvzf my-file.tar.gz ~/Documents/*.doc
$ tar xvzf my-file.tar.gz
$ tar xvjf file.tar.bz2
$ tar cvf --lzma my-file.tar.lzma ~/my-project/
$ tar tzvf my-file.tar.gz
Other Archive +Compression Tools
(the rar Command )
• The rar tool can be used to archive + compress or extract +
deflate files to/from the .rar format archive file
• The tool need to be installed first
$ sudo apt-get install rar
• The rar tool is very powerful tool, we will only cover the basics
of it
• To add a file (group of files) to a rar archive
$ rar a my-archive.rar ~/project/*.pdf
• This will add the pdf files to the my-archive.rar archive.
• If the archive does not exist, it will be created
• If archive exists, it will be appended with these files
• To lock the archive to stop more file additions
$ rar k my-archive.rar
Accessing the RAR Archive file
(the rar Command )
• To list the contents of the archive,
$ rar l my-archive.rar
• To delete a file from the archive
$ rar d my-archive.rar my-file.pdf
• To extract the files in the current directory without
maintaining the original hierarchy (flat set of files, without
creating subdirectories
$ rar e my-archive.rar
$ rar e my-archive.rar file-1.pdf
• To extract the files in the proper file hierarchy (to maintain the
directory structure). This created subdirectories inside the
current directory
$ rar x my-archive.rar
$ rar x my-archive.rar file-2.pdf
Refreshing the Archive
(The rar Command)
• Let us assume we are archiving all your project files
$ rar a my-archive.rar ./my-project
• Then we modified some of the files that has been archived
• Now we need to update the archive with the new modified
files
$ rar f my-archive.rar (only refreshes the local files)
$ rar f my-archive.rar * (refreshes all subdirectories as well)
Other Archive +Compression Tools
(the zip/unzip tool)
• The tools ‘zip’ and ‘unzip’ can perform windows format ‘.zip’
• Note that those tools perform both archiving and compression
$ zip -r file.zip ~/my-documents/ (r for recursive)
$ unzip file.zip
• Note:
• If we use ‘zip’ on a directory to file.zip and that file already exists,
then file.zip will be updated,
• New files added
• Modified files updated
Checking File Integrity
(The md5sum Command)
$ md5sum <file to be protected> > <checksum file>
$ md5sum -c <checksum file>
• If we have a file and you need to make sure it is not modified or
corrupted over time or after transferring it, we can protect its
integrity by calculating a checksum and store it
$ md5sum my-file.txt > checksum-file
• Then at a later time or after some procedure, that may affect its, we
can recalculate the checksum and compare it to the original
$ md5sum -c checksum-file
• We should get a status if the checksum matches the current file or
not
• Note that the checksum file will contain a 128 bit message digest of
the file content
Checking File Integrity
(The md5sum Command)
Checking Integrity for Multiple Files
• In case we need to check integrity for multiple files, we can use one
of the following methods
• We can archive the directory structure to be protected, and perform
an md5sum on the archive file
$ rar a my-archive.rar ~/my-project/
$ md5sum my-archive.rar > csum
• We can use shell expansion wildcards to select the files to be
protected
$ md5sum * > csum
$ md5sum *.java > csum
• We can use the find command to select the files to be protected
$ find ~ -type f -exec md5sum {} + > csum
$ find ~ -type f -name *.pdf | md5sum > csum
Checking Integrity for Multiple Files
http://Linux4EmbeddedSystems.com

Contenu connexe

Tendances

Tendances (20)

Linux commands
Linux commandsLinux commands
Linux commands
 
UNIX Operating System ppt
UNIX Operating System pptUNIX Operating System ppt
UNIX Operating System ppt
 
Basic commands of linux
Basic commands of linuxBasic commands of linux
Basic commands of linux
 
Linux Administration
Linux AdministrationLinux Administration
Linux Administration
 
Linux standard file system
Linux standard file systemLinux standard file system
Linux standard file system
 
Archiving in linux tar
Archiving in linux tarArchiving in linux tar
Archiving in linux tar
 
User and groups administrator
User  and  groups administratorUser  and  groups administrator
User and groups administrator
 
Linux commands
Linux commandsLinux commands
Linux commands
 
DOS - Disk Operating System
DOS - Disk Operating SystemDOS - Disk Operating System
DOS - Disk Operating System
 
Course 102: Lecture 3: Basic Concepts And Commands
Course 102: Lecture 3: Basic Concepts And Commands Course 102: Lecture 3: Basic Concepts And Commands
Course 102: Lecture 3: Basic Concepts And Commands
 
Vi editor
Vi editorVi editor
Vi editor
 
Access control list acl - permissions in linux
Access control list acl  - permissions in linuxAccess control list acl  - permissions in linux
Access control list acl - permissions in linux
 
An Introduction To Linux
An Introduction To LinuxAn Introduction To Linux
An Introduction To Linux
 
Know the UNIX Commands
Know the UNIX CommandsKnow the UNIX Commands
Know the UNIX Commands
 
Introduction to Linux
Introduction to Linux Introduction to Linux
Introduction to Linux
 
Users and groups in Linux
Users and groups in LinuxUsers and groups in Linux
Users and groups in Linux
 
Linux course fhs file hierarchy standard
Linux   course   fhs file hierarchy standardLinux   course   fhs file hierarchy standard
Linux course fhs file hierarchy standard
 
Presentation on samba server
Presentation on samba serverPresentation on samba server
Presentation on samba server
 
Basic linux commands
Basic linux commandsBasic linux commands
Basic linux commands
 
Server configuration
Server configurationServer configuration
Server configuration
 

Similaire à Course 102: Lecture 24: Archiving and Compression of Files

Management file and directory in linux
Management file and directory in linuxManagement file and directory in linux
Management file and directory in linux
Zkre Saleh
 
Unit 12 finding and processing files
Unit 12 finding and processing filesUnit 12 finding and processing files
Unit 12 finding and processing files
root_fibo
 
CASPUR Staging System II
CASPUR Staging System IICASPUR Staging System II
CASPUR Staging System II
Andrea PETRUCCI
 

Similaire à Course 102: Lecture 24: Archiving and Compression of Files (20)

12 linux archiving tools
12 linux archiving tools12 linux archiving tools
12 linux archiving tools
 
Management file and directory in linux
Management file and directory in linuxManagement file and directory in linux
Management file and directory in linux
 
Command Line Tools
Command Line ToolsCommand Line Tools
Command Line Tools
 
LinuxCommands (1).pdf
LinuxCommands (1).pdfLinuxCommands (1).pdf
LinuxCommands (1).pdf
 
2. UNIX OS System Architecture easy.pptx
2. UNIX OS System Architecture easy.pptx2. UNIX OS System Architecture easy.pptx
2. UNIX OS System Architecture easy.pptx
 
Find and Locate: Two Commands
Find and Locate: Two CommandsFind and Locate: Two Commands
Find and Locate: Two Commands
 
Linux Presentation
Linux PresentationLinux Presentation
Linux Presentation
 
Linux Fundamentals
Linux FundamentalsLinux Fundamentals
Linux Fundamentals
 
linux-file-system01.ppt
linux-file-system01.pptlinux-file-system01.ppt
linux-file-system01.ppt
 
Managing your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformaticsManaging your data - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformatics
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptx
 
Unit 12 finding and processing files
Unit 12 finding and processing filesUnit 12 finding and processing files
Unit 12 finding and processing files
 
Unix cmc
Unix cmcUnix cmc
Unix cmc
 
Linux shell scripting
Linux shell scriptingLinux shell scripting
Linux shell scripting
 
LinuxLabBasics.ppt
LinuxLabBasics.pptLinuxLabBasics.ppt
LinuxLabBasics.ppt
 
CASPUR Staging System II
CASPUR Staging System IICASPUR Staging System II
CASPUR Staging System II
 
03 browsing the filesystem
03 browsing the filesystem03 browsing the filesystem
03 browsing the filesystem
 
BITS: Introduction to Linux - Text manipulation tools for bioinformatics
BITS: Introduction to Linux - Text manipulation tools for bioinformaticsBITS: Introduction to Linux - Text manipulation tools for bioinformatics
BITS: Introduction to Linux - Text manipulation tools for bioinformatics
 
Group13
Group13Group13
Group13
 
Rar
RarRar
Rar
 

Plus de Ahmed El-Arabawy

Plus de Ahmed El-Arabawy (20)

C 102 lec_29_what_s_next
C 102 lec_29_what_s_nextC 102 lec_29_what_s_next
C 102 lec_29_what_s_next
 
Course 102: Lecture 28: Virtual FileSystems
Course 102: Lecture 28: Virtual FileSystems Course 102: Lecture 28: Virtual FileSystems
Course 102: Lecture 28: Virtual FileSystems
 
Course 102: Lecture 27: FileSystems in Linux (Part 2)
Course 102: Lecture 27: FileSystems in Linux (Part 2)Course 102: Lecture 27: FileSystems in Linux (Part 2)
Course 102: Lecture 27: FileSystems in Linux (Part 2)
 
Course 102: Lecture 26: FileSystems in Linux (Part 1)
Course 102: Lecture 26: FileSystems in Linux (Part 1) Course 102: Lecture 26: FileSystems in Linux (Part 1)
Course 102: Lecture 26: FileSystems in Linux (Part 1)
 
Course 102: Lecture 25: Devices and Device Drivers
Course 102: Lecture 25: Devices and Device Drivers Course 102: Lecture 25: Devices and Device Drivers
Course 102: Lecture 25: Devices and Device Drivers
 
Course 102: Lecture 22: Package Management
Course 102: Lecture 22: Package Management Course 102: Lecture 22: Package Management
Course 102: Lecture 22: Package Management
 
Course 102: Lecture 20: Networking In Linux (Basic Concepts)
Course 102: Lecture 20: Networking In Linux (Basic Concepts) Course 102: Lecture 20: Networking In Linux (Basic Concepts)
Course 102: Lecture 20: Networking In Linux (Basic Concepts)
 
Course 102: Lecture 19: Using Signals
Course 102: Lecture 19: Using Signals Course 102: Lecture 19: Using Signals
Course 102: Lecture 19: Using Signals
 
Course 102: Lecture 18: Process Life Cycle
Course 102: Lecture 18: Process Life CycleCourse 102: Lecture 18: Process Life Cycle
Course 102: Lecture 18: Process Life Cycle
 
Course 102: Lecture 17: Process Monitoring
Course 102: Lecture 17: Process Monitoring Course 102: Lecture 17: Process Monitoring
Course 102: Lecture 17: Process Monitoring
 
Course 102: Lecture 16: Process Management (Part 2)
Course 102: Lecture 16: Process Management (Part 2) Course 102: Lecture 16: Process Management (Part 2)
Course 102: Lecture 16: Process Management (Part 2)
 
Course 102: Lecture 14: Users and Permissions
Course 102: Lecture 14: Users and PermissionsCourse 102: Lecture 14: Users and Permissions
Course 102: Lecture 14: Users and Permissions
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions
 
Course 102: Lecture 12: Basic Text Handling
Course 102: Lecture 12: Basic Text Handling Course 102: Lecture 12: Basic Text Handling
Course 102: Lecture 12: Basic Text Handling
 
Course 102: Lecture 11: Environment Variables
Course 102: Lecture 11: Environment VariablesCourse 102: Lecture 11: Environment Variables
Course 102: Lecture 11: Environment Variables
 
Course 102: Lecture 10: Learning About the Shell
Course 102: Lecture 10: Learning About the Shell Course 102: Lecture 10: Learning About the Shell
Course 102: Lecture 10: Learning About the Shell
 
Course 102: Lecture 9: Input Output Internals
Course 102: Lecture 9: Input Output Internals Course 102: Lecture 9: Input Output Internals
Course 102: Lecture 9: Input Output Internals
 
Course 102: Lecture 8: Composite Commands
Course 102: Lecture 8: Composite Commands Course 102: Lecture 8: Composite Commands
Course 102: Lecture 8: Composite Commands
 
Course 102: Lecture 7: Simple Utilities
Course 102: Lecture 7: Simple Utilities Course 102: Lecture 7: Simple Utilities
Course 102: Lecture 7: Simple Utilities
 
Course 102: Lecture 6: Seeking Help
Course 102: Lecture 6: Seeking HelpCourse 102: Lecture 6: Seeking Help
Course 102: Lecture 6: Seeking Help
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Course 102: Lecture 24: Archiving and Compression of Files

  • 1. Linux For Embedded Systems ForArabs Ahmed ElArabawy Course 102: UnderstandingLinux
  • 2. Lecture 24: Archiving and Compression of Files
  • 3. Archiving Files • Archiving is to combine a group of files organized in a tree structure into one file • This enables easier handling, such as for backup or transfer purposes • This is a separate procedure from compression • The tool used for archiving files is “tar”
  • 4. Archiving Files (tar Command) $ tar <options> <destination archive file> <directories/files to archive> • This command is used to archives a group of files/directories into a single tar file • It is also used to un-archive a tar file into its original directory tree structure • To archive (tar) a group of files and directories, $ tar cvf <archiveFile.tar> <a set of files and directories> • ‘c’ for create • ‘v’ for verbose • ‘f’ for file • Examples: $ tar cvf my-docs.tar ~/Documents/pdfs/ $ tar cvf selected-files.tar ~/file-1.txt ~/Documents/file-2.txt ~/*.pdf
  • 5. Working with a tar file (tar Command) • Now we have the tar file, and we can do the following with it, • To un-archive (un-tar) a file into all of its original components $ tar xvf <archiveFile.tar> • ‘x’ for extract • To extract some of the files/dirs from a tar file $ tar xvf <archiveFile.tar> <files/dir to untar> • To show the contents of a tar file $ tar tvf <archiveFile.tar> • Examples: $ tar cvf tar-file.tar *.pdf *.txt $ tar tvf tar-file.tar $ tar xvf tar-file.tar *.pdf $ tar xvf tar-file.tar
  • 6. Using tar inside a find Command • Let us say, we need to archive all pdf files in our home directory. • We will need the find command to search for those files • The outcome of the find command is then passed to the tar command • This can be achieved by using a pipe, $ find ~ -name ‘*.pdf’ | tar cvf file.tar • Another way to achieve the same target is to use the tar command as the user defined command to be executed inside the find command $ find ~ -name ‘*.pdf’ -exec tar cvf file.tar ‘{}’ ‘+’
  • 7. Compressing Files • Compressing files is the procedure of reducing the size of the file by a tool that removes any redundant data in the file • There are multiple formats for compressed files, • ‘.gz’  use the tools gzip, and gunzip $ gzip <file> file is compressed into file.gz $ gunzip <file.gz> file.gz is flattened into file • Better compression ‘.bz2’  use the tools bzip2 and bunzip2 $ bzip2 <file> $ bunzip2 <file.bz2> • Even better compression ‘lzma’  use the tools lzma and unlzma $ lzma <file> $ unlzma <file.lzma> • Note that these tools, • They deal with a single file and not a group of files, to be able to compress a group of file in a directory tree, you need to archive them first • They replace the file with its compressed/flattened version. The original file is deleted
  • 8. Accessing Compressing Text Files • If a text file is compressed to a .gz format, we can still access it without uncompressing it using a set of tools (Z-tools) $ gzip my-file.txt • To view the file contents, we use the zcat command (similar to cat command for uncompressed text files) $ zcat my-file.gz • If the file is long, and we need to view it page by page, we can use the commands zmore and zless (similar to more and less commands for uncompressed text commands) $ zless my-file.gz $ zmore my-file.gz • We can also search inside the compressed text document via the zgrep and zegrep commands (similar to grep & egrep)
  • 9. More on Compressing Files • Note that compression preserves the permissions and timestamp of the files. After uncompressing the compressed file, we end up with the same permissions and timestamps of the original file • It is not useful to try to compress an already compressed file, it will probably increase its size due to • No reduction in file size since reduction already happened the first time • Added more overhead data (meta-data) at the second compression time
  • 10. Mixing Archive and Compression • Archiving means combining multiple files organized in a tree structure into a single file • Compression is to reduce the size of a single file • So if we want to perform both, we can do this in two steps, $ tar cvf pdf-files.tar *.pdf (this will generate the pdf-files.tar) $ gzip pdf-files.tar (this will generate the pdf-file.tar.gz) • We can replace the gzip with any other compression tool (depending on the desired compression format) • We can perform both tasks (archiving and compression) in a single step
  • 11. Using tar for Archive + Compress $ tar <options> <compressed file> <files or folders> $ tar <options> <compressed file> • With the right set of options to the tar command, we can perform both archiving and compression (to the desired compression format) 1. Add the option ‘z’ to perform gzip or gunzip 2. Add the option ‘j’ to perform bzip2 or bunzip2 3. Add the option ‘--lzma’ to perform lzma or unlzma • Examples: $ tar cvzf my-file.tar.gz ~/Documents/*.doc $ tar xvzf my-file.tar.gz $ tar xvjf file.tar.bz2 $ tar cvf --lzma my-file.tar.lzma ~/my-project/ $ tar tzvf my-file.tar.gz
  • 12. Other Archive +Compression Tools (the rar Command ) • The rar tool can be used to archive + compress or extract + deflate files to/from the .rar format archive file • The tool need to be installed first $ sudo apt-get install rar • The rar tool is very powerful tool, we will only cover the basics of it • To add a file (group of files) to a rar archive $ rar a my-archive.rar ~/project/*.pdf • This will add the pdf files to the my-archive.rar archive. • If the archive does not exist, it will be created • If archive exists, it will be appended with these files • To lock the archive to stop more file additions $ rar k my-archive.rar
  • 13. Accessing the RAR Archive file (the rar Command ) • To list the contents of the archive, $ rar l my-archive.rar • To delete a file from the archive $ rar d my-archive.rar my-file.pdf • To extract the files in the current directory without maintaining the original hierarchy (flat set of files, without creating subdirectories $ rar e my-archive.rar $ rar e my-archive.rar file-1.pdf • To extract the files in the proper file hierarchy (to maintain the directory structure). This created subdirectories inside the current directory $ rar x my-archive.rar $ rar x my-archive.rar file-2.pdf
  • 14. Refreshing the Archive (The rar Command) • Let us assume we are archiving all your project files $ rar a my-archive.rar ./my-project • Then we modified some of the files that has been archived • Now we need to update the archive with the new modified files $ rar f my-archive.rar (only refreshes the local files) $ rar f my-archive.rar * (refreshes all subdirectories as well)
  • 15. Other Archive +Compression Tools (the zip/unzip tool) • The tools ‘zip’ and ‘unzip’ can perform windows format ‘.zip’ • Note that those tools perform both archiving and compression $ zip -r file.zip ~/my-documents/ (r for recursive) $ unzip file.zip • Note: • If we use ‘zip’ on a directory to file.zip and that file already exists, then file.zip will be updated, • New files added • Modified files updated
  • 16. Checking File Integrity (The md5sum Command) $ md5sum <file to be protected> > <checksum file> $ md5sum -c <checksum file> • If we have a file and you need to make sure it is not modified or corrupted over time or after transferring it, we can protect its integrity by calculating a checksum and store it $ md5sum my-file.txt > checksum-file • Then at a later time or after some procedure, that may affect its, we can recalculate the checksum and compare it to the original $ md5sum -c checksum-file • We should get a status if the checksum matches the current file or not • Note that the checksum file will contain a 128 bit message digest of the file content
  • 17. Checking File Integrity (The md5sum Command)
  • 18. Checking Integrity for Multiple Files • In case we need to check integrity for multiple files, we can use one of the following methods • We can archive the directory structure to be protected, and perform an md5sum on the archive file $ rar a my-archive.rar ~/my-project/ $ md5sum my-archive.rar > csum • We can use shell expansion wildcards to select the files to be protected $ md5sum * > csum $ md5sum *.java > csum • We can use the find command to select the files to be protected $ find ~ -type f -exec md5sum {} + > csum $ find ~ -type f -name *.pdf | md5sum > csum
  • 19. Checking Integrity for Multiple Files