SlideShare a Scribd company logo
1 of 21
Download to read offline
RedHat Enterprise Linux Essential
     Unit 7: Text Processing Tools
Objectives
Upon completion of this unit, you should be able to:

 Use tools for extracting, analyzing and manipulating
  text data
Tools for Extracting Text

 File Contents: less and cat

 File Excerpts: head and tail

 Extract by Column: cut

 Extract by Keyword: grep
Viewing File Contents
                             less and cat

 cat: dump one or more files to STDOUT

    Multiple files are concatenated together


 less: view file or STDIN one page at a time

    Useful commands while viewing:

       • /text searches for text

       • n/N jumps to the next/previous match

       • v opens the file in a text editor


 less is the pager used by man
Viewing File Excerpts
                               head and tail

 head: Display the first 10 lines of a file

    Use -n to change number of lines displayed


 tail: Display the last 10 lines of a file

    Use -n to change number of lines displayed


    Use -f to "follow" subsequent additions to the file
       • Very useful for monitoring log files!
Extracting Text by Keyword
                             grep
 Prints lines of files or STDIN where a pattern is matched
       $ grep 'john' /etc/passwd

       $ date --help | grep year

 Use -i to search case-insensitively

 Use -n to print line numbers of matches

 Use -v to print lines not containing pattern

 Use -AX to include the X lines after each match

 Use -BX to include the X lines before each match
Extracting Text by Column
                                 cut

 Display specific columns of file or STDIN data

  $ cut -d: -f1 /etc/passwd

  $ grep root /etc/passwd | cut -d: -f7


 Use -d to specify the column delimiter (default is TAB)

 Use -f to specify the column to print

 Use -c to cut by characters

  $ cut -c2-5 /usr/share/dict/words
Tools for Analyzing Text

 Text Stats: wc

 Sorting Text: sort

 Comparing Files: diff and patch

 Spell Check: aspell
Gathering Text Statistics
                       wc (word count)
 Counts words, lines, bytes and characters

 Can act upon a file or STDIN

       $ wc story.txt

       39   237   1901 story.txt

 Use -l for only line count

 Use -w for only word count

 Use -c for only byte count

 Use -m for character count (not displayed)
Sorting Text sort

 Sorts text to STDOUT - original file unchanged

       $ sort [options] file(s)
 Common options
    -r performs a reverse (descending) sort

    -n performs a numeric sort

    -f ignores (folds) case of characters in strings

    -u (unique) removes duplicate lines in output

    -t c uses c as a field separator

    -k X sorts by c-delimited field X
       • Can be used multiple times
Eliminating Duplicate Lines
                        sort and uniq
 sort -u: removes duplicate lines from input

 uniq: removes duplicate adjacent lines from input
    Use -c to count number of occurrences

    Use with sort for best effect:

      $ sort userlist.txt | uniq -c
Comparing Files
                              diff
 Compares two files for differences
      $ diff foo.conf-broken foo.conf-works
      5c5
      < use_widgets = no
      ---
      > use_widgets = yes
    Denotes a difference (change) on line 5

 Use gvimdiff for graphical diff
    Provided by vim-X11 package
Duplicating File Changes
                               patch
 diff output stored in a file is called a "patchfile"
    Use -u for "unified" diff, best in patchfiles

 patch duplicates changes in other files (use with care!)

 • Use -b to automatically back up changed files

  $ diff -u foo.conf-broken foo.conf-works > foo.patch

  $ patch -b foo.conf-broken foo.patch
Spell Checking with aspell

 Interactively spell-check files:
       $ aspell check letter.txt

 Non-interactively list mis-spelled words in STDIN

       $ aspell list < letter.txt

       $ aspell list < letter.txt | wc -l
Tools for Manipulating Text
                           tr and sed
 Alter (translate) Characters: tr
    Converts characters in one set to corresponding characters in another
     set
    Only reads data from STDIN

       $ tr 'a-z' 'A-Z' < lowercase.txt

 Alter Strings: sed
    stream editor

    Performs search/replace operations on a stream of text

    Normally does not alter source file

    Use -i.bak to back-up and alter source file
sed
                              Examples
 Quote search and replace instructions!

 sed addresses
    sed 's/dog/cat/g' pets

    sed '1,50s/dog/cat/g' pets

    sed '/digby/,/duncan/s/dog/cat/g' pets

 Multiple sed instructions
    sed -e 's/dog/cat/' -e 's/hi/lo/' pets

    sed -f myedits pets
Introduction awk

   Field/Column processor
   Supports egrep-compatible (POSIX) RegExes
   Can return full lines like grep
   Awk runs 3 steps:
     BEGIN - optional
     Body, where the main action(s) take place
     END - optional
 Multiple body actions can be executed by separating them using
  semicolons. e.g. '{ print $1; print $2 }'
 awk, auto-loops through input stream, regardless of the source of the
  stream. e.g. STDIN, Pipe, File
 Usage:
       awk '/optional_match/ { action }' file_name | Pipe
Example awk

 Print a text file
    awk '{print }' /etc/passwd

    awk '{print $0}' /etc/passwd

 Print specific field
    awk -F':' '{print $1}' /etc/passwd

 Pattern matching
    awk '$9 == 500 { print $0}' /var/log/httpd/access.log

 Print lines containing vmintam,student and khanh
    awk '/vmintam|student|khanh/' /etc/passwd
Example awk (con’t)

 print 1st lines from file
   awk "NR==1{print;exit}" /etc/resolv.conf

 Simply Arithmetic
   awk '{total += $1} END {print total}' earnings.txt

 Shell cannot calculate with floating point numberes, but awk can:
   awk 'BEGIN {printf "%.3fn", 2005.50 / 3}‘

 history | awk '{print $2}' | sort | uniq -c | sort -rn | head
Special Characters for Complex Searches
                 Regular Expressions
 ^ represents beginning of line

 $ represents end of line

 Character classes as in bash:
    [abc], [^abc]

    [[:upper:]], [^[:upper:]]

 Used by:
    grep, sed, less, others
Unit 8 text processing tools

More Related Content

What's hot (18)

Grep - A powerful search utility
Grep - A powerful search utilityGrep - A powerful search utility
Grep - A powerful search utility
 
101 3.7 search text files using regular expressions
101 3.7 search text files using regular expressions101 3.7 search text files using regular expressions
101 3.7 search text files using regular expressions
 
101 3.7 search text files using regular expressions
101 3.7 search text files using regular expressions101 3.7 search text files using regular expressions
101 3.7 search text files using regular expressions
 
15 practical grep command examples in linux
15 practical grep command examples in linux15 practical grep command examples in linux
15 practical grep command examples in linux
 
Grep
GrepGrep
Grep
 
Hex file and regex cheat sheet
Hex file and regex cheat sheetHex file and regex cheat sheet
Hex file and regex cheat sheet
 
PHP 5.3
PHP 5.3PHP 5.3
PHP 5.3
 
Unix - Filters
Unix - FiltersUnix - Filters
Unix - Filters
 
intro unix/linux 05
intro unix/linux 05intro unix/linux 05
intro unix/linux 05
 
Mysql
MysqlMysql
Mysql
 
Unix Basics
Unix BasicsUnix Basics
Unix Basics
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , Overview
 
Using Unix
Using UnixUsing Unix
Using Unix
 
Programming in C
Programming in CProgramming in C
Programming in C
 
Unix Commands
Unix CommandsUnix Commands
Unix Commands
 
Linux com
Linux comLinux com
Linux com
 
Learning Grep
Learning GrepLearning Grep
Learning Grep
 
Chunked, dplyr for large text files
Chunked, dplyr for large text filesChunked, dplyr for large text files
Chunked, dplyr for large text files
 

Viewers also liked (7)

Speed protocol processor
Speed protocol processorSpeed protocol processor
Speed protocol processor
 
Word processor in the classroom
Word processor in the classroomWord processor in the classroom
Word processor in the classroom
 
Ictlessonepp4aralin10angcomputerfilesystem 150622081942-lva1-app6892 -
Ictlessonepp4aralin10angcomputerfilesystem 150622081942-lva1-app6892 -Ictlessonepp4aralin10angcomputerfilesystem 150622081942-lva1-app6892 -
Ictlessonepp4aralin10angcomputerfilesystem 150622081942-lva1-app6892 -
 
Ictlessonepp4 aralin11pananaliksikgamitanginternet-150622045536-lva1-app6891 -
Ictlessonepp4 aralin11pananaliksikgamitanginternet-150622045536-lva1-app6891 -Ictlessonepp4 aralin11pananaliksikgamitanginternet-150622045536-lva1-app6891 -
Ictlessonepp4 aralin11pananaliksikgamitanginternet-150622045536-lva1-app6891 -
 
Ict lesson epp 4 aralin 9 pangangalap ng impormasyon gamit ang ict
Ict lesson epp 4 aralin 9 pangangalap ng impormasyon gamit ang ictIct lesson epp 4 aralin 9 pangangalap ng impormasyon gamit ang ict
Ict lesson epp 4 aralin 9 pangangalap ng impormasyon gamit ang ict
 
K TO 12 GRADE 4 UNANG MARKAHANG PAGSUSULIT
K TO 12 GRADE 4 UNANG MARKAHANG PAGSUSULITK TO 12 GRADE 4 UNANG MARKAHANG PAGSUSULIT
K TO 12 GRADE 4 UNANG MARKAHANG PAGSUSULIT
 
Ppt for tranmission media
Ppt for tranmission mediaPpt for tranmission media
Ppt for tranmission media
 

Similar to Unit 8 text processing tools

Shell Scripts
Shell ScriptsShell Scripts
Shell Scripts
Dr.Ravi
 
1) List currently running jobsANS) see currently runningcommand.pdf
1) List currently running jobsANS) see currently runningcommand.pdf1) List currently running jobsANS) see currently runningcommand.pdf
1) List currently running jobsANS) see currently runningcommand.pdf
amaresh6333
 

Similar to Unit 8 text processing tools (20)

Handling Files Under Unix.pptx
Handling Files Under Unix.pptxHandling Files Under Unix.pptx
Handling Files Under Unix.pptx
 
Handling Files Under Unix.pptx
Handling Files Under Unix.pptxHandling Files Under Unix.pptx
Handling Files Under Unix.pptx
 
Cheatsheet: Hex file headers and regex
Cheatsheet: Hex file headers and regexCheatsheet: Hex file headers and regex
Cheatsheet: Hex file headers and regex
 
Ch05
Ch05Ch05
Ch05
 
Unix Trainning Doc.pptx
Unix Trainning Doc.pptxUnix Trainning Doc.pptx
Unix Trainning Doc.pptx
 
Linux
LinuxLinux
Linux
 
Linux
LinuxLinux
Linux
 
Linux
LinuxLinux
Linux
 
Linux
LinuxLinux
Linux
 
Spsl II unit
Spsl   II unitSpsl   II unit
Spsl II unit
 
Unix lab manual
Unix lab manualUnix lab manual
Unix lab manual
 
Vim and Python
Vim and PythonVim and Python
Vim and Python
 
intro unix/linux 06
intro unix/linux 06intro unix/linux 06
intro unix/linux 06
 
Scripting and the shell in LINUX
Scripting and the shell in LINUXScripting and the shell in LINUX
Scripting and the shell in LINUX
 
Linux Command Line - By Ranjan Raja
Linux Command Line - By Ranjan Raja Linux Command Line - By Ranjan Raja
Linux Command Line - By Ranjan Raja
 
101 3.7 search text files using regular expressions
101 3.7 search text files using regular expressions101 3.7 search text files using regular expressions
101 3.7 search text files using regular expressions
 
Shell Scripts
Shell ScriptsShell Scripts
Shell Scripts
 
1) List currently running jobsANS) see currently runningcommand.pdf
1) List currently running jobsANS) see currently runningcommand.pdf1) List currently running jobsANS) see currently runningcommand.pdf
1) List currently running jobsANS) see currently runningcommand.pdf
 
Unix
UnixUnix
Unix
 
Python_Unit_III.pptx
Python_Unit_III.pptxPython_Unit_III.pptx
Python_Unit_III.pptx
 

More from root_fibo

Unit 13 network client
Unit 13 network clientUnit 13 network client
Unit 13 network client
root_fibo
 
Unit 12 finding and processing files
Unit 12 finding and processing filesUnit 12 finding and processing files
Unit 12 finding and processing files
root_fibo
 
Unit 11 configuring the bash shell – shell script
Unit 11 configuring the bash shell – shell scriptUnit 11 configuring the bash shell – shell script
Unit 11 configuring the bash shell – shell script
root_fibo
 
Unit3 browsing the filesystem
Unit3 browsing the filesystemUnit3 browsing the filesystem
Unit3 browsing the filesystem
root_fibo
 
Unit 10 investigating and managing
Unit 10 investigating and managingUnit 10 investigating and managing
Unit 10 investigating and managing
root_fibo
 
Unit 9 basic system configuration tools
Unit 9 basic system configuration toolsUnit 9 basic system configuration tools
Unit 9 basic system configuration tools
root_fibo
 
Unit 7 standard i o
Unit 7 standard i oUnit 7 standard i o
Unit 7 standard i o
root_fibo
 
Unit 6 bash shell
Unit 6 bash shellUnit 6 bash shell
Unit 6 bash shell
root_fibo
 
Unit 5 vim an advanced text editor
Unit 5 vim an advanced text editorUnit 5 vim an advanced text editor
Unit 5 vim an advanced text editor
root_fibo
 
Unit 4 user and group
Unit 4 user and groupUnit 4 user and group
Unit 4 user and group
root_fibo
 

More from root_fibo (11)

Unit 13 network client
Unit 13 network clientUnit 13 network client
Unit 13 network client
 
Unit 12 finding and processing files
Unit 12 finding and processing filesUnit 12 finding and processing files
Unit 12 finding and processing files
 
Unit 11 configuring the bash shell – shell script
Unit 11 configuring the bash shell – shell scriptUnit 11 configuring the bash shell – shell script
Unit 11 configuring the bash shell – shell script
 
Unit3 browsing the filesystem
Unit3 browsing the filesystemUnit3 browsing the filesystem
Unit3 browsing the filesystem
 
Unit 10 investigating and managing
Unit 10 investigating and managingUnit 10 investigating and managing
Unit 10 investigating and managing
 
Unit 9 basic system configuration tools
Unit 9 basic system configuration toolsUnit 9 basic system configuration tools
Unit 9 basic system configuration tools
 
Unit 7 standard i o
Unit 7 standard i oUnit 7 standard i o
Unit 7 standard i o
 
Unit 6 bash shell
Unit 6 bash shellUnit 6 bash shell
Unit 6 bash shell
 
Unit 5 vim an advanced text editor
Unit 5 vim an advanced text editorUnit 5 vim an advanced text editor
Unit 5 vim an advanced text editor
 
Unit 4 user and group
Unit 4 user and groupUnit 4 user and group
Unit 4 user and group
 
Unit2 help
Unit2 helpUnit2 help
Unit2 help
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Unit 8 text processing tools

  • 1. RedHat Enterprise Linux Essential Unit 7: Text Processing Tools
  • 2. Objectives Upon completion of this unit, you should be able to:  Use tools for extracting, analyzing and manipulating text data
  • 3. Tools for Extracting Text  File Contents: less and cat  File Excerpts: head and tail  Extract by Column: cut  Extract by Keyword: grep
  • 4. Viewing File Contents less and cat  cat: dump one or more files to STDOUT  Multiple files are concatenated together  less: view file or STDIN one page at a time  Useful commands while viewing: • /text searches for text • n/N jumps to the next/previous match • v opens the file in a text editor  less is the pager used by man
  • 5. Viewing File Excerpts head and tail  head: Display the first 10 lines of a file  Use -n to change number of lines displayed  tail: Display the last 10 lines of a file  Use -n to change number of lines displayed  Use -f to "follow" subsequent additions to the file • Very useful for monitoring log files!
  • 6. Extracting Text by Keyword grep  Prints lines of files or STDIN where a pattern is matched $ grep 'john' /etc/passwd $ date --help | grep year  Use -i to search case-insensitively  Use -n to print line numbers of matches  Use -v to print lines not containing pattern  Use -AX to include the X lines after each match  Use -BX to include the X lines before each match
  • 7. Extracting Text by Column cut  Display specific columns of file or STDIN data $ cut -d: -f1 /etc/passwd $ grep root /etc/passwd | cut -d: -f7  Use -d to specify the column delimiter (default is TAB)  Use -f to specify the column to print  Use -c to cut by characters $ cut -c2-5 /usr/share/dict/words
  • 8. Tools for Analyzing Text  Text Stats: wc  Sorting Text: sort  Comparing Files: diff and patch  Spell Check: aspell
  • 9. Gathering Text Statistics wc (word count)  Counts words, lines, bytes and characters  Can act upon a file or STDIN $ wc story.txt 39 237 1901 story.txt  Use -l for only line count  Use -w for only word count  Use -c for only byte count  Use -m for character count (not displayed)
  • 10. Sorting Text sort  Sorts text to STDOUT - original file unchanged $ sort [options] file(s)  Common options  -r performs a reverse (descending) sort  -n performs a numeric sort  -f ignores (folds) case of characters in strings  -u (unique) removes duplicate lines in output  -t c uses c as a field separator  -k X sorts by c-delimited field X • Can be used multiple times
  • 11. Eliminating Duplicate Lines sort and uniq  sort -u: removes duplicate lines from input  uniq: removes duplicate adjacent lines from input  Use -c to count number of occurrences  Use with sort for best effect: $ sort userlist.txt | uniq -c
  • 12. Comparing Files diff  Compares two files for differences $ diff foo.conf-broken foo.conf-works 5c5 < use_widgets = no --- > use_widgets = yes  Denotes a difference (change) on line 5  Use gvimdiff for graphical diff  Provided by vim-X11 package
  • 13. Duplicating File Changes patch  diff output stored in a file is called a "patchfile"  Use -u for "unified" diff, best in patchfiles  patch duplicates changes in other files (use with care!)  • Use -b to automatically back up changed files $ diff -u foo.conf-broken foo.conf-works > foo.patch $ patch -b foo.conf-broken foo.patch
  • 14. Spell Checking with aspell  Interactively spell-check files: $ aspell check letter.txt  Non-interactively list mis-spelled words in STDIN $ aspell list < letter.txt $ aspell list < letter.txt | wc -l
  • 15. Tools for Manipulating Text tr and sed  Alter (translate) Characters: tr  Converts characters in one set to corresponding characters in another set  Only reads data from STDIN $ tr 'a-z' 'A-Z' < lowercase.txt  Alter Strings: sed  stream editor  Performs search/replace operations on a stream of text  Normally does not alter source file  Use -i.bak to back-up and alter source file
  • 16. sed Examples  Quote search and replace instructions!  sed addresses  sed 's/dog/cat/g' pets  sed '1,50s/dog/cat/g' pets  sed '/digby/,/duncan/s/dog/cat/g' pets  Multiple sed instructions  sed -e 's/dog/cat/' -e 's/hi/lo/' pets  sed -f myedits pets
  • 17. Introduction awk  Field/Column processor  Supports egrep-compatible (POSIX) RegExes  Can return full lines like grep  Awk runs 3 steps:  BEGIN - optional  Body, where the main action(s) take place  END - optional  Multiple body actions can be executed by separating them using semicolons. e.g. '{ print $1; print $2 }'  awk, auto-loops through input stream, regardless of the source of the stream. e.g. STDIN, Pipe, File  Usage: awk '/optional_match/ { action }' file_name | Pipe
  • 18. Example awk  Print a text file awk '{print }' /etc/passwd awk '{print $0}' /etc/passwd  Print specific field awk -F':' '{print $1}' /etc/passwd  Pattern matching awk '$9 == 500 { print $0}' /var/log/httpd/access.log  Print lines containing vmintam,student and khanh awk '/vmintam|student|khanh/' /etc/passwd
  • 19. Example awk (con’t)  print 1st lines from file awk "NR==1{print;exit}" /etc/resolv.conf  Simply Arithmetic awk '{total += $1} END {print total}' earnings.txt  Shell cannot calculate with floating point numberes, but awk can: awk 'BEGIN {printf "%.3fn", 2005.50 / 3}‘  history | awk '{print $2}' | sort | uniq -c | sort -rn | head
  • 20. Special Characters for Complex Searches Regular Expressions  ^ represents beginning of line  $ represents end of line  Character classes as in bash:  [abc], [^abc]  [[:upper:]], [^[:upper:]]  Used by:  grep, sed, less, others