SlideShare une entreprise Scribd logo
record-oriented grep
mlr-grep
ryo1kato
@github
@gmail
@twitter
@facebook
motivation
Want to "grep" multi-
line entries in a file
✦ multi-line log files, or *.ini, etc.
✦ semi-structured text like an
ifconfig output
2
for example...
$ cat data.txt

[one]

two

three



[foo]

bar

baz



[hoge]

piyo

huga
3
}
want to extract entire
record lines that contains
a pattern, where a record
Typical way
✦ grep -A 12 -B 34 -C 56
✦ pcregrep --multiline
✦ awk -v RS='nn' "/$re/"
✦ perl -e …
4
But
✦ pcregrep : You often need a very long
regex.
✦ Note that it's NOT about finding multiline pattern
(a pattern containing 'n'), but extract multiline
record containing a pattern.
✦ AWK : Possible with using RS (need gawk)
✦ Actually it's difficult to do it right using
pcregrep or awk.
✦ perl, python : well, if you go that far ...
5
But, do you want to write a one-liner / X script for these?
✦ zgrep
✦ grep -c (--count)
✦ grep -i (--ignore-case)
✦ grep -v (--invert-match)
✦ grep --color
6
So I wrote it for you!
✦mlr-grep
✦ Multi-Line Record Grep
✦ AWK, Haskell, Python
✦ named amlgrep, hmlgrep, and pmlgrep
✦ They have almost identical
features.
7
$ amlgrep 'ba' …



[foo]

bar

baz

8
e.g.
} A whole record
containing the pattern
✦ amlgrep - AWK implementation
✦ Needs gawk.
✦ Fastest
✦ --rs regex is slightly broken in RHEL5.
✦ Auto extract *.gz, *.bz2, and *.xz files
✦ --color, --count, --invert-match
✦ AND, OR of multiple keywords.
✦ hmlgrep - Haskell implementation
✦ Has almost same feature set as AWK ver.
✦ Sometimes 1.5 2x slower, with files with short lines and many
matches.
✦ pymlgrep - Python implementation
✦ Slowest (4x of AWK version)
✦ Doesn't support multiple keywords
9
Multiple Keywords
10
$ amlgrep [--or] h t [FILE]



[one]

two

three



[hoge]

piyo

huga
≒ egrep 'h|t',
but fewer key types.
11
$ amlgrep --and h t [FILE]



[one]

two

three
egrep 'h.*t|t.*h' 

but fewer key types
12
--timestamp
multi-line log files
with each entry begins
with timestamps
13
$ cat datetime.log

2014-01-23 12:34:56 log 1

foo

bar

2014-01-24 12:34:57 log 2

one

two

2014-01-25 12:34:58 log 3

hoge

piyo
14
$ amlgrep -t 'one' … 

2014-01-24 12:34:57 log 2

one

two

15
$ amlgrep -t --dump foo
gawk -W re-interval -F n -v RS='n(((Mon|
Tue|Wed|Thu|Fri|Sat),?[ t]+)?(Jan|Feb|
Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),?
[ t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5]
[0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?|
20[0-9][0-9]-(0[0-9]|11|12)-(0[1-9]|[12]
[0-9]|3[01]))' '-v' 'ORS=' 'oldRT $0 ~ /
foo/ {i++;if(substr(oldRT,1,1)=="n")
{h=substr(oldRT,2)}else{h=oldRT};;gsub(/
foo/,"&",h);print h;gsub(/foo/,
"&");print;if(RT != "")printf "n"}
{oldRT=RT} END{if (i>0){exit 0}else{exit
1}}'
16
Change the record separator
✦ --rs '^$'
✦ Empty lines
✦ --rs '^----'
✦ Four or more dash
✦ --rs '^[[:alnum]]'
✦ Alphanumeric character on the first column. (For ifconfig
like output)
✦ --rs '^['
✦ A line begins with '[' (For *.ini files)
✦ --timestamp
≒ -rs '^(((Mon|Tue|Wed|Thu|Fri|Sat),?[t]+)?(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),?[
t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5][0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?|20[0-9][0-9]-
(0[0-9]|11|12)-(0[1-9]|[12][0-9]|3[01]))'
17
http://github.com/
ryo1kato/mlr-grep
18

Contenu connexe

Tendances

Bash4
Bash4Bash4
Bash4
apsegundo
 
Top 10 Perl Performance Tips
Top 10 Perl Performance TipsTop 10 Perl Performance Tips
Top 10 Perl Performance Tips
Perrin Harkins
 
Developing High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoDeveloping High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & Go
Chris Stivers
 
Parsec
ParsecParsec
Parsec
Phil Freeman
 
tokyotalk
tokyotalktokyotalk
tokyotalk
Hiroshi Ono
 
Compiler basics: lisp to assembly
Compiler basics: lisp to assemblyCompiler basics: lisp to assembly
Compiler basics: lisp to assembly
Phil Eaton
 
LCDS - State Presentation
LCDS - State PresentationLCDS - State Presentation
LCDS - State Presentation
Ruochun Tzeng
 
faastCrystal
faastCrystalfaastCrystal
faastCrystal
Sachirou Inoue
 
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicumBsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Scott Tsai
 
ulimit
ulimit ulimit
ulimit
hiyelata
 
Mastering the Unix Command Line
Mastering the Unix Command LineMastering the Unix Command Line
Mastering the Unix Command Line
Howard Mao
 
Phil Bartie QGIS PLPython
Phil Bartie QGIS PLPythonPhil Bartie QGIS PLPython
Phil Bartie QGIS PLPython
Ross McDonald
 
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Maarten Mulders
 
Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010
Dirkjan Bussink
 
Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)
Fabio Akita
 
Profiling with Devel::NYTProf
Profiling with Devel::NYTProfProfiling with Devel::NYTProf
Profiling with Devel::NYTProf
bobcatfish
 
Parboiled explained
Parboiled explainedParboiled explained
Parboiled explained
Paul Popoff
 
Low Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPFLow Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPF
Akshay Kapoor
 
Easy to Learn C language program
Easy to Learn C language programEasy to Learn C language program
Easy to Learn C language program
Hitarth Patel
 
Scaling FastAGI Applications with Go
Scaling FastAGI Applications with GoScaling FastAGI Applications with Go
Scaling FastAGI Applications with Go
Digium
 

Tendances (20)

Bash4
Bash4Bash4
Bash4
 
Top 10 Perl Performance Tips
Top 10 Perl Performance TipsTop 10 Perl Performance Tips
Top 10 Perl Performance Tips
 
Developing High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoDeveloping High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & Go
 
Parsec
ParsecParsec
Parsec
 
tokyotalk
tokyotalktokyotalk
tokyotalk
 
Compiler basics: lisp to assembly
Compiler basics: lisp to assemblyCompiler basics: lisp to assembly
Compiler basics: lisp to assembly
 
LCDS - State Presentation
LCDS - State PresentationLCDS - State Presentation
LCDS - State Presentation
 
faastCrystal
faastCrystalfaastCrystal
faastCrystal
 
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicumBsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
 
ulimit
ulimit ulimit
ulimit
 
Mastering the Unix Command Line
Mastering the Unix Command LineMastering the Unix Command Line
Mastering the Unix Command Line
 
Phil Bartie QGIS PLPython
Phil Bartie QGIS PLPythonPhil Bartie QGIS PLPython
Phil Bartie QGIS PLPython
 
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)
 
Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010
 
Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)
 
Profiling with Devel::NYTProf
Profiling with Devel::NYTProfProfiling with Devel::NYTProf
Profiling with Devel::NYTProf
 
Parboiled explained
Parboiled explainedParboiled explained
Parboiled explained
 
Low Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPFLow Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPF
 
Easy to Learn C language program
Easy to Learn C language programEasy to Learn C language program
Easy to Learn C language program
 
Scaling FastAGI Applications with Go
Scaling FastAGI Applications with GoScaling FastAGI Applications with Go
Scaling FastAGI Applications with Go
 

Similaire à multi-line record grep

One-Liners to Rule Them All
One-Liners to Rule Them AllOne-Liners to Rule Them All
One-Liners to Rule Them All
egypt
 
Perl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one linersPerl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one liners
Kirk Kimmel
 
What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?
lichtkind
 
shellScriptAlt.pptx
shellScriptAlt.pptxshellScriptAlt.pptx
shellScriptAlt.pptx
NiladriDey18
 
SymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performancesSymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performances
julien pauli
 
Symfony live 2017_php7_performances
Symfony live 2017_php7_performancesSymfony live 2017_php7_performances
Symfony live 2017_php7_performances
julien pauli
 
How Xslate Works
How Xslate WorksHow Xslate Works
How Xslate Works
Goro Fuji
 
Getting Started with the Alma API
Getting Started with the Alma APIGetting Started with the Alma API
Getting Started with the Alma API
Kyle Banerjee
 
Profiling php5 to php7
Profiling php5 to php7Profiling php5 to php7
Profiling php5 to php7
julien pauli
 
Cli the other SAPI confoo11
Cli the other SAPI confoo11Cli the other SAPI confoo11
Cli the other SAPI confoo11
Combell NV
 
Gun make
Gun makeGun make
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
Tim Bunce
 
DevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung FooDevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung Foo
brian_dailey
 
Fundamental of Shell Programming
Fundamental of Shell ProgrammingFundamental of Shell Programming
Fundamental of Shell Programming
Rahul Hada
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data Records
Workhorse Computing
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Serialization in Go
Serialization in GoSerialization in Go
Serialization in Go
Albert Strasheim
 
PigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxPigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptx
Rahul Borate
 
Love Your Command Line
Love Your Command LineLove Your Command Line
Love Your Command Line
Liz Henry
 
Unleash your inner console cowboy
Unleash your inner console cowboyUnleash your inner console cowboy
Unleash your inner console cowboy
Kenneth Geisshirt
 

Similaire à multi-line record grep (20)

One-Liners to Rule Them All
One-Liners to Rule Them AllOne-Liners to Rule Them All
One-Liners to Rule Them All
 
Perl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one linersPerl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one liners
 
What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?
 
shellScriptAlt.pptx
shellScriptAlt.pptxshellScriptAlt.pptx
shellScriptAlt.pptx
 
SymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performancesSymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performances
 
Symfony live 2017_php7_performances
Symfony live 2017_php7_performancesSymfony live 2017_php7_performances
Symfony live 2017_php7_performances
 
How Xslate Works
How Xslate WorksHow Xslate Works
How Xslate Works
 
Getting Started with the Alma API
Getting Started with the Alma APIGetting Started with the Alma API
Getting Started with the Alma API
 
Profiling php5 to php7
Profiling php5 to php7Profiling php5 to php7
Profiling php5 to php7
 
Cli the other SAPI confoo11
Cli the other SAPI confoo11Cli the other SAPI confoo11
Cli the other SAPI confoo11
 
Gun make
Gun makeGun make
Gun make
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
DevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung FooDevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung Foo
 
Fundamental of Shell Programming
Fundamental of Shell ProgrammingFundamental of Shell Programming
Fundamental of Shell Programming
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data Records
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
 
Serialization in Go
Serialization in GoSerialization in Go
Serialization in Go
 
PigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxPigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptx
 
Love Your Command Line
Love Your Command LineLove Your Command Line
Love Your Command Line
 
Unleash your inner console cowboy
Unleash your inner console cowboyUnleash your inner console cowboy
Unleash your inner console cowboy
 

Dernier

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
Ayan Halder
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
Peter Muessig
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
Mobile app Development Services | Drona Infotech
Mobile app Development Services  | Drona InfotechMobile app Development Services  | Drona Infotech
Mobile app Development Services | Drona Infotech
Drona Infotech
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
YousufSait3
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 

Dernier (20)

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
Mobile app Development Services | Drona Infotech
Mobile app Development Services  | Drona InfotechMobile app Development Services  | Drona Infotech
Mobile app Development Services | Drona Infotech
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
zOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL DifferenceszOS Mainframe JES2-JES3 JCL-JECL Differences
zOS Mainframe JES2-JES3 JCL-JECL Differences
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 

multi-line record grep

  • 2. motivation Want to "grep" multi- line entries in a file ✦ multi-line log files, or *.ini, etc. ✦ semi-structured text like an ifconfig output 2
  • 3. for example... $ cat data.txt
 [one]
 two
 three
 
 [foo]
 bar
 baz
 
 [hoge]
 piyo
 huga 3 } want to extract entire record lines that contains a pattern, where a record
  • 4. Typical way ✦ grep -A 12 -B 34 -C 56 ✦ pcregrep --multiline ✦ awk -v RS='nn' "/$re/" ✦ perl -e … 4
  • 5. But ✦ pcregrep : You often need a very long regex. ✦ Note that it's NOT about finding multiline pattern (a pattern containing 'n'), but extract multiline record containing a pattern. ✦ AWK : Possible with using RS (need gawk) ✦ Actually it's difficult to do it right using pcregrep or awk. ✦ perl, python : well, if you go that far ... 5
  • 6. But, do you want to write a one-liner / X script for these? ✦ zgrep ✦ grep -c (--count) ✦ grep -i (--ignore-case) ✦ grep -v (--invert-match) ✦ grep --color 6
  • 7. So I wrote it for you! ✦mlr-grep ✦ Multi-Line Record Grep ✦ AWK, Haskell, Python ✦ named amlgrep, hmlgrep, and pmlgrep ✦ They have almost identical features. 7
  • 8. $ amlgrep 'ba' …
 
 [foo]
 bar
 baz
 8 e.g. } A whole record containing the pattern
  • 9. ✦ amlgrep - AWK implementation ✦ Needs gawk. ✦ Fastest ✦ --rs regex is slightly broken in RHEL5. ✦ Auto extract *.gz, *.bz2, and *.xz files ✦ --color, --count, --invert-match ✦ AND, OR of multiple keywords. ✦ hmlgrep - Haskell implementation ✦ Has almost same feature set as AWK ver. ✦ Sometimes 1.5 2x slower, with files with short lines and many matches. ✦ pymlgrep - Python implementation ✦ Slowest (4x of AWK version) ✦ Doesn't support multiple keywords 9
  • 11. $ amlgrep [--or] h t [FILE]
 
 [one]
 two
 three
 
 [hoge]
 piyo
 huga ≒ egrep 'h|t', but fewer key types. 11
  • 12. $ amlgrep --and h t [FILE]
 
 [one]
 two
 three egrep 'h.*t|t.*h' 
 but fewer key types 12
  • 13. --timestamp multi-line log files with each entry begins with timestamps 13
  • 14. $ cat datetime.log
 2014-01-23 12:34:56 log 1
 foo
 bar
 2014-01-24 12:34:57 log 2
 one
 two
 2014-01-25 12:34:58 log 3
 hoge
 piyo 14
  • 15. $ amlgrep -t 'one' … 
 2014-01-24 12:34:57 log 2
 one
 two
 15
  • 16. $ amlgrep -t --dump foo gawk -W re-interval -F n -v RS='n(((Mon| Tue|Wed|Thu|Fri|Sat),?[ t]+)?(Jan|Feb| Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),? [ t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5] [0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?| 20[0-9][0-9]-(0[0-9]|11|12)-(0[1-9]|[12] [0-9]|3[01]))' '-v' 'ORS=' 'oldRT $0 ~ / foo/ {i++;if(substr(oldRT,1,1)=="n") {h=substr(oldRT,2)}else{h=oldRT};;gsub(/ foo/,"&",h);print h;gsub(/foo/, "&");print;if(RT != "")printf "n"} {oldRT=RT} END{if (i>0){exit 0}else{exit 1}}' 16
  • 17. Change the record separator ✦ --rs '^$' ✦ Empty lines ✦ --rs '^----' ✦ Four or more dash ✦ --rs '^[[:alnum]]' ✦ Alphanumeric character on the first column. (For ifconfig like output) ✦ --rs '^[' ✦ A line begins with '[' (For *.ini files) ✦ --timestamp ≒ -rs '^(((Mon|Tue|Wed|Thu|Fri|Sat),?[t]+)?(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),?[ t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5][0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?|20[0-9][0-9]- (0[0-9]|11|12)-(0[1-9]|[12][0-9]|3[01]))' 17