SlideShare une entreprise Scribd logo
1  sur  10
Télécharger pour lire hors ligne
Introdução ao
          HTML::Untemplate
          ou “Transmutando os módulos do CPAN”

          (27 de março de 2013)




quarta-feira, 27 de março de 13                                                       1

O quicktalk seria inicialmente sobre o módulo que escrevi, HTML::Untemplate.
Porém a técnica que facilitou o desenvolvimento do mesmo aplica-se a qualquer outro
módulo. Aliás, HTML::Untemplate já está no CPAN, e documentado.
xpathify
          /html/@lang
          pt-br

          //meta[@http-equiv='Content-Type'][1]/@content
          text/html; charset=UTF-8

          /html/head[1]/title[1]/text()
          Perl | 7Masters

          //link[@rel='shortcut icon'][1]/@href
          http://setemasters.imasters.com.br/wp-content/themes/setemasters/assets/images/favicon.ico

          //link[@rel='shortcut icon'][1]/@type
          image/x-icon

          //meta[@name='robots'][1]/@content
          noindex,nofollow

          //link[@rel='stylesheet'][1]/@href
          http://setemasters.imasters.com.br/wp-content/themes/setemasters/style.css?ver=1363802105

          //link[@rel='stylesheet'][1]/@media
          all

          //link[@rel='stylesheet'][1]/@type
          text/css

          /html/head[1]/script[1]/@src
          http://setemasters.imasters.com.br/wp-includes/js/jquery/jquery.js?ver=1.8.3

          /html/head[1]/script[1]/@type
          text/javascript




quarta-feira, 27 de março de 13                                                                        2

O módulo em questão “reinventa” a apresentação de um documento HTML, transformando a
árvore DOM em pares key/value, aonde keys são strings XPath que identificam a localização
do conteúdo (em values).
untemplate
          /html/head[1]/title[1]/text()
          PHP | 7Masters
          Perl | 7Masters
          .Net | 7Masters
          Java | 7Masters

          //img[@class='attachment-edition-featured wp-post-image'][1]/@src
          http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/php-160x1601.png
          http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/perl-160x160.png
          http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/dotnet-160x160.png
          http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/java-160x160.png

          //img[@class='attachment-edition-featured wp-post-image'][1]/@alt
          php-160x160
          perl-160x160
          dotnet-160x160
          java-160x160

          //h2[@class='featured-edition-title'][1]/strong[1]/text()
          PHP
          Perl
          .Net
          Java

          //span[@class='release-date'][1]/text()
           30 de janeiro de 2013 
           A Partir das 19h00  27 de março de 2013 
           30 de outubro de 2012 
           27 de novembro de 2012 




quarta-feira, 27 de março de 13                                                                      3

Essa representação facilita “diff” de dois (ou mais) páginas em HTML (output de /usr/bin/diff
geralmente é uma bagunça para HTML gerado a partir de um template).
O primeiro passo: um parser de HTML
          http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454




quarta-feira, 27 de março de 13                                                                                              4

Parsers parecem simples. De fato, 80% das funcionalidades de um parser são implementados
em 20% do tempo de desenvolvimento. Fazer um parser de HTML completo usando regexp
está na contramão do sucesso (todavia, pequenos regexp pontuais - OK).
Reutilizando HTML::Tree
           <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"> @0
             <head> @0.0
               <title> @0.0.0
                  "The Perl Programming Language - www.perl.org"
               <meta content="text/html;charset=utf-8" http-equiv="Content-Type" /> @0.0.1
               <link href="http://st.pimg.net/perlweb/favicon.v249dfa7.ico" rel="shortcut icon" /> @0.0.2
               <link href="http://st.pimg.net/perlweb/css/leostyle.vf79cee0.css" rel="stylesheet" type="text/css" /> @0.0.3
               <link href="http://st.pimg.net/perlweb/css/www.ve1fe6bb.css" rel="stylesheet" type="text/css" /> @0.0.4
               <meta content="The Perl Programming Language at Perl.org. Links and other helpful resources for new and experienced
           Perl programmers." name="description" /> @0.0.5
               <script charset="utf-8" src="http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js" type="text/javascript">
           @0.0.6
               <script charset="utf-8" src="http://st.pimg.net/perlweb/js/jquery.corner.v84b7681.js" type="text/javascript"> @0.0.7
               <script charset="utf-8" src="http://st.pimg.net/perlweb/js/leo.v9872b9c.js" type="text/javascript"> @0.0.8
             <body class="section_home"> @0.1
               <div id="header_holder"> @0.1.0
                  <div class="sub_holder"> @0.1.0.0
                    <div id="page_image"> @0.1.0.0.0
                    <h1> @0.1.0.0.1
                      " The Perl Programming Language "
                    <div id="logo_holder"> @0.1.0.0.2
                      <a href="/"> @0.1.0.0.2.0
                        <img align="right" alt="Perl, modern programming" height="65" id="logo" src="http://st.pimg.net/perlweb/
           images/camel_head.v25e738a.png" /> @0.1.0.0.2.0.0
                      " "
                      <span> @0.1.0.0.2.2
                        "www.perl.org"
               <div id="nav"> @0.1.1




quarta-feira, 27 de março de 13                                                                                                        5

HTML::Tree existe e é ativamente mantido desde 1998. Extremamente robusto, tolera bem
input “disforme”. Implementa o método address() que aponta a localização do nó na árvore
DOM. Bom ponto de partida!
SUPER:: power

           1    package HTML::Linear;
           2    use base 'HTML::TreeBuilder';
           3    use strict;
           4    use warnings 'all';
           5
           6    sub eof {
           7        my ($self, @args) = @_;
           8        my $retval = $self->SUPER::eof(@args);
           9
          10            $self->deparse($self, []);
          11            ...;
          12
          13            return $retval;
          14    }




quarta-feira, 27 de março de 13                                                         6

Modelo de herança “tradicional” presente em qualquer linguagem que implementa o
paradigma OOP. No caso do Perl, SUPER:: “cru” muda o foco de “o que fazer” para “como
fazer” (too many code!).
Aspect-Oriented Programming através de
          Moose Method Modifiers

           1    package HTML::Linear;
           2    use Moo;
           3    extends 'HTML::TreeBuilder';
           4
           5    after eof => sub {
           6        my ($self) = @_;
           7
           8            $self->deparse($self, []);
           9            ...;
          10    };




quarta-feira, 27 de março de 13                                                            7

AOP existe em Java. No Perl, o maior representante é o Moose, que introduziu os “method
modifiers”. Menos código, mais abstração. Mais foco em “o que fazer” do que “como fazer”.
Moose::Manual::MethodModifiers para
          mais detalhes.
           1   package Example;
           2
           3   use Moo; # Moose, Mouse?
           4
           5   sub foo {
           6       print "           foon";
           7   }
           8
           9   before foo         => sub { print "about to call foon" };
          10   after foo          => sub { print "just called foon"   };
          11
          12   around foo => sub {
          13       my $orig = shift;
          14       my $self = shift;
          15
          16         print "       I'm around foon";
          17
          18         $orig->($self => @_);
          19
          20         print "       I'm still around foon";
          21   };




quarta-feira, 27 de março de 13                                                                 8

“before” altera os parâmetros da função. “after” altera o valor do ‘return’. “around” seria o
equivalente ao SUPER:: que funciona através de um callback.
Moose, Mouse, Any::Moose, Moo...
          Obs.: Moo dispensa ::NonMoose!




                                  CPAN: all your method are
                                         belong to us!


quarta-feira, 27 de março de 13                                                                                9

Alteração dos métodos é possível, nativamente, em qualquer módulo do CPAN que utilize Moose ou Mouse.
Outros módulos podem ser assimilados via Mo[ou]seX::NonMoose.
Já o Moo é capaz de aplicar after/before/around sem nenhum “artefato externo”; inclusive nos módulos do tipo
XS!
Obrigado!

                Stanislaw Pusep <stas@sysd.org>
                blogs.perl.org/users/stas
                coderwall.com/creaktive
                github.com/creaktive
                twitter.com/creaktive




quarta-feira, 27 de março de 13                   10

Contenu connexe

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Introdução ao HTML::Untemplate, ou “Transmutando os módulos do CPAN”

  • 1. Introdução ao HTML::Untemplate ou “Transmutando os módulos do CPAN” (27 de março de 2013) quarta-feira, 27 de março de 13 1 O quicktalk seria inicialmente sobre o módulo que escrevi, HTML::Untemplate. Porém a técnica que facilitou o desenvolvimento do mesmo aplica-se a qualquer outro módulo. Aliás, HTML::Untemplate já está no CPAN, e documentado.
  • 2. xpathify /html/@lang pt-br //meta[@http-equiv='Content-Type'][1]/@content text/html; charset=UTF-8 /html/head[1]/title[1]/text() Perl | 7Masters //link[@rel='shortcut icon'][1]/@href http://setemasters.imasters.com.br/wp-content/themes/setemasters/assets/images/favicon.ico //link[@rel='shortcut icon'][1]/@type image/x-icon //meta[@name='robots'][1]/@content noindex,nofollow //link[@rel='stylesheet'][1]/@href http://setemasters.imasters.com.br/wp-content/themes/setemasters/style.css?ver=1363802105 //link[@rel='stylesheet'][1]/@media all //link[@rel='stylesheet'][1]/@type text/css /html/head[1]/script[1]/@src http://setemasters.imasters.com.br/wp-includes/js/jquery/jquery.js?ver=1.8.3 /html/head[1]/script[1]/@type text/javascript quarta-feira, 27 de março de 13 2 O módulo em questão “reinventa” a apresentação de um documento HTML, transformando a árvore DOM em pares key/value, aonde keys são strings XPath que identificam a localização do conteúdo (em values).
  • 3. untemplate /html/head[1]/title[1]/text() PHP | 7Masters Perl | 7Masters .Net | 7Masters Java | 7Masters //img[@class='attachment-edition-featured wp-post-image'][1]/@src http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/php-160x1601.png http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/perl-160x160.png http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/dotnet-160x160.png http://setemasters.imasters.com.br/wp-content/uploads/sites/3/2013/02/java-160x160.png //img[@class='attachment-edition-featured wp-post-image'][1]/@alt php-160x160 perl-160x160 dotnet-160x160 java-160x160 //h2[@class='featured-edition-title'][1]/strong[1]/text() PHP Perl .Net Java //span[@class='release-date'][1]/text()  30 de janeiro de 2013   A Partir das 19h00  27 de março de 2013   30 de outubro de 2012   27 de novembro de 2012  quarta-feira, 27 de março de 13 3 Essa representação facilita “diff” de dois (ou mais) páginas em HTML (output de /usr/bin/diff geralmente é uma bagunça para HTML gerado a partir de um template).
  • 4. O primeiro passo: um parser de HTML http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 quarta-feira, 27 de março de 13 4 Parsers parecem simples. De fato, 80% das funcionalidades de um parser são implementados em 20% do tempo de desenvolvimento. Fazer um parser de HTML completo usando regexp está na contramão do sucesso (todavia, pequenos regexp pontuais - OK).
  • 5. Reutilizando HTML::Tree <html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"> @0 <head> @0.0 <title> @0.0.0 "The Perl Programming Language - www.perl.org" <meta content="text/html;charset=utf-8" http-equiv="Content-Type" /> @0.0.1 <link href="http://st.pimg.net/perlweb/favicon.v249dfa7.ico" rel="shortcut icon" /> @0.0.2 <link href="http://st.pimg.net/perlweb/css/leostyle.vf79cee0.css" rel="stylesheet" type="text/css" /> @0.0.3 <link href="http://st.pimg.net/perlweb/css/www.ve1fe6bb.css" rel="stylesheet" type="text/css" /> @0.0.4 <meta content="The Perl Programming Language at Perl.org. Links and other helpful resources for new and experienced Perl programmers." name="description" /> @0.0.5 <script charset="utf-8" src="http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js" type="text/javascript"> @0.0.6 <script charset="utf-8" src="http://st.pimg.net/perlweb/js/jquery.corner.v84b7681.js" type="text/javascript"> @0.0.7 <script charset="utf-8" src="http://st.pimg.net/perlweb/js/leo.v9872b9c.js" type="text/javascript"> @0.0.8 <body class="section_home"> @0.1 <div id="header_holder"> @0.1.0 <div class="sub_holder"> @0.1.0.0 <div id="page_image"> @0.1.0.0.0 <h1> @0.1.0.0.1 " The Perl Programming Language " <div id="logo_holder"> @0.1.0.0.2 <a href="/"> @0.1.0.0.2.0 <img align="right" alt="Perl, modern programming" height="65" id="logo" src="http://st.pimg.net/perlweb/ images/camel_head.v25e738a.png" /> @0.1.0.0.2.0.0 " " <span> @0.1.0.0.2.2 "www.perl.org" <div id="nav"> @0.1.1 quarta-feira, 27 de março de 13 5 HTML::Tree existe e é ativamente mantido desde 1998. Extremamente robusto, tolera bem input “disforme”. Implementa o método address() que aponta a localização do nó na árvore DOM. Bom ponto de partida!
  • 6. SUPER:: power 1 package HTML::Linear; 2 use base 'HTML::TreeBuilder'; 3 use strict; 4 use warnings 'all'; 5 6 sub eof { 7 my ($self, @args) = @_; 8 my $retval = $self->SUPER::eof(@args); 9 10 $self->deparse($self, []); 11 ...; 12 13 return $retval; 14 } quarta-feira, 27 de março de 13 6 Modelo de herança “tradicional” presente em qualquer linguagem que implementa o paradigma OOP. No caso do Perl, SUPER:: “cru” muda o foco de “o que fazer” para “como fazer” (too many code!).
  • 7. Aspect-Oriented Programming através de Moose Method Modifiers 1 package HTML::Linear; 2 use Moo; 3 extends 'HTML::TreeBuilder'; 4 5 after eof => sub { 6 my ($self) = @_; 7 8 $self->deparse($self, []); 9 ...; 10 }; quarta-feira, 27 de março de 13 7 AOP existe em Java. No Perl, o maior representante é o Moose, que introduziu os “method modifiers”. Menos código, mais abstração. Mais foco em “o que fazer” do que “como fazer”.
  • 8. Moose::Manual::MethodModifiers para mais detalhes. 1 package Example; 2 3 use Moo; # Moose, Mouse? 4 5 sub foo { 6 print " foon"; 7 } 8 9 before foo => sub { print "about to call foon" }; 10 after foo => sub { print "just called foon" }; 11 12 around foo => sub { 13 my $orig = shift; 14 my $self = shift; 15 16 print " I'm around foon"; 17 18 $orig->($self => @_); 19 20 print " I'm still around foon"; 21 }; quarta-feira, 27 de março de 13 8 “before” altera os parâmetros da função. “after” altera o valor do ‘return’. “around” seria o equivalente ao SUPER:: que funciona através de um callback.
  • 9. Moose, Mouse, Any::Moose, Moo... Obs.: Moo dispensa ::NonMoose! CPAN: all your method are belong to us! quarta-feira, 27 de março de 13 9 Alteração dos métodos é possível, nativamente, em qualquer módulo do CPAN que utilize Moose ou Mouse. Outros módulos podem ser assimilados via Mo[ou]seX::NonMoose. Já o Moo é capaz de aplicar after/before/around sem nenhum “artefato externo”; inclusive nos módulos do tipo XS!
  • 10. Obrigado! Stanislaw Pusep <stas@sysd.org> blogs.perl.org/users/stas coderwall.com/creaktive github.com/creaktive twitter.com/creaktive quarta-feira, 27 de março de 13 10