SlideShare une entreprise Scribd logo
1  sur  21
Wildcard Match
Mayank Gupta and Rajpal Singh
0in FE Noida


March, 2012
Agenda

         Introduction
         Motivation
         New Flow
         Class Hierarchy




                                         © 2011 Mentor Graphics Corp. Company Confidential
2   Mayank, Wildcard Match, March 2012   www.mentor.com
Motivation

         Efficiently matching a regular expression in a RTL design.
         Use NELT to do matching.
             — Previous flow creates a separate data structure altogether to do
               matching.
             — Using NELT hierarchy would reduce memory usage.
         Enhance Functionality.




                                                       © 2011 Mentor Graphics Corp. Company Confidential
3   Mayank, Wildcard Match, March 2012                 www.mentor.com
New Flow

                                         • Tokenizing Pattern
                            Tokenize
                                         • Store it in appropriate Data structure



                                         • Start matching on NELT.
                            Match on
                             NELT


                                         • Do matching on UTG.
                                         • For Record/Arrays.
                            Match on
                              UTG




                                                                        © 2011 Mentor Graphics Corp. Company Confidential
4   Mayank, Wildcard Match, March 2012                                  www.mentor.com
New Flow

    STEP 1 : Tokenize wildcard


    Eg : Wildcard is ―a*.b*.*.*c*‖


                 a*                  b*   *   *c*




                                                    © 2011 Mentor Graphics Corp. Company Confidential
5   Mayank, Wildcard Match, March 2012              www.mentor.com
New Flow

    STEP 2 : Start matching nodes in NELT


    -     Match current token with top‘s children                       top

                                                   a1                                                        b1
     a*             b*            *      *c*
                                               a    aa            b2                             b                c1

                                                           b                   c1
                                                                                                                  c
                                                                                C



                                                         © 2011 Mentor Graphics Corp. Company Confidential
6   Mayank, Wildcard Match, March 2012                   www.mentor.com
New Flow

    -     Match ―b*‖ with children of a1

                                                                       top

                                                   a1                                                       b1

                                               a   aa            b2
        a*           b*             *    *c*                                                    b                c1

                                                          b                   c1
                                                                                                                 c
                                                                               C



                                                        © 2011 Mentor Graphics Corp. Company Confidential
7   Mayank, Wildcard Match, March 2012                  www.mentor.com
New Flow

    -     Match ―*‖ with children of b2

                                                                       top

                                                   a1                                                       b1

                                               a   aa            b2                             b                c1

        a*           b*             *    *c*              b                   c1
                                                                                                                 c
                                                                               c



                                                        © 2011 Mentor Graphics Corp. Company Confidential
8   Mayank, Wildcard Match, March 2012                  www.mentor.com
New Flow

    -     Match ―*c*‖ and ―*‖ with children of c1

                                                                           top

                                                       a1                                                       b1

                                                   a   aa            b2                             b                c1

        a*           b*             *        *c*              b                   c1
                                                                                                                     c
                                                                                   c
        a*            b*                 *   *c*
                                                                        Final Match
                                                            © 2011 Mentor Graphics Corp. Company Confidential
9   Mayank, Wildcard Match, March 2012                      www.mentor.com
New Flow

          Step 3 : Match on UTG hierarchy

              — If we hit a record/Array/Subtype we match using UTG
                Hierarchy.




                                                     © 2011 Mentor Graphics Corp. Company Confidential
10   Mayank, Wildcard Match, March 2012              www.mentor.com
Why match using UTG?

          Because we do not create NELT for record symbols.

                                                    top



                                          a1        b1                                    Record1



                              a                b2   b       Record2                  f1                      f2


                                               b          f21            f22


                                                            No NELT for this portion

          Hence we use UTG for matching inside records.
                                                                © 2011 Mentor Graphics Corp. Company Confidential
11   Mayank, Wildcard Match, March 2012                         www.mentor.com
Tokenizing a wildcard

          A token can be of two types :
            — String Token
            — Star Token                                         TokenBase

          Star token is simply a ‗*‘
          String token is anything other     StringToken                                    StarToken
           than ‗*‘
          Eg : ―a*.b*.*.*c*‖
            — String Tokens are a*,b*,*c*             Class Hierarchy
            — Star token is only 1 here - *




                                                 © 2011 Mentor Graphics Corp. Company Confidential
12   Mayank, Wildcard Match, March 2012          www.mentor.com
How to do Hierarchical Match?

          How we ensure that we match
           hierarchy in case of ‗*‘
                                                                    Star Token
          There are two types of ‗*‘
            — Local Match Star
            — Hierarchical Match Star          Local Match Star
                                                                                               Hierarchical
                                                                                               Match Star

          Local Star matches only the
           nodes at current level               Two types of ‗*‘ in regex
          Hierarchical Star matches all the
           nodes at current and lower level.




                                                    © 2011 Mentor Graphics Corp. Company Confidential
13   Mayank, Wildcard Match, March 2012             www.mentor.com
How to do Hierarchical Match?

     Example.
          If pattern is ―a*.b*.*.*c*‖
                                                                                       Star Token


          It will be converted to                                                                                Hierarchical
                                                                  Local Match Star
                                                                                                                  Match Star
         a*        H*        b*           H*   L*   H*   *c* H*

                                                                   Two types of ‗*‘ in regex




                                                                       © 2011 Mentor Graphics Corp. Company Confidential
14   Mayank, Wildcard Match, March 2012                                www.mentor.com
Organizing Tokens

          Class NeltTokenArray contains
              — vector<TokenBase*>                               a*            b*                    *                      *c*

          Class NeltTokenIndex contains
            — NeltTokenArray*                                          a*      b*             *               *c*
            — Index (current token)
                                                                       Index
          Class NeltRegexExpr contains
            — Vector<NeltTokenIndex*>

         a*       b*         *            *c*   a*      b*   *   *c*        a*          b*          *           *c*
         Index                                  Index                       Index




                                                                               © 2011 Mentor Graphics Corp. Company Confidential
15   Mayank, Wildcard Match, March 2012                                        www.mentor.com
Manager Classes

          Class NeltRegexMgr is used to match on NELT.
          Class NeltUtgRegexMgr is used to match on UTG .
          It is the responsibility of NeltRegexMgr to invoke NeltUtgRegexMgr.




                                                      © 2011 Mentor Graphics Corp. Company Confidential
16   Mayank, Wildcard Match, March 2012               www.mentor.com
C++ classes

          NeltRegexMgr                     NeltTraverse                        NeltUtgTypeTraverse
              — NeltTraverse
          NeltUtgRegexMgr
                                            NeltRegexMgr                            NeltUtgRegexMgr
              — NeltTypeTraverse
          NeltRegexExpr
          NeltTokenIndex                 NeltTokenIndex                                    TokenBase
              — NeltUtgTokenIndex
          NeltTokenArray                 NeltUtgTokenIndex              StringToken                              StarToken
          TokenBase
              — StringToken
              — StarToken



                                                              © 2011 Mentor Graphics Corp. Company Confidential
17   Mayank, Wildcard Match, March 2012                       www.mentor.com
Source Code

          Source files
              — src/commonpp/nelt
                        –   neltRegexMgr.cxx
                        –   neltRegexMgr.hxx
                        –   neltUtgRegexMgr.cxx
                        –   neltUtgRegexMgr.hxx
                        –   neltRegexUtils.cxx
                        –   neltRegexUtils.hxx




                                                  © 2011 Mentor Graphics Corp. Company Confidential
18   Mayank, Wildcard Match, March 2012           www.mentor.com
Performance data

                S.No                Test Case   Old Flow Time(s)      New Flow Time(s)
                    1             Parme         161                   484
                    2             Oracle        1814                  1658




                                                              © 2011 Mentor Graphics Corp. Company Confidential
19   Mayank, Wildcard Match, March 2012                        www.mentor.com
Future Work

          Add a new class SliceToken deriving from TokenBase to
           store tokens of the form tok[slice]
          Avoid duplicate matching
              — Eg : ―a*.a*b‖ is expanded into two patterns :
                1) a*.H*.a*b
                2) a*.H*.a*.H*.*b

                    Both the patterns have ―a*.H*‖ in the beginning and hence it gets
                    matched twice.




                                                           © 2011 Mentor Graphics Corp. Company Confidential
20   Mayank, Wildcard Match, March 2012                    www.mentor.com
www.mentor.com




                                                       © 2011 Mentor Graphics Corp. Company Confidential
21   Mayank, Wildcard Match, March 2012                www.mentor.com

Contenu connexe

Tendances

1362 1363 1352 1353 ball valve with elect and pneu actuator
1362 1363 1352 1353 ball valve with elect and pneu actuator1362 1363 1352 1353 ball valve with elect and pneu actuator
1362 1363 1352 1353 ball valve with elect and pneu actuator
Chaitannya Mahatme
 
Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...
Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...
Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...
TERN Australia
 
Acordes piano
Acordes pianoAcordes piano
Acordes piano
cmusica
 
Tajmahal e a4
Tajmahal e a4Tajmahal e a4
Tajmahal e a4
paciffic
 

Tendances (6)

1362 1363 1352 1353 ball valve with elect and pneu actuator
1362 1363 1352 1353 ball valve with elect and pneu actuator1362 1363 1352 1353 ball valve with elect and pneu actuator
1362 1363 1352 1353 ball valve with elect and pneu actuator
 
Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...
Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...
Greg Lollback_Variation in biomass estimation among replicated PPBio PTER plo...
 
Ball spline
Ball splineBall spline
Ball spline
 
Metrics
MetricsMetrics
Metrics
 
Acordes piano
Acordes pianoAcordes piano
Acordes piano
 
Tajmahal e a4
Tajmahal e a4Tajmahal e a4
Tajmahal e a4
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Wildcard expansion

  • 1. Wildcard Match Mayank Gupta and Rajpal Singh 0in FE Noida March, 2012
  • 2. Agenda  Introduction  Motivation  New Flow  Class Hierarchy © 2011 Mentor Graphics Corp. Company Confidential 2 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 3. Motivation  Efficiently matching a regular expression in a RTL design.  Use NELT to do matching. — Previous flow creates a separate data structure altogether to do matching. — Using NELT hierarchy would reduce memory usage.  Enhance Functionality. © 2011 Mentor Graphics Corp. Company Confidential 3 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 4. New Flow • Tokenizing Pattern Tokenize • Store it in appropriate Data structure • Start matching on NELT. Match on NELT • Do matching on UTG. • For Record/Arrays. Match on UTG © 2011 Mentor Graphics Corp. Company Confidential 4 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 5. New Flow STEP 1 : Tokenize wildcard Eg : Wildcard is ―a*.b*.*.*c*‖ a* b* * *c* © 2011 Mentor Graphics Corp. Company Confidential 5 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 6. New Flow STEP 2 : Start matching nodes in NELT - Match current token with top‘s children top a1 b1 a* b* * *c* a aa b2 b c1 b c1 c C © 2011 Mentor Graphics Corp. Company Confidential 6 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 7. New Flow - Match ―b*‖ with children of a1 top a1 b1 a aa b2 a* b* * *c* b c1 b c1 c C © 2011 Mentor Graphics Corp. Company Confidential 7 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 8. New Flow - Match ―*‖ with children of b2 top a1 b1 a aa b2 b c1 a* b* * *c* b c1 c c © 2011 Mentor Graphics Corp. Company Confidential 8 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 9. New Flow - Match ―*c*‖ and ―*‖ with children of c1 top a1 b1 a aa b2 b c1 a* b* * *c* b c1 c c a* b* * *c* Final Match © 2011 Mentor Graphics Corp. Company Confidential 9 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 10. New Flow  Step 3 : Match on UTG hierarchy — If we hit a record/Array/Subtype we match using UTG Hierarchy. © 2011 Mentor Graphics Corp. Company Confidential 10 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 11. Why match using UTG?  Because we do not create NELT for record symbols. top a1 b1 Record1 a b2 b Record2 f1 f2 b f21 f22 No NELT for this portion  Hence we use UTG for matching inside records. © 2011 Mentor Graphics Corp. Company Confidential 11 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 12. Tokenizing a wildcard  A token can be of two types : — String Token — Star Token TokenBase  Star token is simply a ‗*‘  String token is anything other StringToken StarToken than ‗*‘  Eg : ―a*.b*.*.*c*‖ — String Tokens are a*,b*,*c* Class Hierarchy — Star token is only 1 here - * © 2011 Mentor Graphics Corp. Company Confidential 12 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 13. How to do Hierarchical Match?  How we ensure that we match hierarchy in case of ‗*‘ Star Token  There are two types of ‗*‘ — Local Match Star — Hierarchical Match Star Local Match Star Hierarchical Match Star  Local Star matches only the nodes at current level Two types of ‗*‘ in regex  Hierarchical Star matches all the nodes at current and lower level. © 2011 Mentor Graphics Corp. Company Confidential 13 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 14. How to do Hierarchical Match? Example.  If pattern is ―a*.b*.*.*c*‖ Star Token  It will be converted to Hierarchical Local Match Star Match Star a* H* b* H* L* H* *c* H* Two types of ‗*‘ in regex © 2011 Mentor Graphics Corp. Company Confidential 14 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 15. Organizing Tokens  Class NeltTokenArray contains — vector<TokenBase*> a* b* * *c*  Class NeltTokenIndex contains — NeltTokenArray* a* b* * *c* — Index (current token) Index  Class NeltRegexExpr contains — Vector<NeltTokenIndex*> a* b* * *c* a* b* * *c* a* b* * *c* Index Index Index © 2011 Mentor Graphics Corp. Company Confidential 15 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 16. Manager Classes  Class NeltRegexMgr is used to match on NELT.  Class NeltUtgRegexMgr is used to match on UTG .  It is the responsibility of NeltRegexMgr to invoke NeltUtgRegexMgr. © 2011 Mentor Graphics Corp. Company Confidential 16 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 17. C++ classes  NeltRegexMgr NeltTraverse NeltUtgTypeTraverse — NeltTraverse  NeltUtgRegexMgr NeltRegexMgr NeltUtgRegexMgr — NeltTypeTraverse  NeltRegexExpr  NeltTokenIndex NeltTokenIndex TokenBase — NeltUtgTokenIndex  NeltTokenArray NeltUtgTokenIndex StringToken StarToken  TokenBase — StringToken — StarToken © 2011 Mentor Graphics Corp. Company Confidential 17 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 18. Source Code  Source files — src/commonpp/nelt – neltRegexMgr.cxx – neltRegexMgr.hxx – neltUtgRegexMgr.cxx – neltUtgRegexMgr.hxx – neltRegexUtils.cxx – neltRegexUtils.hxx © 2011 Mentor Graphics Corp. Company Confidential 18 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 19. Performance data S.No Test Case Old Flow Time(s) New Flow Time(s) 1 Parme 161 484 2 Oracle 1814 1658 © 2011 Mentor Graphics Corp. Company Confidential 19 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 20. Future Work  Add a new class SliceToken deriving from TokenBase to store tokens of the form tok[slice]  Avoid duplicate matching — Eg : ―a*.a*b‖ is expanded into two patterns : 1) a*.H*.a*b 2) a*.H*.a*.H*.*b Both the patterns have ―a*.H*‖ in the beginning and hence it gets matched twice. © 2011 Mentor Graphics Corp. Company Confidential 20 Mayank, Wildcard Match, March 2012 www.mentor.com
  • 21. www.mentor.com © 2011 Mentor Graphics Corp. Company Confidential 21 Mayank, Wildcard Match, March 2012 www.mentor.com