Wildcard expansion
- 2. Agenda
Introduction
Motivation
New Flow
Class Hierarchy
© 2011 Mentor Graphics Corp. Company Confidential
2 Mayank, Wildcard Match, March 2012 www.mentor.com
- 3. Motivation
Efficiently matching a regular expression in a RTL design.
Use NELT to do matching.
— Previous flow creates a separate data structure altogether to do
matching.
— Using NELT hierarchy would reduce memory usage.
Enhance Functionality.
© 2011 Mentor Graphics Corp. Company Confidential
3 Mayank, Wildcard Match, March 2012 www.mentor.com
- 4. New Flow
• Tokenizing Pattern
Tokenize
• Store it in appropriate Data structure
• Start matching on NELT.
Match on
NELT
• Do matching on UTG.
• For Record/Arrays.
Match on
UTG
© 2011 Mentor Graphics Corp. Company Confidential
4 Mayank, Wildcard Match, March 2012 www.mentor.com
- 5. New Flow
STEP 1 : Tokenize wildcard
Eg : Wildcard is ―a*.b*.*.*c*‖
a* b* * *c*
© 2011 Mentor Graphics Corp. Company Confidential
5 Mayank, Wildcard Match, March 2012 www.mentor.com
- 6. New Flow
STEP 2 : Start matching nodes in NELT
- Match current token with top‘s children top
a1 b1
a* b* * *c*
a aa b2 b c1
b c1
c
C
© 2011 Mentor Graphics Corp. Company Confidential
6 Mayank, Wildcard Match, March 2012 www.mentor.com
- 7. New Flow
- Match ―b*‖ with children of a1
top
a1 b1
a aa b2
a* b* * *c* b c1
b c1
c
C
© 2011 Mentor Graphics Corp. Company Confidential
7 Mayank, Wildcard Match, March 2012 www.mentor.com
- 8. New Flow
- Match ―*‖ with children of b2
top
a1 b1
a aa b2 b c1
a* b* * *c* b c1
c
c
© 2011 Mentor Graphics Corp. Company Confidential
8 Mayank, Wildcard Match, March 2012 www.mentor.com
- 9. New Flow
- Match ―*c*‖ and ―*‖ with children of c1
top
a1 b1
a aa b2 b c1
a* b* * *c* b c1
c
c
a* b* * *c*
Final Match
© 2011 Mentor Graphics Corp. Company Confidential
9 Mayank, Wildcard Match, March 2012 www.mentor.com
- 10. New Flow
Step 3 : Match on UTG hierarchy
— If we hit a record/Array/Subtype we match using UTG
Hierarchy.
© 2011 Mentor Graphics Corp. Company Confidential
10 Mayank, Wildcard Match, March 2012 www.mentor.com
- 11. Why match using UTG?
Because we do not create NELT for record symbols.
top
a1 b1 Record1
a b2 b Record2 f1 f2
b f21 f22
No NELT for this portion
Hence we use UTG for matching inside records.
© 2011 Mentor Graphics Corp. Company Confidential
11 Mayank, Wildcard Match, March 2012 www.mentor.com
- 12. Tokenizing a wildcard
A token can be of two types :
— String Token
— Star Token TokenBase
Star token is simply a ‗*‘
String token is anything other StringToken StarToken
than ‗*‘
Eg : ―a*.b*.*.*c*‖
— String Tokens are a*,b*,*c* Class Hierarchy
— Star token is only 1 here - *
© 2011 Mentor Graphics Corp. Company Confidential
12 Mayank, Wildcard Match, March 2012 www.mentor.com
- 13. How to do Hierarchical Match?
How we ensure that we match
hierarchy in case of ‗*‘
Star Token
There are two types of ‗*‘
— Local Match Star
— Hierarchical Match Star Local Match Star
Hierarchical
Match Star
Local Star matches only the
nodes at current level Two types of ‗*‘ in regex
Hierarchical Star matches all the
nodes at current and lower level.
© 2011 Mentor Graphics Corp. Company Confidential
13 Mayank, Wildcard Match, March 2012 www.mentor.com
- 14. How to do Hierarchical Match?
Example.
If pattern is ―a*.b*.*.*c*‖
Star Token
It will be converted to Hierarchical
Local Match Star
Match Star
a* H* b* H* L* H* *c* H*
Two types of ‗*‘ in regex
© 2011 Mentor Graphics Corp. Company Confidential
14 Mayank, Wildcard Match, March 2012 www.mentor.com
- 15. Organizing Tokens
Class NeltTokenArray contains
— vector<TokenBase*> a* b* * *c*
Class NeltTokenIndex contains
— NeltTokenArray* a* b* * *c*
— Index (current token)
Index
Class NeltRegexExpr contains
— Vector<NeltTokenIndex*>
a* b* * *c* a* b* * *c* a* b* * *c*
Index Index Index
© 2011 Mentor Graphics Corp. Company Confidential
15 Mayank, Wildcard Match, March 2012 www.mentor.com
- 16. Manager Classes
Class NeltRegexMgr is used to match on NELT.
Class NeltUtgRegexMgr is used to match on UTG .
It is the responsibility of NeltRegexMgr to invoke NeltUtgRegexMgr.
© 2011 Mentor Graphics Corp. Company Confidential
16 Mayank, Wildcard Match, March 2012 www.mentor.com
- 17. C++ classes
NeltRegexMgr NeltTraverse NeltUtgTypeTraverse
— NeltTraverse
NeltUtgRegexMgr
NeltRegexMgr NeltUtgRegexMgr
— NeltTypeTraverse
NeltRegexExpr
NeltTokenIndex NeltTokenIndex TokenBase
— NeltUtgTokenIndex
NeltTokenArray NeltUtgTokenIndex StringToken StarToken
TokenBase
— StringToken
— StarToken
© 2011 Mentor Graphics Corp. Company Confidential
17 Mayank, Wildcard Match, March 2012 www.mentor.com
- 18. Source Code
Source files
— src/commonpp/nelt
– neltRegexMgr.cxx
– neltRegexMgr.hxx
– neltUtgRegexMgr.cxx
– neltUtgRegexMgr.hxx
– neltRegexUtils.cxx
– neltRegexUtils.hxx
© 2011 Mentor Graphics Corp. Company Confidential
18 Mayank, Wildcard Match, March 2012 www.mentor.com
- 19. Performance data
S.No Test Case Old Flow Time(s) New Flow Time(s)
1 Parme 161 484
2 Oracle 1814 1658
© 2011 Mentor Graphics Corp. Company Confidential
19 Mayank, Wildcard Match, March 2012 www.mentor.com
- 20. Future Work
Add a new class SliceToken deriving from TokenBase to
store tokens of the form tok[slice]
Avoid duplicate matching
— Eg : ―a*.a*b‖ is expanded into two patterns :
1) a*.H*.a*b
2) a*.H*.a*.H*.*b
Both the patterns have ―a*.H*‖ in the beginning and hence it gets
matched twice.
© 2011 Mentor Graphics Corp. Company Confidential
20 Mayank, Wildcard Match, March 2012 www.mentor.com
- 21. www.mentor.com
© 2011 Mentor Graphics Corp. Company Confidential
21 Mayank, Wildcard Match, March 2012 www.mentor.com