SlideShare a Scribd company logo
1 of 28
Download to read offline
Fast and Parallel Webpage Layout  
        Leo A. Meyerovich, Rastislav Bodik
               University of California, Berkeley

   CPSC 722: Advanced Systems Seminar      Presenter: Tian Pan
Let’s get started with a story… in June, 2012 Facebook…
NYTimes: Facebook to rewrite their iOS app
BBC: Facebook recodes iOS mobile app to address speed complaints
Guardian: Facebook doubles iPhone app speed by dumping HTML5 for native code
…
There are 85,000 + iPhone applications in the same situation:
refactoring existing UI / rewrite clients completely
   + downloaded over 2 billion times
   - cover less than 1% of online content
So we still need:
A browser supporting emerging and diverse class of mobile devices




However,
  - limited CPU computational resources.
  - The power wall forces hardware architects to apply increases in
transistor counts towards improving parallel performance, not
sequential performance.


           A fast and parallel mobile browser
Outline
1.  Problem and background
2.  Challenges
3.  Solutions
4.  Conclusion
Data flow in a browser
Where are the bottlenecks in loading a page?




Lower bounds on CPU times for loading popular pages (Laptop)
Where are the bottlenecks in loading a page?




Lower bounds on CPU times for loading popular pages (Laptop)

      Layout matching and rendering (34%)
Input                          Output
  HTML tree
                                  Absolute element positions


                 Fonts



CSS




       Layout matching and rendering (34%)
Layout matching and rendering steps

                        Categories

                        I.  Selector matching
                              step 1
                        II.  Box and text layout
                              step 2, 4, 5, 6
                        III.  Glyph handling
                              step 3
                        IV.  Painting or rendering
                              step 7
Where are the bottlenecks in layout
    matching and rendering?




 3 < 2 <1       1. CSS selector matching
Challenges:     2. Box and text layout solving
                3. Glyph rendering
Outline
1.  Problem and background
2.  Challenges
3.  Solutions
  3.1. CSS selector matching
  3.2. Box and text layout
  3.3. Glyph rendering
4.  Conclusion
3.1 CSS Selector Matching
           Match CSS rules with HTML nodes


p img { margin: 10px; }
Selector         Style constraints

                                 <p>	
                                   <img blahblah>	
                                 </p>
     DOM node with CSS rules
attributes   rules
                            id1          r1
                                                     id
                            id2          r2      hash table
                            …            …
    Selector    {Rules}
    …id1        r1
    …id2        r2          attributes rules
                            class1       r3
    …class1     r3                                 class
    …tag1       r4          class2       r5
                            class3       r6
                                                 hash table
    …class2     r5
    …class3     r6          …            …
    …           …
                            attributes rules
           CSS                                      tag
                            tag1         r4
a list of selector{rules}                        hash table
                            …            …
attributes rules   node rules
                  id1        r1      n2   r1
node attributes                                   node rules
                  id2        r2      n1   r2
n1    id2         …          …       …    …       n1   r2
      class2                         …    …            r5
      class3      attributes rules   …    …            r6
      tag1        class1     r3      n3   r3           r4
                  class2     r5      n1   r5
n2    id1                            n1   r6
                                                  n2   r1
                  class3     r6
      tag1        …          …       …    …            r4
n3    class1                         …    …       n3   r4
…     …                              …    …       …    …
                  attributes rules
                                     n1   r4
                  tag1       r4
                                     n3   r4
HTML nodes        …          …


                           Map                    Reduce
Optimizations adopted by WebKit:
•  Hashtables. [×] check CSS repeatedly for every node
   [√] read only once, build hashmap, and check hash
•  Right-to-left matching. Most selectors can be matched
   by only examing a short suffix of the path.
Other Optimization:
•  Hash Tiling. partition the hashtable to idHash,
   classHash, tagHash, … for reducing cache misses.
   (Also could have been parallel.)
•  Tokenization. store attributes as int of tokens instead
   of string to save cache and comparison time.
•  Random load balancing. Allocate selectors matching
   randomly instead of sequentially as origin.
Other Optimization:
•  Result pre-allocation. Pre-allocate space for popular
   sites.
•  Delayed set insertion. Preallocate a vector with a size
   of potential matches.
•  Non-STL sets. Create the vector with a size of
   potential matches, add matches one by one and do
   linear collision checks.
3.1 CSS Selector Matching
                     Evaluation




Cilk++: Overall 13x and 14.8x with and without Gmail
Intel TBB: Overall 55.2x and 64.8x with and without Gmail
            Workstation: 204ms -> 3.5ms
            Handheld: 3000ms ->50ms
3.2 Box and text layout
Input: HTML tree nodes with symbolic constraint attributes
Output: actual layout details (size, shape, position) waiting
to be painted into pixels




  Layout constraints input              Layout constraints output
Unfortunately, it is hard to optimize,
because CSS
•  Informal written and cross-cutting, e.g. infinite loops
•  Confusing for webpage designers
•  Need standards-compliant engines
Berkeley Style Sheets (BSS)
A new, more orthogonal, concise, well-defined
intermediate layout language
•  Transformed from CSS
•  Specified with an attribute grammar (chances
   for parallelization)
•  BSS0 (vertical and horizontal boxes), BSS1
   (BBS0+shrink-to-fit sizing), BSS2
   (BBS1+left floats)
BSS0 (vertical and horizontal boxes)
Attribute Grammars                                O(log|tree|)
                                  attrA
                                               Potential for parallelization
                                                                                      S9
         attrB               n1                attrC
                    n2                       n3
                                                                       S7                       S8
        n4        n5   n6                 n7

attrD            attrE       attrF        attrG

                                                                   S3       S4   S5   S6
                                                                                 calcSynthesized()
                                     attrA                                        attrA
             IattrA                                 IattrA
             attrB                                  attrC          S1                            S2
                                                                 S3 S4 S5                        S6

        IattrA      IattrA     IattrA      IattrA
                                                                  S3        S4   S5        S6
        IattrB      IattrB     IattrB      IattrB
        attrD       attrE      attrF       attrG       calcInherited()


attr: attribute Iattr: inherited attribute S: synthesized attribute
3.2 Layout Constraint Solving
           Evaluation




Slashdot.org, BSS1, Cilk++: 3x~4x
3.3 Glyph Rendering
Till now, the size and position of texts have been
calculated. How to render these texts?
                            Parallel and locality benefits




   requests       request groups       pull and render
3.3 Glyph Rendering
         Evaluation




FreeType2 font library, TBB: 3x~4x
4 Conclusion
Address three bottlenecks of loading a page
  1. CSS selector matching
      •  Pre-built hash tables, map-reduce
  2. Box and text layout solving
      •  Specify layout as attribute grammars
  3. Glyph rendering
      •  Combine requests to groups and render
         in parallel
Milestone in building a parallel and mobile browser
Thanks~

More Related Content

Similar to Fast and Parallel Webpage Layout

Strabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database SystemStrabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database SystemKostis Kyzirakos
 
Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large GraphsNishant Gandhi
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2Antonios Giannopoulos
 
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of DatabricksBig Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of DatabricksData Con LA
 
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...In-Memory Computing Summit
 
ソーシャルグラフ分析
ソーシャルグラフ分析ソーシャルグラフ分析
ソーシャルグラフ分析shunya kimura
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkScyllaDB
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2Itamar Haber
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep DiveAmazon Web Services
 
Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2Jean-Philippe BEMPEL
 
Spark_tutorial (1).pptx
Spark_tutorial (1).pptxSpark_tutorial (1).pptx
Spark_tutorial (1).pptx0111002
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of RubySATOSHI TAGOMORI
 
Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...bhargavi804095
 
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBBuilding a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBCody Ray
 
TiDB vs Aurora.pdf
TiDB vs Aurora.pdfTiDB vs Aurora.pdf
TiDB vs Aurora.pdfssuser3fb50b
 
IronSmalltalk
IronSmalltalkIronSmalltalk
IronSmalltalkESUG
 

Similar to Fast and Parallel Webpage Layout (20)

Strabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database SystemStrabon: A Semantic Geospatial Database System
Strabon: A Semantic Geospatial Database System
 
Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large Graphs
 
Riak perf wins
Riak perf winsRiak perf wins
Riak perf wins
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
 
Upgrading to rails3
Upgrading to rails3Upgrading to rails3
Upgrading to rails3
 
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of DatabricksBig Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
Big Data Day LA 2015 - Spark after Dark by Chris Fregly of Databricks
 
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
IMCSummit 2015 - Day 1 Developer Track - Spark After Dark: Generating High Qu...
 
ソーシャルグラフ分析
ソーシャルグラフ分析ソーシャルグラフ分析
ソーシャルグラフ分析
 
Outrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar FrameworkOutrageous Performance: RageDB's Experience with the Seastar Framework
Outrageous Performance: RageDB's Experience with the Seastar Framework
 
What's new in Redis v3.2
What's new in Redis v3.2What's new in Redis v3.2
What's new in Redis v3.2
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2
 
Data Collection and Storage
Data Collection and StorageData Collection and Storage
Data Collection and Storage
 
Spark_tutorial (1).pptx
Spark_tutorial (1).pptxSpark_tutorial (1).pptx
Spark_tutorial (1).pptx
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
 
Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...
 
Opal compiler
Opal compilerOpal compiler
Opal compiler
 
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDBBuilding a Scalable Distributed Stats Infrastructure with Storm and KairosDB
Building a Scalable Distributed Stats Infrastructure with Storm and KairosDB
 
TiDB vs Aurora.pdf
TiDB vs Aurora.pdfTiDB vs Aurora.pdf
TiDB vs Aurora.pdf
 
IronSmalltalk
IronSmalltalkIronSmalltalk
IronSmalltalk
 

Fast and Parallel Webpage Layout

  • 1. Fast and Parallel Webpage Layout   Leo A. Meyerovich, Rastislav Bodik University of California, Berkeley CPSC 722: Advanced Systems Seminar Presenter: Tian Pan
  • 2. Let’s get started with a story… in June, 2012 Facebook… NYTimes: Facebook to rewrite their iOS app BBC: Facebook recodes iOS mobile app to address speed complaints Guardian: Facebook doubles iPhone app speed by dumping HTML5 for native code …
  • 3. There are 85,000 + iPhone applications in the same situation: refactoring existing UI / rewrite clients completely + downloaded over 2 billion times - cover less than 1% of online content
  • 4. So we still need: A browser supporting emerging and diverse class of mobile devices However, - limited CPU computational resources. - The power wall forces hardware architects to apply increases in transistor counts towards improving parallel performance, not sequential performance. A fast and parallel mobile browser
  • 5. Outline 1.  Problem and background 2.  Challenges 3.  Solutions 4.  Conclusion
  • 6. Data flow in a browser
  • 7. Where are the bottlenecks in loading a page? Lower bounds on CPU times for loading popular pages (Laptop)
  • 8. Where are the bottlenecks in loading a page? Lower bounds on CPU times for loading popular pages (Laptop) Layout matching and rendering (34%)
  • 9. Input Output HTML tree Absolute element positions Fonts CSS Layout matching and rendering (34%)
  • 10. Layout matching and rendering steps Categories I.  Selector matching step 1 II.  Box and text layout step 2, 4, 5, 6 III.  Glyph handling step 3 IV.  Painting or rendering step 7
  • 11. Where are the bottlenecks in layout matching and rendering? 3 < 2 <1 1. CSS selector matching Challenges: 2. Box and text layout solving 3. Glyph rendering
  • 12. Outline 1.  Problem and background 2.  Challenges 3.  Solutions 3.1. CSS selector matching 3.2. Box and text layout 3.3. Glyph rendering 4.  Conclusion
  • 13. 3.1 CSS Selector Matching Match CSS rules with HTML nodes p img { margin: 10px; } Selector Style constraints <p> <img blahblah> </p> DOM node with CSS rules
  • 14. attributes rules id1 r1 id id2 r2 hash table … … Selector {Rules} …id1 r1 …id2 r2 attributes rules class1 r3 …class1 r3 class …tag1 r4 class2 r5 class3 r6 hash table …class2 r5 …class3 r6 … … … … attributes rules CSS tag tag1 r4 a list of selector{rules} hash table … …
  • 15. attributes rules node rules id1 r1 n2 r1 node attributes node rules id2 r2 n1 r2 n1 id2 … … … … n1 r2 class2 … … r5 class3 attributes rules … … r6 tag1 class1 r3 n3 r3 r4 class2 r5 n1 r5 n2 id1 n1 r6 n2 r1 class3 r6 tag1 … … … … r4 n3 class1 … … n3 r4 … … … … … … attributes rules n1 r4 tag1 r4 n3 r4 HTML nodes … … Map Reduce
  • 16. Optimizations adopted by WebKit: •  Hashtables. [×] check CSS repeatedly for every node [√] read only once, build hashmap, and check hash •  Right-to-left matching. Most selectors can be matched by only examing a short suffix of the path. Other Optimization: •  Hash Tiling. partition the hashtable to idHash, classHash, tagHash, … for reducing cache misses. (Also could have been parallel.) •  Tokenization. store attributes as int of tokens instead of string to save cache and comparison time. •  Random load balancing. Allocate selectors matching randomly instead of sequentially as origin.
  • 17. Other Optimization: •  Result pre-allocation. Pre-allocate space for popular sites. •  Delayed set insertion. Preallocate a vector with a size of potential matches. •  Non-STL sets. Create the vector with a size of potential matches, add matches one by one and do linear collision checks.
  • 18. 3.1 CSS Selector Matching Evaluation Cilk++: Overall 13x and 14.8x with and without Gmail Intel TBB: Overall 55.2x and 64.8x with and without Gmail Workstation: 204ms -> 3.5ms Handheld: 3000ms ->50ms
  • 19. 3.2 Box and text layout Input: HTML tree nodes with symbolic constraint attributes Output: actual layout details (size, shape, position) waiting to be painted into pixels Layout constraints input Layout constraints output
  • 20. Unfortunately, it is hard to optimize, because CSS •  Informal written and cross-cutting, e.g. infinite loops •  Confusing for webpage designers •  Need standards-compliant engines
  • 21. Berkeley Style Sheets (BSS) A new, more orthogonal, concise, well-defined intermediate layout language •  Transformed from CSS •  Specified with an attribute grammar (chances for parallelization) •  BSS0 (vertical and horizontal boxes), BSS1 (BBS0+shrink-to-fit sizing), BSS2 (BBS1+left floats)
  • 22. BSS0 (vertical and horizontal boxes)
  • 23. Attribute Grammars O(log|tree|) attrA Potential for parallelization S9 attrB n1 attrC n2 n3 S7 S8 n4 n5 n6 n7 attrD attrE attrF attrG S3 S4 S5 S6 calcSynthesized() attrA attrA IattrA IattrA attrB attrC S1 S2 S3 S4 S5 S6 IattrA IattrA IattrA IattrA S3 S4 S5 S6 IattrB IattrB IattrB IattrB attrD attrE attrF attrG calcInherited() attr: attribute Iattr: inherited attribute S: synthesized attribute
  • 24. 3.2 Layout Constraint Solving Evaluation Slashdot.org, BSS1, Cilk++: 3x~4x
  • 25. 3.3 Glyph Rendering Till now, the size and position of texts have been calculated. How to render these texts? Parallel and locality benefits requests request groups pull and render
  • 26. 3.3 Glyph Rendering Evaluation FreeType2 font library, TBB: 3x~4x
  • 27. 4 Conclusion Address three bottlenecks of loading a page 1. CSS selector matching •  Pre-built hash tables, map-reduce 2. Box and text layout solving •  Specify layout as attribute grammars 3. Glyph rendering •  Combine requests to groups and render in parallel Milestone in building a parallel and mobile browser