Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
The Latest in
Advanced
Technical SEO
Index Bloat &
Discovery: from
Facets to
Frameworks
Hi!
Good Afternoon.
Ari	Nahmani
CEO	/	Founder
Kahena	Digital	Marketing
ari@kahenadigital.com
TeamClients
index bloat
index bloat
crawl budget
web-tech > googlebot
discoverability
Today’s Session
• Technical SEO issues around e-commerce /
large site architecture
• Preventing index bloat & preserving c...
Index Bloat Prevention
Index Bloat Prevention
A bloated index =
if indexed URLS > “unique
pages”
Index Bloat Prevention
On an ecommerce site:
A bloated index =
if indexed URLS >
sum(CAT+PDP+Static)
Index Bloat Prevention
On a ‘content’ site:
A bloated index =
if indexed URLS >
sum(Articles+Static)
cannibalization
Index Bloat Prevention: Cannibalization
Index Bloat Prevention: Sorts & Facets
Index Bloat Prevention: Sorts & Filters
http://www.site.com/guys/tees/?pref
n1=bvAverageRating&prefn2=col
orGroup&prefv3=L...
Index Bloat Prevention: Sorts & Filters
<link	rel="canonical"	
href=”http://www.site.com/guys/tees/"	/>
• Basic Solution: ...
Solution: Filtering Out All Facet Params
• PROS:
– Avoids diluted / dupe URLs (request, not
directive)
• CONS:
– If you wa...
Crawl Budget: Facet Parameter URLs
Crawl Budget: Facet Parameter URLs
JS / AJAX Indexation
Index Bloat VS Discovery: JS + AJAX
Index Bloat Prevention: JS + AJAX
AJAX	Refinement	V1	=		
NO	URL	CHANGE
Index Bloat Prevention: JS + AJAX
AJAX	Refinement	V1	- NO	URL	CHANGE,	
but	inactive,	different href=	URL	exists
AJAX Facet Refinements V1 (NO URL CHANGE)
• PROS:
– Theoretically no parameters exposed to bloat the
index
• CONS:
– Users...
Index Bloat Prevention: JS + AJAX
AJAX	Refinement	V2	=	
html	5 history.pushState()
Index Bloat Prevention: JS + AJAX
html	5	history.pushState()
http://www.site.com/guys/tees/?color=green&size=large
Consistent URL Signals - Navigation
Ideal consistency:
Navigation URLs =
Pushstate() URLs =
Canonical URLs =
XML Sitemap U...
Consistent URL Signals - Navigation
Ideal consistency:
Navigation URLs =
Pushstate() URLs ≠
Canonical URLs =
XML Sitemap U...
Index Bloat Prevention: JS + AJAX
Google	preferred	pushstate URL	version,	we	had	to	reinforce	
(via	normal	inline	href=‘’,...
AJAX Facet Refinements V2 (PushState URL Change)
• PROS:
– Users can now share /bookmark the correct
content
– Added to br...
Indexing AJAX & JS Frameworks
Indexing AJAX & JS Frameworks
Indexing AJAX & JS Frameworks
What method exists
that we know still
works?
Indexing AJAX & JS Frameworks
HTML SNAPSHOT
<head>
<meta name="fragment" content="!">
Google / Bing crawls with:
_escaped_fragment_=
Indexing AJAX & JS: HTML Snapshot
Indexing AJAX & JS: HTML Snapshot
Indexing AJAX & JS: HTML Snapshot
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Goog...
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Goog...
Indexing AJAX & JS: HTML Snapshot
• Upon crawl of URL with _escaped_fragment_=,
serve ’dumbed down’ HTML version of page.
...
Indexing AJAX & JS: How To Decide?
HTML
SNAPSHOT
_escaped_fragment_=
Trust
Googlebot
VALIDATE!
Progressive
Enhancement
‘Du...
Indexing AJAX & JS: Pre-rendering
Upon crawl of URL with _escaped_fragment_=
1. prerender.io – middleware via reverse prox...
Indexing AJAX & JS: Prerender.io
Indexing AJAX & JS: Prerender.io
Indexing AJAX & JS: BromBone
Indexing AJAX & JS: Server Prerender
Server side
(phantomJS /
headless browser)
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
H...
Indexing AJAX & JS: Server Side
bit.ly/javascriptseo
Indexing AJAX & JS: Server Side
bit.ly/javascriptseobit.ly/javascriptseo
Indexing AJAX & JS: Server Side
bit.ly/javascriptseobit.ly/javascriptseo
Server side
(phantomJS /
headless browser)
Pre or Realtime
Rendered
(to users & bots)
Indexing AJAX & JS: How To Decide?
H...
Indexing AJAX & JS: Trust Googlebot
read	these	first…
Testing JS Indexation: Jscrawlability.com
Validation & Testing:
Discovery vs Bloat
Testing: Fetch & Render JS / AJAX
Testing: Slice and Dice the Index
Advanced	Site	Operators
site:yoursite.com –inurl:cat.jsp
-inurl:prod.jsp –inurl:store.jsp
Testing: Slice and Dice the Index
Advanced	Site	Operators
site:yoursite.com inurl:size
inurl:cat.jsp -inurl:cid
Testing: Slice and Dice the Index
Advanced	Site	Operators
site:yoursite.com inurl:pdp
intext:”write	a	review”
Testing: Automate Bloat + Discovery Check
Testing: Automate Bloat + Discovery Check
Testing: Search Analytics for Bloat / Discovery
Testing: Go To The Source: Server Logs!
Summing It Up
• Index Bloat, Crawl Budget, & Testing: Large sites are
prone to serious index bloat and wasted crawl budget...
Dankeschön!
Questions?
Ari	Nahmani
CEO	/	Founder
Kahena	Digital	Marketing
ari@kahenadigital.com
@AriNahmani
References:
• Can You Now Trust Google To Crawl Ajax Sites?
• Search Engine Optimization Best Practices for AJAX URLs | We...
Image Credits:
fat-american-1.jpg (1280×955)
bigbrands1.jpg (570×383)
consistencydemotivator_large.jpeg (480×338)
04-godfa...
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016
Prochain SlideShare
Chargement dans…5
×

Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016

Ari Nahmani covers the latest in advanced technical SEO at SMX Munich (Muenchen) 2016. Discussions of the deprecated HTML snapshot, Javascript crawlability and indexing, new frameworks, prerendering, server side rendering, prerender.io, isomorphic javascript, and other technical issues related to the future of protecting your index health.

  • Identifiez-vous pour voir les commentaires

Advanced Technical SEO - Index Bloat & Discovery: from Facets to Javascript Frameworks - SMX Munich 2016

  1. 1. The Latest in Advanced Technical SEO Index Bloat & Discovery: from Facets to Frameworks
  2. 2. Hi! Good Afternoon. Ari Nahmani CEO / Founder Kahena Digital Marketing ari@kahenadigital.com
  3. 3. TeamClients
  4. 4. index bloat
  5. 5. index bloat
  6. 6. crawl budget
  7. 7. web-tech > googlebot
  8. 8. discoverability
  9. 9. Today’s Session • Technical SEO issues around e-commerce / large site architecture • Preventing index bloat & preserving crawl budget as a core methodology • Current solutions & upcoming threats (JS, AJAX, new frameworks, pre-rendering)
  10. 10. Index Bloat Prevention
  11. 11. Index Bloat Prevention A bloated index = if indexed URLS > “unique pages”
  12. 12. Index Bloat Prevention On an ecommerce site: A bloated index = if indexed URLS > sum(CAT+PDP+Static)
  13. 13. Index Bloat Prevention On a ‘content’ site: A bloated index = if indexed URLS > sum(Articles+Static)
  14. 14. cannibalization
  15. 15. Index Bloat Prevention: Cannibalization
  16. 16. Index Bloat Prevention: Sorts & Facets
  17. 17. Index Bloat Prevention: Sorts & Filters http://www.site.com/guys/tees/?pref n1=bvAverageRating&prefn2=col orGroup&prefv3=LG&srule=sortin gNewArrival&prefv1=4&prefv2=RE D&prefn3=size
  18. 18. Index Bloat Prevention: Sorts & Filters <link rel="canonical" href=”http://www.site.com/guys/tees/" /> • Basic Solution: Strip out the unnecessary parameters
  19. 19. Solution: Filtering Out All Facet Params • PROS: – Avoids diluted / dupe URLs (request, not directive) • CONS: – If you want/need specific parameters indexed and exposed (size, color), need properly coded canonical tag logic, recipe for major leak and confusion. – Considerations w/ pagination & view-all page
  20. 20. Crawl Budget: Facet Parameter URLs
  21. 21. Crawl Budget: Facet Parameter URLs
  22. 22. JS / AJAX Indexation
  23. 23. Index Bloat VS Discovery: JS + AJAX
  24. 24. Index Bloat Prevention: JS + AJAX AJAX Refinement V1 = NO URL CHANGE
  25. 25. Index Bloat Prevention: JS + AJAX AJAX Refinement V1 - NO URL CHANGE, but inactive, different href= URL exists
  26. 26. AJAX Facet Refinements V1 (NO URL CHANGE) • PROS: – Theoretically no parameters exposed to bloat the index • CONS: – Users can’t share refined / filtered content to friends, no accurate bookmarking. (Terrible UX) – Googlebot will still crawl hidden href=' or other JS framework links like Angular: ng-href= (check canonical logic!!)
  27. 27. Index Bloat Prevention: JS + AJAX AJAX Refinement V2 = html 5 history.pushState()
  28. 28. Index Bloat Prevention: JS + AJAX html 5 history.pushState() http://www.site.com/guys/tees/?color=green&size=large
  29. 29. Consistent URL Signals - Navigation Ideal consistency: Navigation URLs = Pushstate() URLs = Canonical URLs = XML Sitemap URLs =
  30. 30. Consistent URL Signals - Navigation Ideal consistency: Navigation URLs = Pushstate() URLs ≠ Canonical URLs = XML Sitemap URLs =
  31. 31. Index Bloat Prevention: JS + AJAX Google preferred pushstate URL version, we had to reinforce (via normal inline href=‘’, canonical, xml sitemap)
  32. 32. AJAX Facet Refinements V2 (PushState URL Change) • PROS: – Users can now share /bookmark the correct content – Added to browser history • CONS: – Still need to have consistent canonical structure due to Googlebot crawling pushstate() – Different hidden URL structure via AJAX facets may require further unpredictable canonicalization logic / further dev work
  33. 33. Indexing AJAX & JS Frameworks
  34. 34. Indexing AJAX & JS Frameworks
  35. 35. Indexing AJAX & JS Frameworks What method exists that we know still works?
  36. 36. Indexing AJAX & JS Frameworks HTML SNAPSHOT
  37. 37. <head> <meta name="fragment" content="!"> Google / Bing crawls with: _escaped_fragment_= Indexing AJAX & JS: HTML Snapshot
  38. 38. Indexing AJAX & JS: HTML Snapshot
  39. 39. Indexing AJAX & JS: HTML Snapshot
  40. 40. Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Server side (phantomJS / headless browser) Pre-Rendered (to bots)
  41. 41. Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre-Rendered (to bots) Server side (phantomJS / headless browser)
  42. 42. Indexing AJAX & JS: HTML Snapshot • Upon crawl of URL with _escaped_fragment_=, serve ’dumbed down’ HTML version of page. • Not pre-rendered, rather simplified. • For example, on ecommerce à a view-all category listing with no dynamic facets. Amazing results from our clients.
  43. 43. Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre or Realtime Rendered (to users & bots) Pre-Rendered (to bots) Server side (phantomJS / headless browser)
  44. 44. Indexing AJAX & JS: Pre-rendering Upon crawl of URL with _escaped_fragment_= 1. prerender.io – middleware via reverse proxy that serves a pre-rendered, cached HTML page to bots OR 2. Server side – the server pre-rendered the JS in cached html pages to serve to bots or does it in real-time (headless browser).
  45. 45. Indexing AJAX & JS: Prerender.io
  46. 46. Indexing AJAX & JS: Prerender.io
  47. 47. Indexing AJAX & JS: BromBone
  48. 48. Indexing AJAX & JS: Server Prerender
  49. 49. Server side (phantomJS / headless browser) Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre-Rendered (to bots)
  50. 50. Indexing AJAX & JS: Server Side bit.ly/javascriptseo
  51. 51. Indexing AJAX & JS: Server Side bit.ly/javascriptseobit.ly/javascriptseo
  52. 52. Indexing AJAX & JS: Server Side bit.ly/javascriptseobit.ly/javascriptseo
  53. 53. Server side (phantomJS / headless browser) Pre or Realtime Rendered (to users & bots) Indexing AJAX & JS: How To Decide? HTML SNAPSHOT _escaped_fragment_= Trust Googlebot VALIDATE! Progressive Enhancement ‘Dumbed down’ HTML Template 3rd Party Service (prerender.io) Pre-Rendered (to bots)
  54. 54. Indexing AJAX & JS: Trust Googlebot read these first…
  55. 55. Testing JS Indexation: Jscrawlability.com
  56. 56. Validation & Testing: Discovery vs Bloat
  57. 57. Testing: Fetch & Render JS / AJAX
  58. 58. Testing: Slice and Dice the Index Advanced Site Operators site:yoursite.com –inurl:cat.jsp -inurl:prod.jsp –inurl:store.jsp
  59. 59. Testing: Slice and Dice the Index Advanced Site Operators site:yoursite.com inurl:size inurl:cat.jsp -inurl:cid
  60. 60. Testing: Slice and Dice the Index Advanced Site Operators site:yoursite.com inurl:pdp intext:”write a review”
  61. 61. Testing: Automate Bloat + Discovery Check
  62. 62. Testing: Automate Bloat + Discovery Check
  63. 63. Testing: Search Analytics for Bloat / Discovery
  64. 64. Testing: Go To The Source: Server Logs!
  65. 65. Summing It Up • Index Bloat, Crawl Budget, & Testing: Large sites are prone to serious index bloat and wasted crawl budget. Needs diligent testing and an OCD-like attention to detail with the basics. Test often & automate! • JS/AJAX: Pushstate(), JS Frameworks and AJAX present both discovery and bloat challenges. Know the options: short term fixes like HTML snapshot (G+B), and long term re-designs with modern frameworks w/ built in server side rendering.
  66. 66. Dankeschön! Questions? Ari Nahmani CEO / Founder Kahena Digital Marketing ari@kahenadigital.com @AriNahmani
  67. 67. References: • Can You Now Trust Google To Crawl Ajax Sites? • Search Engine Optimization Best Practices for AJAX URLs | Webmaster Blog • We Tested How Googlebot Crawls Javascript And Here's What We Learned • Prerender - AngularJS SEO, BackboneJS SEO, or EmberJS SEO • SMX Munich Advanced Technical SEO Brainstorm - Google Docs • www.simoahava.com/seo/dynamically-added-meta-data-indexed-google-crawlers/ • Speakers | Search Marketing Expo &ndash; SMX Munich • JavaScript + SEO: Better Together &mdash; Medium • SEO AJAX Crawlability in a Responsive Publisher World • SEO Strategies for JavaScript-Heavy Single Page Applications or AJAX Sites | Search Engine Watch • The Basics of JavaScript Framework SEO in AngularJS - Builtvisible • Can Search Engines Crawl Javascript? • https://www.w3.org/wiki/Graceful_degradation_versus_progressive_enhancement#Graceful_degradatio n_and_progressive_enhancement_in_a_nutshell • SEO and JS: New Challenges • BromBone | SEO for your AngularJS, EmberJS, or BackboneJS website. • DIY AngularJS SEO with PhantomJS (the easy way!) | Lawsonry • https://scotch.io/tutorials/angularjs-seo-with-prerender-io
  68. 68. Image Credits: fat-american-1.jpg (1280×955) bigbrands1.jpg (570×383) consistencydemotivator_large.jpeg (480×338) 04-godfather-keep-friend.jpg (518×300) 4da1a1a23dba011a7ba6918986a6b818302b949ae694b27d559cf8e733 08bf7b.jpg (604×392) the-17-craziest-cannibal-attacks-in-history-u2.jpg (520×272) taxonomy-types-800x450.png (800×450) wireframes-homecat.png (1000×460) Check-yoself.jpg (800×1025) Dangerous-Curve-Ahead-Sign-K-6513.gif (400×400) crawlerserver2.png (884×445) beach.png (1196×838)

×