2. #SMX #15A @PatrickStox
• Technical SEO for IBM - Opinions expressed are my own and not
those of IBM.
• I write, mainly for Search Engine Land
• I speak at some conferences like this one, Pubcon, TechSEO Boost
• Organizer for the Raleigh SEO Meetup (most successful in US)
• We also run a conference, the Raleigh SEO Conference
• Also the Beer & SEO Meetup (because beer)
• 2017 US Search Awards Judge, 2017 UK Search Awards Judge, 2018
Interactive Marketing Awards Judge
Who is Patrick Stox?
4. #SMX #15A @PatrickStox
The content blocks make it into more of a page builder
https://testgutenberg.com/
WordPress is replacing their TinyMCE editor with
Gutenberg, built in React
5. #SMX #15A @PatrickStox
Many of these Developers haven’t worked with SEOs before and
many of the SEOs don’t understand how JavaScript works or how
search engines handle it.
What’s an SEO to do?
6. #SMX #15A @PatrickStox
Their focus is usually on functionality. Things like SEO and
Accessibility often aren’t their priority.
What about the Developers?
9. #SMX #15A @PatrickStox
Search Engines don’t interact with the Page
Content should be loaded by default, not based on a user
interaction like click, mouseover, or scroll
11. #SMX #15A @PatrickStox
AJAX Crawling Scheme – Deprecated in 2015
Support for this is ENDING – Google will render #! URL
Change URL structure: example.com/#url >> example.com/#!url
Google and Bing would request: example.com/?_escaped_fragment_=url
Server would return an HTML snapshot
Google and Bing index example.com/#!url with content from
example.com/?_escaped_fragment_=url
12. #SMX #15A @PatrickStox
You really want Clean URLS
To change URLs for different content, usually History API and
HTML5 pushstate() are used.
Most of the frameworks have a router allowing you to customize.
/en/us?Topics%5B0%5D%5B0%5D=cat.topic%3Ainfrastructure
Create patterns to match
/{language}/{country}/{category}/{slug}
13. #SMX #15A @PatrickStox
It’s hard to 404
You can add a noindex to any error pages along with a message.
Will be treated as a soft-404.
JS redirect to an actual 404 page that returns the status code.
Create a 404 Route.
14. #SMX #15A @PatrickStox
Links
You still *should use* <a href=
ng-click, onclick, href=“javascript:void(0);” – these won’t be seen as
links unless you include the <a href= also.
15. #SMX #15A @PatrickStox
Links
You still *should* use <a href=
ng-click, onclick, href=“javascript:void(0);” – these won’t be seen as
links unless you include the <a href= also.
In my experience, if it remotely looks like a link or has a / in it,
Google will probably crawl it.
17. #SMX #15A @PatrickStox
This one is the troublemaker.
Rendered in the users browser or by search engine.
Client-Side Rendering
18. #SMX #15A @PatrickStox
Longer to load and process but everything is available and can be
changed quickly. A loading image is typically used but you may see
a blank page.
Why Client-Side Rendering?
20. #SMX #15A @PatrickStox
The server processes JS and sends the processed HTML to users or
search engines.
Slower TTFB unless you cache, will work for the ~2% of users with
JS disabled.
Server-Side Rendering (SSR)
21. #SMX #15A @PatrickStox
A headless browser records the DOM (Document Object Model)
and creates an HTML snapshot. Like SSR, but done pre-
deployment.
Prerender.io, BromBone, PhantomJS
May not serve the latest version, doesn’t allow for personalization,
will work for the ~2% of users with JS disabled.
Pre-Rendering
22. #SMX #15A @PatrickStox
Serves a rendered version
on load but then replaces
with JS for subsequent
loads.
This is probably the best
setup, but it can be a lot of
resources to load.
Hybrid Rendering – Isomorphic or Universal
23. #SMX #15A @PatrickStox
Send normal client side rendered content to users and sending SSR
content to search engines.
Dynamic Rendering – A Policy Change
27. #SMX #15A @PatrickStox
Bing
Bing confirmed support for JSON-LD
https://searchengineland.com/bing-confirmed-support-json-ld-
formatted-schema-org-markup-293508
Fabrice Canel of Bing said at Pubcon Vegas 2017 that Bing
processes JS.
32. #SMX #15A @PatrickStox
So don’t use client-side rendered JS.
I’ve got maybe 5 examples of JS rendering on other
search engines and hundreds where it doesn’t
33. #SMX #15A @PatrickStox
Share these #SMXInsights on your social channels!
#SMXInsights
Try to avoid Client-side rendered JS. Other
options include:
– Server-side render
– Pre-render
– Hybrid render
– Dynamic render
37. #SMX #15A @PatrickStox
Share these #SMXInsights on your social channels!
#SMXInsights
There is a second wave of indexing for
client side rendered JS websites. The 1st is
an HTML snapshot and the second one
comes later, after the Web Rendering
Service (WRS) processes the code.
38. #SMX #15A @PatrickStox
What does Second Indexing mean for you?
• You may not find your content yet using site:domain/page “phrase”
39. #SMX #15A @PatrickStox
What does Second Indexing mean for you?
• You may not find your content yet using site:domain/page “phrase”
• You’ll see errors in the GSC HTML Improvements Report
40. #SMX #15A @PatrickStox
What does Second Indexing mean for you?
• You may not find your content yet using site:domain/page “phrase”
• You’ll see errors in the GSC HTML Improvements Report
• You might need to check the source code vs the DOM to see what
will change once the Web Rendering Service (WRS) renders the
page
41. #SMX #15A @PatrickStox
What does Second Indexing mean for you?
• You may not find your content yet using site:domain/page “phrase”
• You’ll see errors in the GSC HTML Improvements Report
• You might need to check the source code vs the DOM to see what
will change once the Web Rendering Service (WRS) renders the
page
• Nofollow added via JS is a bad idea because it may show as follow
before rendered
42. #SMX #15A @PatrickStox
What does Second Indexing mean for you?
• You may not find your content yet using site:domain/page “phrase”
• You’ll see errors in the GSC HTML Improvements Report
• You might need to check the source code vs the DOM to see what
will change once the Web Rendering Service (WRS) renders the
page
• Nofollow added via JS is a bad idea because it may show as follow
before rendered
• Internal links may not be picked up and added to crawl before the
render happens
44. #SMX #15A @PatrickStox
Canonical tags
Were all those people who injected canonical tags with JS and
systems like Google Tag Manager wasting their time?
Let’s find out! I setup a test on a website that didn’t have a
canonical set and have injected a canonical that points to another
website. Is https://www.stoxseo.com/ indexed as
https://patrickstox.com/? Search Google for info:stoxseo.com
45. #SMX #15A @PatrickStox
Canonical tags
Stoxseo.com is being treated as patrickstox.com, so Google is
respecting my canonical inserted with JavaScript even though they
said at Google I/O 2018 that they don’t.
46. #SMX #15A @PatrickStox
Share these #SMXInsights on your social channels!
#SMXInsights
Does Google respect canonical tags
injected with JavaScript? They said at
Google I/O 2018 that they don’t, but my
test is live and shows they do. Search
Google for info:stoxseo.com cc: @johnmu
48. #SMX #15A @PatrickStox
Share these #SMXInsights on your social channels!
#SMXInsights
Lots of things supported by modern
browsers aren’t supported by Chrome 41.
Graceful degradation and polyfills are
important for now, although Google hopes
to update to the latest Chrome this year.
https://caniuse.com/#compare=chrome+41
,chrome+69
49. #SMX #15A @PatrickStox
Googlebot and WRS only speaks HTTP/1.x and FTP, with and
without TLS.
WRS and Googlebot doesn't support WebSocket protocol
50. #SMX #15A @PatrickStox
Use feature detection to identify supported APIs and capabilities of
the WRS, and polyfills where applicable — just as you would for any
other browser — as the capabilities of WRS may update at any
time:
• IndexedDB and WebSQL interfaces are disabled.
• Interfaces defined by the Service Worker specification are
disabled.
• WebGL interface is disabled; 3D and VR content is not currently
indexed.
WRS disables some interfaces and capabilities
51. #SMX #15A @PatrickStox
WRS loads each URL, following server and client redirects, same as
a regular browser. However, WRS does not retain state across page
loads:
• Local Storage and Session Storage data are cleared across page
loads.
• HTTP Cookies are cleared across page loads.
Googlebot and WRS are stateless across page loads
52. #SMX #15A @PatrickStox
Any features that requires user consent are auto-declined by the
Googlebot. For a full list of affected features, refer to
the Permission Registry. For example, Camera API, Geolocation
API, and Notifications API.
WRS declines permission requests
53. #SMX #15A @PatrickStox
Googlebot is designed to be a good citizen of the web. Crawling is
its main priority, while making sure it doesn't degrade the
experience of users visiting the site. Googlebot and WRS
continuously analyze and identify resources that don’t contribute
to essential page content and may not fetch such resources. For
example, reporting and error requests that don’t contribute to
essential page content, and other similar types of requests are
unused or unnecessary to extract essential page content.
Googlebot and WRS prioritize essential page content
54. #SMX #15A @PatrickStox
Mostly crawls from West Coast US (Mountain View).
According to Gary Illyes at SMX Advanced 2017 they don’t throttle
speeds when checking mobile sites.
They are very aggressive with caching everything (you may want to
use file versioning). This can lead to some impossible states being
indexed if parts of old files are cached.
Other things to know about Googlebot and the WRS
55. #SMX #15A @PatrickStox
In fact, because of the caching and you know being a bot and all,
they actually run things as fast as they can with a sped up clock.
Check out this post from Tom Anthony
http://www.tomanthony.co.uk/blog/googlebot-javascript-random/
Googlebot runs fast
56. #SMX #15A @PatrickStox
Because of this experiment from Max Prin, most people believe
that Googlebot will only wait 5 seconds for a page to load.
https://maxxeight.com/tests/js-timer/
5 second rule
58. #SMX #15A @PatrickStox
Share these #SMXInsights on your social channels!
#SMXInsights
There is no fixed timeout of 5 seconds for
Googlebot as is commonly believed for
JS. Between aggressive caching and
running things as fast as possible, there is
no known or fixed limit.
59. #SMX #15A @PatrickStox
Share these #SMXInsights on your social channels!
#SMXInsights
Googlebot renders with a long viewport.
Mobile screen size is 431 X 731 and
Google resizes to 12,140 pixels high, while
the desktop version is 768 X 1024 but
Google only resizes to 9,307 pixels high.
Credit to JR Oakes @jroakes
https://codeseo.io/console-log-hacking-for-
googlebot/
61. #SMX #15A @PatrickStox
Both of these are raw HTML, before JS has been processed and
they’re unreliable.
Cache can process the JS as it was but it’s being processed in your
browser, making it deceptive. This isn’t what Google saw.
View Source and Google’s Cache
62. #SMX #15A @PatrickStox
Viewing the DOM shows you the HTML after the JS has been
processed.
If you want to make sure Google sees it, load in the DOM by default
without needing an action like click or scroll.
Inspect or Inspect Element
63. #SMX #15A @PatrickStox
Search Google: site:domain.com “snippet of text from your page”
If the second indexing hasn’t happened, this may not show a result.
To check indexing
64. #SMX #15A @PatrickStox
Google Search Console / Fetch and Render - renders a page with
Googlebot as either desktop or mobile and lists blocked /
unreachable resources. It does not show the processed DOM or
have a debug mode.
Lets you submit rendered pages to Google search for indexing. This
is stricter than the system normally used. Just because you submit
to index after fetch and render doesn’t mean the WRS has
processed the page for second indexing!
Render a page as Googlebot
65. #SMX #15A @PatrickStox
Google Mobile Friendly Test - renders a page with smartphone
Googlebot. It does have the processed DOM (source code) and a
debug mode (see page loading issues and JavaScript Console).
Render a page as Googlebot
66. #SMX #15A @PatrickStox
Share these #SMXInsights on your social channels!
#SMXInsights
Google’s Mobile Friendly Test
https://search.google.com/test/mobile-
friendly is currently the best way to see
what Google sees for a client-side rendered
JS website. It shows the processed DOM
and has a JavaScript Console.
68. #SMX #15A @PatrickStox
Sometimes you may see another website that had the same basic
HTML as your website indexed instead of your own. Check with
info:domain/page as this will usually indicate the second indexing
hasn’t happened yet.
What about using the same raw HTML?