It is often hard and expensive to make major changes to your website and many businesses demand forecasts, predictions, and business cases to prioritise them. Will is going to present tools and approaches for figuring out whether a change is worthwhile before you make it - including ways of thinking about on-page, content quality, usage data impacts, and what happens when you change your internal linking structure.
11. And let’s make some really
EFFECTIVE recommendations
I think it’s a fairly straight-forward pitch
12. Control
We have
the same
data as
Google
Influence
Google has
data we don’t
have
Keyword targeting
External links
Internal links
Usage data
Website “quality”
Ranking
factors
21. External links
The less direct control you have over a factor, the
harder testing and modelling becomes.
Control
We have the
same data
as Google
Google has
data we don’t
have
27. 1. Data we have
(or can get)
only for our own site
Like usage data - see, for example, this
post by @SimoAhava explaining how to
capture bounce rate back to the SERP.
Also interesting: Rand’s video about a
possible organic quality score.
28. 2. Cases where the
real ranking factor is a
machine-learned
proxy for the real thing
e.g.
● Content quality (Panda ML)
● Link quality
○ Ignored links (ML on disavow)
Actually
measure:
ML PROXY FOR
QUALITY
Want to
measure:
QUALITY
29. For usage data: it is impossible to guess what
people prefer
See whichtestwon
30. So tools like SERP Turkey can be useful
(by our very own @TomAnthonySEO)
31. When it comes to “quality”
● How do you define it?
● How do you communicate it to clients / bosses?
● How do you benchmark it against competitors?
● How do you figure out if a change improves it?
33. Train ML models
Google uses the human
questionnaires as training
data for ML models of
“quality”
Gather human rater
information
Google employs
thousands of human
quality raters to answer
questionnaires about
many kinds of website
34. 2011
Release Panda
The Panda quality
algorithm starts being
used as a batch process
modifying the regular core
algorithm
Train ML models
Google uses the human
questionnaires as training
data for ML models of
“quality”
Gather human rater
information
Google employs
thousands of human
quality raters to answer
questionnaires about
many kinds of website
35. 2011
Release Panda
The Panda quality
algorithm starts being
used as a batch process
modifying the regular core
algorithm
Train ML models
Google uses the human
questionnaires as training
data for ML models of
“quality”
Gather human rater
information
Google employs
thousands of human
quality raters to answer
questionnaires about
many kinds of website
2016
Make Panda real-time
“Quality” becomes a
first-class ranking factor
in the core algorithm
36. Back in 2011, I was suggesting we run our own
Panda-like quality surveys (WBF here, instructions here)
37. Probably the only
thing that’s really
changed since then is
that you should run it
mobile-first now
Hat-tip Tom Capper
38. More executives are aware of
quality as a ranking factor these
days
Since Panda went real-time, quality issues don’t necessarily cause
obvious drops correlated with algorithm history dates
39. Client site 1 Client site 2
Would you trust information from this website? 72% 64%
40. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
41. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
42. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
43. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
Are there any noticeable errors on this page? 6% 4% 1%
44. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
Are there any noticeable errors on this page? 6% 4% 1%
Does this page provide original content or info? 76% 72% 85%
45. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
Are there any noticeable errors on this page? 6% 4% 1%
Does this page provide original content or info? 76% 72% 85%
Would you recognize this site as an authority? 44% 33% 58%
46. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
Are there any noticeable errors on this page? 6% 4% 1%
Does this page provide original content or info? 76% 72% 85%
Would you recognize this site as an authority? 44% 33% 58%
Does this website contain insightful analysis? 72% 62% 81%
47. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
Are there any noticeable errors on this page? 6% 4% 1%
Does this page provide original content or info? 76% 72% 85%
Would you recognize this site as an authority? 44% 33% 58%
Does this website contain insightful analysis? 72% 62% 81%
Would you consider bookmarking pages on this site? 44% 38% 56%
48. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
Are there any noticeable errors on this page? 6% 4% 1%
Does this page provide original content or info? 76% 72% 85%
Would you recognize this site as an authority? 44% 33% 58%
Does this website contain insightful analysis? 72% 62% 81%
Would you consider bookmarking pages on this site? 44% 38% 56%
Are there excessive adverts on this website? 2% 2% 8%
49. Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%
Is this website written by experts? 50% 46% 65%
Would you give this site your credit card details? 29% 21% 43%
Are there any noticeable errors on this page? 6% 4% 1%
Does this page provide original content or info? 76% 72% 85%
Would you recognize this site as an authority? 44% 33% 58%
Does this website contain insightful analysis? 72% 62% 81%
Would you consider bookmarking pages on this site? 44% 38% 56%
Are there excessive adverts on this website? 2% 2% 8%
Could pages from this site appear in print? 54% 54% 59%
50. We also asked for free-text feedback and
found some surprising priorities from
non-SEOs
52. “There's not enough information
about the company and why I
should use their products”
On a micro-site that doesn’t have an “about” page
53. “In this day and age every page that
has anything at all to do with
business should be https”
Security is a big deal in B2B - even without on-site purchases
54. “The pictures were of low quality
and blurry”
We know this matters to users. It’s at the easier end of ML detection
63. Can you figure out:
Will we do better if we make this change?
How much better could it be?
Which of the many ways of doing it is best?
64. Let’s look at the state of the art:
Use interactive visualisations to find issues
Calculate internal PageRank
Follow Paul Shapiro and Patrick Stox for more
65. You’ve probably all seen crawl graphs
They are distorted by starting at one page and only showing some paths
Good explainer at sitebulb.com and Ian Lurie reports some good results from colouring by indexation
66. Full link graphs are more complete, but I find them
hard to interpret
73. Internal PageRank is a
powerful idea.
But by starting from
“all pages are equal”
we get some
odd results
Like the contact page
being more powerful
than the homepage
74. There are case studies of people seeing real results
from radical changes to internal link structure
See Alex’s fascinating Mozcon talk [PDF]
75. but real-world changes are hard to
make, hard to undo, and could
cause lasting damage
and even worse from my perspective, it’s hard to split-test when the
expected changes are everywhere on the site
76. So our state of the art still has gaps
How much difference will a proposed fix make?
Which proposed change is a better idea?
77. It’s important because our intuition
is really bad.
Essentially what we want to do is figure out the best link structure for
distributing external authority around our site
88. How does its PageRank compare?
PageRank?
PageRank?
89. I suspect most people’s intuition about PageRank is
wrong so I did some unscientific surveying
See the survey
90. Over 1 in 5 people got even the simple question
wrong
And to be honest, depending what “significantly” means, even the 19% might not be too wrong. But it does
hint at single-iteration thinking. We’re all really bad at figuring out the convergence of iterative algorithms.
98. This is important because it means
too many recommendations are
based on bad intuition about how
PageRank works
None of us have an intuitive sense of random surfer or eigenvectors
99. There are always trade-offs, but we
can’t compare them easily
It’s rare for one approach strictly to dominate another
109. 5. Set personalization to mR probabilities
Set alpha to damping parameter (normally 0.85, we want lower)
110. Future enhancements
● Handle nofollow correctly (see Matt Cutts’ old PageRank sculpting post)
● Handle redirects and rel canonical sensibly
● Include top mR pages (or all pages with mR?) - even if not in the crawl
○ Use as a seed and crawl from these pages
● Weight links by type to get closer to reasonable surfer model
○ This is the weight parameter in NetworkX
○ Use actual click-data for your own site to approximate an actual surfer!
111. Then we propose a change and see
if the treatment works
Step 1 is figuring out how to capture your proposed changes to the
internal link structure of your site
112. You can add or remove small numbers of links by
changing the crawl output in a spreadsheet
Source Destination
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/services/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/events/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/features/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/u/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/videos/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/about/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/jobs/
113. It’s easy to make sitewide additions to the
navigation as you build the graph
site.add_edges_from([(edge['Source'],
'https://www.distilled.net/events/searchlove-london/')])
114. Much harder to remove from global navigation
because it’s not the same as removing every link
site.add_edges_from([(edge['Source'],
'https://www.distilled.net/events/searchlove-london/')])
117. Then crawl the preview environment
Subtleties:
● Crawl live and preview to x levels deep
● Combine into a superset of pages discovered on
each crawl
● Crawl both again from the list
Because we are comparing relative weights (normalised PR)
we need the same set of pages
118. Generally we will care
about the impact on
groups of pages:
Label them by URL /
in the crawl /
using modularity
119. Might it be possible to come up
with a single metric that captures
“internal link graph quality”?
I’ve been wondering about equality metrics like Gini coefficients.
Come back next year to see if I’ve made progress on this!
120. Until then: compare your proposed
changes to find the best solution to
your issue
For example, find the change that best flows authority to
under-indexed product pages.
121. So I think I’ve presented two key
new ideas in this section:
122. 1. A quantitative way of assessing
your internal link setup
by incorporating external authority into internal PR calculations
123. 2. A way of comparing different
proposed changes
by working with the data rather than just with visualisations
124. And remember, we
need this because you
need to make bold
changes
Small tweaks don’t even move the PageRank needle
126. 1. Start gathering qualitative data
For your site, for proposed changes, for competitors.
About quality and about usage.
127. 2. Use more powerful quantitative
data
For things like internal linking analysis and recommendations
See my newly-published blog post for the technical details
128. Let’s stop wasting time with
ineffective recommendations, or
damaging sites with bad ones
131. If you’re interested in the
counter-intuitive results I
presented at the beginning,
check out odn.distilled.net. We’ll
be happy to demo for you.
We’re serving ~5 billion requests
per quarter and recently
published everything from
response times to our +£100k /
month split test.
133. ● Da Vinci helicopter
● Niels Bohr
● Scream
● Statue of Liberty
● Complexity
● Head in hands
● Rorschach Test
● State of the art
● Axe
● Surfer
● Clouds / Clouds with sun
● Wrong way
● Accountant glasses
● St Paul’s cathedral
● Cactus
Image credits
● Lego heads
● Anonymous
● Padlock
● Blur
● Doctor
● Boardroom
● Repetition
● Balance
● Smash
● Stars
● Table football
● Equality
● Quality
● Panda
● Leaves