SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it

SearchLove London 2017
Let’s do better
We need to have a bigger impact
By Will Critchlow - @willcritchlow

The last 18 months of split-testing
has shown me that FAR too often...

...common recommendations make no difference

...you’ll never believe what led to this decline
-27% in two weeks

“Target these pages
at the ways that
people search.”

And even when we recommend the
right kind of thing, we suck at the
details
Pretty gloomy. Want to come on a journey to do better?

Let’s do fewer pointless things

Let’s screw things up less often

And let’s make some really
EFFECTIVE recommendations
I think it’s a fairly straight-forward pitch

Control
We have
the same
data as
Google
Influence
Google has
data we don’t
have
Keyword targeting
External links
Internal links
Usage data
Website “quality”
Ranking
factors

On a 2x2 like any good consultant

There are
areas where
Google has
data we don’t
have

While in
others, we
have the
same
information
they do

We can only influence these factors

Keyword targeting
Control
Influence
Internal links
We have
the same
data as
Google
Google has
data we don’t
have

Control
Influence
We have
the same
data as
Google
Google has
data we don’t
have
Usage data

Control
External links
Out of scope today
We have the
same data
as Google
Google has
data we don’t
have

External links
The less direct control you have over a factor, the
harder testing and modelling becomes.
Control
We have the
same data
as Google
Google has
data we don’t
have

Control
Keyword targeting
Influence
Test & Model
We have the
same data
as Google
Google has
data we don’t
have

Control
Influence
Usage data
Survey & Study
We have the
same data
as Google
Google has
data we don’t
have

Control
Influence
Internal links
Analyse better
We have the
same data
as Google
Google has
data we don’t
have

1. Data we are missing -
survey and study

1. Data we have
(or can get)
only for our own site
Like usage data - see, for example, this
post by @SimoAhava explaining how to
capture bounce rate back to the SERP.
Also interesting: Rand’s video about a
possible organic quality score.

2. Cases where the
real ranking factor is a
machine-learned
proxy for the real thing
e.g.
● Content quality (Panda ML)
● Link quality
○ Ignored links (ML on disavow)
Actually
measure:
ML PROXY FOR
QUALITY
Want to
measure:
QUALITY

For usage data: it is impossible to guess what
people prefer
See whichtestwon

So tools like SERP Turkey can be useful
(by our very own @TomAnthonySEO)

When it comes to “quality”
● How do you define it?
● How do you communicate it to clients / bosses?
● How do you benchmark it against competitors?
● How do you figure out if a change improves it?

Gather human rater
information
Google employs
thousands of human
quality raters to answer
questionnaires about
many kinds of website

Train ML models
Google uses the human
questionnaires as training
data for ML models of
“quality”
Gather human rater
information
Google employs
thousands of human

2011
Release Panda
The Panda quality
algorithm starts being
used as a batch process
modifying the regular core
algorithm
Train ML models
“quality”
Gather human rater
information
Google employs
thousands of human

2011
Release Panda
The Panda quality
algorithm starts being
used as a batch process
modifying the regular core
algorithm
Train ML models
“quality”
Gather human rater
information
Google employs
thousands of human
2016
Make Panda real-time
“Quality” becomes a
first-class ranking factor
in the core algorithm

Back in 2011, I was suggesting we run our own
Panda-like quality surveys (WBF here, instructions here)

Probably the only
thing that’s really
changed since then is
that you should run it
mobile-first now
Hat-tip Tom Capper

More executives are aware of
quality as a ranking factor these
days
Since Panda went real-time, quality issues don’t necessarily cause
obvious drops correlated with algorithm history dates

Client site 1 Client site 2
Would you trust information from this website? 72% 64%

Client site 1 Client site 2 Key competitor
Would you trust information from this website? 72% 64% 81%

Is this website written by experts? 50% 46% 65%

Would you give this site your credit card details? 29% 21% 43%

Are there any noticeable errors on this page? 6% 4% 1%

Does this page provide original content or info? 76% 72% 85%

Would you recognize this site as an authority? 44% 33% 58%

Does this website contain insightful analysis? 72% 62% 81%

Would you consider bookmarking pages on this site? 44% 38% 56%

Are there excessive adverts on this website? 2% 2% 8%

Are there excessive adverts on this website? 2% 2% 8%
Could pages from this site appear in print? 54% 54% 59%

We also asked for free-text feedback and
found some surprising priorities from
non-SEOs

“The reviews seem fake”
Trust is a huge deal for real-world users

“There's not enough information
about the company and why I
should use their products”
On a micro-site that doesn’t have an “about” page

“In this day and age every page that
has anything at all to do with
business should be https”
Security is a big deal in B2B - even without on-site purchases

“The pictures were of low quality
and blurry”
We know this matters to users. It’s at the easier end of ML detection

Benefits of running surveys:
Real site Screenshot
vs.

vs.
Real site Staging
vs.

vs.
Real site Staging
vs.
Your site Competitor
vs.

vs.
Real site Staging
vs.
Your site Competitor
vs.
Competitor
Tweaked
competitorvs.

2. Factors we need to analyse better

“Improve your
information
architecture by
linking more to
your product
pages.”

Not wrong exactly,
but certainly incomplete

Can you figure out:
Will we do better if we make this change?
How much better could it be?
Which of the many ways of doing it is best?

Let’s look at the state of the art:
Use interactive visualisations to find issues
Calculate internal PageRank
Follow Paul Shapiro and Patrick Stox for more

You’ve probably all seen crawl graphs
They are distorted by starting at one page and only showing some paths
Good explainer at sitebulb.com and Ian Lurie reports some good results from colouring by indexation

Full link graphs are more complete, but I find them
hard to interpret

Use static visualisations for:
Communicating and
Convincing

they are generally not good for
Discovery and
Diagnosis

Though sometimes you’ll find something
interesting like this entirely-duplicated site
Credit: Paul Shapiro

“Everything looks
like a graph but
almost nothing
should ever be
drawn as one”
I found this quote in this interesting presentation

Interactive visualisations in Gephi are more useful
for discovery and diagnosis

Internal PageRank is a
powerful idea.
But by starting from
“all pages are equal”
we get some
odd results
Like the contact page
being more powerful
than the homepage

There are case studies of people seeing real results
from radical changes to internal link structure
See Alex’s fascinating Mozcon talk [PDF]

but real-world changes are hard to
make, hard to undo, and could
cause lasting damage
and even worse from my perspective, it’s hard to split-test when the
expected changes are everywhere on the site

So our state of the art still has gaps
How much difference will a proposed fix make?
Which proposed change is a better idea?

It’s important because our intuition
is really bad.
Essentially what we want to do is figure out the best link structure for
distributing external authority around our site

I mentioned PageRank (PR)
before without really
explaining it

It’s the algorithm Google
developed to measure
webpages’ authority based
on links

Many people can talk
about the random surfer
model
For this talk, I’m going to group it with updates like reasonable surfer

Fewer are comfortable
with the eigenvector of the
stochastic adjacency
matrix

But most intuition is based
on “flow” of PR - and that’s
not really how the
algorithm works

I suspect most people’s intuition about PageRank is
wrong so I did some unscientific surveying
See the survey

Let me explain: Imagine a typical site

With some external links in to some pages

Now imagine you add a new page, linked only from
the homepage

And linking to the same N pages as the homepage

How does its PageRank compare?
PageRank?
PageRank?

Over 1 in 5 people got even the simple question
wrong
And to be honest, depending what “significantly” means, even the 19% might not be too wrong. But it does
hint at single-iteration thinking. We’re all really bad at figuring out the convergence of iterative algorithms.

Now, let’s step it up a notch

You’re on “who wants to be a millionaire”, you ask
the audience, and it comes back like this:

It’s actually quite sensitive to some assumptions,
but almost 3 in 5 people are definitely wrong
NOPE

I wasn’t 100% sure, but my modelling matched my
intuition
NOPE
Right answer

Though there are some weird site setups where you
can find this happens (e.g. no external links at all)
NOPE
Right answer
Possible edge case

NOPE
Either way, it was only ~2%
of the new page’s PR on
Distilled.net

This is important because it means
too many recommendations are
based on bad intuition about how
PageRank works
None of us have an intuitive sense of random surfer or eigenvectors

There are always trade-offs, but we
can’t compare them easily
It’s rare for one approach strictly to dominate another

So let’s try to come up with a better
approach

What I really want to do is run
PageRank across the whole web
graph

Then make changes to my site’s
linking structure, and re-run
PageRank on the whole web

We can approximate this with a
modified form of internal PageRank

1. Crawl x levels deep
& export internal links
Subcategory
Category 1
Homepage
Category 2
Subcategory
Subcategory
Subcategory
Facet
Facet
Product
Product
Product

2. Gather raw external authority
(raw mozrank from the moz API)
Subcategory
Category 1
Homepage
Category 2
Subcategory
Subcategory
Subcategory
Facet
Facet
Product
Product
Product

3. Normalise the authority data
mR raw
3.67E-13
3.35E-11
1.71E-13
1.64E-13
1.59E-13
3.28E-13
6.88E-14
2.45E-13
7.12E-14
3.12E-13
1.67E-13

3. Normalise the authority data
mR raw mR raw normalised
3.67E-13 1.0%
3.35E-11 94.2%
1.71E-13 0.5%
1.64E-13 0.5%
1.59E-13 0.4%
3.28E-13 0.9%
6.88E-14 0.2%
2.45E-13 0.7%
7.12E-14 0.2%
3.12E-13 0.9%
1.67E-13 0.5%

4. Use NetworkX or similar to run PR
See NetworkX

5. Set personalization to mR probabilities
Set alpha to damping parameter (normally 0.85, we want lower)

Future enhancements
● Handle nofollow correctly (see Matt Cutts’ old PageRank sculpting post)
● Handle redirects and rel canonical sensibly
● Include top mR pages (or all pages with mR?) - even if not in the crawl
○ Use as a seed and crawl from these pages
● Weight links by type to get closer to reasonable surfer model
○ This is the weight parameter in NetworkX
○ Use actual click-data for your own site to approximate an actual surfer!

Then we propose a change and see
if the treatment works
Step 1 is figuring out how to capture your proposed changes to the
internal link structure of your site

You can add or remove small numbers of links by
changing the crawl output in a spreadsheet
Source Destination
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/services/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/events/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/features/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/u/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/resources/videos/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/about/
https://www.distilled.net/resources/beginners-guide-to-traffic-drop-analysis/ https://www.distilled.net/jobs/

It’s easy to make sitewide additions to the
navigation as you build the graph
site.add_edges_from([(edge['Source'],
'https://www.distilled.net/events/searchlove-london/')])

Much harder to remove from global navigation
because it’s not the same as removing every link
site.add_edges_from([(edge['Source'],
'https://www.distilled.net/events/searchlove-london/')])

For more complex changes, we can use our ODN

Then crawl the preview environment

Then crawl the preview environment
Subtleties:
● Crawl live and preview to x levels deep
● Combine into a superset of pages discovered on
each crawl
● Crawl both again from the list
Because we are comparing relative weights (normalised PR)
we need the same set of pages

Generally we will care
about the impact on
groups of pages:
Label them by URL /
in the crawl /
using modularity

Might it be possible to come up
with a single metric that captures
“internal link graph quality”?
I’ve been wondering about equality metrics like Gini coefficients.
Come back next year to see if I’ve made progress on this!

Until then: compare your proposed
changes to find the best solution to
your issue
For example, find the change that best flows authority to
under-indexed product pages.

So I think I’ve presented two key
new ideas in this section:

1. A quantitative way of assessing
your internal link setup
by incorporating external authority into internal PR calculations

2. A way of comparing different
proposed changes
by working with the data rather than just with visualisations

And remember, we
need this because you
need to make bold
changes
Small tweaks don’t even move the PageRank needle

1. Start gathering qualitative data
For your site, for proposed changes, for competitors.
About quality and about usage.

2. Use more powerful quantitative
data
For things like internal linking analysis and recommendations
See my newly-published blog post for the technical details

Let’s stop wasting time with
ineffective recommendations, or
damaging sites with bad ones

and start making a real difference

Thank you for coming to
SearchLove

If you’re interested in the
counter-intuitive results I
presented at the beginning,
check out odn.distilled.net. We’ll
be happy to demo for you.
We’re serving ~5 billion requests
per quarter and recently
published everything from
response times to our +£100k /
month split test.

● Da Vinci helicopter
● Niels Bohr
● Scream
● Statue of Liberty
● Complexity
● Head in hands
● Rorschach Test
● State of the art
● Axe
● Surfer
● Clouds / Clouds with sun
● Wrong way
● Accountant glasses
● St Paul’s cathedral
● Cactus
Image credits
● Lego heads
● Anonymous
● Padlock
● Blur
● Doctor
● Boardroom
● Repetition
● Balance
● Smash
● Stars
● Table football
● Equality
● Quality
● Panda
● Leaves

SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (17)

Similaire à SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it

Similaire à SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it (20)

Plus de Distilled

Plus de Distilled (20)

Dernier

Dernier (20)

SearchLove London 2017 | Will Critchlow | Seeing the Future: How to Tell the Impact of a Change Before You Make it