SlideShare une entreprise Scribd logo
1  sur  61
Télécharger pour lire hors ligne
Log Analysis and PRO Use Cases
for Search Marketers
Dave Sottimano - Untagged.io - Madrid 2016
Prepare yourself for bullet point
hell.
It’s meant for reading:
bit.ly/untagged2016
Lo siento :(
You know what makes me sad?
Incomplete data.
Inflated & ambiguous stats
80,000
80,000
https://support.google.com/webmasters/answer/35253?hl=en
Seems broken. It is broken, this is actually an image
search result and it has been ranking through the entire
time period.
???
But hey, reporting stats for the entire
internet isn’t easy.
So, thank you Google.
..but, we need better data.
Why server log analysis is so important:
How do we try and increase crawl frequency?
Increase External link count (includes links from social sites)
List valuable pages in sitemaps and ping Google
Increase Internal link count (crawl paths)
Create new pages, and update older pages (avoid stagnation)
Ensure pages are unique, reduce internal duplication
Avoid internally linking to redirects or broken pages
Testing. Lots of testing.
What actions do SEOs take from log analysis?
● Optimize Googlebot crawl
○ restructure link architecture, apply directives, block via robots.txt
● Find server errors or Googlebot induced errors
○ Try to fix any 4xx, 5xx error codes
○ Use browser user agent referer fields to uncover source of errors
● Understand Googlebot crawl rate & behaviour for SEO testing
○ Helpful for testing and insights and constantly questioning best practices
● Block badly behaving bots, prevent bandwidth drain
○ Look for hotlinking bandwidth drain, i.e images from porn sites
● Find unreported links through referer fields
○ Link crawlers don’t find every link, server logs are necessary for comprehensive audits
● Double check Analytics data
○ Helpful for correcting analytics setup or understanding why referers aren’t passed correctly
The hard part:
Getting the right data and
merging.
Step 1: Get the right fields logged
206.248.146.167 - - [25/Aug/2015:06:50:01 +0000] "GET /shoes HTTP/1.0" 200 251
"https://www.google.ca/" “example.com” "Mozilla/5.0 (Windows NT 6.1; WOW64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"
User agent
IP Address Date/Time
Referer
Method
Response code
Page
Response time
Hostname
Step 2: Ensure the correct originating IP is logged
Load balancers, proxies or CDN’s may overwrite the original IP of the request. Use
X-Forwarded-For header for to ensure you have the original IP
IIS: http://www.loadbalancer.org/blog/iis-and-x-forwarded-for-header
Apache: http://www.loadbalancer.org/blog/apache-and-x-forwarded-for-headers
Nginx: https://easyengine.io/tutorials/nginx/forwarding-visitors-real-ip/
CloudFlare:
https://support.cloudflare.com/hc/en-us/sections/200805497-Restoring-Visitor-IPs
Step 3: Ensure we have all of the logs
● Triple check the hostname! If you’re analyzing example.com, desktop for
instance, ensure you’re not counting the mobile version (m.example.com) or
other subdomains (forum.example.com). Be very careful to get the right data
or you will pull your hair out. Ask system administrators!
● If the server stores cached copies and serves them from another server, get
those logs too and combine them for the target domain analysis.
● Too much data? Ask for selective logging for Googlebot user agent only
Step 4: Parse the logs, grab Googlebot entries
https://www.splunk.com/en_us/download/splunk-light-2.html
Step 5: Verify Googlebot entries by DNS
1. Segment out logs with user-agent: Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)
2. Take the original IP in the logs, example: 66.249.65.63
3. Reverse DNS lookup: crawl-66-249-65-63.googlebot.com
4. DNS lookup: 66.249.65.63 (confirmed!)
https://support.google.com/webmasters/answer/80553?hl=en
Software I use: http://www.nirsoft.net/utils/ipnetinfo.html
Note to myself: Look out for Google Mobile user
agents
Mozilla/5.0+(iPhone;+CPU+iPhone+OS+6_0+like+Mac+OS+X)+AppleWebKit/536.26
+(KHTML,+like+Gecko)+Version/6.0+Mobile/10A5376e+Safari/8536.25+(compatible;
+Googlebot/2.1;++http://www.google.com/bot.html)
This is a verified Googlebot from 66.249.65.63, but it’s not listed on the official
crawlers page.
Official Google: Mobile-first Indexing
Step 5: Merge Crawl data with clean logs
● Crawl as: Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html) and a popular browser user agent
● Crawler config: Disobey Robots.txt, crawl all non-HTML, crawl internal
nofollow, crawl canonicals & sitemaps, ideally JS enabled
● Fields required: URL, Response code, Title, Robots directives (blocked,
noindex, nofollow etc.), Canonical, Page size, response time, crawl level,
number of internal links to page
Try DeepCrawl for free bit.ly/freecrawl - 25,000 credits for Untagged.io
Step 6: Add Web Analytics data
● Ensure the the URLs correspond correctly (special characters, full URL)
● Ensure the date period is exactly the same period as server logs
● Use data from source/medium = Google/Organic only
DeepCrawl can do merge both crawl and analytics data from Google Analytics
So far...
● Have all logs from the right host with the right
fields
● Have the original IP addresses
● Confirmed real Googlebot visits
● Merged crawl data and analytics data perfectly
Just when you think all the data is correct,
something will go wrong, guaranteed ;)
Real example, small site:
http://www.campgroundsigns.com/
7 million events from load balancer, IIS custom format access logs= 1.6 gB of
data
13,000 Googlebot events over 28 days
1,129 pages are indexable on
campgroundsigns.com
Caveat!
The following are observations based on 1 small website. The
observations for this site are only for this site and are not
representative.
Each website and it’s Googlebot crawl activity are different.
Special thanks to campgroundsigns.com for volunteering for
the analysis
What is Google crawling?
What we wanted crawled vs What Google
crawled
Based on a 28 day sample
What did Google crawl by Page Type?
How we determine page types
By URL (if possible):
example.com/products/product-123
By unique HTML template footprint (recommended):
How does crawl
affect traffic?
Googlebot crawl and Organic Traffic
Based on a 28 day sample
Is Googlebot using
it’s crawl wisely?
Of those pages, what were the response codes?
The 4% of 410 errors are actually caused by
Google trying to render JavaScript.
127 pages per 28 day crawl are wasted.
How often did Google crawl NOINDEX pages?
Did Google crawl the right pages?
Indexable defined as: Response code: 200, no robots.txt block, self referencing canonical or no canonical in head or http header, no noindex
directives in head or http header, no directives applied in GSC param config, no removal request, not JS/CSS or resource files. Not indexable either
has non 200 response or one of the previous.
Generally, we see reduced crawl activity to
pages with NOINDEX.
There’s something wrong.
PLA = Product listing Ad.
We tried to block the PLA
pages to divert attention
to important pages:
Based on 4 day, Mon-Thursday period before and after the block
Errr, go back, quick.
All requests Unique pages crawled
Before After Before After
PLA (Blocked by robots) 1334 0 703 0
Department or other Page 404 212 270 124
Product page 605 247 452 177
resource 332 406 50 61
Homepage 15 15 1 1
Totals 2690 880 1476 363
Difference -67% -75%
Turns out, Google uses their regular Googlebot
crawler to crawl them, not Adbot.
It was a mistake blocking these. We’ll try
canonicals next.
https://support.google.com/merchants/answer/160156?hl=en
Insights:
If Googlebot crawls a
page, is it always indexed?
No. Out of a sample of 691 pages, 12 were crawled and not
indexed.
Insights:
If a page gets Google
organic traffic, is it always
indexed?
No. 225 pages receiving at least 1 Google Organic visit
during the time period:
Read why: http://bit.ly/2faEoA9
How did I check
indexation?
Sorry, this section is not
available for non
attendees!
Frequent question for SEOs:
How long will it take for Google
to update index after a
migration?
Googlebot 2.1 pages crawled per day, vs Search Console
Where to find pages crawled per day in Search
Console?
Tip: Disable all CSS styles for easy copy
How many unique URLs are crawled per day?
Googlebot only crawled 766 pages out of the 1,129
we wanted crawled over 28 days.
More realistic, still estimated, but slightly less
bullshit:
● 766 unique, indexable pages were crawled over 28 days
● That gives us an Average of 27 unique pages crawled
per day.
● 1129 total indexable pages / 27 = minimum 42 days for a
full recrawl.
Remember, this is a complete estimate.
That doesn’t even account for how many times
Google has to figure out a 301 redirect.
Same calculation, different site (with approx 86,000
indexable pages)
This is not representative of any other site.
Fixing the problems isn’t
always easy.
But it does pay off.
Here’s some
fancy charts
moving up and to
the right as proof.
Things to
remember:
If it seems Google isn’t respecting robots.txt,
check:
10 day
lag!
Server log analysis is hard. Here’s why:
● Data size challenges, example: 7 million events = 1.6 gB (and that’s tiny)
● Lots of different servers logging with custom formats
● Often, obtaining them means surpassing people problems & technical
challenges
● Any small mistakes combining crawl, analytics, search console data can make
the entire analysis useless
● Combining large datasets requires either some form of programming or
technical knowledge; it’s not for everyone.
● Many available tools aren’t comprehensive enough for SEO purposes yet.
That being said, they are the best thing since patatas bravas
con alioli.
Things that can corrupt your results
● Thinking you’re seeing Googlebot but it’s not really Googlebot
● Not accounting for robots.txt restrictions changes or other directive changed in
crawl data during logging period
● Incorrect field mapping, i.e. mistake referer for page request
● Incorrect merging of crawl and analytics data
Helpful links for log analysis
Guides:
● A Complete Guide to Log Analysis with Big Query - Dominic Woodman
● The Ultimate Guide to Log File Analysis - Daniel Butler
● SEO Finds in Your Server Log Tim Resnik
● How to Use Server Log Analysis for Technical SEO Samuel Scott
Software:
● Splunk
● SEO Log File Analyser
● Logz.io
● Botify
Muchas gracias Untagged!
@dsottimano
http://www.definemg.com

Contenu connexe

Tendances

Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AGTechnical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AGBastian Grimm
 
Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019Bastian Grimm
 
Migration Best Practices - Peak Ace on Air
Migration Best Practices - Peak Ace on AirMigration Best Practices - Peak Ace on Air
Migration Best Practices - Peak Ace on AirBastian Grimm
 
The need for Speed: Advanced #webperf - SEOday 2018
The need for Speed: Advanced #webperf - SEOday 2018The need for Speed: Advanced #webperf - SEOday 2018
The need for Speed: Advanced #webperf - SEOday 2018Bastian Grimm
 
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站areyouok
 
The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014
The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014
The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014Bastian Grimm
 
Crawl the entire web in 10 minutes...and just 100€
Crawl the entire web  in 10 minutes...and just 100€Crawl the entire web  in 10 minutes...and just 100€
Crawl the entire web in 10 minutes...and just 100€Danny Linden
 
Welcome to a new reality - DeepCrawl Webinar 2018
Welcome to a new reality - DeepCrawl Webinar 2018Welcome to a new reality - DeepCrawl Webinar 2018
Welcome to a new reality - DeepCrawl Webinar 2018Bastian Grimm
 
Super speed around the globe - SearchLeeds 2018
Super speed around the globe - SearchLeeds 2018Super speed around the globe - SearchLeeds 2018
Super speed around the globe - SearchLeeds 2018Bastian Grimm
 
Reducing Server Resources: Improve Costs, SEO, Conversions & UX
Reducing Server Resources: Improve Costs, SEO, Conversions & UXReducing Server Resources: Improve Costs, SEO, Conversions & UX
Reducing Server Resources: Improve Costs, SEO, Conversions & UXMichael Jones
 
Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015Jan Hendrik Merlin Jacob
 
Whats Next in SEO & CRO - 3XE Conference 2018 Dublin
Whats Next in SEO & CRO - 3XE Conference 2018 DublinWhats Next in SEO & CRO - 3XE Conference 2018 Dublin
Whats Next in SEO & CRO - 3XE Conference 2018 DublinBastian Grimm
 
SMX Advanced 2018 SEO for Javascript Frameworks by Patrick Stox
SMX Advanced 2018 SEO for Javascript Frameworks by Patrick StoxSMX Advanced 2018 SEO for Javascript Frameworks by Patrick Stox
SMX Advanced 2018 SEO for Javascript Frameworks by Patrick Stoxpatrickstox
 
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsDistilled
 
Rendering SEO (explained by Google's Martin Splitt)
Rendering SEO (explained by Google's Martin Splitt)Rendering SEO (explained by Google's Martin Splitt)
Rendering SEO (explained by Google's Martin Splitt)Anton Shulke
 
BrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The Edge
BrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The EdgeBrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The Edge
BrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The EdgeDan Taylor
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stoxpatrickstox
 
對抗釣魚與詐騙網站的經驗談
對抗釣魚與詐騙網站的經驗談對抗釣魚與詐騙網站的經驗談
對抗釣魚與詐騙網站的經驗談Jerry
 
High Performance Websites
High Performance WebsitesHigh Performance Websites
High Performance WebsitesParham
 
10 Tips to make your Website lightning-fast - SMX Stockholm 2012
10 Tips to make your Website lightning-fast - SMX Stockholm 201210 Tips to make your Website lightning-fast - SMX Stockholm 2012
10 Tips to make your Website lightning-fast - SMX Stockholm 2012Bastian Grimm
 

Tendances (20)

Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AGTechnical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
Technical SEO vs. User Experience - Bastian Grimm, Peak Ace AG
 
Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019Migration Best Practices - SMX West 2019
Migration Best Practices - SMX West 2019
 
Migration Best Practices - Peak Ace on Air
Migration Best Practices - Peak Ace on AirMigration Best Practices - Peak Ace on Air
Migration Best Practices - Peak Ace on Air
 
The need for Speed: Advanced #webperf - SEOday 2018
The need for Speed: Advanced #webperf - SEOday 2018The need for Speed: Advanced #webperf - SEOday 2018
The need for Speed: Advanced #webperf - SEOday 2018
 
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂09 如何建设高性能网站
 
The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014
The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014
The Need for Speed (5 Performance Optimization Tipps) - brightonSEO 2014
 
Crawl the entire web in 10 minutes...and just 100€
Crawl the entire web  in 10 minutes...and just 100€Crawl the entire web  in 10 minutes...and just 100€
Crawl the entire web in 10 minutes...and just 100€
 
Welcome to a new reality - DeepCrawl Webinar 2018
Welcome to a new reality - DeepCrawl Webinar 2018Welcome to a new reality - DeepCrawl Webinar 2018
Welcome to a new reality - DeepCrawl Webinar 2018
 
Super speed around the globe - SearchLeeds 2018
Super speed around the globe - SearchLeeds 2018Super speed around the globe - SearchLeeds 2018
Super speed around the globe - SearchLeeds 2018
 
Reducing Server Resources: Improve Costs, SEO, Conversions & UX
Reducing Server Resources: Improve Costs, SEO, Conversions & UXReducing Server Resources: Improve Costs, SEO, Conversions & UX
Reducing Server Resources: Improve Costs, SEO, Conversions & UX
 
Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015Crawl Budget - Some Insights & Ideas @ seokomm 2015
Crawl Budget - Some Insights & Ideas @ seokomm 2015
 
Whats Next in SEO & CRO - 3XE Conference 2018 Dublin
Whats Next in SEO & CRO - 3XE Conference 2018 DublinWhats Next in SEO & CRO - 3XE Conference 2018 Dublin
Whats Next in SEO & CRO - 3XE Conference 2018 Dublin
 
SMX Advanced 2018 SEO for Javascript Frameworks by Patrick Stox
SMX Advanced 2018 SEO for Javascript Frameworks by Patrick StoxSMX Advanced 2018 SEO for Javascript Frameworks by Patrick Stox
SMX Advanced 2018 SEO for Javascript Frameworks by Patrick Stox
 
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
 
Rendering SEO (explained by Google's Martin Splitt)
Rendering SEO (explained by Google's Martin Splitt)Rendering SEO (explained by Google's Martin Splitt)
Rendering SEO (explained by Google's Martin Splitt)
 
BrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The Edge
BrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The EdgeBrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The Edge
BrightonSEO 2019 - Edge SEO - Using CDNs To Perform SEO On The Edge
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
 
對抗釣魚與詐騙網站的經驗談
對抗釣魚與詐騙網站的經驗談對抗釣魚與詐騙網站的經驗談
對抗釣魚與詐騙網站的經驗談
 
High Performance Websites
High Performance WebsitesHigh Performance Websites
High Performance Websites
 
10 Tips to make your Website lightning-fast - SMX Stockholm 2012
10 Tips to make your Website lightning-fast - SMX Stockholm 201210 Tips to make your Website lightning-fast - SMX Stockholm 2012
10 Tips to make your Website lightning-fast - SMX Stockholm 2012
 

En vedette

Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016
Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016
Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016MobileMoxie
 
Avoiding the Epic (Content) Fail
Avoiding the Epic (Content) FailAvoiding the Epic (Content) Fail
Avoiding the Epic (Content) FailAndrew Eisner
 
What Associations Can Learn from Political Campaigns
What Associations Can Learn from Political CampaignsWhat Associations Can Learn from Political Campaigns
What Associations Can Learn from Political CampaignsBlue State Digital
 
What is content marketing?
What is content marketing? What is content marketing?
What is content marketing? Joseph Hall
 
Inbound Marketing: Buy-In, Budgets and Best Practices
Inbound Marketing: Buy-In, Budgets and Best PracticesInbound Marketing: Buy-In, Budgets and Best Practices
Inbound Marketing: Buy-In, Budgets and Best PracticesKuno Creative
 
A warm welcome - crafting your perfect email welcome experience
A warm welcome - crafting your perfect email welcome experienceA warm welcome - crafting your perfect email welcome experience
A warm welcome - crafting your perfect email welcome experienceFairSay
 
2016 SEO Keyword Guide
2016 SEO Keyword Guide2016 SEO Keyword Guide
2016 SEO Keyword GuideKuno Creative
 
Improve Your SEO by Mastering These Core Principles
Improve Your SEO by Mastering These Core PrinciplesImprove Your SEO by Mastering These Core Principles
Improve Your SEO by Mastering These Core PrinciplesLindsay Wassell
 
Who are we writing for? Choose fact over fiction.
Who are we writing for? Choose fact over fiction.Who are we writing for? Choose fact over fiction.
Who are we writing for? Choose fact over fiction.Dana DiTomaso
 
140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...
140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...
140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...Axonn Media
 
How to Build a Time Machine - LearnInbound
How to Build a Time Machine - LearnInboundHow to Build a Time Machine - LearnInbound
How to Build a Time Machine - LearnInboundHannah Smith
 
5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...
5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...
5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...Relevance
 
What's Next... in Social — What's Next in 2014
What's Next... in Social — What's Next in 2014What's Next... in Social — What's Next in 2014
What's Next... in Social — What's Next in 2014DigitasLBi
 
Darren Shaw - User Behavior and Local Search - Dallas State of Search 2014
Darren Shaw  - User Behavior and Local Search - Dallas State of Search 2014Darren Shaw  - User Behavior and Local Search - Dallas State of Search 2014
Darren Shaw - User Behavior and Local Search - Dallas State of Search 2014Darren Shaw
 
Optimize for Engagement: Future-Proof Your Local Search Rankings
Optimize for Engagement: Future-Proof Your Local Search RankingsOptimize for Engagement: Future-Proof Your Local Search Rankings
Optimize for Engagement: Future-Proof Your Local Search RankingsDana DiTomaso
 
Seo y big data, rastreando lo que google rastrea - clinic seo - eshow
Seo y big data, rastreando lo que google rastrea - clinic seo - eshowSeo y big data, rastreando lo que google rastrea - clinic seo - eshow
Seo y big data, rastreando lo que google rastrea - clinic seo - eshowIñaki Huerta (ikhuerta)
 

En vedette (19)

Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016
Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016
Indexing on Fire: Google Firebase Native & Web App Indexing - MozCon 2016
 
Avoiding the Epic (Content) Fail
Avoiding the Epic (Content) FailAvoiding the Epic (Content) Fail
Avoiding the Epic (Content) Fail
 
What Associations Can Learn from Political Campaigns
What Associations Can Learn from Political CampaignsWhat Associations Can Learn from Political Campaigns
What Associations Can Learn from Political Campaigns
 
What is content marketing?
What is content marketing? What is content marketing?
What is content marketing?
 
Inbound Marketing: Buy-In, Budgets and Best Practices
Inbound Marketing: Buy-In, Budgets and Best PracticesInbound Marketing: Buy-In, Budgets and Best Practices
Inbound Marketing: Buy-In, Budgets and Best Practices
 
A warm welcome - crafting your perfect email welcome experience
A warm welcome - crafting your perfect email welcome experienceA warm welcome - crafting your perfect email welcome experience
A warm welcome - crafting your perfect email welcome experience
 
2016 SEO Keyword Guide
2016 SEO Keyword Guide2016 SEO Keyword Guide
2016 SEO Keyword Guide
 
Improve Your SEO by Mastering These Core Principles
Improve Your SEO by Mastering These Core PrinciplesImprove Your SEO by Mastering These Core Principles
Improve Your SEO by Mastering These Core Principles
 
Who are we writing for? Choose fact over fiction.
Who are we writing for? Choose fact over fiction.Who are we writing for? Choose fact over fiction.
Who are we writing for? Choose fact over fiction.
 
140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...
140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...
140 Super Awesome Content Marketing Twitter Accounts Every Marketer Should Fo...
 
Foxtail Website Audit
Foxtail Website AuditFoxtail Website Audit
Foxtail Website Audit
 
How to Build a Time Machine - LearnInbound
How to Build a Time Machine - LearnInboundHow to Build a Time Machine - LearnInbound
How to Build a Time Machine - LearnInbound
 
5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...
5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...
5 Steps to Better Content Marketing Results: Partnering with Subject Matter E...
 
Optimizing Unstructured Data
Optimizing Unstructured DataOptimizing Unstructured Data
Optimizing Unstructured Data
 
What's Next... in Social — What's Next in 2014
What's Next... in Social — What's Next in 2014What's Next... in Social — What's Next in 2014
What's Next... in Social — What's Next in 2014
 
Darren Shaw - User Behavior and Local Search - Dallas State of Search 2014
Darren Shaw  - User Behavior and Local Search - Dallas State of Search 2014Darren Shaw  - User Behavior and Local Search - Dallas State of Search 2014
Darren Shaw - User Behavior and Local Search - Dallas State of Search 2014
 
Optimize for Engagement: Future-Proof Your Local Search Rankings
Optimize for Engagement: Future-Proof Your Local Search RankingsOptimize for Engagement: Future-Proof Your Local Search Rankings
Optimize for Engagement: Future-Proof Your Local Search Rankings
 
Seo y big data, rastreando lo que google rastrea - clinic seo - eshow
Seo y big data, rastreando lo que google rastrea - clinic seo - eshowSeo y big data, rastreando lo que google rastrea - clinic seo - eshow
Seo y big data, rastreando lo que google rastrea - clinic seo - eshow
 
Daft Punk SEO
Daft Punk SEODaft Punk SEO
Daft Punk SEO
 

Similaire à Log analysis and pro use cases for search marketers online version (1)

SEO for Large/Enterprise Websites - Data & Tech Side
SEO for Large/Enterprise Websites - Data & Tech SideSEO for Large/Enterprise Websites - Data & Tech Side
SEO for Large/Enterprise Websites - Data & Tech SideDominic Woodman
 
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your LogsSearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your LogsDistilled
 
A Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big QueryA Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big QueryDominic Woodman
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUJason Mun
 
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan Taylor
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan TaylorOptimization 2020 | Using Edge SEO For Technical Issues ft. Dan Taylor
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan TaylorDan Taylor
 
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014Bastian Grimm
 
SEARCH Y : Benjamin Bussière - Javascript and seo misconceptions, misunders...
SEARCH Y :  Benjamin Bussière - Javascript and seo  misconceptions, misunders...SEARCH Y :  Benjamin Bussière - Javascript and seo  misconceptions, misunders...
SEARCH Y : Benjamin Bussière - Javascript and seo misconceptions, misunders...SEARCH Y - Philippe Yonnet Evénements
 
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016Mark Thomas
 
Why Analytics important for any business - EBriks Infotech
 Why Analytics important for any business - EBriks Infotech Why Analytics important for any business - EBriks Infotech
Why Analytics important for any business - EBriks InfotechEBriks Infotech Pvt. Ltd.
 
Google Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress MeetupGoogle Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress MeetupRich Plakas
 
Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2Nate Plaunt
 
Why Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks InfotechWhy Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks InfotechEBriks Infotech Pvt. Ltd.
 
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018patrickstox
 
How can a data layer help my seo
How can a data layer help my seoHow can a data layer help my seo
How can a data layer help my seoPhil Pearce
 
Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...
Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...
Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...John McCaffrey
 

Similaire à Log analysis and pro use cases for search marketers online version (1) (20)

SEO for Large/Enterprise Websites - Data & Tech Side
SEO for Large/Enterprise Websites - Data & Tech SideSEO for Large/Enterprise Websites - Data & Tech Side
SEO for Large/Enterprise Websites - Data & Tech Side
 
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your LogsSearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
SearchLove Boston 2017 | Dom Woodman | How to Get Insight From Your Logs
 
A Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big QueryA Guide to Log Analysis with Big Query
A Guide to Log Analysis with Big Query
 
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AUKeeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
Keeping Things Lean & Mean: Crawl Optimisation - Search Marketing Summit AU
 
Technial SEO
Technial SEOTechnial SEO
Technial SEO
 
SEARCH Y - Bastian Grimm - Migrations Best Practices
SEARCH Y - Bastian Grimm -  Migrations Best PracticesSEARCH Y - Bastian Grimm -  Migrations Best Practices
SEARCH Y - Bastian Grimm - Migrations Best Practices
 
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan Taylor
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan TaylorOptimization 2020 | Using Edge SEO For Technical Issues ft. Dan Taylor
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan Taylor
 
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
 
Technical SEO Updated
Technical SEO UpdatedTechnical SEO Updated
Technical SEO Updated
 
SEARCH Y : Benjamin Bussière - Javascript and seo misconceptions, misunders...
SEARCH Y :  Benjamin Bussière - Javascript and seo  misconceptions, misunders...SEARCH Y :  Benjamin Bussière - Javascript and seo  misconceptions, misunders...
SEARCH Y : Benjamin Bussière - Javascript and seo misconceptions, misunders...
 
SEO for Large Websites
SEO for Large WebsitesSEO for Large Websites
SEO for Large Websites
 
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
BrightonSEO 5 Critical Questions Your Log Files Can Answer September 2016
 
Modern JavaScript and SEO
Modern JavaScript and SEOModern JavaScript and SEO
Modern JavaScript and SEO
 
Why Analytics important for any business - EBriks Infotech
 Why Analytics important for any business - EBriks Infotech Why Analytics important for any business - EBriks Infotech
Why Analytics important for any business - EBriks Infotech
 
Google Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress MeetupGoogle Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
Google Analytics with an Intro to Google Tag Manager for Austin WordPress Meetup
 
Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2Demand Quest SEO Training - Session 2
Demand Quest SEO Training - Session 2
 
Why Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks InfotechWhy Analytics is Important for Any Business - EBriks Infotech
Why Analytics is Important for Any Business - EBriks Infotech
 
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
Troubleshooting SEO for JS Frameworks - Patrick Stox - DTD 2018
 
How can a data layer help my seo
How can a data layer help my seoHow can a data layer help my seo
How can a data layer help my seo
 
Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...
Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...
Ruby on Rails Performance Tuning. Make it faster, make it better (WindyCityRa...
 

Dernier

Exploring The World Of Adult Ad Networks.pdf
Exploring The World Of Adult Ad Networks.pdfExploring The World Of Adult Ad Networks.pdf
Exploring The World Of Adult Ad Networks.pdfadult marketing
 
Miss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdfMiss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdfMagdalena Kulisz
 
The power of SEO-driven market intelligence
The power of SEO-driven market intelligenceThe power of SEO-driven market intelligence
The power of SEO-driven market intelligenceHinde Lamrani
 
McDonald's: A Journey Through Time (PPT)
McDonald's: A Journey Through Time (PPT)McDonald's: A Journey Through Time (PPT)
McDonald's: A Journey Through Time (PPT)DEVARAJV16
 
Digital Marketing Courses In Pune- school Of Internet Marketing
Digital Marketing Courses In Pune- school Of Internet MarketingDigital Marketing Courses In Pune- school Of Internet Marketing
Digital Marketing Courses In Pune- school Of Internet MarketingShauryaBadaya
 
5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software Solutions5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software SolutionsDevherds Software Solutions
 
top marketing posters - Fresh Spar Technologies - Manojkumar C
top marketing posters - Fresh Spar Technologies - Manojkumar Ctop marketing posters - Fresh Spar Technologies - Manojkumar C
top marketing posters - Fresh Spar Technologies - Manojkumar CManojkumar C
 
Inbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon Garside
Inbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon GarsideInbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon Garside
Inbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon Garsiderobwhite630290
 
Infographics about SEO strategies and uses
Infographics about SEO strategies and usesInfographics about SEO strategies and uses
Infographics about SEO strategies and usesbhavanirupeshmoksha
 
Most Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdf
Most Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdfMost Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdf
Most Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdfCIO Business World
 
What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...
What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...
What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...Ahrefs
 
What’s the difference between Affiliate Marketing and Brand Partnerships?
What’s the difference between Affiliate Marketing and Brand Partnerships?What’s the difference between Affiliate Marketing and Brand Partnerships?
What’s the difference between Affiliate Marketing and Brand Partnerships?Partnercademy
 
Research and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdf
Research and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdfResearch and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdf
Research and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdfVWO
 
A Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | BanyanbrainA Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | BanyanbrainBanyanbrain
 
Digital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G AgeDigital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G AgeDigiKarishma
 
The Impact of Digital Technologies
The Impact of Digital Technologies The Impact of Digital Technologies
The Impact of Digital Technologies bruguardarib
 
ASO Process: What is App Store Optimization
ASO Process: What is App Store OptimizationASO Process: What is App Store Optimization
ASO Process: What is App Store OptimizationAli Raza
 
From Chance to Choice - Tactical Link Building for International SEO
From Chance to Choice - Tactical Link Building for International SEOFrom Chance to Choice - Tactical Link Building for International SEO
From Chance to Choice - Tactical Link Building for International SEOSzymon Słowik
 
Introduction to marketing Management Notes
Introduction to marketing Management NotesIntroduction to marketing Management Notes
Introduction to marketing Management NotesKiranTiwari42
 
Exploring Web 3.0 Growth marketing: Navigating the Future of the Internet
Exploring Web 3.0 Growth marketing: Navigating the Future of the InternetExploring Web 3.0 Growth marketing: Navigating the Future of the Internet
Exploring Web 3.0 Growth marketing: Navigating the Future of the Internetnehapardhi711
 

Dernier (20)

Exploring The World Of Adult Ad Networks.pdf
Exploring The World Of Adult Ad Networks.pdfExploring The World Of Adult Ad Networks.pdf
Exploring The World Of Adult Ad Networks.pdf
 
Miss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdfMiss Immigrant USA Activity Pageant Program.pdf
Miss Immigrant USA Activity Pageant Program.pdf
 
The power of SEO-driven market intelligence
The power of SEO-driven market intelligenceThe power of SEO-driven market intelligence
The power of SEO-driven market intelligence
 
McDonald's: A Journey Through Time (PPT)
McDonald's: A Journey Through Time (PPT)McDonald's: A Journey Through Time (PPT)
McDonald's: A Journey Through Time (PPT)
 
Digital Marketing Courses In Pune- school Of Internet Marketing
Digital Marketing Courses In Pune- school Of Internet MarketingDigital Marketing Courses In Pune- school Of Internet Marketing
Digital Marketing Courses In Pune- school Of Internet Marketing
 
5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software Solutions5 Digital Marketing Tips | Devherds Software Solutions
5 Digital Marketing Tips | Devherds Software Solutions
 
top marketing posters - Fresh Spar Technologies - Manojkumar C
top marketing posters - Fresh Spar Technologies - Manojkumar Ctop marketing posters - Fresh Spar Technologies - Manojkumar C
top marketing posters - Fresh Spar Technologies - Manojkumar C
 
Inbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon Garside
Inbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon GarsideInbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon Garside
Inbound Marekting 2.0 - The Paradigm Shift in Marketing | Axon Garside
 
Infographics about SEO strategies and uses
Infographics about SEO strategies and usesInfographics about SEO strategies and uses
Infographics about SEO strategies and uses
 
Most Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdf
Most Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdfMost Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdf
Most Influential HR Leaders Leading the Corporate World, 2024 (Final file).pdf
 
What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...
What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...
What I learned from auditing over 1,000,000 websites - SERP Conf 2024 Patrick...
 
What’s the difference between Affiliate Marketing and Brand Partnerships?
What’s the difference between Affiliate Marketing and Brand Partnerships?What’s the difference between Affiliate Marketing and Brand Partnerships?
What’s the difference between Affiliate Marketing and Brand Partnerships?
 
Research and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdf
Research and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdfResearch and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdf
Research and Discovery Tools for Experimentation - 17 Apr 2024 - v 2.3 (1).pdf
 
A Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | BanyanbrainA Comprehensive Guide to Technical SEO | Banyanbrain
A Comprehensive Guide to Technical SEO | Banyanbrain
 
Digital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G AgeDigital Marketing in 5G Era - Digital Transformation in 5G Age
Digital Marketing in 5G Era - Digital Transformation in 5G Age
 
The Impact of Digital Technologies
The Impact of Digital Technologies The Impact of Digital Technologies
The Impact of Digital Technologies
 
ASO Process: What is App Store Optimization
ASO Process: What is App Store OptimizationASO Process: What is App Store Optimization
ASO Process: What is App Store Optimization
 
From Chance to Choice - Tactical Link Building for International SEO
From Chance to Choice - Tactical Link Building for International SEOFrom Chance to Choice - Tactical Link Building for International SEO
From Chance to Choice - Tactical Link Building for International SEO
 
Introduction to marketing Management Notes
Introduction to marketing Management NotesIntroduction to marketing Management Notes
Introduction to marketing Management Notes
 
Exploring Web 3.0 Growth marketing: Navigating the Future of the Internet
Exploring Web 3.0 Growth marketing: Navigating the Future of the InternetExploring Web 3.0 Growth marketing: Navigating the Future of the Internet
Exploring Web 3.0 Growth marketing: Navigating the Future of the Internet
 

Log analysis and pro use cases for search marketers online version (1)

  • 1. Log Analysis and PRO Use Cases for Search Marketers Dave Sottimano - Untagged.io - Madrid 2016
  • 2. Prepare yourself for bullet point hell. It’s meant for reading: bit.ly/untagged2016 Lo siento :(
  • 3. You know what makes me sad?
  • 5. Inflated & ambiguous stats 80,000 80,000 https://support.google.com/webmasters/answer/35253?hl=en
  • 6. Seems broken. It is broken, this is actually an image search result and it has been ranking through the entire time period. ???
  • 7. But hey, reporting stats for the entire internet isn’t easy. So, thank you Google.
  • 8. ..but, we need better data.
  • 9. Why server log analysis is so important:
  • 10. How do we try and increase crawl frequency? Increase External link count (includes links from social sites) List valuable pages in sitemaps and ping Google Increase Internal link count (crawl paths) Create new pages, and update older pages (avoid stagnation) Ensure pages are unique, reduce internal duplication Avoid internally linking to redirects or broken pages Testing. Lots of testing.
  • 11. What actions do SEOs take from log analysis? ● Optimize Googlebot crawl ○ restructure link architecture, apply directives, block via robots.txt ● Find server errors or Googlebot induced errors ○ Try to fix any 4xx, 5xx error codes ○ Use browser user agent referer fields to uncover source of errors ● Understand Googlebot crawl rate & behaviour for SEO testing ○ Helpful for testing and insights and constantly questioning best practices ● Block badly behaving bots, prevent bandwidth drain ○ Look for hotlinking bandwidth drain, i.e images from porn sites ● Find unreported links through referer fields ○ Link crawlers don’t find every link, server logs are necessary for comprehensive audits ● Double check Analytics data ○ Helpful for correcting analytics setup or understanding why referers aren’t passed correctly
  • 12. The hard part: Getting the right data and merging.
  • 13. Step 1: Get the right fields logged 206.248.146.167 - - [25/Aug/2015:06:50:01 +0000] "GET /shoes HTTP/1.0" 200 251 "https://www.google.ca/" “example.com” "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36" User agent IP Address Date/Time Referer Method Response code Page Response time Hostname
  • 14. Step 2: Ensure the correct originating IP is logged Load balancers, proxies or CDN’s may overwrite the original IP of the request. Use X-Forwarded-For header for to ensure you have the original IP IIS: http://www.loadbalancer.org/blog/iis-and-x-forwarded-for-header Apache: http://www.loadbalancer.org/blog/apache-and-x-forwarded-for-headers Nginx: https://easyengine.io/tutorials/nginx/forwarding-visitors-real-ip/ CloudFlare: https://support.cloudflare.com/hc/en-us/sections/200805497-Restoring-Visitor-IPs
  • 15. Step 3: Ensure we have all of the logs ● Triple check the hostname! If you’re analyzing example.com, desktop for instance, ensure you’re not counting the mobile version (m.example.com) or other subdomains (forum.example.com). Be very careful to get the right data or you will pull your hair out. Ask system administrators! ● If the server stores cached copies and serves them from another server, get those logs too and combine them for the target domain analysis. ● Too much data? Ask for selective logging for Googlebot user agent only
  • 16. Step 4: Parse the logs, grab Googlebot entries https://www.splunk.com/en_us/download/splunk-light-2.html
  • 17. Step 5: Verify Googlebot entries by DNS 1. Segment out logs with user-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) 2. Take the original IP in the logs, example: 66.249.65.63 3. Reverse DNS lookup: crawl-66-249-65-63.googlebot.com 4. DNS lookup: 66.249.65.63 (confirmed!) https://support.google.com/webmasters/answer/80553?hl=en Software I use: http://www.nirsoft.net/utils/ipnetinfo.html
  • 18. Note to myself: Look out for Google Mobile user agents Mozilla/5.0+(iPhone;+CPU+iPhone+OS+6_0+like+Mac+OS+X)+AppleWebKit/536.26 +(KHTML,+like+Gecko)+Version/6.0+Mobile/10A5376e+Safari/8536.25+(compatible; +Googlebot/2.1;++http://www.google.com/bot.html) This is a verified Googlebot from 66.249.65.63, but it’s not listed on the official crawlers page. Official Google: Mobile-first Indexing
  • 19. Step 5: Merge Crawl data with clean logs ● Crawl as: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) and a popular browser user agent ● Crawler config: Disobey Robots.txt, crawl all non-HTML, crawl internal nofollow, crawl canonicals & sitemaps, ideally JS enabled ● Fields required: URL, Response code, Title, Robots directives (blocked, noindex, nofollow etc.), Canonical, Page size, response time, crawl level, number of internal links to page Try DeepCrawl for free bit.ly/freecrawl - 25,000 credits for Untagged.io
  • 20. Step 6: Add Web Analytics data ● Ensure the the URLs correspond correctly (special characters, full URL) ● Ensure the date period is exactly the same period as server logs ● Use data from source/medium = Google/Organic only DeepCrawl can do merge both crawl and analytics data from Google Analytics
  • 21. So far... ● Have all logs from the right host with the right fields ● Have the original IP addresses ● Confirmed real Googlebot visits ● Merged crawl data and analytics data perfectly
  • 22. Just when you think all the data is correct, something will go wrong, guaranteed ;)
  • 23. Real example, small site: http://www.campgroundsigns.com/ 7 million events from load balancer, IIS custom format access logs= 1.6 gB of data 13,000 Googlebot events over 28 days 1,129 pages are indexable on campgroundsigns.com
  • 24. Caveat! The following are observations based on 1 small website. The observations for this site are only for this site and are not representative. Each website and it’s Googlebot crawl activity are different. Special thanks to campgroundsigns.com for volunteering for the analysis
  • 25. What is Google crawling?
  • 26. What we wanted crawled vs What Google crawled Based on a 28 day sample
  • 27. What did Google crawl by Page Type?
  • 28. How we determine page types By URL (if possible): example.com/products/product-123 By unique HTML template footprint (recommended):
  • 30. Googlebot crawl and Organic Traffic Based on a 28 day sample
  • 31. Is Googlebot using it’s crawl wisely?
  • 32. Of those pages, what were the response codes?
  • 33. The 4% of 410 errors are actually caused by Google trying to render JavaScript. 127 pages per 28 day crawl are wasted.
  • 34. How often did Google crawl NOINDEX pages?
  • 35. Did Google crawl the right pages? Indexable defined as: Response code: 200, no robots.txt block, self referencing canonical or no canonical in head or http header, no noindex directives in head or http header, no directives applied in GSC param config, no removal request, not JS/CSS or resource files. Not indexable either has non 200 response or one of the previous.
  • 36. Generally, we see reduced crawl activity to pages with NOINDEX. There’s something wrong. PLA = Product listing Ad.
  • 37. We tried to block the PLA pages to divert attention to important pages:
  • 38. Based on 4 day, Mon-Thursday period before and after the block Errr, go back, quick. All requests Unique pages crawled Before After Before After PLA (Blocked by robots) 1334 0 703 0 Department or other Page 404 212 270 124 Product page 605 247 452 177 resource 332 406 50 61 Homepage 15 15 1 1 Totals 2690 880 1476 363 Difference -67% -75%
  • 39. Turns out, Google uses their regular Googlebot crawler to crawl them, not Adbot. It was a mistake blocking these. We’ll try canonicals next. https://support.google.com/merchants/answer/160156?hl=en
  • 40. Insights: If Googlebot crawls a page, is it always indexed?
  • 41. No. Out of a sample of 691 pages, 12 were crawled and not indexed.
  • 42. Insights: If a page gets Google organic traffic, is it always indexed?
  • 43. No. 225 pages receiving at least 1 Google Organic visit during the time period: Read why: http://bit.ly/2faEoA9
  • 44. How did I check indexation? Sorry, this section is not available for non attendees!
  • 45. Frequent question for SEOs: How long will it take for Google to update index after a migration?
  • 46. Googlebot 2.1 pages crawled per day, vs Search Console
  • 47. Where to find pages crawled per day in Search Console?
  • 48. Tip: Disable all CSS styles for easy copy
  • 49. How many unique URLs are crawled per day?
  • 50. Googlebot only crawled 766 pages out of the 1,129 we wanted crawled over 28 days.
  • 51. More realistic, still estimated, but slightly less bullshit: ● 766 unique, indexable pages were crawled over 28 days ● That gives us an Average of 27 unique pages crawled per day. ● 1129 total indexable pages / 27 = minimum 42 days for a full recrawl. Remember, this is a complete estimate.
  • 52. That doesn’t even account for how many times Google has to figure out a 301 redirect.
  • 53. Same calculation, different site (with approx 86,000 indexable pages) This is not representative of any other site.
  • 54. Fixing the problems isn’t always easy. But it does pay off.
  • 55. Here’s some fancy charts moving up and to the right as proof.
  • 57. If it seems Google isn’t respecting robots.txt, check: 10 day lag!
  • 58. Server log analysis is hard. Here’s why: ● Data size challenges, example: 7 million events = 1.6 gB (and that’s tiny) ● Lots of different servers logging with custom formats ● Often, obtaining them means surpassing people problems & technical challenges ● Any small mistakes combining crawl, analytics, search console data can make the entire analysis useless ● Combining large datasets requires either some form of programming or technical knowledge; it’s not for everyone. ● Many available tools aren’t comprehensive enough for SEO purposes yet. That being said, they are the best thing since patatas bravas con alioli.
  • 59. Things that can corrupt your results ● Thinking you’re seeing Googlebot but it’s not really Googlebot ● Not accounting for robots.txt restrictions changes or other directive changed in crawl data during logging period ● Incorrect field mapping, i.e. mistake referer for page request ● Incorrect merging of crawl and analytics data
  • 60. Helpful links for log analysis Guides: ● A Complete Guide to Log Analysis with Big Query - Dominic Woodman ● The Ultimate Guide to Log File Analysis - Daniel Butler ● SEO Finds in Your Server Log Tim Resnik ● How to Use Server Log Analysis for Technical SEO Samuel Scott Software: ● Splunk ● SEO Log File Analyser ● Logz.io ● Botify