Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Google Crawl Budget Optimization

1 228 vues

Publié le

On average, only 40% of a site's pages are crawled regularly by Google. SEOs know that optimizing crawl budget for large and complex websites is critical for organic performance. But how can you find your own crawl budget? And once you understand your site’s crawl budget, how can you impact it?

This presentation shares insights from our recent on-demand webinar Optimize Google Crawl Budget. Click to learn how to define, measure and optimize for Google crawl budget.

Publié dans : Marketing
  • Identifiez-vous pour voir les commentaires

Google Crawl Budget Optimization

  1. 1. CRAWL BUDGET OPTIMIZATION BOTIFY WEBINAR Mark Thomas VP Customer Success Watch the Crawl Budget Optimization Webinar on demand now
  2. 2. Watch the webinar on demand More Crawl Budget Resources • Crawl Budget Webinar • Botify Crawl Budget Blog Posts o Google Crawl Budget Optimization o Google Confirms SEOs Should Control Their Crawl Budget o What is Crawl Ratio, and Why Does it Matter? o Time Is On Your Site: How to Use Crawl Frequency Metrics For SEO o Crawl Budget is a Finite Resource: Spend it Wisely o Expand on Google Search Console Data with Botify o Botify Sheds Light on Crawl Errors in Google Search Console
  3. 3. THE TYRANNY OF CHOICE Watch the webinar on demand
  4. 4. THE TYRANNY OF CHOICE Watch the webinar on demand
  5. 5. THE TYRANNY OF CHOICE Watch the webinar on demand
  6. 6. “Traditionally, companies said that it's all about the customer, and therefore give them everything they want,” says Glen Williams of Bain, a consultancy. “In reality, this can make it difficult to identify which products the customer really wants, and can create problems for managing the business.” Offering too many jazzy options for new cars, say, may not only confuse consumers but add to production costs and increase the potential for factory-floor bungles. A 2006 Bain study suggested that reducing complexity and narrowing choice can boost revenues by 5-40% and cut costs by 10-35%. COULD THE SAME BE TRUE FOR ORGANIC REVENUE????
  7. 7. WEBINAR OBJECTIVES DEFINE MEASURE OPTIMIZE Watch the webinar on demand
  8. 8. DEFINE Watch the webinar on demand
  9. 9. CRAWL BUDGET DEFINED Watch the webinar on demand
  10. 10. “Taking crawl rate and crawl demand together we define crawl budget as the number of URLs Googlebot can and wants to crawl.” Watch the webinar on demand
  11. 11. Watch the webinar on demand
  12. 12. RANKING PROCESS NO CRAWL = NO CONVERSION
  13. 13. On average, only 40%pages of a site are crawled regularly by Google* * Study based on Botify customers “Crawl Ratio” within a 30 day period Marketplace Publisher Retailer
  14. 14. CRAWL BUDGET Crawl Rate Limit ● Crawl Health ● Search Console limit Definition “The number of URLs Google can and wants to crawl” Crawl Demand ● Popularity ● Staleness Watch the webinar on demand
  15. 15. CRAWL BUDGET Negative Impact URLs with low value to SEO • Faceted navigation • Session IDs • On-site duplicate content • "Soft 404" error pages • Hacked pages • Infinite spaces and proxies • Low quality content and spam Watch the webinar on demand
  16. 16. CRAWL BUDGET Botify Suggests: Cases where tools that can be used for constructive SEO outcomes are at such scale as to have a corrosive effect by causing Google to spend too much resource on URLs that won’t drive their own traffic. Two of those examples are: ● High % of redirects or 404 errors as a share of crawl and/or in site structure ● High percentage of non-canonical URLs in site structure Watch the webinar on demand
  17. 17. MEASURE Watch the webinar on demand
  18. 18. SEARCH CONSOLE LIMITS GOOD OR BAD NEWS? Search Console allows you to see how many pages are crawled each day, but not to know which URLs... This is great to get a general sense of Googlebot’s activity and corroborate Botify. Watch the webinar on demand
  19. 19. SERVER LOGS + SEGMENTUse log analysis with segmentation of URLs Visualize exactly which pages are crawled by Google every day Watch the webinar on demand
  20. 20. “WE’VE MOVED MORE AND MORE TOWARDS UNDERSTANDING SECTIONS OF A SITE TO UNDERSTAND THE QUALITY OF THOSE SECTIONS ” Watch the webinar on demand
  21. 21. SERVER LOGS + CRAWL + SEGMENT Combining crawler data (simulation) with log analysis enables us to calculate the percentage of pages of a site seen by Google “% URLS CRAWLED (BY GOOGLE)” Watch the webinar on demand
  22. 22. REFINE YOUR DATA TO CONCENTRATE ON USEFUL PAGES Analyzing crawl rate on Compliant (or Indexable) pages only, allows you to analyze the percentage of useful pages crawled by Google Watch the webinar on demand
  23. 23. REFINE AGAIN A more detailed analysis by analyzing the useful crawl rate per segments Here, for example, there is an opportunity to optimize the Crawl Budget on the Categories pages. Watch the webinar on demand
  24. 24. OPTIMIZE Watch the webinar on demand
  25. 25. OPTIMIZE CRAWL HEALTH POPULARITY STALENESS ● Page load time ● Crawls errors ● Irrelevant Actions for web crawlers (e.g. login, cart, contact form) ● Dilution of PageRank ● Low value SEO Pages ● Depth ● Linking internally ● Orphaned pages ● Measurement of retention time in the index ● Measuring index refresh rate Watch the webinar on demand
  26. 26. CRAWL HEALTH LOAD TIMES Botify reports the load times of the all pages (time download the HTML code) Detect pages with performance issues in the “Performance” tab GZIP Compression reduces server demands and is also reported in Botify Free tools such as Gtmetrix or PageSpeed Insight offer further advice on a per-page basis Watch the webinar on demand
  27. 27. “YOU CAN INFLUENCE [CRAWL BUDGET] ON YOUR SIDE THROUGH TECHNICAL MEANS.” “USE A REALLY FAST SERVER.” Watch the webinar on demand
  28. 28. CRAWL HEALTH CRAWL ERRORS Bad HTTP codes (non 200 or 304) will have negative impacts on Crawl Budget. WHAT SHOULD YOU DO? Minimize HTTP errors Check for redirect loops Improve server performance (5XX errors) Submit an XML sitemap with expired pages to help get them removed from Google’s index more quickly Watch the webinar on demand
  29. 29. POPULARITY PAGERANK Google generally crawls pages on the web by order of PageRank. WHAT SHOULD YOU DO? Minimize the dilutions of PageRank in the structure of a site to maximize the Crawl Budget. Watch the webinar on demand
  30. 30. “LINKS TO PRODUCTS ON YOUR HOMEPAGE GAIN A LITTLE MORE WEIGHT” Watch the webinar on demand
  31. 31. POPULARITY PAGERANK PR 100 PR 50 PR 25 PR 5 PR 5 PR 5 PR 25 PR 50 PR 25 PR 25 With low value SEO pages PR 100 PR 50 PR 25 PR 25 PR 25 PR 50 PR 25 PR 25 Without low value SEO pages Strategic page with more PageRank Each link to a low SEO value page helps reduce the PageRank assigned to the strategic pages of the site. Remove low value SEO pages Watch the webinar on demand
  32. 32. POPULARITY PAGERANK DILUTION (NOT-COMPLIANT) Non-indexable pages and facets contribute to the dilution of PageRank WHAT SHOULD YOU DO? • Analyze the percentage of internal PageRank diluted to worthless pages • Minimize the number of these pages in the structure of the site Watch the webinar on demand
  33. 33. POPULARITY PAGERANK DILUTION (THIN CONTENT) Empty or virtually empty pages have very little chance of generating organic traffic, but contribute to the dilution of PageRank WHAT SHOULD YOU DO? • Identify pages with very little content • Add content to these pages or remove them from the site structure Watch the webinar on demand
  34. 34. POPULARITY PAGERANK DILUTION (ROBOTS.TXT) Links pointing to pages blocked by robots.txt can potentially dilute much of the internal PageRank WHAT SHOULD YOU DO? Identify the links pointing to these pages in the site structure and then delete them Watch the webinar on demand
  35. 35. “DON’T SEND A CANONICAL TO A URL YOU’RE TELLING THE PARAMETER HANDLING TOOL NOT TO CRAWL.” Watch the webinar on demand
  36. 36. POPULARITY PAGERANK DILUTION (DEPTH) PR 100 PR 50 PR 25 PR 5 PR 5 PR 5 PR 5 PR 5 PR 1 PR 0.2 PR 0.2 PR 0.2 PR 0.2 PR 0.2 PR 1 PR 1 PR 1 PR 1 PR 25 PR 50 PR 25 PR 25 Depth 0 Depth 1 Depth 2 Depth 3 Depth 4 Depth 5 Home Page The depth of a page is the minimum number of clicks to reach a page from the home page CONSEQUENCE: The deeper a page is, the less PageRank it will tend to receive Watch the webinar on demand
  37. 37. POPULARITY PAGERANK DILUTION (DEPTH) CRAWL RATE decreases the deeper Googlebot ventures The higher the page lives within the structure, the easier it is for Google to crawl them. WHAT SHOULD YOU DO? Minimize the depth of strategic pages ‘Fetch as Google’ strategic pages and then “Submit to Index” Watch the webinar on demand
  38. 38. POPULARITY INTERNAL LINKING Average number of links received by the pages Not Crawled by Google Average number of links received by pages crawled by Google WHAT SHOULD YOU DO? Link from key indexed pages Add internal links Measure & test to find the optimal inlink distribution Watch the webinar on demand
  39. 39. POPULARITY ORPHAN PAGES Orphaned pages are pages that are not linked in the site structure but crawled by Google. WHAT SHOULD YOU DO? • If the orphan pages have SEO potential, link them into the structure. • If they do not have SEO potential, block them at crawl search engines (robots.txt, 410 ...) Watch the webinar on demand
  40. 40. POPULARITY SITEMAPSDirty Sitemaps help generate orphaned pages. WHAT SHOULD YOU DO? • Identify Sitemaps pages that are not linked in the site structure • Clean Sitemaps if these pages are not supposed to be there • XML sitemaps should contain only the most important URL Watch the webinar on demand
  41. 41. STALENESS DETECTION Google: “Staleness: our systems attempt to prevent URLs from becoming stale in the index.” WHY IS IT IMPORTANT? Analyzing the staleness of pages makes it possible to know how often the pages of a site should be crawled by Google. Watch the webinar on demand
  42. 42. STALENESS INDEX REFRESH How does keeping your content fresh impact organic search performance? Are your pages being penalized for content changing too frequently? How can you employ redirects to improve performance? Are stale pages negatively affecting SEO? Watch the webinar on demand
  43. 43. Thank You Watch the webinar on demand
  44. 44. Watch the webinar on demand More Crawl Budget Resources • Crawl Budget Webinar • Botify Crawl Budget Blog Posts o Google Crawl Budget Optimization o Google Confirms SEOs Should Control Their Crawl Budget o What is Crawl Ratio, and Why Does it Matter? o Time Is On Your Site: How to Use Crawl Frequency Metrics For SEO o Crawl Budget is a Finite Resource: Spend it Wisely o Expand on Google Search Console Data with Botify o Botify Sheds Light on Crawl Errors in Google Search Console

×