New perspectives on duplicate content
How do ranking factors and evolving search technologies impact the way we handle duplicate content? What does the future hold for similar content on the web? Join OnCrawl Ambassador Omi Sido and Alexis Sanders as they explore the question of duplicate content.
9. o Repetitive page
o Doorway pages
o Inventory control
o Syndicated content
o PR releases
o Republishing
o Plagiarism
o Non-unique copy
o Localized content
o Thin content
o Staging sites
o HTTP vs. HTTPS
o Subdomain
o URL cases
o File extensions
o Trailing slash
o Index pages
o Parameters
o Pagination
o Mobile Configuration
o Internal site search
technical content
o Facets
o Sorts
o Image-only
10. How can SEOs find and identify duplicate content?
#OnCrawlinOrbit
11. 1. Know your user journey
7. If your content is stolen:
Request a canonical tag
File a DCMA request
6. Strategically → consolidate,
create, delete, optimize
5. Leverage
appropriate
signaling
4. If the pages are
100% duplicate,
consolidate w/
a 301 redirect
3. Prioritize
duplicate
content
issues that are
affecting
performance
2. Create a strong
hierarchical URL taxonomy
22. What is new for duplicate content
in the past year and a half?
#OnCrawlinOrbit
23. What will duplicate content management
look like in the future?
#OnCrawlinOrbit
24. • Less technical-based duplicate content (as CMS wise up)
• More automation (unit testing and external testing)
• Automatically detect high similarity pages and page types
for writers and content managers
• Google continue to improve their existing systems and
detection
• Perhaps an alert system to escalate issue of Google not
using the right canonical
Alexis’ hopes for the future,
25. Do you have a favorite technical trick?
#OnCrawlinOrbit
26. • EC2 remote computer instance
• Check mobile first testing tool
• Switch user agent to Googlebot
• Using TechnicalSEO.com’s robots.txt
tool
• Screaming frog log analyzer
• Made with Love’s htaccess checker
• Using Google Data Studio to report on
changes (syncing Sheets with updates,
filtering each page by relevant updates)
Alexis’ tech SEO tricks
27. Do you have a least favorite technical SEO question?
#OnCrawlinOrbit
28. Do you have a favorite googlebot?
#OnCrawlinOrbit
29. Alexis: I like the idea that
Googlebot is tired and
overworked (from crawling 130
trillion URLs).
30. Do you have a favorite planet?
#OnCrawlinOrbit
31. Launching the best SEO tips
into space
Next up on June 27th
from Bordeaux, France
FULL AGENDA AT WWW.ONCRAWL.COM/SEOINORBIT
Notes de l'éditeur
October 2015 Google Hangout
Joey: [holding up headshots] “Which one do you like better?”Bianca: “Umm, I think I like the white shirt better.”Joey: “Yeah, it’s-it’s more…”Bianca: “Pensive?”Joey: “Damn, I was going for thoughtful.”
Roll in like star wars?
https://www.shapechef.com/blog/star-wars-intro-crawl-in-powerpoint-2013
I have two clients that see a lot of this, one in real estate, one in health. Lot of trying to add unique content and information architecture.
From Google
Incorrect canonicals: ~2/1/19
Fixed: 4/18/19
Same as stuff in the past
Finding ways to make content unique, especially when site is massive
Machine generated content
Alexis:
Check mobile first testing tool, switch user agent to Googlebot
Using technicalseo.com’s robots.txt tool
Screaming frog log analyzer
EC2 remote computer
I’ll try to think of something better.