In today’s always-on world our “to do” lists never seem to shrink. Fortunately, when it comes to SEO there are ways to work faster AND better. One sure-fire way to increase efficiency and effectiveness is automation. Join Catalyst’s Paul Shapiro as he discusses specific ways to use automation to deliver better results in less time. You’ll leave with an understanding of how automation technology can simplify technical SEO processes. Audiences will learn how to:
• Leverage SQL databases to automatically collect data from Google search console over time
• Automate keyword research with an open-source tool called KNIME
• Use programming concepts, such as regex for data extraction, and work with APIs to enhance your data analysis
• Implement data visualization strategies to quickly recognize critical patterns and trends
3. Paul Shapiro | @fighto #TechSEOBoost
Why Automate?
1.Work faster and free up time for the
important stuff
2.Look at more data
3.Improved consistency and procedure
4. Paul Shapiro | @fighto #TechSEOBoost
What Can You Automate?
• If you’re doing something on a routine
basis, it can probably be automated.
• Any procedures that can be broken down
into smaller, micro tasks that can be
handled a by a computer.
• Machine Learning can help with more
complex decision making.
5. Paul Shapiro | @fighto #TechSEOBoost
How Do You Automate?
7. Paul Shapiro | @fighto #TechSEOBoost
https://www.knime.org
8. Paul Shapiro | @fighto #TechSEOBoost
Why KNIME?
• Fast way to put together complex
analyses
• Great for prototyping
• Large library of built-in “nodes”
• Free/Open Source
• Run on Windows/Mac/Linux
• Very expandable – even compatible
with R, Python, Java, JavaScript
• Easy enough for non-technical staff
to grasp
9. Paul Shapiro | @fighto #TechSEOBoost
Other Options
• Scripting Languages
• Python
• Ruby
• Node.js
• Go
• R
• Excel with VBA
• Google Sheets
10. Paul Shapiro | @fighto #TechSEOBoost
Cron & Windows Task Manager
are Your Friend
11. Paul Shapiro | @fighto #TechSEOBoost
What is Cron and Why?
• *NIX system daemon used to schedule tasks and
scripts.
• Windows Task Manager is the Windows equivalent
of Cron.
• This way we can schedule scripts and programs that
perform automated tasks on a recurring, scheduled
basis.
12. Paul Shapiro | @fighto #TechSEOBoost
Quick How To
* * * * * command /to/execute
Day of Week (0-6) (Sunday = 0)
Month (1-12)
Hour (0-23)
Day of Month (1-31)
Minute (0-59)
13. Paul Shapiro | @fighto #TechSEOBoost
Run Every Month at Midnight
0 0 1 * * python datacollector.py
14. Paul Shapiro | @fighto #TechSEOBoost
The Basics of KNIME
15. Paul Shapiro | @fighto #TechSEOBoost
What is a Node?
• Nodes are prebuilt, drag and drop modules designed perform a singular task
• Nodes are strung together like a chain to accomplish larger, more complex
tasks
• Nodes can be grouped together into “meta-nodes”, which can be configured in
unison
16. Paul Shapiro | @fighto #TechSEOBoost
How Do You Add Nodes &
How Do They Connect?
How do you add nodes to your “workflow”?
How do you string nodes together?
17. Paul Shapiro | @fighto #TechSEOBoost
How Do You Configure & Run Nodes?
Configuring Nodes
Running Workflows
OR
19. Paul Shapiro | @fighto #TechSEOBoost
Most Keyword Research Looks Like This
20. Paul Shapiro | @fighto #TechSEOBoost
Typical Time Investment for Keyword
Research
5
8
12
21 21
6
10
18
22
28
0
5
10
15
20
25
30
Micro (0-49 pages) Small (50-99 pages) Medium (100-249 pages) Large (250-499 pages) Extra Large (>500 pages)
Hours to Complete Keyword Research by Site Size
Average (Low End) Average (High End)
21. Paul Shapiro | @fighto #TechSEOBoost
Size of the Data Set
vs.
22. Paul Shapiro | @fighto #TechSEOBoost
Filtering +
Data
Manipula-
tions
One Big Keyword List
Seed Keywords
- List
- GWMT
- SEMRush Comp. KWs
- SQR Keywords
Keyword Planner
Suggestions (via
GrepWords)
Google Autocomplete
Semantic Keyword
Recommendations
(via MarketMuse)
Google Autocomplete
SEMRush
Domain vs. Domain
Keywords
Google Autocomplete
23. Paul Shapiro | @fighto #TechSEOBoost
Data Manipulations / Calculations
• Get top 10 results from rank checking API (i.e., GetSTAT)
• Use Moz API nodes and find average PA to assess
competiveness.
• Optionally, use SEMRush’s Keyword Difficulty API
Organic
Competition
Search Volume
Keyword Trends
23
24. Paul Shapiro | @fighto #TechSEOBoost
Data Manipulations / Calculations
• Get top 10 results from rank checking API (i.e., GetSTAT)
• Use Moz API nodes and find average PA to assess
competiveness.
• Optionally, use SEMRush’s Keyword Difficulty API
Organic
Competition
• Get Search Volumes via SEMRush API or via GrepWords APISearch Volume
Keyword Trends
24
25. Paul Shapiro | @fighto #TechSEOBoost
Data Manipulations / Calculations
• Get top 10 results from rank checking API (i.e., GetSTAT)
• Use Moz API nodes and find average PA to assess
competiveness.
• Optionally, use SEMRush’s Keyword Difficulty API
Organic
Competition
• Get Search Volumes via SEMRush API or via GrepWords APISearch Volume
• Use 2 Years of Google Trends data to calculate slope and
determine growing/declining keywordsKeyword Trends
25
26. Paul Shapiro | @fighto #TechSEOBoost
String ‘em All Together and then…
26
27. Paul Shapiro | @fighto #TechSEOBoost
Visualize
27
This top-right quadrant contains
keywords with:
• Low competition
• Good growth
Larger bubbles show higher
search volumes.
You can alternatively use
current rank on the x-axis to
signal organic market share like
a traditional growth-share
matrix.
36. Paul Shapiro | @fighto #TechSEOBoost
Search Console
Schedule to run monthly with Cron
and backup to SQL database:
https://searchwilderness.com/gwmt-
data-python/
JR’s BigQuery vision:
http://pshapi.ro/2vmjDe8
37. Paul Shapiro | @fighto #TechSEOBoost
301 Redirect Mapping
from Old URLs
38. Paul Shapiro | @fighto #TechSEOBoost
301 Redirect Mapping from Old URLs
Crawl
Current Site
Download
Rendered
Pages
Extract Main
Content
(BoilerPipe)
Convert to
Bitvector
Get Historic
URLs from
Wayback
Machine
API
Filter Out
URLs found
on Current
Site
Grab
Rendered
Page from
Wayback
Machine
Extract
Main
Content
(BoilerPipe)
Convert to
Bitvector
Patrick Stox
https://searchengineland.com/fixing-historical-redirects-using-wayback-machine-apis-257628
Cosine
Similarity
Generate
.htaccess
strings
1
2
3
40. Paul Shapiro | @fighto #TechSEOBoost
SERP Similarity / Overlap
41. Paul Shapiro | @fighto #TechSEOBoost
Riyaad Edoo
1. Download ranking data via STAT
API
2. Compare results from 1-10 for
each query against results from 1-
10 for every other query.
3. Calculate percent similarity.
4. Schedule checks and examine
what changed.
42. Paul Shapiro | @fighto #TechSEOBoost
Test JavaScript Rendering
43. Paul Shapiro | @fighto #TechSEOBoost
http://pshapi.ro/puppetcrawl
44. Paul Shapiro | @fighto #TechSEOBoost
Performance Testing with
Lighthouse
45. Paul Shapiro | @fighto #TechSEOBoost
http://pshapi.ro/perfpony
47. Paul Shapiro | @fighto #TechSEOBoost
CTR
1. Data collection: We collect data on query,
page and associated metrics via the Google
Search Console Search Analytics API.
2. Round average position: I round average
position to the tenths decimal place (e.g., 1.19
is rounded to 1.2).
3. Math: We identify outliers using a combination
of the statistical methods for identifying
outliers (modified z-score, IQR).
4. Email: If any negative outliers are identified
for a keyword query and page combination at
an average position, an email is sent out
identifying all of this data to each of the SEOs
assigned to the account to investigate.
5. Scheduling: Set your script to run on a
recurring basis.
My SEL Article: http://pshapi.ro/2Ae2LYP
49. Paul Shapiro | @fighto #TechSEOBoost
Reddit Data Mining
Reddit Data Mining: Python Script
https://searchwilderness.com/reddit-python-code/
1. Enter filename for output
2. Enter a search or series of searches
3. Choose reddit sorting method. For this purpose,
choose ‘new’
4. Choose to look at all of reddit, or isolate to
particular subreddit(s).
5. Schedule with cron to find new topic ideas on a
recurring basis.
Password: fighto
51. Paul Shapiro | @fighto #TechSEOBoost
Bulk Check AMP
Pages with
AMPBench API
Python Script:
http://pshapi.ro/2AHlNaE
Requires:
• Python
• Requests package
Ideally AMPBench would run locally, but
can be ran off the appspot demo URL.
53. Paul Shapiro | @fighto #TechSEOBoost
Link Building: Prospecting with Competitors
54. Paul Shapiro | @fighto #TechSEOBoost
Tech Audit Related Site Changes
55. Paul Shapiro | @fighto #TechSEOBoost
• New 404s
• New redirects
• Changes to robots.txt
• Content based changes over time
• Indexation changes
• New pages
56. Paul Shapiro | @fighto #TechSEOBoost
CatalystDigital.com
Paul Shapiro
https://searchwilderness.com
@fighto
Thanks!