Learn about the different things you can and cannot automate in SEO, saving you time and enabling more advanced work. Discover free tools, such as KNIME, and learn how to use them to begin your automation efforts. Finally, learn what an API is and how it can help you and your SEO work
3. Paul Shapiro | @fighto
Why Automate?
1.Work faster and free up time for the
important stuff
2.Look at more data
3.Improved consistency and procedure
4. Paul Shapiro | @fighto
What Can You Automate?
• If you’re doing something on a routine basis,
it can probably be automated (but doesn’t
mean it should).
• Any procedures that can be broken down into
smaller, micro-tasks that can be handled a by
a computer.
• Machine Learning can help with more complex
decision making (think power of AlphaGo).
8. Paul Shapiro | @fighto
How to Conceptualize Automation:
Break into Micro-tasks
1
2 3 4
9. Paul Shapiro | @fighto
How to Work with APIs
API Endpoint:
http://api.grepwords.com/lookup?apikey=random_string&q=keyword
Simple API key authentication via GET request
String is
unique to
you
(authentic
ation)
Variable,
changes
and often
looped
10. Paul Shapiro | @fighto
How to Work with APIs
http://api.grepwords.com/lookup?apikey=secret&q=board+games
Output (JSON):
Simple API key authentication via GET request
[{"keyword":"board games","updated_cpc":"2018-04-30","updated_cmp":"2018-04-
30","updated_lms":"2018-04-30","updated_history":"2018-04-
30","lms":246000,"ams":246000,"gms":246000,"competition":0.86204091185173,"com
petetion":0.86204091185173,"cmp":0.86204091185173,"cpc":0.5,"m1":201000,"m1_mo
nth":"2018-02","m2":246000,"m2_month":"2018-01","m3":450000,"m3_month":"2017-
12","m4":368000,"m4_month":"2017-11","m5":201000,"m5_month":"2017-
10","m6":201000,"m6_month":"2017-09","m7":201000,"m7_month":"2017-
08","m8":201000,"m8_month":"2017-07","m9":201000,"m9_month":"2017-
06","m10":201000,"m10_month":"2017-05","m11":201000,"m11_month":"2017-
04","m12":201000,"m12_month":"2017-03"}]
11. Paul Shapiro | @fighto
How to Work with APIs
Most API Outputs:
1. JSON
2. XML
3. CSV
12. Paul Shapiro | @fighto
How to Work with APIs
Last Step:
Parse it!
13. Paul Shapiro | @fighto
How to Work with APIs
Parsing Example Using Python:
import json
json_string = '[{"keyword":"board games","updated_cpc":"2018-04-
30","updated_cmp":"2018-04-30","updated_lms":"2018-04-
30","updated_history":"2018-04-
30","lms":246000,"ams":246000,"gms":246000,"competition":0.86204091185173,"com
petetion":0.86204091185173,"cmp":0.86204091185173,"cpc":0.5,"m1":201000,"m1_mo
nth":"2018-02","m2":246000,"m2_month":"2018-01","m3":450000,"m3_month":"2017-
12","m4":368000,"m4_month":"2017-11","m5":201000,"m5_month":"2017-
10","m6":201000,"m6_month":"2017-09","m7":201000,"m7_month":"2017-
08","m8":201000,"m8_month":"2017-07","m9":201000,"m9_month":"2017-
06","m10":201000,"m10_month":"2017-05","m11":201000,"m11_month":"2017-
04","m12":201000,"m12_month":"2017-03"}]‘
parsed_json([0]['gms'])
1
2
3
14. Paul Shapiro | @fighto
How to Work with APIs
Full Python Script:
import requests
import json
r =
requests.get('http://api.grepwords.com/lookup?apike
y=secretapikey&q=board+games')
parsed_json = json.loads(r.text)
print(parsed_json[0]['gms'])
18. Paul Shapiro | @fighto
Why KNIME?
• Fast way to put together complex
analyses
• Great for prototyping
• Large library of built-in “nodes”
• Free/Open Source
• Run on Windows/Mac/Linux
• Very expandable – even compatible
with R, Python, Java, JavaScript
• Easy enough for non-technical staff
to grasp
20. Paul Shapiro | @fighto
Other Options
• Scripting Languages
• Python
• Ruby
• Node.js
• Go
• R
• Excel with VBA
• Google Sheets
21. Paul Shapiro | @fighto
Cron & Windows Task Manager
are Your Friend
22. Paul Shapiro | @fighto
What is Cron and Why?
• *NIX system daemon used to schedule tasks and
scripts.
• Windows Task Manager is the Windows equivalent
of Cron.
• This way we can schedule scripts and programs that
perform automated tasks on a recurring, scheduled
basis.
23. Paul Shapiro | @fighto
Quick How To
* * * * * command /to/execute
Day of Week (0-6) (Sunday = 0)
Month (1-12)
Hour (0-23)
Day of Month (1-31)
Minute (0-59)
24. Paul Shapiro | @fighto
Run Every Month at Midnight
0 0 1 * * python datacollector.py
26. Paul Shapiro | @fighto
What is a Node?
• Nodes are prebuilt, drag and drop modules designed perform a singular task
• Nodes are strung together like a chain to accomplish larger, more complex
tasks
• Nodes can be grouped together into “meta-nodes”, which can be configured in
unison
27. Paul Shapiro | @fighto
How Do You Add Nodes &
How Do They Connect?
How do you add nodes to your “workflow”?
How do you string nodes together?
28. Paul Shapiro | @fighto
How Do You Configure & Run Nodes?
Configuring Nodes
Running Workflows
OR
30. Paul Shapiro | @fighto
Most Keyword Research Looks Like This
31. Paul Shapiro | @fighto
Typical Time Investment for Keyword
Research
5
8
12
21 21
6
10
18
22
28
0
5
10
15
20
25
30
Micro (0-49 pages) Small (50-99 pages) Medium (100-249 pages) Large (250-499 pages) Extra Large (>500 pages)
Hours to Complete Keyword Research by Site Size
Average (Low End) Average (High End)
33. Paul Shapiro | @fighto
Filtering +
Data
Manipula-
tions
One Big Keyword List
Seed Keywords
- List
- GWMT
- SEMRush Comp. KWs
- SQR Keywords
Keyword Planner
Suggestions (via
GrepWords)
Google Autocomplete
Semantic Keyword
Recommendations
(via MarketMuse)
Google Autocomplete
SEMRush
Domain vs. Domain
Keywords
Google Autocomplete
34. Paul Shapiro | @fighto
Data Manipulations / Calculations
• Get top 10 results from rank checking API (i.e., GetSTAT)
• Use Moz API nodes and find average PA to assess
competiveness.
• Optionally, use SEMRush’s Keyword Difficulty API
Organic
Competition
Search Volume
Keyword Trends
35. Paul Shapiro | @fighto
Data Manipulations / Calculations
• Get top 10 results from rank checking API (i.e., GetSTAT)
• Use Moz API nodes and find average PA to assess
competiveness.
• Optionally, use SEMRush’s Keyword Difficulty API
Organic
Competition
• Get Search Volumes via SEMRush API or via GrepWords APISearch Volume
Keyword Trends
36. Paul Shapiro | @fighto
Data Manipulations / Calculations
• Get top 10 results from rank checking API (i.e., GetSTAT)
• Use Moz API nodes and find average PA to assess
competiveness.
• Optionally, use SEMRush’s Keyword Difficulty API
Organic
Competition
• Get Search Volumes via SEMRush API or via GrepWords APISearch Volume
• Use 2 Years of Google Trends data to calculate slope and
determine growing/declining keywordsKeyword Trends
37. Paul Shapiro | @fighto
String ‘em All Together and then…
38. Paul Shapiro | @fighto
Visualize
This top-right quadrant contains
keywords with:
• Low competition
• Good growth
Larger bubbles show higher
search volumes.
You can alternatively use
current rank on the x-axis to
signal organic market share like
a traditional growth-share
matrix.
47. Paul Shapiro | @fighto
Search Console
Schedule to run monthly with Cron
and backup to SQL database:
https://searchwilderness.com/gwmt-
data-python/
JR Oakes’ BigQuery vision:
http://pshapi.ro/2vmjDe8
48. Paul Shapiro | @fighto
301 Redirect Mapping
from Old URLs
49. Paul Shapiro | @fighto
301 Redirect Mapping from Old URLs
Crawl
Current Site
Download
Rendered
Pages
Extract Main
Content
(BoilerPipe)
Convert to
Bitvector
Get Historic
URLs from
Wayback
Machine
API
Filter Out
URLs found
on Current
Site
Grab
Rendered
Page from
Wayback
Machine
Extract
Main
Content
(BoilerPipe)
Convert to
Bitvector
Cosine
Similarity
Generate
.htaccess
strings
1
2
3
52. Paul Shapiro | @fighto
1. Download ranking data via STAT
API
2. Compare results from 1-10 for
each query against results from 1-
10 for every other query.
3. Calculate percent similarity.
4. Schedule checks and examine
what changed.
58. Paul Shapiro | @fighto
CTR
1. Data collection: We collect data on query,
page and associated metrics via the Google
Search Console Search Analytics API.
2. Round average position: I round average
position to the tenths decimal place (e.g., 1.19
is rounded to 1.2).
3. Math: We identify outliers using a combination
of the statistical methods for identifying
outliers (modified z-score, IQR).
4. Email: If any negative outliers are identified
for a keyword query and page combination at
an average position, an email is sent out
identifying all of this data to each of the SEOs
assigned to the account to investigate.
5. Scheduling: Set your script to run on a
recurring basis.
My SEL Article: http://pshapi.ro/2Ae2LYP
60. Paul Shapiro | @fighto
Reddit Data Mining
Reddit Data Mining: Python Script
https://searchwilderness.com/reddit-python-code/
1. Enter filename for output
2. Enter a search or series of searches
3. Choose reddit sorting method. For this purpose,
choose ‘new’
4. Choose to look at all of reddit, or isolate to
particular subreddit(s).
5. Schedule with cron to find new topic ideas on a
recurring basis.
62. Paul Shapiro | @fighto
Bulk Check AMP
Pages with
AMPBench API
Python Script:
http://pshapi.ro/2AHlNaE
Requires:
• Python
• Requests package
Ideally AMPBench would run locally, but
can be ran off the appspot demo URL.
64. Paul Shapiro | @fighto
http://apiv2.ahrefs.com/?from=backlinks_new_lost&limit=10&target=competitor.com&
where=type:%22new%22,date:%222017-06-
01%22&mode=domain&output=json&token=your_personal_api_key
Link Building: Prospecting with Competitors
Scheduled use of Ahrefs API
Parsed Results + SMTP = Link Opportunities
66. Paul Shapiro | @fighto
• Detect new 404s and other errors
• New redirects
• Changes to robots.txt
• Content based changes over time
• Indexation changes
• New pages created
• Changes in rank or traffic
(rank checking API or Search Console)
Use Cloud Crawler like Botify/Deepcrawl with
API or a custom solution
67. Paul Shapiro | @fighto
Custom Solution
1. Run Screaming Frog in the Cloud with lots of
RAM:
• Amazon AWS: http://ipullrank.com/how-to-run-
screaming-frog-and-url-profiler-on-amazon-web-
services/
• Google Cloud:
https://online.marketing/guide/screaming-frog-in-
google-cloud/
2. Activate with command line and Task Manager for
scheduling
3. Use a macro program like RoboTask to generate
reports and send to a particular folder
4. Download via FTP or dump to SQL database for
analysis
5. Analysis produces alerts with SMTP
69. Paul Shapiro | @fighto
1. Download webpage body
contents
2. Run through text
summarization engine(s) to
produce small snippets of
important page text
3. Have person edit to avoid
truncation and improve
language
Process for Semi-Automated Meta Descriptions
https://searchengineland.com/reducing-the-time-it-takes-
to-write-meta-descriptions-for-large-websites-299887
71. Paul Shapiro | @fighto
Recognize Images
• You can use custom machine learning options
(https://www.tensorflow.org/tutorials/image_recognition) but it’s easier and
more effective to use an API in this context.
• For APIs, you have options:
• Microsoft Computer Vision: https://azure.microsoft.com/en-us/services/cognitive-
services/computer-vision/
• Google Cloud Vision API: https://cloud.google.com/vision/
• CloudSight: https://cloudsight.ai/
1. Download all images without alt attribute
2. Run through API and get a caption (not perfect, better than nothing)