30% of travel industry website visitors are unsavory competitors, hackers, spammers, and fraudsters. Fact is, travel suppliers, OTAs, and metasearch sites are all being scraped by bots which hurts their marketing metrics, SEO, website performance, and customer loyalty.
View this presentation to understand:
- The prevalence and impact of bots on your website
- How to improve your online KPIs
- How to identify and block fraudsters and scrapers
- When a web scraper is actually good
The future of online travel and website security
2. Speakers
Rami Essaid
CEO & Co-founder
Distil Networks
Speaker
Orion Cassetto
Dir. of Product Marketing
Distil Networks
Moderator
3. Good Bots
Search Engine Crawling
Power APIs
Check system connectivity and status
Bad Bots
Steal content
Scan for vulnerabilities
Perform Fraud
etc.
The Basics of Bots
A “Bot” is an automated program that runs on the internet
5. Three Layers of Business Damaged Caused by Malicious Bots
Bots hurt your KPIs...
Slowdowns, downtime and poor CX
Decline in website traffic
Loss of revenue and customer base
7. What Kind of Data is Being Scraped?
Customer Data Pricing Info Editorial Content
Incentive
Packages
Reviews Keyword
Placement
SEO
Optimization
8. What Are Scrapers Doing with Your Travel Site?
Posting your content on competitor
sites
Scrapers steal your traffic and advertising
dollars. Duplicative content and high bounce
rates diminishes your SEO
Undermining your prices
Bots monitor your prices, ensuring competitors
can undercut with lower price listings
Executing searches on your site
The resulting API calls to third parties can cost
you
9. According to RyanAir unauthorized scrapers frequently
○ Added excessive charges to European customers
○ Failed or refused to pass on vital information like customer contact info, flight
changes, web check in, and info on special needs
○ Caused missed flights and repeated problems for customers
Unauthorized Scraping Causes Problems for Passengers and Airlines Alike
10. EasyJet complained of 60% increase in ticket
prices due to eDreams adding excessive fees to
its tickets
Affected over 300,000 customers during the 6
months period of monitoring
EasyJet implemented bot detection tech to make
informed decisions about automated clients
Aggregators Add Excessive Fees to Airline
Tickets
“Utilizing an automated system for
detecting scrapers gives us the information
we need to act on each situation, and block
individuals if we wish to.”
-- Jerry Dunn
Distribution Development Manager, EasyJet
11. Add-on sales like upgrades, travel insurance, etc. result in an average of $20 to $40 of
additional revenue per sale for airlines
When scrapers insert themselves in the sale as middlemen, the upsell/cross-sell
opportunity moves to their businesses
Web scrapers and travel aggregators may also charge referral fees
or ask for volume discounts from airlines or hotel chains
Scraping Causes a Loss of Upsell and Cross-sell
Opportunities
Source: http://www.eyefortravel.com/mobile-and-technology/scraping-single-biggest-threat-travel-industry
12. Negative SEO Attacks Damage Relevancy
Bots steal content, product lists, and prices for
duplication elsewhere on the Internet
Duplicated content reduces your company’s
uniqueness and thus quality score
SEO damage may result, especially if
○ Your prices are undercut
○ The content is repurposed on a more popular site
Duplicate Content Results in Diminished SEO
13. Bots Break Into Accounts with Stolen Passwords
Brute Force Account Takeover
Using a bot to try stolen usernames and passwords from
breaches at other websites on your site
Newly compromised accounts are then used for various forms
of fraud/theft
14. The Ashley Madison breach released 32 million
log-in credentials into the wild
Account takeover and transaction fraud have
significantly increased
Lost or stolen credentials were already the top
cause of data breaches since 2010
Online Fraud Boosted by Ashley Madison Breach
Source: VBIR 2105
15. Brute Force Attacks used to Pilfer Loyalty Programs
Loyalty programs are Low hanging fruit
Loyalty programs are frequent targets for
hackers
Legacy systems were secured with 4-digit
PIN numbers
Points in can be used for air travel, rental
cars, dining and shopping.
16. Traffic from unnecessary bots inflates “Look-to-book” ratios
Blocking unnecessary bots improves KPI tracking and analytic data accuracy
Bots Skew Key Website Analytics
17. Roughly 23% of traffic on the average travel website is from bad bots
Bot traffic frequently cause websites to experience performance issues
or brownouts
Garbage Bot Traffic Increases Costs and Infrastructure Utilization
18. Challenges Distil Results
Bots caused brownouts which led to immediate loss
of revenue
Increased uptime from 99.6% to 99.9% (no downtime for the
first time in five years)
Bots can hurt Google quality score and SEO Improved SEO and Google quality score
Homegrown IP blocking wasn’t working - Bots came
in through proxies and used spoofed IP addresses
Automated bot defense identified and blocked more than
99.99% of bots
Redtag.ca Protects SEO and Uptime by Blocking Bad Bots
Redtag.ca specializes in finding fantastic travel deals on
vacations, flights, cruises, hotels and car rentals.
With Distil, we increased our uptime from 99.6% to 99.9%,
reduced infrastructure costs and eliminated costly bot-driven API
calls. Bots, be gone!”
-Rob Gennaro, Digital Marketing Officer, Red Label Vacations
“
19. Reviewing The Impact Bots Have on Travel Site
Profits
Customers
transfer loyalty to
3rd party sites
Add-on sales happen on 3rd
party sites
Excessive fees increase
prices
SEO damage
reduces web site
searchability
Order errors and ToS breaches
cause poor experiences
Loyalty programs
hacked
20. Good bots make up over 35% of all traffic to the average website
○ Search engines - Google, Bing, Baidu, etc.,
○ Alexa Crawler
○ Pingdom, Keynote, etc.
○ Vulnerability Scanners
○ etc.
Effective solutions block bad bots but leave good bots unhindered
The Importance of Accurately Identifying Good Bots
Source: Distil Networks,
2015 Bad Bot Landscape Report
21. Partners in Disguise
Many meta search sites get their start from scraping. Once
revenues appear they license API access.
Site Indexing
Search engine bots scour and prioritize content to drive
inbound Traffic to your site.
When Site Scraping Should be
Sanctioned
22. The First Easy and Accurate Way to Defend
Websites Against Malicious Bots
23. The World’s Most Accurate Bot Detection
System
Inline Fingerprinting
Fingerprints stick to the bot even if it attempts to
reconnect from random IP addresses or hide behind an
anonymous proxy.
Known Violators Database
Real-time updates from the world’s largest Known
Violators Database, which is based on the collective
intelligence of all Distil-protected sites.
Browser Validation
The first solution to disallow browser spoofing by
validating each incoming request as self-reported and
detects all known browser automation tools.
Behavioral Modeling and Machine Learning
Machine-learning algorithms pinpoint behavioral
anomalies specific to your site’s unique traffic patterns.
24. How Travel Companies Benefit from Distil
Increase insight & control
over human, good bot &
bad bot traffic
Block 99.9% of malicious
bots without impacting
legitimate users
Slash the high tax bots
place on internal teams
& web infrastructure
Protect data from web
scrapers, unauthorized
aggregators & hackers
Rami slide
We just want to introduce what bots are, the fact that they are both good and bad and how many of each exist on the average site.
Rami slide
This problem is here to stay. In fact, it is growing because of cheap and plentify resources and ready made tools to perform the attacks.
Rami slide
This sets the stage for the depth of the bot problem. Most people think of bots as scrapers, but these scrapers do all sorts of business logic attacks and can affect businesses in a number of ways. These include
Site slowdown from massive amounts of garbage traffic
Website traffic declines from SEO attacks and content theft which may drive users to other websites to for the same content
Lost revenue. When leads and data go elsewhere, so do users and their loyalty, right behind that is the revenue they bring.
Rami Slide
We want to introduce the different ways that bots impact travel businesses. To do so, we’re going to talk about some of the various aspects of online businesses from loyalty to searchability and provide examples of bots impacting each of them.
This point of this slide is to talk about the different kinds of data which can be stolen via scraping.
Customer data - Itineraries, contact information
Pricing info - prices, availability, vendors
Editorial content - unique articles about destinations, venues, etc.
incentive packages - bundled deals which are typically used by brands to overcome scrapers can themselves be scraped and mimiced
User reviews - think travel advisor reviews
Keyword placement
SEO optimization
Travel package margins are tight
All of their prices should be on par with yours
Competitors will drop their prices on packages based on time of day
Late at night
Reviews
Diminishing data quality
Rami Slide-
This slide is a case study about Ryan Air. The business impacts here are cost (excessive fees), and customer satisfaction/user experiences (scrapers and aggregators not passing on vital info causes missed flights, which pisses off users)
Rami Slide -
The business values impacted in this case study are:
Costs to customers - excessive fees raise prices
http://www.travolution.co.uk/articles/2013/06/05/6785/easyjet-calls-for-crackdown-on-screen-scraping-website.html
Rami Slide
The business value here is revenue making opportunities. Cross-sell and upsell opps make up a sizable amount of the travel industry revenue. When sales happen on scrapers sites or unlicensed aggregators the upsell opportunities are transferred to these sites and thus there is a loss of revenue.
TL DR; Bots take info to other sites. Customers follow the data. They purchase there and the upsell opportunity is lost.
Slide Owner: Rami
The business impact here is findability.
SEO Attacks and duplicated content make it harder for customers to find relevant websites (travel brands).
Slide Owner: Rami
This example takes a couple slides but the impact is that fraud hurts customer satisfaction and company profits. The first slide will explain how bots are involved with account take over and fraud.
Rami -
The point account take over and fraud are not going away. They are low hanging fruit that is growing with these huge password list dumps.
According to the 2011 to 2015 Verizon Breach Investigation Reports, Lost or Stolen credentials has been the #1 cause of data breaches. 2015 is likely to continue this trend due to Ashley Madison.
http://www.verizonenterprise.com/resources/reports/rp_data-breach-investigation-report-2015-insider_en_xg.pdf
This is our where the story comes back to the travel industry. These lists are used to break into loyalty programs such at these and steal customer points.
Some recent breaches:
American Airlines - 10K + Compromised accounts in Jan
British Airelines - Tens of thousands of compromised accounts in March
United Airlines - Dozens of accounts compromised - Feb
Hilton - undisclosed number of account take overs.
http://www.computerworld.com/article/2867241/security0/stolen-credentials-used-to-access-united-airlines-mileageplus-accounts.html
http://www.dallasnews.com/business/airline-industry/20150112-american-united-airlines-targets-of-attempt-to-steal-customers-miles.ece
http://www.computerworld.com/article/2868019/united-american-airlines-account-fraud-highlights-hacker-focus-on-travel-industry.html
http://www.darkreading.com/attacks-breaches/british-airways-the-latest-loyalty-program-breach-victim/d/d-id/1319683
Rami -
Online Travel business rely on accurate data in order to make informed decisions about their businesses. All bad bots, including web scrapers and other are basically watering down KPIs and analytic data. “Look to Book” ratios would be hugely affected as these bots are only looking but not booking.
RAMI-
The last piece of the story has to do with costs to travel brands. Bots utilize web environment resources including but not limited to bandwidth, server utilization etc. when bots becomes excessive this bot traffic may cause site instability or downtime.
RAMI - Red tag is a case study which illustrates several of these points. Specifically they were having brownouts and their SEO was being affected. When they tried to police the bot problem themselves, it turned out to be costly.
They turned to commercial solutions (Distil) to help them efficiently deal with their bad bot problem.
RAMI - Here’s a recap of how bots impact each of these important aspects of online travel businesses.
It also shows the logos of some of the different sites we discussed which had these problems.
Slide Owner: RAMI
bot friendly ecosystem
cant be iron fisted
It isn’t enough to just block bots. Good bots are an important part of a healthy web ecosystem. Great care should be taken to identify both good and bad bots, and to only block the bad bots.
RAMI -
Good bots also come in many flavors. An interesting example frequently seen in the travel world is that of partners in disguise. A famous example of this is Hipmunk, a travel aggregation site who got its start by scraping data from other sites. Once they had a viable business model with revenue, they began licensing the data sources. Another great example of good bots are site indexing bots from search engineers like google, baidu, and bing.
Orion to Ask: How do I avoid an arms race with the scrapers so I can focus my team and infrastructure on welcoming, not screening, visitors?
Source: http://www.tnooz.com/article/what-you-need-to-know-about-web-scraping-how-to-understand-identify-and-sometimes-stop/
Meta search sites like Hipmunk sometimes get their start by scraping travel site data. Once they have enough data and enough traffic to be valuable they go to suppliers and OTAs with a partnership agreement. I’m naming Hipmunk because the Company is one of the few to fess up to site scraping, and one of the few who claim to have quickly stopped scraping when asked.
I’d wager that Hipmunk and others use(d) web scraping because it’s easy, and getting a decision maker at a major travel supplier on the phone is not easy, and finding legitimate channels to acquire supplier data is most definitely not easy.