Using machine learning to determine drivers of bounce and conversion (part 2)

•Télécharger en tant que PPTX, PDF•

3 j'aime•1,275 vues

[2016 Velocity NY] There has been a lot of historical work that looks at the relationship between performance and conversions, but most of it has been after the fact or relied on linear models. Google partnered with SOASTA to train a machine-learning model on a large sample of real-world performance, conversion, and bounce data. Patrick Meenan and Tammy Everts offer an overview of the resulting model, able to predict the impact of performance work and other site metrics on conversion and bounce rates. The code used to generate the model is freely available.

Technologie

Using machine learning
to determine drivers
of bounce and conversion
(part 2)
Velocity 2016 New York

Pat Meenan
@patmeenan
Tammy Everts
@tameverts

Get the code
https://github.com/WPO-
Foundation/beacon-ml

Random forest
Lots of random decision trees

Vectorizing the data
• Everything needs to be numeric
• Strings converted to several inputs as yes/no
(1/0)
• i.e. Device manufacturer
• “Apple” would be a discrete input
• Watch out for input explosion (UA String)

Balancing the data
• 3% conversion rate
• 97% accurate by always guessing no
• Subsample the data for 50/50 mix

Smoothing the data
ML works best on normally distributed data
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_val = scaler.transform(x_val)

Validation data
• Train on 80% of the data
• Validate on 20% to prevent overfitting
–Training accuracy from validation set

Input/output relationships
• SSL highly correlated with conversions
• Long sessions highly correlated with
not bouncing
• Remove correlated features from
training

$Training random forest clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None) clf.fit(x_train, y_train)$

Feature importances
clf.feature_importances_

Training deep learning
model = Sequential()
model.add(...)
model.compile(optimizer='adagrad',
loss='binary_crossentropy',
metrics=["accuracy"])
model.fit(x_train,
y_train,
nb_epoch=EPOCH_COUNT,
batch_size=32,
validation_data=(x_val, y_val),
verbose=2,
shuffle=True)

Brute force FTW
• 93 input “features”
• Train 93 models with 1 input
– Measuring the prediction accuracy of each
• Train 92 models with 2 inputs
– Top feature from first round
– Measure combined prediction accuracy
• Lather, rinse, repeat…

Visualizing the model
• Take trained model (X inputs)
• Vary inputs
–100ms to 20 seconds in 100ms intervals
• Apply the data smoothing from training
set
• model.predict_proba

What’s in our beacon?
• Top-level – domain, timestamp, SSL
• Session – start time, length (in pages), total load time
• User agent – browser, OS, mobile ISP
• Geo – country, city, organization, ISP, network speed
• Bandwidth
• Timers – base, custom, user-defined
• Custom metrics
• HTTP headers
https://docs.soasta.com/whatsinbeacon/

Finding 1
Maybe everything doesn’t matter
after all

Finding 2
DOM ready (aka DOM content loaded)
and average session load time were
the best indicators of bounce rate

Finding 3
When it came to getting high
predictability, conversion data was
tougher than bounce data

81% prediction accuracy was as high as we got

Finding 4
Pages with more scripts were
more less likely to convert

Finding 5
The number of DOM elements matters…
a lot

Finding 6
Mobile-related measurements weren’t
meaningful predictors of conversions

Finding 7
Some conventional metrics
were not as important as we thought

Feature
Importance
(bounce)
Start render 69 ~top 3

Things to watch out for
(other than dangling prepositions)

1. YMMV
2. Do try this at home
3. Gather your RUM data (lots of it)
4. Run the machine learning against it
5. If you get unexpected results, keep
digging

Recommandé

Using machine learning to determine drivers of bounce and conversionTammy Everts

How to fix the design issues that matter on the pages that matter [2016 Smash...Tammy Everts

Performance: Key Elements to Consider in the Cloud - RightScale Compute 2013RightScale

Pedal to the Metal: Speed up your load times for more conversionsTammy Everts

How to create a performance-first culture [2018 WebPerfDays Amsterdam]Tammy Everts

Velocity NY - How to Measure Revenue in MillisecondsCliff Crocker

Performance Is About People, Not Metrics [2017 Web Directions Summit] Tammy Everts

2021 Chrome Dev Summit: Web Performance 101Tammy Everts

Recommandé

Using machine learning to determine drivers of bounce and conversionTammy Everts

How to fix the design issues that matter on the pages that matter [2016 Smash...Tammy Everts

Performance: Key Elements to Consider in the Cloud - RightScale Compute 2013RightScale

Pedal to the Metal: Speed up your load times for more conversionsTammy Everts

How to create a performance-first culture [2018 WebPerfDays Amsterdam]Tammy Everts

Velocity NY - How to Measure Revenue in MillisecondsCliff Crocker

Performance Is About People, Not Metrics [2017 Web Directions Summit] Tammy Everts

2021 Chrome Dev Summit: Web Performance 101Tammy Everts

How slow load times hurt UX (and what you can do about it) [FluentConf 2016]Tammy Everts

GlobalDots - How Website Speed Affects Conversion RatesGlobalDots

The 7 Habits of Highly Effective Performance Teams [PerfNow 2019]Tammy Everts

2020 Chrome Dev Summit: Web Performance 101Tammy Everts

Shop.org 2017 Tech talk website speed for ecommerce why it matters and how to...National Retail Federation

The Small Things That Add Up: How to Find What Design Factors Influence Conve...Tammy Everts

Humans by the hundred (DevOps Days Ohio)Yelp Engineering

Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessSOASTA

Digital Gaggle 2017 - Mobile IndexErudite

Progressive Web App ChallengesJason Grigsby

What You Don't Know About 3rd Party Scripts Can Hurt You!Jennifer Finney

Accel 2011 sony_brandon_bunkerEnsighten

Zapier DemystifiedNoel P. Rodriguez

Webhooks - Creating a Programmable Internetryan teixeira

Serverless Ops: What do we do when the server goes away?Tom McLaughlin

How to measure the business impact of web performanceSOASTA

More Than Just URL Mappers - Proxies for Observation and ControlMark McBride

Wordcamp 2017-toronto-sam lalondewcto2017

Front End Effectiveness – Federico WeberFederico Weber

AMP Accelerated Mobile Pages - The Next Generation SMX London 2017 Dawn AndersonDawn Anderson MSc DigM

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers ...SOASTA

Machine Learning RUM - Velocity 2016Patrick Meenan

Contenu connexe

Tendances

How slow load times hurt UX (and what you can do about it) [FluentConf 2016]Tammy Everts

GlobalDots - How Website Speed Affects Conversion RatesGlobalDots

The 7 Habits of Highly Effective Performance Teams [PerfNow 2019]Tammy Everts

2020 Chrome Dev Summit: Web Performance 101Tammy Everts

Shop.org 2017 Tech talk website speed for ecommerce why it matters and how to...National Retail Federation

The Small Things That Add Up: How to Find What Design Factors Influence Conve...Tammy Everts

Humans by the hundred (DevOps Days Ohio)Yelp Engineering

Tis The Season: Load Testing Tips and Checklist for Retail Seasonal ReadinessSOASTA

Digital Gaggle 2017 - Mobile IndexErudite

Progressive Web App ChallengesJason Grigsby

What You Don't Know About 3rd Party Scripts Can Hurt You!Jennifer Finney

Accel 2011 sony_brandon_bunkerEnsighten

Zapier DemystifiedNoel P. Rodriguez

Webhooks - Creating a Programmable Internetryan teixeira

Serverless Ops: What do we do when the server goes away?Tom McLaughlin

How to measure the business impact of web performanceSOASTA

More Than Just URL Mappers - Proxies for Observation and ControlMark McBride

Wordcamp 2017-toronto-sam lalondewcto2017

Front End Effectiveness – Federico WeberFederico Weber

AMP Accelerated Mobile Pages - The Next Generation SMX London 2017 Dawn AndersonDawn Anderson MSc DigM

Tendances (20)

How slow load times hurt UX (and what you can do about it) [FluentConf 2016]

GlobalDots - How Website Speed Affects Conversion Rates

The 7 Habits of Highly Effective Performance Teams [PerfNow 2019]

2020 Chrome Dev Summit: Web Performance 101

Shop.org 2017 Tech talk website speed for ecommerce why it matters and how to...

The Small Things That Add Up: How to Find What Design Factors Influence Conve...

Humans by the hundred (DevOps Days Ohio)

Tis The Season: Load Testing Tips and Checklist for Retail Seasonal Readiness

Digital Gaggle 2017 - Mobile Index

Progressive Web App Challenges

What You Don't Know About 3rd Party Scripts Can Hurt You!

Accel 2011 sony_brandon_bunker

Zapier Demystified

Webhooks - Creating a Programmable Internet

Serverless Ops: What do we do when the server goes away?

How to measure the business impact of web performance

More Than Just URL Mappers - Proxies for Observation and Control

Wordcamp 2017-toronto-sam lalonde

Front End Effectiveness – Federico Weber

AMP Accelerated Mobile Pages - The Next Generation SMX London 2017 Dawn Anderson

Similaire à Using machine learning to determine drivers of bounce and conversion (part 2)

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers ...SOASTA

Machine Learning RUM - Velocity 2016Patrick Meenan

Machine learning systems for engineersCameron Joannidis

Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...Curiosity Software Ireland

Neotys PAC 2018 - Tingting ZongNeotys_Partner

Machine Learning in Autonomous Data WarehouseSandesh Rao

The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web TestingPerfecto by Perforce

FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudAmazon Web Services

Mind Map Test Data Management Overviewdublinx

Testing Distributed Query Engine as a Servicetakezoe

Building data intensive applicationsAmit Kejriwal

Creating a Data validation and Testing StrategyRTTS

DMM9 - Data Migration TestingNick van Beest

Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...Amazon Web Services

Introduction to Machine Learning and Data Science using the Autonomous databa...Sandesh Rao

Introduction to Machine Learning and Data Science using Autonomous Database ...Sandesh Rao

Parallel machines flinkforward2017Nisha Talagala

TCO Amazon Web Services

Using ai and automation to build resiliency into azure dev opsRob Jahn

[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...DataScienceConferenc1

Similaire à Using machine learning to determine drivers of bounce and conversion (part 2) (20)

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers ...

Machine Learning RUM - Velocity 2016

Machine learning systems for engineers

Curiosity and Lemontree present - Data Breaks DevOps: Why you need automated ...

Neotys PAC 2018 - Tingting Zong

Machine Learning in Autonomous Data Warehouse

The Automation Firehose: Be Strategic & Tactical With Your Mobile & Web Testing

FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the Cloud

Mind Map Test Data Management Overview

Testing Distributed Query Engine as a Service

Building data intensive applications

Creating a Data validation and Testing Strategy

DMM9 - Data Migration Testing

Using AWS to Build a Scalable Big Data Management & Processing Service (BDT40...

Introduction to Machine Learning and Data Science using the Autonomous databa...

Introduction to Machine Learning and Data Science using Autonomous Database ...

Parallel machines flinkforward2017

TCO

Using ai and automation to build resiliency into azure dev ops

[DSC Europe 22] Engineers guide for shepherding models in to production - Mar...

Plus de Tammy Everts

A (Fairly) Complete Guide to Performance Budgets [SmashingConf SF 2023]Tammy Everts

Real-World Performance Budgets [PerfNow 2022]Tammy Everts

Smashing Meets for Speed: Why web performance matters – especially nowTammy Everts

How I learned to stop worrying and love UX metricsTammy Everts

Connecting the dots between design, performance and conversion rates [Smashin...Tammy Everts

The hunt for the unicorn performance metric [DeltaV London 2018]Tammy Everts

2016 Mobile State of the Union [RWD Summit]Tammy Everts

Metrics, metrics everywhere (but where the heck do you start?)Tammy Everts

How Slow Load Times Hurt Your Bottom Line (And 17 Things You Can Do to Fix It)Tammy Everts

2015 State of the Union: Mobile Web PerformanceTammy Everts

Metrics, metrics everywhere (but where the heck do you start?)Tammy Everts

State of the Union: Mobile Web PerformanceTammy Everts

Plus de Tammy Everts (12)

A (Fairly) Complete Guide to Performance Budgets [SmashingConf SF 2023]

Real-World Performance Budgets [PerfNow 2022]

Smashing Meets for Speed: Why web performance matters – especially now

How I learned to stop worrying and love UX metrics

Connecting the dots between design, performance and conversion rates [Smashin...

The hunt for the unicorn performance metric [DeltaV London 2018]

2016 Mobile State of the Union [RWD Summit]

Metrics, metrics everywhere (but where the heck do you start?)

How Slow Load Times Hurt Your Bottom Line (And 17 Things You Can Do to Fix It)

2015 State of the Union: Mobile Web Performance

Metrics, metrics everywhere (but where the heck do you start?)

State of the Union: Mobile Web Performance

Dernier

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Manual 508 Accessibility Compliance AuditSkynet Technologies

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda

Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda

A Journey Into the Emotions of Software DevelopersNicole Novielli

Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA

Data governance with Unity Catalog PresentationKnoldus Inc.

From Family Reminiscence to Scholarly Archive .Alan Dix

Take control of your SAP testing with UiPath Test SuiteDianaGray10

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5

[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Dernier (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Manual 508 Accessibility Compliance Audit

Time Series Foundation Models - current state and future directions

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger

A Journey Into the Emotions of Software Developers

Long journey of Ruby standard library at RubyConf AU 2024

Data governance with Unity Catalog Presentation

From Family Reminiscence to Scholarly Archive .

Take control of your SAP testing with UiPath Test Suite

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...

[Webinar] SpiraTest - Setting New Standards in Quality Assurance

Genislab builds better products and faster go-to-market with Lean project man...

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

Using machine learning to determine drivers of bounce and conversion (part 2)

1. Using machine learning to determine drivers of bounce and conversion (part 2) Velocity 2016 New York

2. Pat Meenan @patmeenan Tammy Everts @tameverts

3. What we did (and why we did it)

5. Get the code https://github.com/WPO- Foundation/beacon-ml

6. Deep learning weights

7. Random forest Lots of random decision trees

8. Vectorizing the data • Everything needs to be numeric • Strings converted to several inputs as yes/no (1/0) • i.e. Device manufacturer • “Apple” would be a discrete input • Watch out for input explosion (UA String)

9. Balancing the data • 3% conversion rate • 97% accurate by always guessing no • Subsample the data for 50/50 mix

10. Smoothing the data ML works best on normally distributed data scaler = StandardScaler() x_train = scaler.fit_transform(x_train) x_val = scaler.transform(x_val)

11. Validation data • Train on 80% of the data • Validate on 20% to prevent overfitting –Training accuracy from validation set

12. Input/output relationships • SSL highly correlated with conversions • Long sessions highly correlated with not bouncing • Remove correlated features from training

13. Training random forest clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None) clf.fit(x_train, y_train)

14. Feature importances clf.feature_importances_

15. Training deep learning model = Sequential() model.add(...) model.compile(optimizer='adagrad', loss='binary_crossentropy', metrics=["accuracy"]) model.fit(x_train, y_train, nb_epoch=EPOCH_COUNT, batch_size=32, validation_data=(x_val, y_val), verbose=2, shuffle=True)

16. Understanding deep learning

17. Brute force FTW • 93 input “features” • Train 93 models with 1 input – Measuring the prediction accuracy of each • Train 92 models with 2 inputs – Top feature from first round – Measure combined prediction accuracy • Lather, rinse, repeat…

18. Visualizing the model • Take trained model (X inputs) • Vary inputs –100ms to 20 seconds in 100ms intervals • Apply the data smoothing from training set • model.predict_proba

19. What we learned

20. What’s in our beacon? • Top-level – domain, timestamp, SSL • Session – start time, length (in pages), total load time • User agent – browser, OS, mobile ISP • Geo – country, city, organization, ISP, network speed • Bandwidth • Timers – base, custom, user-defined • Custom metrics • HTTP headers https://docs.soasta.com/whatsinbeacon/

21. Finding 1 Maybe everything doesn’t matter after all

22. Bounce rate

23. Finding 2 DOM ready (aka DOM content loaded) and average session load time were the best indicators of bounce rate

24. Up to 89.5% accuracy

25.

26. Finding 3 When it came to getting high predictability, conversion data was tougher than bounce data

27. 81% prediction accuracy was as high as we got

28. Finding 4 Pages with more scripts were more less likely to convert

29.

30. Finding 5 The number of DOM elements matters… a lot

31.

32. Finding 6 Mobile-related measurements weren’t meaningful predictors of conversions

33.

34. Finding 7 Some conventional metrics were not as important as we thought

35. Feature Importance (bounce) Start render 69 ~top 3

36. Things to watch out for (other than dangling prepositions)

37. Yep, checkout pages are SLOW

38.

39. Takeaways

40. 1. YMMV 2. Do try this at home 3. Gather your RUM data (lots of it) 4. Run the machine learning against it 5. If you get unexpected results, keep digging

41. Thanks! @patmeenan @tameverts

Notes de l'éditeur

mPulse is built above the boomerang JavaScript library that collects web performance data from a user’s web browser and sends that back to the mPulse servers on a beacon. The simple definition of a beacon is that it is an HTTP(S) request with a ton of data included either as HTTP headers or as part of the Request’s Query String.
“DOM ready” refers to the amount of time it takes for the page’s HTML to be received and parsed by the browser. Actual page elements, such as images, haven’t appeared yet. (It’s kind of like getting ready to cook. Your cookbook is open, your recipe is in front of you, and your ingredients are on standby.)
“DOM ready” refers to the amount of time it takes for the page’s HTML to be received and parsed by the browser. Actual page elements, such as images, haven’t appeared yet. (It’s kind of like getting ready to cook. Your cookbook is open, your recipe is in front of you, and your ingredients are on standby.) It is the same as the DOM Content Loaded event in nav timing but the polyfill version that works across all browsers.
DOM_ready + load time gets us up to 89.5% accuracy on the predictions Takeaway: External blocking scripts (such as third-party ads, analytics, and social widgets) and styles (such as externally hosted CSS and fonts) have the greatest impact on DOM ready times. Site owners should measure the impact that these external elements have on their pages and conduct ongoing monitoring to ensure that scripts and styles are available and fast. Whenever possible, scripts should be served asynchronously (in parallel with the rest of the page) or in a non-blocking fashion.
DOM_ready + load time gets us up to 89.5% accuracy on predictions
Sessions that converted contained 48% more scripts (including third-party scripts, such as ads, analytics beacons, and social buttons) than sessions that didn’t.
Before around 300 scripts, it's possible that it learned the patterns of what some checkout flows looked like. Scripts are one of those things that may be more fixed than timings so it might be easier for deep learning to just learn what all sites checkout flows look like.
max_params_dom_In (number of DOM elements) -- the more complex pages after (1000?) elements starts to fall off.
Median_bandwidth_kbps was 44 User_agent_device_type was 79 Mobile_connection_type was 89 Shoppers who used low-bandwidth or mobile connections didn’t convert significantly less than shoppers on faster connections. This is interesting because it confirms that we’ve entered a “mobile everywhere” phase. Takeaway: Internet users don’t behave especially differently depending on what device they’re using. Site owners need to ensure they’re delivering consistent user experiences across device types.
Start render tells you when content begins to display in the user’s browser. Of the 1M records, 720k did not have render start times included (because the browser didn't support it) which is why it ended up being a not-important feature. Pat re-ran the deep learning version of the importances on a filtered dataset that only includes records that also included a render time to see how it looked relative to the other times. Filtered down to just records that also include a start render, start render is basically the same importance as dom ready. There is a similar pattern to the others where it plateaus though it looks like the plateau starts pretty early (around 3 seconds) which generally makes sense since usually render < dom ready < onload. In all cases, there doesn't seem to be a point where it isn't worth making it faster. If anything, the gains become more significant as you get closer to zero.