Data Science, Delivered Continuously @ GOTO Berlin 2017

•Télécharger en tant que PPTX, PDF•

1 j'aime•629 vues

A talk by Dr. Arif Wider (ThoughtWorks) and Christian Deger (AutoScout24) AutoScout24 is the largest online car marketplace Europe-wide for new and used cars. With more than 2.4 million listings across Europe, AutoScout24 has access to large amounts of data about historic and current market prices and wants to use this data to empower its users to make informed decisions about selling and buying cars. We created a live price estimation service for used vehicles based on a Random Forest prediction model that is continuously delivered to the end user. Predictive analytics of such sort is often only used for guiding company internal decision making. Delivering a predictive analytics product straight to the end user poses an entirely different set of requirements with respect to (1) performance and (2) automated quality control. In order to avoid the effort of handcrafting a high-performance implementation of a complex prediction model, many companies fall back to use primitive prediction models in such a situation. Learn how we achieved superb performance and scalability without the need for manual optimization or sacrifices in terms of prediction accuracy. For quality control, Continuous Delivery is already an established approach to modern web application development that allows for much shorter product release cycles and therefore yields the ability to rapidly innovate and adapt to user needs. However, in predictive analytics Continuous Delivery has been rarely applied so far. Learn how automated verification using live test data sets in a continuous delivery pipeline allows us to release model improvements with confidence at any given time. This way our users can benefit immediately from the work of our data scientists.

Logiciels

1
DATA SCIENCE,
DELIVERED CONTINUOUSLY
Arif Wider & Christian Deger
@arifwider @cdeger

Christian Deger
Chief Architect
cdeger@autoscout24.com
@cdeger

Dr. Arif Wider
Senior Consultant/Developer
awider@thoughtworks.com
@arifwider

PL
S
RUS
UA
RO
CZ
D
NL
B
F
A
HR
I
E
BG
TR
18countries
2.4m+cars & motos
10m+users per
month

The task: A consumer-facing data product
5GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

The task: A consumer-facing data product
6GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

The task: A consumer-facing data product
7GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

The prediction model: Random forest
8
Car listings of
last two years
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger
Volkswagen Golf

How to turn an R-based prediction model
into a high-performance web application?
9
?
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

How to turn an R-based prediction model
into a high-performance web application?
10GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

How to turn an R-based prediction model
into a high-performance web application?
11GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

How to turn an R-based prediction model
into a high-performance web application?
12
 Continuous Delivery!
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Application code in
one repository per
service.
Typical delivery pipeline
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Application code in
one repository per
service.
CI
Deployment package
as artifact.
Typical delivery pipeline
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Application code in
one repository per
service.
CI
Deployment package
as artifact.
CD
Deliver package to
servers
Typical delivery pipeline
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Continuous delivery pipelines
16
Prediction Model Pipeline
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Continuous delivery pipelines
17
Prediction Model Pipeline
Web Application Pipeline
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

The price for CD: Extensive model validation
18GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

The price for CD: Extensive model validation
19GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Lessons learned
20
Form a cross-functional team of
data scientists & software engineers!
Software engineers
… learn how data scientists work
… and understand the quirks of a prediction model
Data Scientist
… learn about unit testing, stable interfaces, git, etc.
... get quick feedback about the impact of their work
 Model and product iterations become much faster!
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Lessons learned
21
Generating gigabytes of Java code
is a challenge for the JVM
Use the G1 garbage collector
Turn off Tiered Compilation
 Do extensive warm-ups
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Lessons learned – Warm up
22GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Lessons learned
23
The approach of applying Continuous Delivery to
Data Science is useful independently of the tech
 Successfully applied similarly to a Python- and
Spark-based project
 Even more useful when quick model evolution
is required because of rapidly changing inputs
(e.g. user interaction)
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Conclusions
24
 Continuous Delivery allows us to bring prediction
model changes live very quickly.
 Only extensive automated end-to-end tests
provide confidence to deploy to production
automatically.
 Java code generation allows for very low response
times and excellent scalability for high loads but
requires plenty of memory.
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Conclusions: Price evaluation everywhere
25GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

Conclusions: Price evaluation everywhere
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger 26

Conclusions: Price evaluation everywhere
GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

29
THANK YOU
QUESTIONS?
Arif Wider & Christian Deger
@arifwider @cdeger

Data Science, Delivered Continuously @ GOTO Berlin 2017

Contenu connexe

Dernier

%in Midrand+277-882-255-28 abortion pills for sale in midrand

masabamasaba

Define the academic and professional writing..pdf

PearlKirahMaeRagusta1

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein

masabamasaba

Data spaces in distributed environments should be allowed to evolve in agile ways providing data space owners with large flexibility about which data they store. Agility and heterogeneity, however, jeopardize data exchanges because representations may build on varying ontologies and data consumers may not rely on the semantic correctness of their queries in the context of semantically heterogeneous, evolving data spaces. Graph data spaces are one example of a powerful model for representing and querying data whose semantics may change over time. To assert and enforce conditions on individual graph data spaces, shape languages (e.g SHACL) have been developed. We investigate the question of how querying and programming can be guarded by reasoning over SHACL constraints in a distributed setting and we sketch a picture of how a future landscape based on semantically heterogeneous data spaces might look like.

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...

Steffen Staab

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf

kalichargn70th171

The title is not connected to what is inside

shinachiaurasa2

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

SelfMade bd

Looking for an efficient way to manage your finances? Look no further than our money management app. With easy-to-use features, you can track your expenses, create budgets, and monitor your savings goals all in one place. Our app provides real-time updates on your spending habits and helps you make smarter financial decisions. Take control of your finances today with our user-friendly money management app.

Right Money Management App For Your Financial Goals

Jhone kinadey

Model Call Girl Services in Delhi reach out to us at 🔝 9953056974 🔝✔️✔️ Our agency presents a selection of young, charming call girls available for bookings at Oyo Hotels. Experience high-class escort services at pocket-friendly rates, with our female escorts exuding both beauty and a delightful personality, ready to meet your desires. Whether it's Housewives, College girls, Russian girls, Muslim girls, or any other preference, we offer a diverse range of options to cater to your tastes. We provide both in-call and out-call services for your convenience. Our in-call location in Delhi ensures cleanliness, hygiene, and 100% safety, while our out-call services offer doorstep delivery for added ease. We value your time and money, hence we kindly request pic collectors, time-passers, and bargain hunters to refrain from contacting us. Our services feature various packages at competitive rates: One shot: ₹2000/in-call, ₹5000/out-call Two shots with one girl: ₹3500/in-call, ₹6000/out-call Body to body massage with sex: ₹3000/in-call Full night for one person: ₹7000/in-call, ₹10000/out-call Full night for more than 1 person: Contact us at 🔝 9953056974 🔝. for details Operating 24/7, we serve various locations in Delhi, including Green Park, Lajpat Nagar, Saket, and Hauz Khas near metro stations. For premium call girl services in Delhi 🔝 9953056974 🔝. Thank you for considering us!

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

9953056974 Low Rate Call Girls In Saket, Delhi NCR

HR Software Buyers Guide in 2024 - HRSoftware.com

Fatema Valibhai

Conference: Engage2024 in Antwerp Type: Workshop Speakers: Florian Vogler, Henning Kunz, Christoph Adler Title: Navigating the Future with The Hitchhiker's Guide to Notes and Domino 14 Abstract: Embark on an exhilarating journey with industry trailblazers Florian Vogler, Henning Kunz, and Christoph Adler in this not-to-be-missed workshop at the forefront of the tech universe. Get ready for a thrilling kick-off as we navigate the current state of the HCL universe, setting the stage for an exploration of the groundbreaking Notes and Domino 14. Discover the latest enhancements and revolutionary features that will redefine your experience. In this interactive session, unlock a treasure trove of tips and tricks to elevate your utilization of version 14, both with and without the game-changing panagenda MarvelClient. Brace yourself for also diving into Nomad, Nomad Web, and VoltMX, expanding your horizons in the expansive HCL landscape. Be a part of this exclusive opportunity to stay ahead in the ever-evolving world of HCL technologies. Your journey to mastering Notes and Domino 14 begins here. And remember, in the spirit of intergalactic exploration, don't forget to bring your towel!

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...

panagenda

InShot proinshot.com stands tall among its peers as the ultimate video editing app, offering simplicity, versatility, and power in one package. With its intuitive interface and comprehensive feature set, InShot caters to both beginners and seasoned editors alike. Whether you're creating content for social media, YouTube, or personal projects, InShot empowers you to unleash your creativity and transform your videos into captivating masterpieces. Join the millions of users who trust InShot https://www.proinshot.com/ for all their video editing needs and discover the difference for yourself!

Exploring the Best Video Editing App.pdf

proinshot.com

Test automation is a cornerstone of software development and quality assurance in today's rapidly evolving digital landscape. Its significance cannot be overstated. Businesses can enhance efficiency, productivity, and accelerate software delivery to market through automation, streamlining testing processes effectively. This comprehensive guide addresses the best practices for test automation in 2024. It offers a detailed checklist to empower you to optimize your automation efforts and maintain a competitive edge.

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

kalichargn70th171

Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts In Chinsurah ❤Personal Whatsapp Number Chinsurah Call Girls 8617697112 💦✅. Chinsurah escorts we are avaliable for our all types budget customers with offer great deals in Chinsurah.Call Now our Chinsurah escort service & call girls ... Independent call girls in Chinsurah escorts available 24 hours a day for discreet incall and outcall bookings from trusted call girls - Elis.in. Nitya salvi Chinsurah escorts service agency # Are you looking for sexy call girls ? Call our agency to get you dream independent call girl, ... One shot: ₹2000/in-call, ₹5000/out-call Two shots with one girl: ₹3500/in-call, ₹6000/out-call Body to body massage with sex: ₹3000/in-call Full night for one person: ₹7000/in-call, ₹10000/out-call Flexibility Choices and options Lists of many beauty fantasies Turn your dream into reality Perfect companionship Cheap and convenient In-call and Out-call services And many more. WhatsApp Chat: 📞 8617697112 Visit The Website : https://www.nityasalvi.com/

Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...

Nitya salvi

Investing in AI transformation today The modern business advantage: Uncovering deep insights with AI Organizations around the world have come to recognize AI as the transformative technology that enables them to gain real business advantage. AI’s ability to organize vast quantities of data allows those who implement it to uncover deep business insights, augment human expertise, drive operational efficiency, transform their products, and better serve their customers

Microsoft AI Transformation Partner Playbook.pdf

Willy Marroquin (WillyDevNET)

A Secure and Reliable Document Management System is Essential.docx

ComplianceQuest1

Azure Native Qumulo scales elastically for common High Performance Compute (HPC) workloads based on application requirements for: Financial Services, Automotive, Genomics / Life Sciences, Media and Entertainment, Energy, Oil and Gas, etc. Performance can be dialed UP (and back down) much higher than the examples shown here. These slides offer a glimpse into ANQ's HPC capabilities, although at a smaller scale. We invite YOU to do your own testing (with a free ANQ trial) and work with us to test your HPC workloads in Azure.

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf

ryanfarris8

ManageIQ - Sprint 236 Review - Slide Deck

ManageIQ

At TECUNIQUE, we're a stable and steadily growing Indian software services company with over 14 years of industry experience. Specializing in offshore software development and quality assurance services, we've built a reputation for delivering unique and effective solutions to start-ups, software development companies, enterprises, and digital agencies. We pride ourselves on our commitment to excellence and innovation. By blending insightful business domain knowledge with exceptional technical prowess, we craft tailor-made solutions that meet the unique needs of our clients. Our dedicated teams are adept in specific technologies, ensuring seamless integration of skills and delivering reliable, scalable, and high-quality software solutions aligned with our clients' preferences. Bespoke Dedicated Teams: Crafted to meet your specific needs and technology preferences, our dedicated teams are committed to delivering top-notch software solutions. Offshore Software Development: Accelerate your software development and scale up quickly with our 12+ years of expertise in offshore development. Quality Assurance Services: Ensure the quality of your software products with our dedicated teams of experienced QA professionals. IT Staff Augmentation: Overcome skill gaps with our client-centric software team, offering staff augmentation services. Expert Software Services: Unlock our capabilities in custom software development, product development, and quality assurance. Mission and Vision: Our mission at TECUNIQUE is to be the catalyst for our clients' success in the dynamic domain of software development. Rooted in our core values of respect, authenticity, and responsibility, we strive to ease the software outsourcing experience, reducing both time and cost to market for our clients. We envision ourselves as the leading Indian software services company, renowned for our unwavering commitment to excellence and innovation. www.tecunique.com

TECUNIQUE: Success Stories: IT Service provider

mohitmore19

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...

Shane Coughlan

Dernier (20)

%in Midrand+277-882-255-28 abortion pills for sale in midrand

Define the academic and professional writing..pdf

%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf

The title is not connected to what is inside

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

Right Money Management App For Your Financial Goals

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

HR Software Buyers Guide in 2024 - HRSoftware.com

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...

Exploring the Best Video Editing App.pdf

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf

Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...

Microsoft AI Transformation Partner Playbook.pdf

A Secure and Reliable Document Management System is Essential.docx

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf

ManageIQ - Sprint 236 Review - Slide Deck

TECUNIQUE: Success Stories: IT Service provider

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...

En vedette

https://www.hubspot.com/state-of-marketing · Scaling relationships and proving ROI · Social media is the place for search, sales, and service · Authentic influencer partnerships fuel brand growth · The strongest connections happen via call, click, chat, and camera. · Time saved with AI leads to more creative work · Seeking: A single source of truth · TLDR; Get on social, try AI, and align your systems. · More human marketing, powered by robots

2024 State of Marketing Report – by Hubspot

Marius Sescu

ChatGPT is a revolutionary addition to the world since its introduction in 2022. A big shift in the sector of information gathering and processing happened because of this chatbot. What is the story of ChatGPT? How is the bot responding to prompts and generating contents? Swipe through these slides prepared by Expeed Software, a web development company regarding the development and technical intricacies of ChatGPT!

Everything You Need To Know About ChatGPT

Expeed Software

Product Design Trends in 2024 | Teenage Engineerings

Pixeldarts

Mental health has been in the news quite a bit lately. Dozens of U.S. states are currently suing Meta for contributing to the youth mental health crisis by inserting addictive features into their products, while the U.S. Surgeon General is touring the nation to bring awareness to the growing epidemic of loneliness and isolation. The country has endured periods of low national morale, such as in the 1970s when high inflation and the energy crisis worsened public sentiment following the Vietnam War. The current mood, however, feels different. Gallup recently reported that national mental health is at an all-time low, with few bright spots to lift spirits. To better understand how Americans are feeling and their attitudes towards mental health in general, ThinkNow conducted a nationally representative quantitative survey of 1,500 respondents and found some interesting differences among ethnic, age and gender groups. Technology For example, 52% agree that technology and social media have a negative impact on mental health, but when broken out by race, 61% of Whites felt technology had a negative effect, and only 48% of Hispanics thought it did. While technology has helped us keep in touch with friends and family in faraway places, it appears to have degraded our ability to connect in person. Staying connected online is a double-edged sword since the same news feed that brings us pictures of the grandkids and fluffy kittens also feeds us news about the wars in Israel and Ukraine, the dysfunction in Washington, the latest mass shooting and the climate crisis. Hispanics may have a built-in defense against the isolation technology breeds, owing to their large, multigenerational households, strong social support systems, and tendency to use social media to stay connected with relatives abroad. Age and Gender When asked how individuals rate their mental health, men rate it higher than women by 11 percentage points, and Baby Boomers rank it highest at 83%, saying it’s good or excellent vs. 57% of Gen Z saying the same. Gen Z spends the most amount of time on social media, so the notion that social media negatively affects mental health appears to be correlated. Unfortunately, Gen Z is also the generation that’s least comfortable discussing mental health concerns with healthcare professionals. Only 40% of them state they’re comfortable discussing their issues with a professional compared to 60% of Millennials and 65% of Boomers. Race Affects Attitudes As seen in previous research conducted by ThinkNow, Asian Americans lag other groups when it comes to awareness of mental health issues. Twenty-four percent of Asian Americans believe that having a mental health issue is a sign of weakness compared to the 16% average for all groups. Asians are also considerably less likely to be aware of mental health services in their communities (42% vs. 55%) and most likely to seek out information on social media (51% vs. 35%).

How Race, Age and Gender Shape Attitudes Towards Mental Health

ThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

marketingartwork

Skeleton Culture Code

Skeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024

Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)

contently

How to Prepare For a Successful Job Search for 2024

Albert Qian

A report by thenetworkone and Kurio. The contributing experts and agencies are (in an alphabetical order): Sylwia Rytel, Social Media Supervisor, 180heartbeats + JUNG v MATT (PL), Sharlene Jenner, Vice President - Director of Engagement Strategy, Abelson Taylor (USA), Alex Casanovas, Digital Director, Atrevia (ES), Dora Beilin, Senior Social Strategist, Barrett Hoffher (USA), Min Seo, Campaign Director, Brand New Agency (KR), Deshé M. Gully, Associate Strategist, Day One Agency (USA), Francesca Trevisan, Strategist, Different (IT), Trevor Crossman, CX and Digital Transformation Director; Olivia Hussey, Strategic Planner; Simi Srinarula, Social Media Manager, The Hallway (AUS), James Hebbert, Managing Director, Hylink (CN / UK), Mundy Álvarez, Planning Director; Pedro Rojas, Social Media Manager; Pancho González, CCO, Inbrax (CH), Oana Oprea, Head of Digital Planning, Jam Session Agency (RO), Amy Bottrill, Social Account Director, Launch (UK), Gaby Arriaga, Founder, Leonardo1452 (MX), Shantesh S Row, Creative Director, Liwa (UAE), Rajesh Mehta, Chief Strategy Officer; Dhruv Gaur, Digital Planning Lead; Leonie Mergulhao, Account Supervisor - Social Media & PR, Medulla (IN), Aurelija Plioplytė, Head of Digital & Social, Not Perfect (LI), Daiana Khaidargaliyeva, Account Manager, Osaka Labs (UK / USA), Stefanie Söhnchen, Vice President Digital, PIABO Communications (DE), Elisabeth Winiartati, Managing Consultant, Head of Global Integrated Communications; Lydia Aprina, Account Manager, Integrated Marketing and Communications; Nita Prabowo, Account Manager, Integrated Marketing and Communications; Okhi, Web Developer, PNTR Group (ID), Kei Obusan, Insights Director; Daffi Ranandi, Insights Manager, Radarr (SG), Gautam Reghunath, Co-founder & CEO, Talented (IN), Donagh Humphreys, Head of Social and Digital Innovation, THINKHOUSE (IRE), Sarah Yim, Strategy Director, Zulu Alpha Kilo (CA).

Social Media Marketing Trends 2024 // The Global Indie Insights

Kurio // The Social Media Age(ncy)

The search marketing landscape is evolving rapidly with new technologies, and professionals, like you, rely on innovative paid search strategies to meet changing demands. It’s important that you’re ready to implement new strategies in 2024. Check this out and learn the top trends in paid search advertising that are expected to gain traction, so you can drive higher ROI more efficiently in 2024. You’ll learn: - The latest trends in AI and automation, and what this means for an evolving paid search ecosystem. - New developments in privacy and data regulation. - Emerging ad formats that are expected to make an impact next year. Watch Sreekant Lanka from iQuanti and Irina Klein from OneMain Financial as they dive into the future of paid search and explore the trends, strategies, and technologies that will shape the search marketing landscape. If you’re looking to assess your paid search strategy and design an industry-aligned plan for 2024, then this webinar is for you.

Trends In Paid Search: Navigating The Digital Landscape In 2024

Search Engine Journal

From their humble beginnings in 1984, TED has grown into the world’s most powerful amplifier for speakers and thought-leaders to share their ideas. They have over 2,400 filmed talks (not including the 30,000+ TEDx videos) freely available online, and have hosted over 17,500 events around the world. With over one billion views in a year, it’s no wonder that so many speakers are looking to TED for ideas on how to share their message more effectively. The article “5 Public-Speaking Tips TED Gives Its Speakers”, by Carmine Gallo for Forbes, gives speakers five practical ways to connect with their audience, and effectively share their ideas on stage. Whether you are gearing up to get on a TED stage yourself, or just want to master the skills that so many of their speakers possess, these tips and quotes from Chris Anderson, the TED Talks Curator, will encourage you to make the most impactful impression on your audience. See the full article and more summaries like this on SpeakerHub here: https://speakerhub.com/blog/5-presentation-tips-ted-gives-its-speakers See the original article on Forbes here: http://www.forbes.com/forbes/welcome/?toURL=http://www.forbes.com/sites/carminegallo/2016/05/06/5-public-speaking-tips-ted-gives-its-speakers/&refURL=&referrer=#5c07a8221d9b

5 Public speaking tips from TED - Visualized summary

SpeakerHub

Everyone is in agreement that ChatGPT (and other generative AI tools) will shape the future of work. Yet there is little consensus on exactly how, when, and to what extent this technology will change our world. Businesses that extract maximum value from ChatGPT will use it as a collaborative tool for everything from brainstorming to technical maintenance. For individuals, now is the time to pinpoint the skills the future professional will need to thrive in the AI age. Check out this presentation to understand what ChatGPT is, how it will shape the future of work, and how you can prepare to take advantage.

ChatGPT and the Future of Work - Clark Boyd

Clark Boyd

Getting into the tech field. what next

Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search Intent

Lily Ray

How to have difficult conversations

Rajiv Jayarajah, MAppComm, ACC

Introduction to Data Science

Christy Abraham Joy

Time Management & Productivity - Best Practices

Vit Horky

The six step guide to practical project management If you think managing projects is too difficult, think again. We’ve stripped back project management processes to the basics – to make it quicker and easier, without sacrificing the vital ingredients for success. “If you’re looking for some real-world guidance, then The Six Step Guide to Practical Project Management will help.” Dr Andrew Makar, Tactical Project Management

The six step guide to practical project management

MindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

RachelPearson36

En vedette (20)

2024 State of Marketing Report – by Hubspot

Everything You Need To Know About ChatGPT

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Data Science, Delivered Continuously @ GOTO Berlin 2017

1. 1 DATA SCIENCE, DELIVERED CONTINUOUSLY Arif Wider & Christian Deger @arifwider @cdeger

2. Christian Deger Chief Architect cdeger@autoscout24.com @cdeger

3. Dr. Arif Wider Senior Consultant/Developer awider@thoughtworks.com @arifwider

4. PL S RUS UA RO CZ D NL B F A HR I E BG TR 18countries 2.4m+cars & motos 10m+users per month

5. The task: A consumer-facing data product 5GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

6. The task: A consumer-facing data product 6GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

7. The task: A consumer-facing data product 7GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

8. The prediction model: Random forest 8 Car listings of last two years GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger Volkswagen Golf

9. How to turn an R-based prediction model into a high-performance web application? 9 ? GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

10. How to turn an R-based prediction model into a high-performance web application? 10GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

11. How to turn an R-based prediction model into a high-performance web application? 11GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

12. How to turn an R-based prediction model into a high-performance web application? 12  Continuous Delivery! GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

13. Application code in one repository per service. Typical delivery pipeline GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

14. Application code in one repository per service. CI Deployment package as artifact. Typical delivery pipeline GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

15. Application code in one repository per service. CI Deployment package as artifact. CD Deliver package to servers Typical delivery pipeline GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

16. Continuous delivery pipelines 16 Prediction Model Pipeline GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

17. Continuous delivery pipelines 17 Prediction Model Pipeline Web Application Pipeline GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

18. The price for CD: Extensive model validation 18GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

19. The price for CD: Extensive model validation 19GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

20. Lessons learned 20 Form a cross-functional team of data scientists & software engineers! Software engineers … learn how data scientists work … and understand the quirks of a prediction model Data Scientist … learn about unit testing, stable interfaces, git, etc. ... get quick feedback about the impact of their work  Model and product iterations become much faster! GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

21. Lessons learned 21 Generating gigabytes of Java code is a challenge for the JVM Use the G1 garbage collector Turn off Tiered Compilation  Do extensive warm-ups GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

22. Lessons learned – Warm up 22GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

23. Lessons learned 23 The approach of applying Continuous Delivery to Data Science is useful independently of the tech  Successfully applied similarly to a Python- and Spark-based project  Even more useful when quick model evolution is required because of rapidly changing inputs (e.g. user interaction) GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

24. Conclusions 24  Continuous Delivery allows us to bring prediction model changes live very quickly.  Only extensive automated end-to-end tests provide confidence to deploy to production automatically.  Java code generation allows for very low response times and excellent scalability for high loads but requires plenty of memory. GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

25. Conclusions: Price evaluation everywhere 25GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

26. Conclusions: Price evaluation everywhere GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger 26

27. Conclusions: Price evaluation everywhere GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

28. Conclusions: Price evaluation everywhere GOTO Berlin 2017 Data Science, Delivered Continuously – A. Wider & C. Deger

29. 29 THANK YOU QUESTIONS? Arif Wider & Christian Deger @arifwider @cdeger

Notes de l'éditeur

A This is Christian - Christian is AutoScout24‘s chief architect but he actually joined AutoScout as a mere Developer and then made his way tohis current role as a Coding Architect. At AutoScout we (TW) have worked a lot with Christian and I think I can say that we‘ve enjoyed each others company quite a bit
C is a developer at TW Germany where Scala is his language of choice, particularly in the context of Big Data applications Before joining TW he has been in academia doing research on applying FP techniques to data synchronization
C AutoScout24 is the largest online car marketplace Europe-wide, with roughly 2.4 million listings on the platform, which means that they have a lot of data about how cars are sold.
A AutoScout has a lot of data about how cars are sold and at what prices. - Now our task was to turn all this data into something actually useful for the end user of the page. - So our task was to create a consumer-facing data product where users can quickly estimate the current value of their car. This works as follows… basic information about the car
A Optionally indicate equipment and condition
A You get a price range
A - What we had when we started working on this was a prediction model because that‘s what the data scientists at AutoScout had already build, the language they used for it was R, and the approach being used for that is called random forest. - Who of you has heard of Random Forest before? - Let‘s have look how this works: The data of the last two years is used to train a prediction model, and what you get out of training are many of such decision trees. - RF is the algorithm that decides … and it is a technique to work agains overfitting, i.e., producing a prediction model that only works on the training data.
C - But our task was to turn this model into a high-performance web application. - And in fact, that is not so common yet in the context of data science, because often, such data is only used for internal decision purposes. - But if you want to create a user facing application, you have a very different situation, where you have to deal with load peaks etc. - And that was also the reason why we ruled out to run an R server in production pretty early. The problem is that R, at least in its open source version does not support multi-threading, thus, scaling for many concurrent requests is extremely difficult.
C - Traditional approach that we still see quite often: model is developed by data scientists in some language that suits their way of working best, e.g., R, and then in order to get a good performance, software engineers translate…
C - However, with this manual approach, what do you do, if the internal structure of the prediction model changes? If a software engineer has to reimplement theses changes, it first of all takes a long time, and also mistakes can be introduced in that translation. For example changed from a random forest to gradient boosted machines. For linear regression, reimplementing the model is not a big problem.
A - We therefore looked on how we can automate this and the technology that helped us with that was H2O. - Has anybody heard of H2O? - It‘s a Java based analytics engine that can be programmed using R (which the data scientists liked) and, and that was the important piece for us, provides the possibility to export your fully trained prediction model as Java source code. - This then allowed us to integrate this model generation into a continuous delivery pipeline.
C Commit stage: Unit tests etc.
C Additional database migration scripts.
C Blue/ Green delivery on the instance.
C This looks as follows: … - Then Java code is generated, actually in our case gigabytes of Java source code, which is then compiled into a JAR which is uploaded to AWS S3. This the prediction model pipeline. - Now, whenever something is changed in the R-based configuration or, at least as important, when the model should be updated using the latest data from the platform, a new model JAR gets generated automatically, and is deployed to S3.
C - Now for the web application, which we implemented in Scala using the Play Framework, there is another CD pipeline. - This pipeline also generates a JAR, the application JAR which is then deployed to AWS EC2. - Now, everytime when deploying this application, the pipeline also pulls the latest prediction model from S3 and then loads both into the same JVM. - But also when the model is updated, this triggers a redeployment of the web application with the newest model. - This way all prediction model changes made by the data scientists go straight to production and users can immediately benefit.
A - However, this only works, if you have enough confidence to do so. - Therefore we build an extensive model validation workflow. - Let‘s start with how a model is usually trained and how the success of the training is evaluated. - You, that is the data scientist, divides the existing historical data into training data and test data, and those two sets need to be disjunct. Then the model is trained using the training data. - The test data is then used to create test estimations and these results are compared with the actual price in the test data. - Will never be exactly the same but indicate how good the model is. - We want to validate how the model reacts to new data.
A - Further down the pipeline we use these test estimation results for a comprehensive end-to-end model validation. - That means we check whether the JAR that was created by compiling the generated Java code gives us exactly the same results as directly asking the model that was created by the data scientists. - Furthermore, we also check whether this model fufills all the expectations that our web application poses on it. - This is called a consumer-driven contract test (CDC), the web application in this case is the consumer of the model. Only if all those things are green, we release to production.
C
A
C The warmup time becomes especially problematic during an incident. Time to recover is drastically increased. You also need to configure your autoscaling to take the period of high load into account.
A
A
C Added labels for fair price, good price and top price. More labels coming.
On the listing itself
With different categories and respective ranges
And as filter criteria for search itself

Data Science, Delivered Continuously @ GOTO Berlin 2017

Recommandé

Recommandé

Contenu connexe

Dernier

Dernier (20)

En vedette

En vedette (20)

Data Science, Delivered Continuously @ GOTO Berlin 2017

Notes de l'éditeur