John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017

•

3 j'aime•1,554 vues

John Maxwell, a data scientist at Nordstrom, did his graduate work in international development economics, focusing on field experiments. He has since led research projects in Indonesia and Ethiopia related to microenterprise, developed large mathematical simulation models used for investment decisions by WSDOT, built dynamic pricing algorithms at Thriftbooks.com, and led the development of Nordstrom’s open source a/b testing service: Elwin. He currently focuses on contextual multi-armed bandit problems and machine learning infrastructure at Nordstrom. Abstract summary Solving the Contextual Multi-Armed Bandit Problem at Nordstrom: The contextual multi-armed bandit problem, also known as associative reinforcement learning or bandits with side information, is a useful formulation of the multi-armed bandit problem that takes into account information about arms and users when deciding which arm to pull. The barrier to entry for both understanding and implementing contextual multi-armed bandits in production is high. The literature in this field pulls from disparate sources including (but not limited to) classical statistics, reinforcement learning, and information theory. Because of this, finding material that fills the gap between very basic explanations and academic journal articles is challenging. The goal of this talk is to provide those lacking intermediate materials as well as an example implementation. Specifically, I will explain key findings from some of the more cited papers in the contextual bandit literature, discuss the minimum requirements for implementation, and give an overview of a production system for solving contextual multi-armed bandit problems.

Technologie

Solving the Contextual Multi-Armed Bandit
Problem at Nordstrom
John Maxwell
Nordstrom
2017/05/19

Motivating the Problem:
Limitations of A/B testing for product recommendations

Motivating the Problem:
Limitations of A/B testing for product recommendations
Need to balance exploration and exploitation intelligently

Motivating the Problem:
Limitations of A/B testing for product recommendations
Need to balance exploration and exploitation intelligently
People aren’t all the same, though maybe similar

Exploration vs Exploitation
Explore ﬁrst: explore then learn (like A/B testing)

Exploration vs Exploitation
-greedy: exploit–but also explore a little bit

Exploration vs Exploitation
Upper Conﬁdence Bound (UCB): optimistic when uncertain

UCB Illustrated
1
Arm1 Arm2
0.50
0.75
1.00
1.25
1.50
1.75
arm
avg
Choice
0
1

UCB Illustrated
2
Arm1 Arm2
0.50
0.75
1.00
1.25
1.50
1.75
arm
avg
Choice
0
1

UCB Illustrated
3
Arm1 Arm2
0.50
0.75
1.00
1.25
1.50
1.75
arm
avg
Choice
0
1

UCB Illustrated
4
Arm1 Arm2
0.50
0.75
1.00
1.25
1.50
1.75
arm
avg
Choice
0
1

UCB Illustrated
5
Arm1 Arm2
0.50
0.75
1.00
1.25
1.50
1.75
arm
avg
Choice
0
1

UCB Illustrated
6
Arm1 Arm2
0.50
0.75
1.00
1.25
1.50
1.75
arm
avg
Choice
0
1

Including Context
How can we use things we know about people and products
(context) along with UCB?

Including Context
How can we use things we know about people and products
(context) along with UCB?
Train a ridge regression for each arm (regress rewards on
contexts)

Including Context
This seems hard to implement

Including Context
This seems hard to implement
Have to invert a potentially large matrix on every call

Including Context
This seems hard to implement
Have to invert a potentially large matrix on every call
How do you deal with delayed rewards?

Including Context
Notice how similar this is to classiﬁcation
arm 1 arm 2 arm 3
1 . .
. .5 .
. . 2
.8 . .

Including Context
Notice how similar this is to classiﬁcation
arm 1 arm 2 arm 3
1 . .
. .5 .
. . 2
.8 . .
We have partial feedback. . . how can we get full feedback?

Including Context
Inverse propensity scoring:
ci,t = −
ri,t(ai ) · I{π(xi,t) = ai }
pi,t(ai )
arm 1 arm 2 arm 3
c1,1 0 0
0 c2,2 0
0 0 c3,3
c1,4 0 0
Agarwal et al. (2014)

Including Context
If you think about IPS transformed rewards as costs, you can
reduce this to cost-sensitive classiﬁcation

Including Context
If you think about IPS transformed rewards as costs, you can
reduce this to cost-sensitive classiﬁcation
Can use any cost-sensitive multi-class classiﬁcation algorithm

Implementation
Dora: a node app that explores using -greedy

Implementation
Dora: a node app that explores using -greedy
Logging, delayed joins

Implementation
Dora: a node app that explores using -greedy
Logging, delayed joins
TensorFlow + TensorFlow Serving: Consistent way to train and
serve cost-sensitive classiﬁer

Questions?
twitter: @jhnmxwll
github: jmmaxwell
site: john-maxwell.com
email: john [at] john-maxwell.com

References
Agarwal, Alekh, Daniel J. Hsu, Satyen Kale, John Langford, Lihong
Li, and Robert E. Schapire. 2014. “Taming the Monster: A Fast
and Simple Algorithm for Contextual Bandits.” CoRR
abs/1402.0555. http://arxiv.org/abs/1402.0555.
Li, Lihong, Wei Chu, John Langford, and Robert E Schapire. 2010.
“A Contextual-Bandit Approach to Personalized News Article
Recommendation.” In Proceedings of the 19th International
Conference on World Wide Web, 661–70. ACM.

Contenu connexe

Tendances

The first lecture from the Machine Learning course series of lectures. The lecture covers basic principles of machine learning, such as the difference between supervised and unsupervised learning, several classifiers: nearest neighbour (k-NN), decision trees, random forest, major obstacles in machine learning: overfitting and the curse of dimensionality, followed by cross-validation algorithm and general ML pipeline. A link to my github (https://github.com/skyfallen/MachineLearningPracticals) with practicals that I have designed for this course in both R and Python. I can share keynote files, contact me via e-mail: dmytro.fishman@ut.ee.

1 Supervised learning

Dmytro Fishman

Dm part03 neural-networks-homework

okeee

Machine learning

Andrea Iacono

L4. Ensembles of Decision Trees

Machine Learning Valencia

Erik Bernhardsson is the CTO at Better, a small startup in NYC working with mortgages. Before Better, he spent five years at Spotify managing teams working with machine learning and data analytics, in particular music recommendations. Abstract Summary: Nearest Neighbor Methods And Vector Models: Vector models are being used in a lot of different fields: natural language processing, recommender systems, computer vision, and other things. They are fast and convenient and are often state of the art in terms of accuracy. One of the challenges with vector models is that as the number of dimensions increase, finding similar items gets challenging. Erik developed a library called “Annoy” that uses a forest of random tree to do fast approximate nearest neighbor queries in high dimensional spaces. We will cover some specific applications of vector models with and how Annoy works.

Erik Bernhardsson, CTO, Better Mortgage

MLconf

Hanie Sedghi is a Research Scientist at Allen Institute for Artificial Intelligence (AI2). Her research interests include large-scale machine learning, high-dimensional statistics and probabilistic models. More recently, she has been working on inference and learning in latent variable models. She has received her Ph.D. from University of Southern California with a minor in Mathematics in 2015. She was also a visiting researcher at University of California, Irvine working with professor Anandkumar during her Ph.D. She received her B.Sc. and M.Sc. degree from Sharif University of Technology, Tehran, Iran. Abstract summary Beating Perils of Non-convexity:Guaranteed Training of Neural Networks using Tensor Methods: Neural networks have revolutionized performance across multiple domains such as computer vision and speech recognition. However, training a neural network is a highly non-convex problem and the conventional stochastic gradient descent can get stuck in spurious local optima. We propose a computationally efficient method for training neural networks that also has guaranteed risk bounds. It is based on tensor decomposition which is guaranteed to converge to the globally optimal solution under mild conditions. We explain how this framework can be leveraged to train feedforward and recurrent neural networks.

Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...

MLconf

Bag the model with bagging

Chode Amarnath

Machine Learning and Data Mining: 16 Classifiers Ensembles

Pier Luca Lanzi

Ensemble methods

Christopher Marker

Ensemble learning

Haris Jamil

Classification with Naive Bayes

Josh Patterson

Understanding Basics of Machine Learning

Pranav Ainavolu

Multiple Classifier Systems

Farzad Vasheghani Farahani

Overview of tree algorithms from decision tree to xgboost

Takami Sato

Ml1 introduction to-supervised_learning_and_k_nearest_neighbors

ankit_ppt

Intro to machine learning

Akshay Kanchan

Machine Learning - Ensemble Methods

Andrew Ferlitsch

Ensemble methods

zekeLabs Technologies

Boosting Algorithms Omar Odibat

omarodibat

Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves. The process of learning begins with observations or data, such as examples, direct experience, or instruction, in order to look for patterns in data and make better decisions in the future based on the examples that we provide. The primary aim is to allow the computers learn automatically without human intervention or assistance and adjust actions accordingly.

large scale Machine learning

Full Stack Developer at Electro Mizan Andisheh

Tendances (20)

1 Supervised learning

Dm part03 neural-networks-homework

Machine learning

L4. Ensembles of Decision Trees

Erik Bernhardsson, CTO, Better Mortgage

Hanie Sedghi, Research Scientist at Allen Institute for Artificial Intelligen...

Bag the model with bagging

Machine Learning and Data Mining: 16 Classifiers Ensembles

Ensemble methods

Ensemble learning

Classification with Naive Bayes

Understanding Basics of Machine Learning

Multiple Classifier Systems

Overview of tree algorithms from decision tree to xgboost

Ml1 introduction to-supervised_learning_and_k_nearest_neighbors

Intro to machine learning

Machine Learning - Ensemble Methods

Ensemble methods

Boosting Algorithms Omar Odibat

large scale Machine learning

En vedette

Shiva Amiri, CEO, Biosymetrics at The AI Conference 2017

MLconf

Chris Nicholson, CEO Skymind at The AI Conference

MLconf

Garrett Goh is a Scientist at the Pacific Northwest National Lab (PNNL), in the Advanced Computing, Mathematics & Data Division. He was previously awarded the Howard Hughes Medical Institute fellowship which supported his PhD in Computational Chemistry at the University of Michigan. At PNNL, he was awarded the Pauling Fellowship that supports his research initiative of combining deep learning and artificial intelligence with traditional chemistry applications. His current interests is in AI-assisted computational chemistry, which is the application of deep learning to predict chemical properties and the discovery of new chemical insights, while using minimal expert knowledge. Abstract summary A Deep Learning Computational Chemistry AI: Making chemical predictions with minimal expert knowledge: Using deep learning and with virtually no expert knowledge, we construct computational chemistry models that perform favorably to existing state-of-the-art models developed by expert practitioners, whose models rely on the knowledge gained from decades of academic research. Our findings potentially demonstrates the impact of AI assistance in accelerating the scientific discovery process, where we envision future applications not just in chemistry, but in affiliated fields, such as biotechnology, pharmaceuticals, consumer goods, and perhaps other domains as well.

Garrett Goh, Scientist, Pacific Northwest National Lab

MLconf

Dr. Bryce Meredig, Chief Science Officer, Citrine at The AI Conference

MLconf

Malika is devoted to bringing great ideas to life. She is an Operations Partner at Comet Labs, a cross between a venture fund and experimental research lab that supports AI and robotics startups. She previously worked in investment banking, and oversaw the development and growth of software and hardware startups in the education, healthcare, and telecom fields, in Asia, Europe and North America. She graduated from the University of Cambridge and has an MBA from Tsinghua University and MIT Sloan. A VC Perspective on AI

Malika Cantor, Operations Partner, Comet Labs at The AI Conference 2017

MLconf

Best Practices for Hyperparameter Optimization: All machine learning and artificial intelligence pipelines – from reinforcement agents to deep neural nets – have tunable hyperparameters. Optimizing these hyperparameters provides tremendous performance gains, but only if the optimization is done correctly. This presentation will discuss topics including selecting performance criteria, why you should always use cross validation, and choosing between state of the art optimization methods.

Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017

MLconf

Tianqi holds a bachelor’s degree in Computer Science from Shanghai Jiao Tong University, where he was a member of ACM Class, now part of Zhiyuan College in SJTU. He did his master’s degree at Changhai Jiao Tong University in China on Apex Data and Knowledge Management before joining the University of Washington as a PhD. He has had several prestigious internships and has been a visiting scholar including: Google on the Brain Team, at Graphlab authoring the boosted tree and neural net toolkit, at Microsoft Research Asia in the Machine Learning Group, and the Digital Enterprise Institute in Galway Ireland. What really excites Tianqi is what processes and goals can be enabled when we bring advanced learning techniques and systems together. He pushes the envelope on deep learning, knowledge transfer and lifelong learning. His PhD is supported by a Google PhD Fellowship. Abstract summary Build Scalable and Modular Learning Systems: Machine learning and data-driven approaches are becoming very important in many areas. There are one factors that drive these successful applications: scalable learning systems that learn the model of interest from large datasets. More importantly, the system needed to be designed in a modular way to work with existing ecosystem and improve users’ productivity environment. In this talk, I will talk about XGBoost and MXNet, two learning scalable and portable systems that I build. I will discuss how we can apply distributed computing, asynchronous scheduling and hardware acceleration to improve these systems, as well as how do they fit into bigger open-source ecosystem of machine learning.

Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017

MLconf

Graph Representation Learning with Deep Embedding Approach: Graphs are commonly used data structure for representing the real-world relationships, e.g., molecular structure, knowledge graphs, social and communication networks. The effective encoding of graphical information is essential to the success of such applications. In this talk I’ll first describe a general deep learning framework, namely structure2vec, for end to end graph feature representation learning. Then I’ll present the direct application of this model on graph problems on different scales, including community detection and molecule graph classification/regression. We then extend the embedding idea to temporal evolving user-product interaction graph for recommendation. Finally I’ll present our latest work on leveraging the reinforcement learning technique for graph combinatorial optimization, including vertex cover problem for social influence maximization and traveling salesman problem for scheduling management.

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...

MLconf

Making Natural Language Processing Robust to Sociolinguistic Variation: Natural language processing on social media text has the potential to aggregate facts and opinions from millions of people all over the world. However, language in social media is highly variable, making it more difficult to analyze that conventional news texts. Fortunately, this variation is not random; it is often linked to social properties of the author. I will describe two machine learning methods for exploiting social network structures to make natural language processing more robust to socially-linked variation. The key idea behind both methods is linguistic homophily: the tendency of socially linked individuals to use language in similar ways. This idea is captured using embeddings of node positions in social networks. By integrating node embeddings into neural networks for language analysis, we obtained customized language processing systems for individual writers — even for individuals for whom we have no labeled data. The first application shows how to apply this idea to the problem of tweet-level sentiment analysis. The second application targets the problem of linking spans of text to known entities in a knowledge base.

Jacob Eisenstein, Assistant Professor, School of Interactive Computing, Georg...

MLconf

CompanyDepot: Employer Name Normalization in the Online Recruitment Industry In the recruitment domain, the employer name normalization task, which links employer names in job postings or resumes to entities in an employer knowledge base (KB), is important to many business applications. It has several unique challenges: handling employer names from both job postings and resumes, leveraging the corresponding location and url context, as well as handling name variations, irrelevant input data, and noises in the KB. In this talk, we present a system called CompanyDepot which uses machine learning techniques to address these challenges. The proposed system achieves 2.5%- 21.4% higher coverage at the same precision level compared to a legacy system used at CareerBuilder over multiple real-world datasets. After applying it to several applications at CareerBuilder, we faced a new challenge: how to avoid duplicate normalization results when the KB is noisy and contains many duplicate entities. To address this challenge, we extend the CompanyDepot system to normalize employer names not only at entity level, but also at cluster level by mapping a query to a cluster in the KB that best matches the query. The proposed system performs an efficient graph-based clustering based on external knowledge from five mapping sources. We also propose a new metric based on success rate and diversity reduction ratio for evaluating the cluster-level normalization. Through experiments and applications, we demonstrate a large improvement on normalization quality from entity-level to cluster-level normalization.

Qiaoling Liu, Lead Data Scientist, CareerBuilder at MLconf ATL 2017

MLconf

Codifying Data Science Intuition: Using Decision Theory to Automate Time Series Model Selection: While models generated from cross-sectional data can utilize cross-validation for model selection, most time series models cannot be cross-validated due to the temporal structure of the data used to create them. It is possible to employ a rolling cross-validation technique, however this process is computationally expensive and provides no indication of the long-term forecast accuracies of the models. The purpose of this talk is to elaborate how decision theory can be used to automate time series model selection in order to streamline the manual process of validation and testing. By creating consecutive, temporally independent holdout sets, performance metrics for each model’s prediction on each holdout set are fed into a decision function to select an unbiased model. The decision function helps minimize the poorest performance of each model across all holdout sets in order to counteract the possibility of choosing a model that overfits or underfits the holdout sets. Not only does this process improve forecast accuracy, but it also reduces computation time by only requiring the creation of a fixed number of proposed forecasting models.

Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017

MLconf

Convolutional Neural Networks at scale in Spark MLlib: Jeremy Nixon will focus on the engineering and applications of a new algorithm built on top of MLlib. The presentation will focus on the methods the algorithm uses to automatically generate features to capture nonlinear structure in data, as well as the process by which it’s trained. Major aspects of that include compositional transformations over the data, convolution, and distributed backpropagation via SGD with adaptive gradients and an adaptive learning rate. Applications will look into how to use convolutional neural networks to model data in computer vision, natural language and signal processing. Details around optimal preprocessing, the type of structure that can be learned, and managing its ability to generalize will inform developers looking to apply nonlinear modeling tools to problems that they face.

Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...

MLconf

Rahul Mehrotra is a Product Manager at Maluuba, a Canadian AI company that’s teaching machines to think, reason and communicate with humans (acquired by Microsoft in January 2017). Based in the AI epicenter of Montréal, Maluuba applies deep learning techniques to solve complex problems in language understanding. Rahul works across Maluuba’s three research areas (Machine Comprehension, Dialogue Systems and Reinforcement Learning) and helps advance breakthrough research by providing real-world problems and use cases. Rahul leads product initiatives to bring cutting-edge academic research to robust product pipelines. Rahul holds a B.ASc in Systems Design Engineering from the University of Waterloo. Building Literate Machines Advances in AI research have led to great innovations based on image and voice recognition, and 2017 will see further advances in the field of language, including the creation of more literate machines—those that can comprehend and communicate with humans but also machines that begin to model innate human-like skills. In this talk, Rahul Mehrotra will explore how advances in deep and reinforcement learning are being applied to solve language understanding problems. You will gain a deeper understanding of the research fundamentals as well as implications and opportunities that language understanding AI services will bring. Rahul will outline how researchers are seeking to equip machines with higher level cognitive skills like common-sense reasoning, information seeking, transfer learning, and decision-making. He will explain how these capabilities are being applied in enterprise, using practical examples across a range of business functions. These use cases are transformative. To give just one example, knowledge workers and employees would no longer need to desperately search through an organization’s directories, repositories, emails, and other channels to find a specific document. Instead, the employee would communicate with an AI agent leveraging machine comprehension capabilities. The agent would be capable of answering the question in a security-compliant manner by having a deep understanding of the contents of the organization’s documents instead of simply retrieving based on keywords. The talk will provide audience with key takeaways on the underlying research as well as the current and future applications of using language understanding AI in enterprise.

Rahul Mehrotra, Product Manager, Maluuba at The AI Conference 2017

MLconf

High Performance Deep Learning on Edge Devices With Apache MXNet: Deep network based models are marked by an asymmetry between the large amount of compute power needed to train a model, and the relatively small amount of compute power needed to deploy a trained model for inference. This is particularly true in computer vision tasks such as object detection or image classification, where millions of labeled images and large numbers of GPUs are needed to produce an accurate model that can be deployed for inference on low powered devices with a single CPU. The challenge when deploying vision models on these low powered devices though, is getting inference to run efficiently enough to allow for near real time processing of a video stream. Fortunately Apache MXNet provides the tools to solve this issues, allowing users to create highly performant models with tools like separable convolutions, quantized weights and sparsity exploitation as well as providing custom hardware kernels to ensure inference calculations are accelerated to the maximum amount allowed by the hardware the model is being deployed on. This is demonstrated though a state of the art MXNet based vision network running in near real time on a low powered Raspberry Pi device. We finally discuss how running inference at the edge as well as leveraging MXNet’s efficient modeling tools can be used to massively drive down compute costs for deploying deep networks in a production system at scale.

Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017

MLconf

Artemy is a CEO of Data Monsters, a Palo Alto based research lab and consulting company. Prior to Data Monsters Artemy founded a business intelligence startup, which raised $6M of venture capital and two years later was sold to a nationwide system integrator. Artemy is an expert in computational social science, knowledge mining, chaos theory. Why Chatbots Fail, And How To Fix Them Chatbots look damn smart at demonstrations when presenters follow the pre-designed scripts. But chatbots fail when real users come. Real users talk in an unexpected manner, change topics and so on. Bots still have very few success stories, with very limited number of use cases. The technology did not take off. In March 2017 Facebook recommended replacing conversational experience with a three-level menu navigation. Another leader, Amazon Alexa has only one frequent use case: “Alexa, play a song”. Everything else does not stick. The frequency of users’ requests follows the statistical distribution with the long tail. In order to keep conversation a good chatbot should be able to understand thousands topics, not dozens. That requires huge knowledge bases. We analyzed thousands of chatbot logs and observed a significant probability of missunderstanding that multiplies with every next phrase. 10-30% of users say something which the chatbot is not prepared and trained for. Almost every long conversation frustrates the user. Retention rate is 3-5 times lower for bots than for mobile apps, which is a disaster. We want to discuss these problems and offer technical solutions in order to improve experience, create knowledge bases faster and build useful self-learning chatbots.

Artemy Malkov, CEO, Data Monsters at The AI Conference 2017

MLconf

Will Murphy, VP of Business Development & Co-Founder, Talla at The AI Confere...

MLconf

Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud Prevention: PayPal is at the forefront of applying large scale graph processing and machine learning algorithms to keep fraudsters at bay. In this talk, I’ll present how advanced graph processing and machine learning algorithms such as Deep Learning and Gradient Boosting are applied at PayPal for fraud prevention. I’ll elaborate on specific challenges in applying large scale graph processing & machine technique to payment fraud prevention. I’ll explain how we employ sophisticated machine learning tools – open source and in-house developed. I will also present results from experiments conducted on a very large graph data set containing millions of edges and vertices.

Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017

MLconf

Game of Drones: Using IoT, Machine Learning, Drones, and Networking to Solve World Hunger Drones are increasingly used in various commercial and consumer scenarios – from agriculture drones (providing farmers with crop and irrigation patterns) to consumer drones (that follow you around as you engage in action sports), to drone racing. Drones are outfitted with a large number of sensors (cameras, accelerometers, gyros, etc.), and can continuously stream these signals in real time for analysis. This talk introduces the landscape of the various drone technologies that are currently available, and shows you how to acquire and analyze the real-time signals from the drones to design intelligent applications in an IoT pipeline. We will demonstrate how to leverage machine learning models that perform real-time facial detection along with predictions of age, gender, emotion, and object recognition using the signals acquired from the drones. You will walk away understanding the basics of how to develop applications that utilize and visualize these real-time insights. This talk includes fun with drones, how to tackle the problem of world hunger, and some Game of Thrones silliness. It is targeted at data scientists, students, researchers, and IT professionals who have an interest in building intelligent applications using drones and machine learning. It will be a fun and exciting exploration as we demonstrate a drone with the power of recognizing faces, ages, genders, emotions, and objects. You will learn how to leverage these same machine learning models to imbue intelligence into drones or other applications.

Jennifer Marsman, Principal Software Development Engineer, Microsoft at MLcon...

MLconf

Getting Value Out of Chat Data: Chat-based interfaces are increasingly common, whether as customers interacting with companies or as employees communicating with each other within an organization. Given the large number of chat logs being captured, along with recent advances in natural language processing, there is a desire to leverage this data for both insight generation and machine learning applications. Unfortunately, chat data is user-generated data, meaning it is often noisy and difficult to normalize. It is also mostly short texts and heavily context-dependent, which cause difficulty in applying methods such as topic modeling and information extraction. Despite these challenges, it is still possible to extract useful information from these data sources. In this talk, I will be providing an overview of techniques and practices for working with chat-based user interaction data with a focus on machine-augmented data annotation and unsupervised learning methods. Bio: Daniel Shank is a Senior Data Scientist at Talla, a company developing a platform for intelligent information discovery and delivery. His focus is on developing machine learning techniques to handle various business automation tasks, such as scheduling, polls, expert identification, as well as doing work on NLP. Before joining Talla as the company’s first employee in 2015, Daniel worked with TechStars Boston and did consulting work for ThriveHive, a small business focused marketing company in Boston. He studied economics at the University of Chicago.

Daniel Shank, Data Scientist, Talla at MLconf SF 2017

MLconf

Machine Learning Systems at Scale: OpenAI is a non-profit research company, discovering and enacting the path to safe artificial general intelligence. As part of our work, we regularly push the limits of scalability in cutting-edge ML algorithms. We’ve found that in many cases, designing the systems we build around the core algorithms is as important as designing the algorithms themselves. This means that many systems engineering areas, such as distributed computing, networking, and orchestration, are crucial for machine learning to succeed on large problems requiring thousands of computers. As a result, at OpenAI engineers and researchers work closely together to build these large systems as opposed to a strict researcher/engineer split. In this talk, we will go over some of the lessons we’ve learned, and how they come together in the design and internals of our system for learning-based robotics research. Bio: Jonas leads technology development for OpenAI’s robotics group, developing methods to apply machine learning and AI to robots. He also helped build the infrastructure to scale OpenAI’s distributed ML systems to thousands of machines.

Jonas Schneider, Head of Engineering for Robotics, OpenAI

MLconf

En vedette (20)