Automated machine learning solutions can help address problems with data-driven activities by selecting optimal machine learning models and techniques without requiring deep machine learning expertise. Vitriol is one such solution that uses meta-learning to leverage knowledge from previous learning tasks to select imputation methods, models, and hyperparameters for new problems. It has a web application interface that allows users to easily connect databases, preprocess and complete data, select modeling tasks, and visualize results without training or space limitations. With each new problem solved, Vitriol's meta-learner continues to improve its model selection abilities.
2. "People worry that computers will gettoo smart and take overthe world, butthe real
problem is that they’re too stupid and they’ve already taken over the world."
- PedroDomingos
IntelligentMachines?
3. Problems with Data Driven Activities
Business Perspective
We have several data streams, but what can we do with them?
More importantly, how?
Good data scientists are hard to find.
Creating a machine learning team in the organization is too costly.
Data Scientist Perspective
Experimenting with many models to find a solution is inconvenient and time-
consuming.
4. No free lunch
No universal learning algorithm (for now?)
There is not a learning algorithm which can be guaranteed to succeed on all learnable
tasks.
Any learning algorithm has a limited scope
There is always a trade off
Bias - variance
5. Basis for all: Meta-Learning
Transfer Learning
Inductive Transfer (learning to learn)
Hundreds of thousands of learning episodes
Learning process improves progressively
6. Data Imputation
Multiple imputation methods to enrich solution space
● Statistical methods
● Machine learning based methods
➔ Thousands of possible imputation methods
• Methods employ a combination of algorithms
7. Meta-Learner
➔ A rich set of meta-data representing the information about
previous learning tasks
• Analysis of the problem
➢ More than 20 features to describe the dataset
• Selected ML model
➢ Imputation method
➢ Model algorithm
➢ Values of tunable parameters
• Evaluation results in multiple metrics
➢ Regression metrics (R2, accuracy…)
➢ Classification metrics (WeightedFMeasure, ...)
8. Meta-Learner
➔Experiments proved intra-domain transfer learning is more feasible
• Meta-data is categorized according to their data domains.
➔ How is meta-data exploited?
• Selection of the most suitable model *
➔ Meta-Learner improves by each problem solved.
➔ Online Learning for instant update.
9. Why Web Application
Easy to Use
Does not need training to learn
Globally Accessible
No space limitations
Easy to update
10. Web Service Architecture
Single page application
Componet-based design
Non-blocking single threaded
IO
Caching for session
management
12. Vitriol in Action 1: Data Preprocessing
I. User provides DB
credentials
II. Chooses the table to work
on
III. He/She can update the table
everytime
IV. He/She can see the results of
preprocessing right after
connecting the db
13. Vitriol in Action 2: Model Selection
I. User selects the table that
he/she wants Vitriol to work
on
II. Chooses pre-process or
Model Creation
III. For pre-proccessing the tow
options are clean and
complete
IV. To create a modal chooses
the column to define the
modal tag
14. Conclusion: Why is this better?
➔ Complete set of method selections (including preprocessing)
➔ Machine learning knowledge is not required
➔ More candidate models are evaluated in less time
➔ Continually improving decisions