Using AI to build AI is a promising solution to give the power of AI to those who can't afford it as those multinational corporations. The technology is also known as Automatic Machine Learning (AutoML). OneClick.ai is the first deep learning AutoML platform that make the latest AI technology accessible to anyone with/without AI background. The deck gives a 30 minutes overview of the recent history of AutoML, and how OneClick.ai innovates on it. Check out our platform at http://www.oneclick.ai
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
The Evolution of AutoML
1. 1
Use AI to Build AI
The Evolution of AutoML
Ning Jiang
CTO, OneClick.ai
2018
2. Ning Jiang
Co-founder of OneClick.ai, the first automated
Deep Learning platform in the market.
Previously Dev Manager at Microsoft Bing, Ning
has over 15 years of R&D experience in AI for ads,
search, and cyber security.
2
4. { Challenges in AI Applications }
4
1. Never enough experienced data scientists
2. Long development cycle (typically 3 mo to 0.5 year)
3. High risk of failure
4. Endless engineering traps in implementation and
maintenance
5. { Coming Along With Deep Learning }
5
1. Few experienced data scientists and engineers
2. Increasing complexity in data (mix images, text, and numbers)
3. Algorithms need to be customized
4. Increased design choices and hyper-parameters
5. Much harder to debug
8. { Key Challenges }
8
1. Satisfy semantic Constraints (e.g. data types)
2. Take the feedback to improve model designs
3. Minimize number of models to train
4. Avoid local minima
5. Speed up model training
9. { Neural Architecture Search }
9
1. Evolutionary algorithms
(ref: https://arxiv.org/abs/1703.01041)
2. Greedy search
(ref: https://arxiv.org/abs/1712.00559)
3. Reinforcement learning
(ref: https://arxiv.org/abs/1611.01578)
4. Speed up model training
(ref: https://arxiv.org/abs/1802.03268)
11. { Target Scenarios }
11
1. Image classification (on CIFAR-10 & ImageNet)
2. Using only Convolution & Pooling layers
3. This is what powers Google AutoML
12. { Constraints }
12
1. Predefined architectures
2. N=2
3. # of filters decided by heuristics
4. NAS to find optimal Cell
structure
13. { Basic constructs }
13
Each construct has
1. Two inputs
2. Processed by two operators
3. One combined output
Operator 1 Operator 2
输入1 输入2
14. { Predefined Operators }
14
Why these and these only?
1. 3X3 convolution
2. 5X5 convolution
3. 7X7 convolution
4. Identity (pass through)
5. 3X3 average pooling
6. 3X3 max pooling
7. 3x3 dilated convolution
8. 1X7 followed by 7X1 convolution
Operator 1 Operator 2
输入1 输入2
15. { Cells }
15
1. Stacking up to 5 basic
constructs
2. About 5.6x1014
cell
candidates
16. { Greedy Search }
16
1. Start with a single construct
(m=1)
2. There are 256 possibilities
3. Add one more construct
4. Pick the best K (256) cells to train
5. Repeat step ¾ until we have 5
constructs in the cell
6. 1028 models to be trained
17. { Pick the best cells}
17
1. Cells as a sequence of choices
2. LSTM to estimate model
accuracy
3. Training data are from trained
models (up to 1024 examples)
4. 99.03% accuracy at m=2
5. 99.52% at m=5
LSTM
Dense
Input2
Input2
Operator1
Operator2
18. { Summary }
18
1. Fewer models to train
○ Remarkable improvement over evolutionary algorithms
2. Search from simple to complex models
3. Heavy use of domain knowledge and heuristics
4. Suboptimal results due to greedy search
5. Can’t generalize to other problems
24. { Stochastic Sampling }
24
For example:
1. Filter size has 4 choices:24,36,48,64
2. For each layer of convolution, RNN outputs a distribution:
○ 60%,20% ,10%, 10%)
○ With 60% chances, the filter size will be 24
3. This helps collects data to correct controller’s mistakes
25. { Training RNN Controller }
25
1. Use REINFORCE to update controller parameters
○ Binary rewards (0/1)
○ Trained model accuracy is the prob. of reward being 1
○ Apply cross entropy to RNN outputs
2. Designs with higher accuracy are assigned higher prob.
26. { Speed Up Model Training }
26
1. When same layers are shared across architectures
2. Share the same layer parameters
3. Alternating training between models
27. { Summary }
27
1. Better model accuracy
2. Can be made to work with complex architectures
3. Able to correct controller mistakes (e.g. bias)
4. Speed up training when layers can be shared
○ From 40K to 16 GPU hours
5. Designed for specific type of problems
6. Still very expensive with typically 10K GPU hours
29. { Challenges }
29
1. NAS algorithms are domain specific
2. Only neural networks are supported
3. Heavy use of human heuristics
4. Expensive (thousands of GPU hours)
5. Cold start problem: NAS has no prior knowledge about data
30. { Our Answer }
30
Controller
Model Training Model Validation
Model Designs
Validation DataTraining Data
Training Data
31. { Generalized Architecture Search }
31
1. Accumulate domain knowledge over time
2. Works with any algorithm (neural networks or not)
3. Automated feature engineering
4. Much fewer models to train
5. GAS powers OneClick.ai
32. 32
Use AI to Build AI
1. Custom-built Deep Learning models for best performance
2. Model designs improved iteratively in few hours
3. Better models in fewer shots due to self-learned domain
knowledge
Meta-learning evaluates millions of
deep learning models in the blink of
an eye. US patent pending
33. 33
Versatile Applications
1. Data types: numeric, categorical, date/time, textual, images
2. Applications: regression, classification, time-series forecasting,
clustering, recommendations, vision
Powered by deep learning, we support
an unprecedented range of applications
and data types
34. 34
Unparalleled Simplicity
1. Users need zero AI background
2. Simpler to use than Excel
3. Advanced functions available to experts via a chatbot
Thanks to a chatbot-based UX, we can
accommodate both newbie and expert
users
35. Use AI to Build AI
Sign up on http://oneclick.ai
ask@oneclick.ai