2. Getting things in perspective …
Predictor
Train
Dog
Let us limit the discussion to:
- Image classification
- Supervised learning
- Close set recognition
- Requires a huge number of data for each
task
- New task needs to be retrained
- However, humans can learn it effortlessly
Image source from “The CIFAR-10 dataset” (https://www.cs.toronto.edu/~kriz/cifar.html)
Image Source: https://unsplash.com/
Image source from “The CIFAR-10 dataset” (https://www.cs.toronto.edu/~kriz/cifar.html)
Image Source:
https://unsplash.com/
3. Getting things in perspective …
How many people are there?
What is this place?
Where is this place?
What is the time of day?
What is the temperature?
What is the mood?
Do they practice social distancing?
Do they wear masks?
Humans:
- Can decompose/manipulate
representations
- Accommodate to task
- Don’t need extra training
Data Bias!
3
Image Source: https://unsplash.com/
4. Getting things in perspective …
To mimic human ability:
- Finding good priors
Blank slate vs. innate behaviors
- Good representations
Learning with the help of ‘unlabeled’ data, such as self-supervised learning
- Transfer learning
Knowledge transfer from one task to the other (For example, improving face
recognition with another model that deals with different expressions)
- Few-shot learning
This is what we will be talking about!
4
5. Few-Shot Learning
- To classify new data after being given a few
samples
- Extreme case is called one-shot learning
Class 1
Class 2
- It is not to solve insufficient data issue, but to provide an alternative way to
handle little data per class
5
?
Source: https://unsplash.com/
8. Query set
Support set
Few-Shot Learning
- N-way-K-shot
Truck
Car
2-way-4-shot
?
8
Task
Image source: https://unsplash.com/
9. Few-Shot Learning
Meta-Learning Framework
- Conventional approach is to train the model using dataset to perform
classification
- Meta-learning is to ‘train’ the model to learn how to use dataset to
perform classification (Learning to Learn)
Class 1
Class 2 ?
Class 1
Class 2
?
9
11. Meta-Learning: Learning to Learn
Testing
predictor
Train
Dog
There are no sample-class binding
11
Each data sample
is a Task
Image source from: https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html
Image source from: https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html
12. 12
Class 1
Class 2
Class 1
Class 2
Class 1
Class 2
Meta-Learning: Classes, samples and labels shuffling
Image source modified from: https://lilianweng.github.io/lil-log/2018/11/30/meta-learning.html
13. Meta-Learning
- Based on similarity
- Matching networks
- Prototypical networks
- Relation networks
- Based on learning algorithm
- Model agnostic meta-learning (MAML)
- Memory augmented neural network
- Based on data
- Bayesian programs
13
14. Meta-Learning: Based on Similarity
0.08 0.02 0.1 0.8
x x x x
sum
Matching
Network
Prototypical
Network
14
[1] Vinyals, O., Blundell, C., Lillicrap, T., & Wierstra, D. (2016). Matching networks for one shot learning. Advances in neural information processing systems, 29, 3630-3638.
[2] Snell, J., Swersky, K., & Zemel, R. S. (2017). Prototypical networks for few-shot learning. arXiv preprint arXiv:1703.05175.
Image source from original paper [1]
Image source from original paper [2]
Image modified from
original paper [1]
15. Meta-Learning: Based on Similarity
15
Image source from: https://www.borealisai.com/en/blog/tutorial-2-few-shot-learning-and-meta-learning-i/
16. Meta-Learning: Based on Learning Algorithm
Memory Augmented Neural Network (MANN)
Learns the algorithm to store and retrieve memories [1]
16
dog cat dog dog cat
…
NULL
[1] Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., & Lillicrap, T. (2016, June). Meta-learning with memory-augmented neural networks. In International conference on machine learning (pp.
1842-1850). PMLR.
Image source from original paper [1]
Image source from original paper [1]
Image source: https://unsplash.com/
17. w1
w2
Meta-Learning: Based on Learning Algorithm
Model agnostic meta-learning (MAML) [1]
17
Task 1 Task 2
Task 3
Init
[1] Finn, C., Abbeel, P., & Levine, S. (2017, July). Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning (pp. 1126-1135). PMLR.
Image source modified from:
https://lilianweng.github.io/lil-log/2018/11/30/meta-learnin
g.html
Image source modified from:
https://lilianweng.github.io/lil-log/2018/11/30/meta-learnin
g.html
18. w1
w2
Meta-Learning: Based on Learning Algorithm
Model agnostic meta-learning (MAML)
18
Task 1 Task 2
Task 3
data for task1
learning
data for task2
learning
data for task3
learning
19. w1
w2
Meta-Learning: Based on Learning Algorithm
Model agnostic meta-learning (MAML)
19
Task 1 Task 2
Task 3
data for task1 meta
learning
data for task2 meta
learning
data for task3 meta
learning
21. Meta-Learning: Based on Data
Modeling through Bayesian Programs
21
…
- Structure of the model contains information
on how the output is created (prior)
- Meta-learning learns a way for various
Bayesian program modules to combine to
express unseen data
- Remember probabilistic programming with
Pyro?
[1] Lake, B. M., Salakhutdinov, R., & Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266), 1332-1338.
Image source from original paper [1]
22. 22
Consideration (after getting things in perspective…)
- Do I need ‘learning to learn’ or just lack of data
Does my application justify its usage
- Is my dataset sufficient enough
Huge amount of data doesn’t mean sufficient
- What prior knowledge I have
For example: data model, invariance assumption
- Any training constraints I can impose
For example: curriculum learning, multi-loss, feature space constraints