This document introduces Fashion-MNIST, a new dataset created by Zalando Research as a drop-in replacement for the MNIST dataset for benchmarking machine learning algorithms. Fashion-MNIST consists of 60,000 training images and 10,000 test images of 10 fashion product categories, formatted similarly to MNIST. It was created to address issues with MNIST being too easy and not representative of modern computer vision tasks. The dataset has gained popularity in the machine learning community as an alternative to MNIST, with over 2,000 stars on GitHub and being supported by several machine learning libraries.
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
1. Fashion-MNIST: a Novel Image Dataset for
Benchmarking Machine Learning Algorithms
Zalando Research
Han Xiao
Sept 25, 2017 @ Amazon Berlin
2. About Me
Han Xiao
Beijinger
Senior Research Scientist @ Zalando Research
2.5y engineering experience in Reco and Search teams @ Zalando
Ph.D. & M.Sc. in Computer Science @ TU Munich
Blog: https://hanxiao.github.io LinkedIn: https://www.linkedin.com/in/hxiao87/
3. Zalando Research Portfolio
Deep Style Insight
● Deep Learning
Advanced Image Manipulation
● Generative models
● Deep Learning
Natural Language Processing
● NLP
● Recurrent Neural Networks
Intelligent Control
● Reinforcement Learning
● Causality
● Bayesian Inference
FASHIONDNA
VIRTUALWARDROBE
SEARCH&
CHATBOT
(CAUSALATTRIBUTION)
5. MNIST vs. Fashion-MNIST
MNIST Fashion-MNIST
Published in 1997 2017
Content Handwritten digits Fashion assortments
Image type 28x28 grayscale
#class 10, balanced
#training examples 60,000
#test examples 10,000
File format IDX file format, gzipped
6. What is (not) Fashion-MNIST?
● It is a toy dataset;
● it is a drop-in replacement for MNIST dataset;
● it can be used for benchmarking/testing machine learning algorithms.
● It is not a new challenge to ML community.
7. Motivation of Fashion-MNIST/Why Move Away from MNIST?
MNIST is too easy.
MNIST is overused.
MNIST can not represent modern CV tasks.
8. Story behind Fashion-MNIST
I was working on some generative models;
Validated it on MNIST, found the task is trivial and the digits are boring;
Started to grab some images and build my own dataset;
Too lazy to write another data loader, so better stored it as the same format as MNIST.
10. Building Fashion-MNIST dataset
Images are Zalando online assortments' (front-look) photos. Shot by in-house
photographers.
Class labels are manually annotated by in-house experts.
14. An Aug. 25, the dataset was released on Github
15. Achievements
● 2K stars on Github in 6 days
● Github trending #2 from 26.08 to 28.08
● Tons of discussions on Twitter, Reddit,
HackerNews, Facebook
● 7+ ML libraries support
● 15+ Benchmarks submitted by researchers all
over the world
● 2 translations of README.md
17. Highlight: 8 machine learning libraries support
● Apache MXNet Gluon (master ver.)
● deeplearn.js
● Kaggle
● Pytorch
● Keras
● Edward
● Tensorflow (master ver.)
● Torch
18. Highlight: 20+ benchmarks from the world
● 20+ submissions
● Easy to use
● More challenging than MNIST (2 conv layer gives
99.2% vs. 92.5%)
● "Shallow" learning algorithms are under 90%
● Best result is 96.3% given by Wide Residual
Networks with Random Erasing Data
Augmentation