Publicité
Publicité

Contenu connexe

Présentations pour vous(20)

Similaire à 5 Practical Steps to a Successful Deep Learning Research(20)

Publicité
Publicité

5 Practical Steps to a Successful Deep Learning Research

  1. 5 Practical Steps to a Successful Deep Learning Research Amir Alush, Phd Co founder & CTO
  2. Brodmann17 Founded in 2016, 19+ team, mostly M.Sc. / Ph.D. machine learning researchers. Backed by: lool Ventures, Maniv Ventures, Sony Innovation & SamsungNEXT Brodmann17 has designed a Deep learning technology independently (patents pending). from scratch, with optimal performance and accuracy by design. Brodmann17 is developing perception software for the world’s largest Tier-1 automotive suppliers, for pre-install/aftermarket ADAS & autonomous driving.
  3. Things I’ll Talk About 1. Requirements 3. Data Annotation 4. Research Evaluation Metric 5. Research 2. Data Collection
  4. Step 1: Set your Requirements Open Research: can lead to new great products but it’s also risky. Always keep this alive! Product oriented research: must have clear requirements: ● What is the task? ● What is the data? ● What is the target platform? cpu (arm/intel), gpu(arm/nvidia)
  5. Step 1: Set your Requirements (example) Smart Doorbell “Requirements”: ● Task: ○ Alert when a human appears (once) with 98%Recall, 1 False Alarms per week ○ 0.3-1.5 meters distance from camera ○ Full / upper body only ○ Unique id per person ● Input: ○ 720p RGB images, 30fps ○ Camera height: 1.5 meters from ground ● Platform & Run-Time: ○ Raspberry Pi 3, 1xA53 ARM CPU ○ 0.5 sec latency from appearance to alert
  6. Things I’ll Talk About 1. Requirements 3. Data Annotation 4. Research Evaluation Metric 5. Research 2. Data Collection
  7. Step 2: Data Collection Data collection is a long and expensive process: ● Long: proprietary setup, requires variability, takes its times, depending on another company ● Expensive: buying data, special setup, storage, management, etc. You should: ● Start early ● Collect the Right data. Plan thoughtfully, wrong data could hold back your product release
  8. Step 2: Data Collection (quantity) How much data do I need ? ● Quantity is Important, it comes with a price tag and time ● Quality is more important It’s a continuous process: 1. Start with a small subset for fast POC to reduce risks 2. Increase the collection rate 3. Data collection to improve research metrics Just putting it here: ● Academic data ● Synthesizing data
  9. Step 2: Data Collection (quality) Meet product requirements: ● Same modality ● Cover expected operation mode distribution (scene appearance, objects appearance, viewpoint, etc..) Using Pascal/Coco for the Smart Doorbell?
  10. Step 2: Data Collection (quality) Meet product requirements: ● Same modality ● Cover expected operation mode distribution (scene appearance, objects appearance, viewpoint, etc..) Doorbell camera example X X X X OK
  11. Step 2: Data Collection (quality) Meet product requirements: ● Same modality ● Cover expected operation mode distribution (e.g. scene appearance, objects appearance, viewpoint, scene type) Traffic monitoring application X X X X X OK
  12. Step 2: Data Collection (quality) Data with Variability ● Collecting a correlated data set is easy ● Data under different conditions: e.g. location, time of day, season, weather condition ● Data coming from multiple sources (cameras, devices)
  13. Step 2: Data Collection (quality) Data with Variability ● Collecting a correlated data set is easy ● Data under different conditions: e.g. location, time of day, season, weather condition ● Data coming from multiple sources (cameras, devices)
  14. Things I’ll Talk About 1. Requirements 3. Data Annotation 4. Research Evaluation Metric 5. Research 2. Data Collection
  15. Step 3: Data Annotation (Quantity) More Expensive and time consuming than the collection part ● Could cost up to several $ per frame! ● Understand what you’ll be needing in the research phase
  16. Step 3: Data Annotation (Quantity) Choose what data to annotate: ● You should not annotate all your data ● Annotate quality data It’s a continuous process: ● Start with a small subset and a fixed annotation scheme ● Increase the annotation rate
  17. Step 3: Data Annotation (Quality) Supervised Learning: ● This is the actual data your models are trained with ● Your model will get as good as your data! Annotation guidelines are derived from Product Requirements ● Usually not straightforward ● Should be fine detailed ● New data annotation scheme / re annotate /cleaning to improve research metrics
  18. Step 3: Data Annotation (Quality) How would you annotate this person?
  19. Step 3: Data Annotation (Quality) How would you annotate this face? Consistency and clarity is important: ● Not to confuse your learning process ● Not to confuse your annotators ● Not to fail your research evaluation metric ● Other algorithms are depending on this annotation
  20. Step 3: Data Annotation (Quality) How would you annotate these objects?
  21. Step 3: Data Annotation (Quality) Quality Assurance: ● Several annotators → costly ● Familiar annotators (with a name) are a good choice ● Tight definition of the task ● Automatic validation ● Simple tasks, or pre-process to simplify
  22. Step 3: Data Annotation (Costs) Optimize Costs & Throughput: ● Bootstrap to initialize/prioritize annotation ● Use Temporal Information ● Use any available information ● Preprocess to simplify tasks ● Build your own annotation infrastructure or 3rd party?
  23. Things I’ll Talk About 1. Requirements 3. Data Annotation 4. Research Evaluation Metric 5. Research 2. Data Collection
  24. Step 4: Research Evaluation Metric Thus far we have: 1. Product requirements 2. Initial Data collection + annotation strategy Before you start your research experiments set a research evaluation metric (a single number*) *Andrew Ng
  25. Step 4: Research Evaluation Metric ● There are many ways to evaluate an experiment: ○ e.g. TPR, aDR, FPR, mAP, latency, etc.. ○ Improving one metric can lower another one.. ● It’s more efficient (time & resources) to advance with a clear target EvaluationMetric EvaluationMetric Time / Resources Time / Resources requirements achieved requirements achieved ? Optimizing for a single evaluation metric Optimizing for several evaluation metrics
  26. Steps 1-4 Overview by Example Smart DoorBell example: Step 1 - Product Requirements Step 2 - Data Collection: ● 720p RGB Videos, 1.5m heigh ● 20% “No Objects” Videos, 80% “With Objects” Videos Step 3 - Data Annotation: ● Annotate only objects up to 1.5m away ● Full body + Upper body only bounding boxes ● Annotate 5% Full Videos, 95% Sampled videos Step 4 - Evaluation Metric ● 98% Recall w/(1FPPW, <500msecs) for Object Detection Task
  27. Things I’ll Talk About 1. Requirements 3. Data Annotation 4. Research Evaluation Metric 5. Research 2. Data Collection
  28. Step 5: Research Research Experiments Error Analysis Data (collection/annotation)
  29. Step 5: Research Applied research is an empirical process. 1. Research Experiments Phase: ● Deep Learning architectures ● Learning hyper parameters ● Data manipulations ● Other ...
  30. Step 5: Research Applied research is an empirical process. 2. Analysis Phase: ● Split data into train / validation / test ● Bias / Variance (on validation and training data) * ● Rank the factors that impact our evaluation metric the most (on validation data) Feature %Error Priority Wrong Annotation 25% 1 Close Objects 20% 2 Truncated Objects 3% 3 Umbrellas 1% 4 ... ... ... ** https://kevinzakka.github.io/2016/09/26/applying-deep-learning/
  31. Step 5: Research Applied research is an empirical process. 3. Data Phase: ● Clean Data / Re annotate / Change Annotation Scheme ● Data Collection
  32. Step 5: Research Applied research is an empirical process. Next Iteration: ● What to explore / fix next in our Deep Learning models
  33. Step 5: Research The research phase is very resource demanding: ● Researchers ● Compute ● Time Optimize this in order to shorten time to product: ● Researchers → Increase productivity ● Compute → reduce costs
  34. Step 5: Research Running an experiment involves: ● Planning the experiment (ok) ● Setting up a compute environment (oh) ● Data selection, preprocessing, fetching (oh) ● Monitoring Periodic evaluation (oh) ● Managing a pipeline of algorithms (oh) ● Saving intermediate results (oh)
  35. Step 5: Research Running many experiments doesn’t scale up!: ● Managing compute environment resources and prioritization ● Monitoring many experiments ● Analysing many experiments results ● Experiment traceability over time: code, data, experiment configuration versioning A dedicated infrastructure and management system is needed to: ● Shared resources management ● Orchestrate the training of the different models ● Monitoring the various experiments, training configurations, models ● Allowing to build complicated algorithms pipelines and running effortlessly Build your own or use 3rd party?
  36. Topics Covered 1. Requirements 3. Data Annotation 4. Research Evaluation Metric 5. Research 2. Data Collection
  37. We are always looking for new talents Passionate about AI and want to explore more ? We invite to join us on our journey! For Jobs opportunities: https://www.linkedin.com/company/brodmann17/
  38. THANK YOU Amir Alush, Phd - Co founder & CTO amir@brodmann17.com
Publicité