SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
India Analytics and 

Big Data Summit 2015
Location : Mumbai
Date : 3 Feb 2015
Name of the Speaker : Kanwal Prakash Singh, Data
Scientist
Company Name : Housing
www.unicomlearning.com
www.bigdatainnovation.org
www.bigdatainnovation.org
www.unicomlearning.com
● Information and data
● Data - Raw Facts or Figures

● Information - Processed facts, sensible 

● Information is derived from data 

● examples
www.bigdatainnovation.org
www.unicomlearning.com
● Why do we need data ?
● Some scenarios where scarcity of data led to dangerous
consequences 

○ Earth is Flat
○ Columbus and America vs India
○ Prosperity will last forever then stock markets crashed
www.bigdatainnovation.org
www.unicomlearning.com
● Take a guess , data collected per day - scale 

○ Housing
○ Linked In
○ Facebook
○ Zomato
www.bigdatainnovation.org
www.unicomlearning.com
● It’s not the Data it’s the questions you seek form data
● Are you expecting the right questions from data ?
○ do you have adequate amount to test your
hypothesis,
○ if so are you sure you are not making strong beliefs
by overlooking on some bias in data !
○ Correlation != causation
www.bigdatainnovation.org
www.unicomlearning.com
● Analytics @ housing 

● What we capture, what we do with that - 

● Optimise Operations / data collection in-house /
recommendations and understanding users / user
bucketing 

● Forecasting, Price Estimates

● Heatmaps - Demand Supply , Price , CFI
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
● How can Data science be used for optimising
operations ?
○ Flat Duplication
○ Listing Decay
○ Forecasting - Supply / Demand / Load
○ Route Optimisation 

● Problem formulation followed by solution through
Statistical methods 

● Follow the curiosity and desire for perfection `
www.bigdatainnovation.org
www.unicomlearning.com
● utilization (useful DC hours/total available DC hours)
metric is not up to the mark.
● Why ? 

● Could have been a load issue (not enough listing
requests hence DC sat idle) but that was not the case
www.bigdatainnovation.org
www.unicomlearning.com
● In fact it appeared we were overloaded. 

● Again how ?

● Data Collectors were travelling a lot ( between two
jobs)
www.bigdatainnovation.org
www.unicomlearning.com
● Hence came the idea of Branching

● The aim was two-fold:
○ reduce the travel time per flat
○ develop capability to serve a request within 45
minutes 

● Done ? Awesome, problem identification done :)
www.bigdatainnovation.org
www.unicomlearning.com
● Not Really Done !

● There was a vast scope of improvement in the
Scheduling Algorithm

● So all in all two problems
○ Find New offices ( delocalisation)
○ Optimise the Scheduling algorithm
www.bigdatainnovation.org
www.unicomlearning.com
● New Office Identification (Constrained Cost
Optimisation)

● Expanding through setup of new Branches
● Estimation of branch locations 

● Costs and capacities of new branches
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
Penalty Additions
www.bigdatainnovation.org
www.unicomlearning.com
● Bingo ! Nailed it 

www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
● Scheduling Algorithm for collection and distribution
Systems

● Optimal allocation of timed tasks (Listing Requests) to
the work force (Data Collectors) 

● Minimum cost maximum matching in a graph 

www.bigdatainnovation.org
www.unicomlearning.com
● Hungarian Algorithm 

● Optimal Allocation of jobs to people - each person has
some cost to perform a job 

● Minimum Cost Maximum Matching in a Bipartite Graph

○ Matching - Set of Edges, with no vertices repeated
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
● Nailed it now :D

www.bigdatainnovation.org
www.unicomlearning.com
● Nailed it now :D

● 30 % operational cost reduced 

● The best part - solution is transferable

○ All Delivery and collection systems 

○ Any general Density Based Branching model 

www.bigdatainnovation.org
www.unicomlearning.com
● Takeaways

○ Data is brahmastra 

○ A noob cant master brahmastra, so rise to the
levels of Elite Warriors - (Mahabharata had several)

○ How ?
■ Mindset - Curious / Hardworking/ Focused
■ Read/ Learn - Blogs / Books / Courses / Peers
■ Apply - Personal Projects / Kaggle
■ Teach
www.bigdatainnovation.org
www.unicomlearning.com
● Acknowledgements
○ Mr. Shanu Vivek, Operations BI, Housing
○ Mr. Vaibhav Krishan, Sr. Quant Analyst
○ Mr. Jaspreet Saluja, Co-Founder, Housing
○ Mr. Rishabh Gupta, Operations, Housing
○ Mr. Arpit Agarwal, Operations, Housing
○ Mr. Abhishek Anand, CTO, Housing
○ Mr. Nitin Sangwan, DSL, Housing 

www.unicomlearning.com
www.bigdatainnovation.org
Speaker Name: Kanwal Prakash Singh
Email ID: kanwalprakashsingh@gmail.com
India Analytics and Big Data Summit 2015
Organized by
UNICOM Trainings & Seminars Pvt. Ltd.

contact@unicomlearning.com
THANK YOU

Contenu connexe

En vedette

SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...
SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...
SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...SAS Institute India Pvt. Ltd
 
Hadoop at aadhaar
Hadoop at aadhaarHadoop at aadhaar
Hadoop at aadhaarRegunath B
 
Towards a big data roadmap for europe
Towards a big data roadmap for europeTowards a big data roadmap for europe
Towards a big data roadmap for europeBIG Project
 
Big Data Public-Private Forum_General Presentation
Big Data Public-Private Forum_General PresentationBig Data Public-Private Forum_General Presentation
Big Data Public-Private Forum_General PresentationBIG Project
 
The Big Data Opportunity
The Big Data Opportunity The Big Data Opportunity
The Big Data Opportunity EMC
 

En vedette (8)

SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...
SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...
SAS Forum India: Big Data, Big Analytics & Bad Behaviour - Fighting Financial...
 
Hadoop at aadhaar
Hadoop at aadhaarHadoop at aadhaar
Hadoop at aadhaar
 
The data deluge: Five years on
The data deluge: Five years on The data deluge: Five years on
The data deluge: Five years on
 
Big data: Bringing competition policy to the digital era – Background note – ...
Big data: Bringing competition policy to the digital era – Background note – ...Big data: Bringing competition policy to the digital era – Background note – ...
Big data: Bringing competition policy to the digital era – Background note – ...
 
Big data: Bringing competition policy to the digital era – STUCKE – November ...
Big data: Bringing competition policy to the digital era – STUCKE – November ...Big data: Bringing competition policy to the digital era – STUCKE – November ...
Big data: Bringing competition policy to the digital era – STUCKE – November ...
 
Towards a big data roadmap for europe
Towards a big data roadmap for europeTowards a big data roadmap for europe
Towards a big data roadmap for europe
 
Big Data Public-Private Forum_General Presentation
Big Data Public-Private Forum_General PresentationBig Data Public-Private Forum_General Presentation
Big Data Public-Private Forum_General Presentation
 
The Big Data Opportunity
The Big Data Opportunity The Big Data Opportunity
The Big Data Opportunity
 

Similaire à India Analytics and Big Data Summit 2015

Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdfLars Albertsson
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...Dataconomy Media
 
How To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with HadoopHow To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with HadoopMammoth Data
 
Welcome To The Data Age - Dawn of the Data Age Lecture Series
Welcome To The Data Age - Dawn of the Data Age Lecture SeriesWelcome To The Data Age - Dawn of the Data Age Lecture Series
Welcome To The Data Age - Dawn of the Data Age Lecture SeriesLuciano Pesci, PhD
 
Big Data & Social Analytics presentation
Big Data & Social Analytics presentationBig Data & Social Analytics presentation
Big Data & Social Analytics presentationgustavosouto
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Rittman Analytics
 
Better Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolBetter Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolLouis Cialdella
 
"Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ...
"Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ..."Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ...
"Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ...Dataconomy Media
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Manjunath Sindagi
 
Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!DataWorks Summit
 
Engineering data quality
Engineering data qualityEngineering data quality
Engineering data qualityLars Albertsson
 
Building ML models for smart retail
Building ML models for smart retailBuilding ML models for smart retail
Building ML models for smart retailAlbert Y. C. Chen
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learningTheodoros Vasiloudis
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesLars Albertsson
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the EnterpriseSrinath Perera
 
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitMeetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitDigipolis Antwerpen
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMProduct School
 
Devday @ Sahaj - Domain Specific NLP Pipelines
Devday @ Sahaj -  Domain Specific NLP PipelinesDevday @ Sahaj -  Domain Specific NLP Pipelines
Devday @ Sahaj - Domain Specific NLP PipelinesRajesh Muppalla
 
Innovation report: Artificial Intelligence
Innovation report: Artificial IntelligenceInnovation report: Artificial Intelligence
Innovation report: Artificial IntelligenceYoussef Rahoui
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Albert Y. C. Chen
 

Similaire à India Analytics and Big Data Summit 2015 (20)

Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...
 
How To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with HadoopHow To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with Hadoop
 
Welcome To The Data Age - Dawn of the Data Age Lecture Series
Welcome To The Data Age - Dawn of the Data Age Lecture SeriesWelcome To The Data Age - Dawn of the Data Age Lecture Series
Welcome To The Data Age - Dawn of the Data Age Lecture Series
 
Big Data & Social Analytics presentation
Big Data & Social Analytics presentationBig Data & Social Analytics presentation
Big Data & Social Analytics presentation
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17
 
Better Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product SchoolBetter Living Through Analytics - Louis Cialdella Product School
Better Living Through Analytics - Louis Cialdella Product School
 
"Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ...
"Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ..."Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ...
"Data Pipelines for Small, Messy and Tedious Data", Vladislav Supalov, CAO & ...
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
 
Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!
 
Engineering data quality
Engineering data qualityEngineering data quality
Engineering data quality
 
Building ML models for smart retail
Building ML models for smart retailBuilding ML models for smart retail
Building ML models for smart retail
 
A few questions about large scale machine learning
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitMeetup 18/10/2018 - Artificiële intelligentie en mobiliteit
Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
 
Devday @ Sahaj - Domain Specific NLP Pipelines
Devday @ Sahaj -  Domain Specific NLP PipelinesDevday @ Sahaj -  Domain Specific NLP Pipelines
Devday @ Sahaj - Domain Specific NLP Pipelines
 
Innovation report: Artificial Intelligence
Innovation report: Artificial IntelligenceInnovation report: Artificial Intelligence
Innovation report: Artificial Intelligence
 
Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0Making better use of Data and AI in Industry 4.0
Making better use of Data and AI in Industry 4.0
 

India Analytics and Big Data Summit 2015