SlideShare une entreprise Scribd logo
1  sur  17
Athens BD Jun2018 | p1
Embeddings of Categorical Variables
Athens BD Jun2018 | p2
Definition
We usually encode categoriesas positive integers so embeddings are mappings
Z→Rk
k is called the 'embedding dimension'.
An embedding 'or VS representationor VS method' of a categoricalvariablex is any
mapping of its categories to Rk.
To learn the embedding of a categoricalin a ML task means to find a map
categories → Rk
where
k << number of categories
Consider VS embeddings as an evolutionofone-hot encodingwe traditionally use to represent categories.
But why we've been using OH encoding anyway?
Why not just use successive integers to represent categories?
Athens BD Jun2018 | p3
Motivation
With the exception of classificationand regression trees (CART), learning algorithms
operate on subsets of Rn where n is the inputdimension.
A naive encoding of categories as (say positive and consecutive) integers suffers
from several issues:
1. The model performance depends on the choice of the
encoding
Suppose we're given {blue, orange, green} → {1, 2, 3}
so that x1 = 1, x2 = 2, x3 = 3
and y1 = 2, y2 = 6, y3 = -2
The LM doesn’t fit
However, if we change the encoding to
{blue, orange, green} = {2, 3, 1} the fit will be perfect.
Athens BD Jun2018 | p4
Motivation
2. The use of integers to represent the values of categoricalinputs destructs the
learning process by treating thegradient overdifferent categoriesunequally:
Assume that the model function containsa multiplicativedependency wx∙x ie:
f(x,...)= f(wx,...)for a categoricalx and we're provided with a training example where x = j.
For any objective J, the partial derivativeat x = j is
∂J/∂wx|x=j ~ j ∙ ∂J/∂x|x=j
The jth categorycontributes to model training j timesthe1st
category!
3. What if a category contributes positively to the output and another category negatively?
Using a single parameter to model the categoricalwill most probably sendtheparametertozero by
the end of the training process!
Athens BD Jun2018 | p5
Why CARTS do not require encoding?
CARTS partitionthe input observable space using a sequence of coordinatesplitsthat
greedily minimize an objective.
By “greedily”we mean that the objective is minimized at eachsplit.A greedy optimum is not the optimum
over all the possible partitions of the input space though.
More formally, given a training set T = {X = [x1,...,xn], Y = (y1,...yn)}with xj Rk , j = 1,...,n
A coordinatesplit at level 0 divides T in 2 subsets T1 = {X1, Y1} and T2 = {X2, Y2} such that the sum of the values
of the objective applied to each subset is minimized.
Level-0 loop:
 coordinate
 coordinate value
evaluate the objective
check minimum
return the coord and coord-value of minimum
Athens BD Jun2018 | p6
Why CARTS do not require encoding?
In regressiontasks theobjective is the MSE
of y's in Yj, j = 1,2.
In a binaryclassificationtask, T1 is
associated with class C1 and T2 with class C2
and the objective is the number of correct
guesses of Cj in Tj
The crucial thing is that for the splitting
process to work:
1. the types of X and Y are not required to be numerical,
2. no ordering of the values of X and Y is implicitly
assumed.
Athens BD Jun2018 | p7
Learning Embeddings in Tensorflow
We're using an example from the retail industry.
The data is sales countsof prepared meat and burger products for a group of stores of a large food retailerin
the US. Line items are salescount per store, calendarday andstockkeepingunit(SKU).
The object is to estimate sales givena SKU, locationand day.
We'll employ a FFNN of just a single hidden layer and an objective that is not the MSE
because it is not suited for countdata.
A random variableY∈Z+ is said to havethe Poisson distribution with parameter μ, if it takes positive integer
values y = 0,1,2,... withprobability
P(Y = y) = eμ⋅μy/y!
Athens BD Jun2018 | p8
Learning Embeddings in Tensorflow
The reason for using the aboveas a model for the distributionof SKU sales is its relationto the binomial
distribution (Bernoulli trials):
If Xj j = 1,2,... areindependent binomialsie
Xj ~ 𝓑(πj) and
Σjπj → μ < ∞ then
ΣjXj ~ Poisson(μ)
Fix a product say S, that sold n items yesterday at Wholefoods MidtownATL.
Each Xj roughly represents a customer that buys S with probability πj and n = ΣjXj
From this point the process of deriving a loss is pretty much standard:
we set y = wTx where w is the weightvec and x the input and maximize the negativelog likelihood.
Athens BD Jun2018 | p9
Input Encodings
SKU IDs, calendar days and store locationsare OH encoded.
This creates an input space of several hundred or thousand binary variablesdepending on the size of the
assortment and the number of stores.
This is an issue for memory as soon as the number of training examples are more than a few thousands
(certain precautions can be takenthough!)
OH Encoding (ohh…)
Vector space encoding
Insteadof store-ids we use geospatialcoordinates(lat | long). Calendar days
are mapped to R2 using a VS representationthat brings closely together days
around a year's end:
day number → cos(2πj/365), sin(2πj/365)
Athens BD Jun2018 | p10
How does it work?
j embeddingj
a( 1)
W( 1)
h( 1)
a( n)
h( n)
=yhat
b( 1)
K- di m
K-dimOtherinputs
Athens BD Jun2018 | p11
The Tensorflow code (go to Jupyter)
Athens BD Jun2018 | p12
The gain of SKU embedding
Suppose the object is to estimate a kind of 'market-basket' whencashier transaction data is not
availableie, groups of SKUs with approx the same sales across days and stores.
This is a core problem in assortment planning:
estimate the number | percentageof product items I'll need to stock for the next week|month|season.
Probably more involvedis the use of assortments in demandforecasting:estimatea product's sales
for the next period from its sales history.
How is the aboverelated to the learnt VS embeddings of SKUs?
The core insight is that neighboring values in the embedding space have similar sales across stores and days
Well, not exactly: currently the best theoretical result we haveis this:
m∙‖e1 - e2‖ ≤ Ex‖yhat(e1,x) - yhat(e2,x)‖≤ M∙‖e1 - e2‖ with m ≤ M
The practice shows thoughtthat the insightholds
Athens BD Jun2018 | p13
Embedding projectors
An embedding projector tries to create a 2D or 3D scatterplot from a multidimensionalset of
points.
The purpose is to retain as much variancein the originalset as possible.
PCA is the most widely used method howeverit fails in high dimensional spaces or complex
geometries.
The proposed method there is t-SNE.
It learns the positions of 2|3D points by minimizing the KL divergence of probability distributions it defines
for the original and space and its t-SNE image (what a hack!).
The reference examples of MNIST and Word2Vecare in the tensorboard-projector page.
Athens BD Jun2018 | p14
Telecom operators exploit the call graph of their subscribers using elementary or more advanced
methods.
Given a log of calls between subscribers (voice and texts) over a time period of N days they define the
strengthofa relation betweensubscribers by the number and duration of calls they make to one
another.
An example from telecoms
Variationstake into account the time of day, the day of week the uniformity of call frequency etc.
A subscriber's X network | community are the subscribers with the strongest relationwith X.
An approach in line with our discussion, is to use the call graph to mapthesubscribers inanembedding
space. A subscriber's community are the nearest neighbors in the embedding space (obviously).
Athens BD Jun2018 | p15
There're several benefits of this approach:
▪ Embeddingshavememory.As soon as a new call record becomes availablea few iterations of the
neural network will accommodate the new information in the existing embedding vectors. This permits
real-time community updates.
▪ Embeddings facilitatethe visualizationof variouscustomer-levelmeasureson their projected
manifolds: We can view for example the distributionof rateplans or rateplan categories or the
distribution of customer tenure over the embedding vectors.
▪ The most useful property thought, is the way embeddings can be used to predict the community of a
new customer for whom there's no call log yet (but a few things are known initially eg the rateplan,
service subscriptions and demographics).
An example from telecoms
Athens BD Jun2018 | p16
the
you need to integrate your program into a larger process, interoperating
ernal systems and processes.
How far can we go?
Word2Vecwas the first REALLY impressive use of a certain novel kind of word embedding.
It constructs a languagemodel from a text corpus ie given a part of a sentence it will predict the rest of it.
A direct consequence is machine translation: throw in a sentence in Greek and it will translate it to Swahili.
Try this out in Google translate.
More?
Sunspring is the first movie script completely written by a machine 2 years ago
Athens BD Jun2018 | p17
Thanxguys
For more pizzas you can track me here:
http://www.mltrain.cc
http://www.linkedin.con/in.cmalliopoulos

Contenu connexe

Tendances

Instance based learning
Instance based learningInstance based learning
Instance based learningSlideshare
 
Graphical Models In Python | Edureka
Graphical Models In Python | EdurekaGraphical Models In Python | Edureka
Graphical Models In Python | EdurekaEdureka!
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GANDai-Hai Nguyen
 
Correlation modeling and portfolio optimization - CIPEFA
Correlation modeling and portfolio optimization - CIPEFACorrelation modeling and portfolio optimization - CIPEFA
Correlation modeling and portfolio optimization - CIPEFAJuan Andrés Serur
 
Elliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsubaElliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsubaIAEME Publication
 
Statistical Clustering and Portfolio Management
Statistical Clustering and Portfolio ManagementStatistical Clustering and Portfolio Management
Statistical Clustering and Portfolio ManagementJuan Andrés Serur
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIRtuxette
 
Bivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm forBivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm foreSAT Publishing House
 
Number formats for signals and coefficients in DSP system
Number formats for signals and coefficients in DSP systemNumber formats for signals and coefficients in DSP system
Number formats for signals and coefficients in DSP systemsarithabanala
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Ch 9-1.Machine Learning: Symbol-based
Ch 9-1.Machine Learning: Symbol-basedCh 9-1.Machine Learning: Symbol-based
Ch 9-1.Machine Learning: Symbol-basedbutest
 
Boosted multinomial logit model (working manuscript)
Boosted multinomial logit model (working manuscript)Boosted multinomial logit model (working manuscript)
Boosted multinomial logit model (working manuscript)Jay (Jianqiang) Wang
 
The Basic Model of Computation
The Basic Model of ComputationThe Basic Model of Computation
The Basic Model of ComputationDipakKumar122
 

Tendances (17)

Color
ColorColor
Color
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
 
Graphical Models In Python | Edureka
Graphical Models In Python | EdurekaGraphical Models In Python | Edureka
Graphical Models In Python | Edureka
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GAN
 
Correlation modeling and portfolio optimization - CIPEFA
Correlation modeling and portfolio optimization - CIPEFACorrelation modeling and portfolio optimization - CIPEFA
Correlation modeling and portfolio optimization - CIPEFA
 
Lesson 29
Lesson 29Lesson 29
Lesson 29
 
Elliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsubaElliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsuba
 
Statistical Clustering and Portfolio Management
Statistical Clustering and Portfolio ManagementStatistical Clustering and Portfolio Management
Statistical Clustering and Portfolio Management
 
About functional SIR
About functional SIRAbout functional SIR
About functional SIR
 
Data Applied: Clustering
Data Applied: ClusteringData Applied: Clustering
Data Applied: Clustering
 
Bivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm forBivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm for
 
Number formats for signals and coefficients in DSP system
Number formats for signals and coefficients in DSP systemNumber formats for signals and coefficients in DSP system
Number formats for signals and coefficients in DSP system
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Ch 9-1.Machine Learning: Symbol-based
Ch 9-1.Machine Learning: Symbol-basedCh 9-1.Machine Learning: Symbol-based
Ch 9-1.Machine Learning: Symbol-based
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
Boosted multinomial logit model (working manuscript)
Boosted multinomial logit model (working manuscript)Boosted multinomial logit model (working manuscript)
Boosted multinomial logit model (working manuscript)
 
The Basic Model of Computation
The Basic Model of ComputationThe Basic Model of Computation
The Basic Model of Computation
 

Similaire à 13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterprise Relational Data

A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsDevansh16
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsbutest
 
Performance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHT
Performance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHTPerformance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHT
Performance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHTIRJET Journal
 
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...IJERA Editor
 
CFM Challenge - Course Project
CFM Challenge - Course ProjectCFM Challenge - Course Project
CFM Challenge - Course ProjectKhalilBergaoui
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsToyotaro Suzumura
 
From_seq2seq_to_BERT
From_seq2seq_to_BERTFrom_seq2seq_to_BERT
From_seq2seq_to_BERTHuali Zhao
 
Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...
Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...
Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...rahulmonikasharma
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationGeoffrey Fox
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
NEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHOD
NEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHODNEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHOD
NEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHODIAEME Publication
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPCVictor Eijkhout
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with pythonSimone Piunno
 
lecture15-supervised.ppt
lecture15-supervised.pptlecture15-supervised.ppt
lecture15-supervised.pptIndra Hermawan
 
Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Yueshen Xu
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
 

Similaire à 13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterprise Relational Data (20)

A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical models
 
A detailed analysis of the supervised machine Learning Algorithms
A detailed analysis of the supervised machine Learning AlgorithmsA detailed analysis of the supervised machine Learning Algorithms
A detailed analysis of the supervised machine Learning Algorithms
 
Performance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHT
Performance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHTPerformance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHT
Performance Analysis on Fingerprint Image Compression Using K-SVD-SR and SPIHT
 
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
An Efficient Method of Partitioning High Volumes of Multidimensional Data for...
 
CFM Challenge - Course Project
CFM Challenge - Course ProjectCFM Challenge - Course Project
CFM Challenge - Course Project
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
 
From_seq2seq_to_BERT
From_seq2seq_to_BERTFrom_seq2seq_to_BERT
From_seq2seq_to_BERT
 
Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...
Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...
Analysis of GF (2m) Multiplication Algorithm: Classic Method v/s Karatsuba-Of...
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
NEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHOD
NEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHODNEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHOD
NEW APPROACH FOR SOLVING FUZZY TRIANGULAR ASSIGNMENT BY ROW MINIMA METHOD
 
Visual Techniques
Visual TechniquesVisual Techniques
Visual Techniques
 
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPCIntegrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
lecture15-supervised.ppt
lecture15-supervised.pptlecture15-supervised.ppt
lecture15-supervised.ppt
 
Chapter two
Chapter twoChapter two
Chapter two
 
Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)Aggregation computation over distributed data streams(the final version)
Aggregation computation over distributed data streams(the final version)
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 

Plus de Athens Big Data

22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...Athens Big Data
 
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage systemAthens Big Data
 
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...Athens Big Data
 
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query executionAthens Big Data
 
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...Athens Big Data
 
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the coversAthens Big Data
 
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: VeltiAthens Big Data
 
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...Athens Big Data
 
19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understanding19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understandingAthens Big Data
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on KubernetesAthens Big Data
 
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a ServiceAthens Big Data
 
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...Athens Big Data
 
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...Athens Big Data
 
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...Athens Big Data
 
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...Athens Big Data
 
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On MesosAthens Big Data
 
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...Athens Big Data
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...Athens Big Data
 
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And GradingAthens Big Data
 

Plus de Athens Big Data (20)

22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
 
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
21st Athens Big Data Meetup - 2nd Talk - Dive into ClickHouse storage system
 
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
19th Athens Big Data Meetup - 2nd Talk - NLP: From news recommendation to wor...
 
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
21st Athens Big Data Meetup - 3rd Talk - Dive into ClickHouse query execution
 
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
21st Athens Big Data Meetup - 1st Talk - Fast and simple data exploration wit...
 
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
20th Athens Big Data Meetup - 2nd Talk - Druid: under the covers
 
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
20th Athens Big Data Meetup - 3rd Talk - Message from our sponsor: Velti
 
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
20th Athens Big Data Meetup - 1st Talk - Druid: the open source, performant, ...
 
19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understanding19th Athens Big Data Meetup - 1st Talk - NLP understanding
19th Athens Big Data Meetup - 1st Talk - NLP understanding
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
 
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
18th Athens Big Data Meetup - 1st Talk - Timeseries Forecasting as a Service
 
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
17th Athens Big Data Meetup - 2nd Talk - Data Flow Building and Calculation P...
 
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
17th Athens Big Data Meetup - 1st Talk - Speedup Machine Application Learning...
 
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
16th Athens Big Data Meetup - 2nd Talk - A Focus on Building and Optimizing M...
 
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
16th Athens Big Data Meetup - 1st Talk - An Introduction to Machine Learning ...
 
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos
 
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
5th Athens Big Data Meetup - PipelineIO Workshop - Real-Time Training and Dep...
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
11th Athens Big Data Meetup - 2nd Talk - Beyond Bitcoin; Blockchain Technolog...
 
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
9th Athens Big Data Meetup - 2nd Talk - Lead Scoring And Grading
 

Dernier

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

13th Athens Big Data Meetup - 2nd Talk - Training Neural Networks With Enterprise Relational Data

  • 1. Athens BD Jun2018 | p1 Embeddings of Categorical Variables
  • 2. Athens BD Jun2018 | p2 Definition We usually encode categoriesas positive integers so embeddings are mappings Z→Rk k is called the 'embedding dimension'. An embedding 'or VS representationor VS method' of a categoricalvariablex is any mapping of its categories to Rk. To learn the embedding of a categoricalin a ML task means to find a map categories → Rk where k << number of categories Consider VS embeddings as an evolutionofone-hot encodingwe traditionally use to represent categories. But why we've been using OH encoding anyway? Why not just use successive integers to represent categories?
  • 3. Athens BD Jun2018 | p3 Motivation With the exception of classificationand regression trees (CART), learning algorithms operate on subsets of Rn where n is the inputdimension. A naive encoding of categories as (say positive and consecutive) integers suffers from several issues: 1. The model performance depends on the choice of the encoding Suppose we're given {blue, orange, green} → {1, 2, 3} so that x1 = 1, x2 = 2, x3 = 3 and y1 = 2, y2 = 6, y3 = -2 The LM doesn’t fit However, if we change the encoding to {blue, orange, green} = {2, 3, 1} the fit will be perfect.
  • 4. Athens BD Jun2018 | p4 Motivation 2. The use of integers to represent the values of categoricalinputs destructs the learning process by treating thegradient overdifferent categoriesunequally: Assume that the model function containsa multiplicativedependency wx∙x ie: f(x,...)= f(wx,...)for a categoricalx and we're provided with a training example where x = j. For any objective J, the partial derivativeat x = j is ∂J/∂wx|x=j ~ j ∙ ∂J/∂x|x=j The jth categorycontributes to model training j timesthe1st category! 3. What if a category contributes positively to the output and another category negatively? Using a single parameter to model the categoricalwill most probably sendtheparametertozero by the end of the training process!
  • 5. Athens BD Jun2018 | p5 Why CARTS do not require encoding? CARTS partitionthe input observable space using a sequence of coordinatesplitsthat greedily minimize an objective. By “greedily”we mean that the objective is minimized at eachsplit.A greedy optimum is not the optimum over all the possible partitions of the input space though. More formally, given a training set T = {X = [x1,...,xn], Y = (y1,...yn)}with xj Rk , j = 1,...,n A coordinatesplit at level 0 divides T in 2 subsets T1 = {X1, Y1} and T2 = {X2, Y2} such that the sum of the values of the objective applied to each subset is minimized. Level-0 loop:  coordinate  coordinate value evaluate the objective check minimum return the coord and coord-value of minimum
  • 6. Athens BD Jun2018 | p6 Why CARTS do not require encoding? In regressiontasks theobjective is the MSE of y's in Yj, j = 1,2. In a binaryclassificationtask, T1 is associated with class C1 and T2 with class C2 and the objective is the number of correct guesses of Cj in Tj The crucial thing is that for the splitting process to work: 1. the types of X and Y are not required to be numerical, 2. no ordering of the values of X and Y is implicitly assumed.
  • 7. Athens BD Jun2018 | p7 Learning Embeddings in Tensorflow We're using an example from the retail industry. The data is sales countsof prepared meat and burger products for a group of stores of a large food retailerin the US. Line items are salescount per store, calendarday andstockkeepingunit(SKU). The object is to estimate sales givena SKU, locationand day. We'll employ a FFNN of just a single hidden layer and an objective that is not the MSE because it is not suited for countdata. A random variableY∈Z+ is said to havethe Poisson distribution with parameter μ, if it takes positive integer values y = 0,1,2,... withprobability P(Y = y) = eμ⋅μy/y!
  • 8. Athens BD Jun2018 | p8 Learning Embeddings in Tensorflow The reason for using the aboveas a model for the distributionof SKU sales is its relationto the binomial distribution (Bernoulli trials): If Xj j = 1,2,... areindependent binomialsie Xj ~ 𝓑(πj) and Σjπj → μ < ∞ then ΣjXj ~ Poisson(μ) Fix a product say S, that sold n items yesterday at Wholefoods MidtownATL. Each Xj roughly represents a customer that buys S with probability πj and n = ΣjXj From this point the process of deriving a loss is pretty much standard: we set y = wTx where w is the weightvec and x the input and maximize the negativelog likelihood.
  • 9. Athens BD Jun2018 | p9 Input Encodings SKU IDs, calendar days and store locationsare OH encoded. This creates an input space of several hundred or thousand binary variablesdepending on the size of the assortment and the number of stores. This is an issue for memory as soon as the number of training examples are more than a few thousands (certain precautions can be takenthough!) OH Encoding (ohh…) Vector space encoding Insteadof store-ids we use geospatialcoordinates(lat | long). Calendar days are mapped to R2 using a VS representationthat brings closely together days around a year's end: day number → cos(2πj/365), sin(2πj/365)
  • 10. Athens BD Jun2018 | p10 How does it work? j embeddingj a( 1) W( 1) h( 1) a( n) h( n) =yhat b( 1) K- di m K-dimOtherinputs
  • 11. Athens BD Jun2018 | p11 The Tensorflow code (go to Jupyter)
  • 12. Athens BD Jun2018 | p12 The gain of SKU embedding Suppose the object is to estimate a kind of 'market-basket' whencashier transaction data is not availableie, groups of SKUs with approx the same sales across days and stores. This is a core problem in assortment planning: estimate the number | percentageof product items I'll need to stock for the next week|month|season. Probably more involvedis the use of assortments in demandforecasting:estimatea product's sales for the next period from its sales history. How is the aboverelated to the learnt VS embeddings of SKUs? The core insight is that neighboring values in the embedding space have similar sales across stores and days Well, not exactly: currently the best theoretical result we haveis this: m∙‖e1 - e2‖ ≤ Ex‖yhat(e1,x) - yhat(e2,x)‖≤ M∙‖e1 - e2‖ with m ≤ M The practice shows thoughtthat the insightholds
  • 13. Athens BD Jun2018 | p13 Embedding projectors An embedding projector tries to create a 2D or 3D scatterplot from a multidimensionalset of points. The purpose is to retain as much variancein the originalset as possible. PCA is the most widely used method howeverit fails in high dimensional spaces or complex geometries. The proposed method there is t-SNE. It learns the positions of 2|3D points by minimizing the KL divergence of probability distributions it defines for the original and space and its t-SNE image (what a hack!). The reference examples of MNIST and Word2Vecare in the tensorboard-projector page.
  • 14. Athens BD Jun2018 | p14 Telecom operators exploit the call graph of their subscribers using elementary or more advanced methods. Given a log of calls between subscribers (voice and texts) over a time period of N days they define the strengthofa relation betweensubscribers by the number and duration of calls they make to one another. An example from telecoms Variationstake into account the time of day, the day of week the uniformity of call frequency etc. A subscriber's X network | community are the subscribers with the strongest relationwith X. An approach in line with our discussion, is to use the call graph to mapthesubscribers inanembedding space. A subscriber's community are the nearest neighbors in the embedding space (obviously).
  • 15. Athens BD Jun2018 | p15 There're several benefits of this approach: ▪ Embeddingshavememory.As soon as a new call record becomes availablea few iterations of the neural network will accommodate the new information in the existing embedding vectors. This permits real-time community updates. ▪ Embeddings facilitatethe visualizationof variouscustomer-levelmeasureson their projected manifolds: We can view for example the distributionof rateplans or rateplan categories or the distribution of customer tenure over the embedding vectors. ▪ The most useful property thought, is the way embeddings can be used to predict the community of a new customer for whom there's no call log yet (but a few things are known initially eg the rateplan, service subscriptions and demographics). An example from telecoms
  • 16. Athens BD Jun2018 | p16 the you need to integrate your program into a larger process, interoperating ernal systems and processes. How far can we go? Word2Vecwas the first REALLY impressive use of a certain novel kind of word embedding. It constructs a languagemodel from a text corpus ie given a part of a sentence it will predict the rest of it. A direct consequence is machine translation: throw in a sentence in Greek and it will translate it to Swahili. Try this out in Google translate. More? Sunspring is the first movie script completely written by a machine 2 years ago
  • 17. Athens BD Jun2018 | p17 Thanxguys For more pizzas you can track me here: http://www.mltrain.cc http://www.linkedin.con/in.cmalliopoulos