Contenu connexe Similaire à Data Science for Retail Broking (20) Plus de AlgoAnalytics Financial Consultancy Pvt. Ltd. (9) Data Science for Retail Broking2. Page 2 © AlgoAnalytics All rights reserved
Overview
About AlgoAnalytics
Applications of Analytics in
Retail Broking
Retail Broking Case Studies
3. Page 3 © AlgoAnalytics All rights reserved
CEO and Company Profile
Aniruddha Pant
CEO and Founder of AlgoAnalytics
PhD, Control systems, University of
California at Berkeley, USA 2001
• 20+ years in application of advanced mathematical techniques
to academic and enterprise problems.
• Experience in application of machine learning to various
business problems.
• Experience in financial markets trading; Indian as well as global
markets.
Highlights
• Experience in cross-domain application of basic scientific
process.
• Research in areas ranging from biology to financial markets to
military applications.
• Close collaboration with premier educational institutes in India,
USA & Europe.
• Active involvement in startup ecosystem in India.
Expertise
• Vice President, Capital Metrics and Risk Solutions
• Head of Analytics Competency Center, Persistent Systems
• Scientist and Group Leader, Tata Consultancy Services
Prior Experience
• Work at the intersection of mathematics and other
domains
• Harness data to provide insight and solutions to our
clients
Analytics Consultancy
• +30 data scientists with experience in mathematics
and engineering
• Team strengths include ability to deal with
structured/ unstructured data, classical ML as well as
deep learning using cutting edge methodologies
Led by Aniruddha Pant
• Develop advanced mathematical models or solutions
for a wide range of industries:
• Financial services, Retail, economics, healthcare,
BFSI, telecom, …
Expertise in Mathematics and Computer
Science
• Work closely with domain experts – either from the
clients side or our own – to effectively model the
problem to be solved
Working with Domain Specialists
About AlgoAnalytics
4. AlgoAnalytics - One Stop AI Shop
Aniruddha Pant
CEO and Founder of AlgoAnalytics•We use structured data to design
our predictive analytics solutions
like churn, recommender sys
•We use techniques like clustering,
Recurrent Neural Networks
Structured
Data
•We used text data analytics for
designing solutions like sentiment
analysis, news summarization and
much more
•We use techniques like natural
language processing, word2vec,
deep learning, TF-IDF
Text Data
•Image data is used for predicting
existence of particular pathology,
image recognition and many
others
•We use techniques like deep
learning – convolutional neural
network, artificial neural networks
and technologies like TensorFlow
Image Data
•We use sound data to design
factory solutions like air leakage
detection, identification of empty
and loaded strokes from press
data, engine-compressor fault
detection
•We use techniques like deep
learning
Sound Data
Retail
•Churn Analysis
•Recommender System
•Image Analytics – image recognition,
tagging, original and substitute
differentiation
Healthcare
•Medical Image Diagnostics
•Work flow optimization
•Cash flow forecasting
Legal
•Contracts Management
•Structured Document decomposition
•Document similarity in text analytics
Internet of Things
•Assisted Living
•Predictive in ovens
•Air leakage detection
•Engine/compressor fault detection
Others
•Algorithmic trading strategies
•Risk sensing – network theory
•Network failure model
•Multilanguage sentiment analytics
5. Page 5 © AlgoAnalytics All rights reserved
Technology:
6. Page 6 © AlgoAnalytics All rights reserved
Data Science for Broking
Predict Dormancy – Finding
which clients are unlikely to
trade and take action
Stock Recommender System –
Suggesting stocks that will
increase probability of trading
for a particular customer
Dormancy visualization
Identifying KPIs for visualizing
Dormancy
RM change – Assessing impact
of change of relationship
manager on trading activity
Mobile Brokerage –
Origination, pricing and
valuation of loans
Dynamic pricing models -
Predicting brokerage slabs and
sensitivities of trading volumes
to broking charges
Channel adoption and
preference – Use
demographics and trading data
to build a classification model
Portfolio Analytics – Analysis
of client portfolio and
suggestions for changes as per
risk profile
Risk profiling – margin and
default risks
Document similarity in text
analytics – Automatic email
classification, determining the
topic of complaint etc.
Automated news download –
downloading news that is
relevant for customers
Sentiment analysis – Using
text analytics to decide
customer response to various
offerings
News Summarization –
Automatic download of
relevant news items, News
summarization
7. Page 7 © AlgoAnalytics All rights reserved
Customer churn
Customer churn prediction : take customers past activity data and use distributed file
systems and cluster computing to predict churn
Process
Data
Compute
Features
Train the
classification
model
Predict /
Score
• Target with loyalty
programs
Predicted
Returning
Customers
• Target with other
offers and discounts
Predicted ‘Not
Returning’
Customers
8. Page 8 © AlgoAnalytics All rights reserved
Q5 Customers
with no trades
were marked
as DORMANT
Test data
Label
Q1 Q2 Q3 Q4
Q2 Q3 Q4 Q5
Modelling
(Machine learning
Algorithms) Result
Evaluation
Prediction
• Train data
• Data aggregation
quarter-wise
Trades data
Roughly 6%
of the clients
responsible for
~80% of the loss
Past
Brokerage
Number of
Trades
Margin
Amount
Exchange
Ledger
Amount
Examples of features used
Client profiles in terms of attributes computed from past trading.
- Active clients = 1.03Mn
- Active clients for which trade data is available = 346K
- Average count of active clients who traded at least once during train
period = 254K
Prediction for
quarter
Jul – Sep 2015 Oct – Dec 2015
Accuracy 81.10% 78.30%
Sensitivity 88.35% 75.42%
Specificity 72.78% 81.90%
Prevalence 53.4% 55.57%
AUC 89.56% 88.21%
Total clients 252845 255873
% Growth in Nifty -5.01% -0.03%
Dormancy Prediction: predicts customers likely to stop trading
• INR 1.6 M brokerage from 2,200 (11% of 20K - CRM assigned )
reactivated clients.
• INR 309 K brokerage from 1,881 (4.8% of 39K – CRM not assigned )
reactivated clients
9. Page 9 © AlgoAnalytics All rights reserved
Recommender System
What is RecSys? Value of Recommendation
RecSys Modeling and Applications
Aims to predict user preferences based on
historical activity and implicit / explicit
feedback
Helps in presenting the most relevant
information (e.g. list of products / services)
Collaborative filtering: User’s
behavior, similar users
Content-based filtering: using
discrete characteristic of items
*Movies, music, news, books,
search queries, social tags, etc.
* Financial services, insurance
Intel business unites (BUs), sales
and marketing
Nearest Neighbor modeling
Matrix factorization and
factorization machines
Classification learning model
10. Page 10 © AlgoAnalytics All rights reserved
Stock Based Recommender System
Data Filtering
•Discard Short Lived Sessions
•Remove Rare Items
•Consider only top ‘n’ most
popular items
Training and Testing
•Training, Validation and
Testing set
•Deep learning
•Final Recommendations
Evaluation
•Recall: Number of times
actual next item in the
sequence is in top ‘k’
recommendations
Observed Recall@5 – 30.39%
Last 5 stocks bought Recommended Stock Probability
Intraday
INFOSYS_TECHNOLOGIES
Intraday State Bank of
India
0.1629
Intraday
MOTHERSON_SUMI_SYST
EM
Intraday L&T 0.137
Intraday UTI BANK LTD
Longterm ICICI Banking
Copora
0.124
Intraday ASIAN PAINT
Intraday ICICI Banking
Copora
0.0709
Intraday LIC HOUSING FIN Intraday Bharat Forge 0.061
11. Page 11 © AlgoAnalytics All rights reserved
RM Change Risk Evaluation
RM’s Customer Ranking
•Rank clients under an RM in
terms of their net worth and
brokerage
Change in Ranking
•Determine changes in
ranking of net worth and
brokerage
Develop Classification Model
•Use RM and client features
to determine whether a
client will rank worse after
an RM change
• Evaluating the impact of the RM change by identifying the customers at risk who are associated with/managed by the
same RM leaving the organization :
Take 90 days before
and after an RM
change
Sum pre and post
brokerage and net
worth
Sort clients according
to brokerage and net
worth ranking
Develop model to
predict change in
ranking on RM change
• Results based on changes in net worth and brokerage revenue:
Accuracy Sensitivity Specificity Prevalence Kappa AUC
Brokerage 74.53% 70.92% 77.02% 40.88% 0.4764 83.09%
Net Worth 74.88% 73.53% 76.07% 46.76% 0.4957 82.28%
• Results:
12. Page 12 © AlgoAnalytics All rights reserved
Cluster Discovery, Description & Visualization using Predicted Labels
Cluster Discovery
Cluster analysis can help identify groups or clusters of similar clients.
Functional description of each cluster can help make inferences on client dormancy.
Cluster analysis can be performed for predicted dormants.
Features such as traded days, net worth, turnover (TO) may group clients into separate
clusters.
Cluster visualization
Prominent 2D View of Clusters
3D Visualization for Cluster Separation
Cluster
Size
Q4 Traded
Days
Q4 Min Net
Worth
(Lacs)
M3 Traded
Days
M3 Peak
Net Worth
(Lacs)
Q4 NSE
Traded
Days
13736 1.48 1.43 0.00 0.00 1.27
373 17.97 0.20 0.76 0.19 17.68
7952 1.22 0.60 1.07 0.82 1.09
Cluster Description
Each cluster can be used as a business action item for reaching out to clients
Represent each cluster by a single point in the feature space.
Make inferences on client behavior from these representation.
13. Page 13 © AlgoAnalytics All rights reserved
Mobile Brokerage Analysis
For example: In June there are 8693 clients on mobile, 5390 clients remained till
Jan 2016. We computed different brokerage features based on the client
present in June.
Take the number
of mobile and
non-mobile users
in June 2016
Trace the same
set clients in the
following months
Analyze the
difference in their
trade behavior
over the months
14. Page 14 © AlgoAnalytics All rights reserved
Non-Mobile
For Clients with Mobile
Brokerage<=0 in June' 15
(# of Traded Clients on
Mobile
Total # of Traded
Clients
Total Brokerage Average Total
brokerage
Drop in average
total brokerage in
non-mobile
Ratio of total
clients retained
in non-mobile
customers
June 2015 0 48669 121861024.8 2503.873612 1 1
July 2015 1878 32408 128207743 3956.052303 1.57997284 0.6659
August 2015 3625 31692 124188548.5 3918.608751 1.56501859 0.6512
September 2015 3475 29102 95712270.28 3288.855415 1.313506959 0.5980
October 2015 2968 24533 94536048.35 3853.423892 1.538984984 0.5041
November 2015 2970 23876 84523884.11 3540.11912 1.413856954 0.4906
December 2015 3575 24383 75354303.71 3090.444314 1.234265299 0.5010
January 2016 3179 23220 63526022.15 2735.832134 1.092639868 0.4771
Mobile
For Clients with Mobile
Brokerage>0 in June' 15
# of Traded Clients
on Mobile
Total # of Traded
Clients
Total Brokerage Average Total brokerage
Drop in average
total brokerage in
mobile
Ratio of total
clients retained
in mobile
customers
June 2015 8693 8693 31603303.92 3635.488775 1 1
July 2015 5412 7244 31868831.43 4399.341721 1.21011011 0.8333
August 2015 5474 7089 31138534.24 4392.514351 1.208232131 0.8155
September 2015 4606 6417 23532889.71 3667.272824 1.008742717 0.7382
October 2015 4036 5693 20816096.02 3656.437031 1.005762157 0.6549
November 2015 3874 5565 19891240.14 3574.346835 0.98318192 0.6402
December 2015 3961 5556 17288433.75 3111.669141 0.855914936 0.6391
January 2016 3823 5390 14874570.71 2759.660613 0.759089296 0.6200
Mobile Brokerage Analysis: Results
0
0.2
0.4
0.6
0.8
1
1.2
Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-16
Comparison of clients retained
Ratio of total clients retained in mobile customers
Ratio of total clients retained in non-mobile customers0
0.5
1
1.5
2
Jun-15 Jul-15 Aug-15 Sep-15 Oct-15 Nov-15 Dec-15 Jan-16
Comparing average drop in total brokerage in
mobile and non-mobile
Drop in average total brokerage in mobile
Drop in average total brokerage in non-mobile
Total mobile clients retained = 0.62
Total non-mobile clients retained =0.48 More mobile clients are retained than the non-mobile clients.
15. Page 15 © AlgoAnalytics All rights reserved
Text Analytics for Retail broking
16. Page 16 © AlgoAnalytics All rights reserved
Text analytics
Identifying Important News Item News Summarization
Automatic summarization is the process of reducing a text
document with computer program in order to create summary
that retains the most important points of the original
document
Creation of algorithm that separates out market/ stock
relevant news and filters out noise using machine learning
Supervised and unsupervised machine learning
approaches can be tried to enhance the performance of
the model.
Latest research from Natural Language Processing(NLP)
and text analytics can yield interesting results.
Extraction
•Includes topic detection,
scoring and extraction of
most relevant text
segment
ML
Techniques
•Techniques like RNNLM
(recurrent neural network
language modeling and
recurrent convolutional
neural networks will be
applied
17. Page 17 © AlgoAnalytics All rights reserved
Document Similarity in Text Analytics
Problem Statement - Finding semantically similar clause from standard clause library for each clause from input document
18. Page 18 © AlgoAnalytics All rights reserved
Algorithmic Trading Strategy
Methodology Technology Infrastructure