NO1 Certified Ilam kala Jadu Specialist Expert In Bahawalpur, Sargodha, Sialk...
Machine learning in finance using python
1. MACHINE LEARNING IN FINANCE
USING PYTHON
ERIC THAM
Director, Quant Strategies
Presentation Slides on
http://www.slideshare.net/erictham/machine-learning-in-finance-using-python
3. MACHINE LEARNING IN FINANCE
Questions :
How do u recognise finance patterns … ?
What data? What do u use it for ?
Unlike normal usage for facial recognition, NLP
4. MACHINE LEARNING IN FINANCE
i. Sentiment analysis : (Behavoiural finance)
ii. Credit analytics
iii. Financial forecasting
iv. Portfolio allocation
5. MACHINE LEARNING PYTHON LIBRARIES
Libraries:
i. sci-kit learn
ii. Theano
iii. Stats-model
Sentiment analysis generally use machine learning.
6. GENERAL FORECASTING: (MACHINE LEARNING)
3 steps to any forecasting: (or machine learning)
1. Preprocess and transform data:
- On both output and input: this is key; it is an art and a science;
- in finance: these could be economic variables, sentiment data, price data
2. Model :
- CART, neural network, logistic regression etc.
- time period
3. Assess and backtest
- statistical output;
- in sample and out of sample
Go back to 1 if necessary.
7. BUILDING A FINANCIAL FORECASTING MODEL IN
PYTHON
1. Sourcing data - retrieves data from sources eg quandl, pandas.io, Yahoo
finance, proprietary databases (go to datasource.py file)
8. BUILDING FINANCIAL FORECASTING MODEL IN
PYTHON
1 .. Technical transformation on data (dataTechnical.py)
- technical indicators like RSI, MACD, KDJ:
10. BUILDING FINANCIAL FORECASTING MODEL IN
PYTHON
Training - applies different model parameters (possibly 1000s combinations) to
assess best results
Go to dataTrain.py
11. PORTFOLIO SELECTION & ALLOCATION
1. clusterPortfolio.py (K-means)
- aggregates stock features eg. sentiment, technical indicators,
momentum indicators, historical returns, betas etc.
- X n * m : model with n stocks each with m features each
- these are clustered into K clusters with the best cluster being
selected)
- criteria to use: means scores, risk levels, portfolio themes, backtest
results etc.
13. CONCLUSION:
Thank you !
Remember it is an art not a science; machine learning in finance gives you
a framework to understand the system;
Still need intuition and trial-and-error (luck)
My Email : erictham115@yahoo.com
Notes de l'éditeur
A self introduction of myself:
Studied phd in finance in University of Lausanne/ Switzerland 洛桑大学
Masters in Financial engineering in Columbia University 哥伦比亚 大学
Masters in Business Analytics (Big Data) in National University of Singapore
Presently a partner in a data analytics start-up doing web and consumer analytics
Now, have an interest in Big Data, and especially in NLP in finance. Paper : real time analysis of twitter sentiment on the NASDAQ markets. Hoping to get it published with some more work!
First real-time (20 mins) different from other papers
Some interesting findings (to elaborate later)
Definitions in wikipedia…
Key words – supervised learning in layman terms uses a reference (learning from past experiences) whilst unsupervised learning learns from unlabelled data eg clustering, PCA
questions need to be answered in context; a few areas that I think of as follows:
the answers: will not talk too much on sentiment analysis there is a talk previously on NLTK+ ;
number of other open source libraries as well like jieba : NLP (and sentiment analysis as a whole uses SVM and recurrent neural network)
- Unstructured data analysis
See my link on twitter mood drives markets;
Writing a paper on sentiment drives markets and markets drives sentiment – hope to complete it this couple of months
Credit analytics: uses classification on credit scoring : logistic regression; tree-based regression:
Assesses a person credit-worthiness based on his credit scores
The following two not the main point of my presentation; but the next two more so;
Not my aim to go through excellent ML libraries but will share those that I use and apply esp sci-kit lean and statsmodel
Separate presentation (I understand) using NLTK; and another Theano expert (Deep learning) which I will not touch on then!
Scikit learn and statsmodel -> both good; scikit-learn has more functions generally; for ordinary regressions good enough to use statsmodel
Step 1: actually tests your understanding of the subject matter;
Transformation could be normalisation, threshold
-> normally involves categorisation; or a mixture model ; frequency of data
Anything
Step 2: Not necessary complex models best: model complexity tend to be defined by parameterisation, non-linearity, time-varyingness (stochasticity), meta-models
Number of dimensions (of data),
In forecasting, the model basically says given this scenario or set of data under this situation, u should get this output with a certain degree of probability.
It is the same with other machine learning in computer science – whether NLP, speech etc.
Step 3: did the model achieve what you want?
Why is financial forecasting so difficult? Because it is social science! It is hard to deterministically human emotions, reactions and actions;
Structural changes to model
See code in github;
Criteria can be risk, different returns, drawdown; sharpe ratio etc
See code in github;
Criteria can be risk, different returns, drawdown; sharpe ratio etc
See code in github;
Criteria can be risk, different returns, drawdown; sharpe ratio etc
Code: See python slide
See code in github:
In portfolio allocation,
Imaibo has the advantage in it has the sentiment data.
See code in github:
In portfolio allocation,
Imaibo has the advantage in it has the sentiment data.