SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Talk @ Taiwan AI Academy, November 17, 2018
Textual Data Analytics in Finance
Dr. Chuan-Ju Wang (王釧茹)
Research Center for Information Technology
Innovation, Academia Sinica
Computational Finance and Data Analytics
Laboratory (CFDA Lab)
http://cfda.csie.org
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Quant — Data Scientist
Source: http://www.indeed.com/jobtrends
Source: http://www.computerweekly.com/blogs/Data-Matters/2014/06/data-scientist-the-new-quant.html
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Data Science in Finance
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Text Analytics
❖ Big Data
❖ Structured Data
❖ user logs, sensor logs, click through logs, …
❖ Unstructured Data
❖ web texts, user conversions, public opinions, reports…
❖ Big Data for Unstructured Text – Text Analytics
❖ Goal — Turn text into data for analysis, via application of
natural language processing (NLP) and analytical methods
https://insidebigdata.com/2015/06/05/text-analytics-the-next-generation-of-big-data/
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Textual Sentiment Analysis for
Financial Risk Prediction
On the Risk Prediction and Analysis of Soft
Information in Finance Reports. European Journal of
Operational Research (EJOR), 257(1), 243-250, 2017.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Soft and Hard Information in Finance
❖ Growing amount of financial data makes it more and more important
to learn how to discover valuable information for various financial
applications.
❖ In finance, there are typically two kinds of information:
❖ Soft information: text, including opinions, ideas, and market
commentary.
❖ Hard information: numerical values, such as financial measures and
historical prices.
❖ Our work aims to exploit soft information for financial risk prediction.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Risk Proxy: Stock Return Volatility
❖ Stock return
❖ Stock return volatility
❖ A common risk metric measured by the standard
deviation of returns over a period of time.
Rt =
(St St 1)
St 1
v[t n,t] =
t
i=t n(Ri R)2
n
, where R =
t
i=t n
Ri
(n + 1)
.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Financial Sentiment Analysis
❖ In this work, we attempt to apply sentiment analysis on the
risk prediction task.
❖ A finance-specific sentiment lexicon is adopted for analysis.
❖ Two machine learning techniques are adopted for the task:
❖ Regression approach: Predict the stock return volatilities.
❖ Ranking approach: Rank the companies to be in line
with their relative risk levels.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Financial Sentiment Lexicon
❖ Words in finance domain and in general usage usually have
different meanings, such as
❖ vice: immoral or wicked behavior
❖ vice: secondary (in finance context)
❖ Almost three-fourths of the words in the 10-K financial reports
from year 1994 to 2008, which are identified as negative by the
widely used Harvard Psychosociological Dictionary, are
typically not considered negative in financial contexts.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Six Finance-Specific Lexicons
❖ Loughran and McDonald (2011)
❖ When is a liability not a liability? textual analysis, dictionaries,
and 10-ks. Journal of Finance.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Problem Formulation
❖ Predict target: Future’s stock return volatility (regression) and
future’s relative risk levels (ranking)
❖ Features
❖ Soft textual information: All words or financial sentiment words
❖ Hard numerical information: The twelve months before the
report volatility for each company
v(+12)
2007/3/222006/3/22
Report filing date
2005/3/22
v(-12)
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Corpora: The 10-K Corpus
❖ A Form 10-K is an annual report required by the U.S. Securities and Exchange Commission (SEC)
❖ Only section 7 “management’s discussion and analysis of financial conditions and results of operations”(MD&A)
❖ The Sarbanes-Oxley Act of 2002: Explain the drastic increase in length during the 2002-2003 period
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Experimental Results
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Financial Sentiment Terms Analysis
amend
deficit
forbear
delist
defaultsureti
discontinu
wherebi
unabl
disput
concern
profit
violat
regain
uncom
-plet
accid
abl
integr
grantor
ceg
nasdaq
gnb
coven
forbear
waiver
sureti
excelsior
rais
ebix
shelbour
nplacement
syndic
pfc
stage
same
driver
default
small-
cap
seri
hearth
awg
amend
libert
special
benefici sever
breach
doubt
Fin-Neg
Fin-Pos
Fin-Lit
Fin-Unc
Non
SEN
ORG
1
1
2
3
4
5
2
3
4
5
deficit
deficits
default
defaulted
defaulting
defaults
delist
delisted
deslisting
delists
amend
amendable
amendatory
amended
amending
amendment
amendments
amends
forbear
forbearance
forbearances
forbearing
forbears
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
FIN10K Prototype Demo
https://cfda.csie.org/10K/
FIN10K: A Web-based Information System for
Financial Report Analysis and Visualization.
ACM CIKM (Demo paper), 2016.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Financial Keyword Expansion via
Continuous Word Vector Representations
Discovering Finance Keywords via Continuous
Space Language Models. ACM Transactions on
Management Information Systems, 7(3), 7:1-7:17, 2016.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Sentiment Analysis — the Lexicon
❖ For sentiment analysis, the lexicon is one of the most
important and common resources.
❖ Usually have a great impact on results and the
corresponding analyses
❖ In finance, the lexicon is usually semi-manually generated.
❖ Result in inadequate words
❖ In this work, we attempt to use the advanced continuous space
language models to expand finance keywords automatically.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Continuous Space Language Models
❖ “You shall know a word by the company it keeps”

(J. R. Firth 1957)
❖ One of the most successful ideas of modern statistical NLP!
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Continuous Space Language Models
❖ Continuous space language models
❖ a.k.a. Continuous word embeddings
❖ Words are represented as low-rank dense vectors.
❖ Recent studies show their superiority in capturing
syntactic and contextual regularities in language.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Keyword Expansion
❖ Our Proposed Keyword Expansion Method
❖ Adapt this technique to incorporate syntactic
information to capture more similarly meaningful
keywords.
❖ Learn vector representations of words via a large
collection of financial reports (domain-specific)
❖ Words in the financial sentiment lexicon are used as seed
words to obtain those within the top N cosine distances.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Keyword Expansion
❖ Keyword Expansion with Syntactic Information
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
The New 10-K Corpus
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Four Prediction Tasks
❖ Four prediction tasks are conducted.
❖ To demonstrate that our approach is effective for
discovering predictability keywords
1) Post-event volatility
2) Stock volatility
3) Abnormal trading volume
4) Excess returns
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Postevent Volatility Prediction
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
FIN10K Prototype Demo
https://cfda.csie.org/10K/
FIN10K: A Web-based Information System for Financial Report Analysis
and Visualization. ACM CIKM (Demo paper), 2016.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Beyond Word-Level Analysis
❖ Multi-word expression detection and analysis
❖ Beyond Word-Level to Sentence-Level Sentiment Analysis for
Financial Reports
❖ RiskFinder: A Sentence-level Risk Detector for Financial Reports,
NAACL’18
❖ https://cfda.csie.org/RiskFinder/
❖ FRIDAYS: A Financial Risk Information Detecting and Analyzing
System, AAAI’18
❖ https://cfda.csie.org/FRIDAYS/
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Summary
❖ If structured data is big, then unstructured data is huge.
❖ 20% (structured) vs. 80% (unstructured)
❖ There is a massive potential waiting to be leveraged in
the analysis of unstructured data in the field of finance.
Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018
Thanks for Your Listening!

Contenu connexe

Similaire à [2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹

Presenting Results 成果展示
Presenting Results 成果展示Presenting Results 成果展示
Presenting Results 成果展示Dadang Solihin
 
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlBuilding a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlAlexander Panchenko
 
Future Watch, China's Big Data Ecosystem Update
Future Watch, China's Big Data Ecosystem UpdateFuture Watch, China's Big Data Ecosystem Update
Future Watch, China's Big Data Ecosystem UpdateTeam Finland Future Watch
 
IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...
IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...
IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...Dr. Haxel Consult
 
The Role of Venture Capital in the US Economy
The Role of Venture Capital in the US EconomyThe Role of Venture Capital in the US Economy
The Role of Venture Capital in the US EconomyMark J. Feldman
 
Tracxn Research - Robo Advisors Report, June 2017
Tracxn Research - Robo Advisors Report, June 2017Tracxn Research - Robo Advisors Report, June 2017
Tracxn Research - Robo Advisors Report, June 2017Tracxn
 
assessment 1 Submission dat e 14 - Apr- 2018 0833AM.docx
assessment 1   Submission dat e  14 - Apr- 2018 0833AM.docxassessment 1   Submission dat e  14 - Apr- 2018 0833AM.docx
assessment 1 Submission dat e 14 - Apr- 2018 0833AM.docxfestockton
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Novi Labs-Tudor Pickering & Holt Energy Disruption Conference
Novi Labs-Tudor Pickering & Holt Energy Disruption Conference Novi Labs-Tudor Pickering & Holt Energy Disruption Conference
Novi Labs-Tudor Pickering & Holt Energy Disruption Conference Jon Ludwig
 
Preparing our Students for an AI Future
Preparing our Students for an AI FuturePreparing our Students for an AI Future
Preparing our Students for an AI FutureSanford Dickert
 
2017 AI Index report
2017 AI Index report2017 AI Index report
2017 AI Index reportsean22
 
Aly factsheet May_2016_white
Aly factsheet May_2016_whiteAly factsheet May_2016_white
Aly factsheet May_2016_whiteCapitalcube CC
 
Aly presentation nov 2017
Aly presentation nov 2017Aly presentation nov 2017
Aly presentation nov 2017Capitalcube CC
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
 
Tracxn Research - Tutoring Landscape, January 2017
Tracxn Research - Tutoring Landscape, January 2017Tracxn Research - Tutoring Landscape, January 2017
Tracxn Research - Tutoring Landscape, January 2017Tracxn
 
F-Prime Capital: Market Perspective, 2018
F-Prime Capital: Market Perspective, 2018 F-Prime Capital: Market Perspective, 2018
F-Prime Capital: Market Perspective, 2018 F-Prime Capital
 
Tracxn Research — Enterprise Storage Landscape, November 2016
Tracxn Research — Enterprise Storage Landscape, November 2016Tracxn Research — Enterprise Storage Landscape, November 2016
Tracxn Research — Enterprise Storage Landscape, November 2016Tracxn
 

Similaire à [2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹 (20)

Presenting Results 成果展示
Presenting Results 成果展示Presenting Results 成果展示
Presenting Results 成果展示
 
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common CrawlBuilding a Web-Scale Dependency-Parsed Corpus from Common Crawl
Building a Web-Scale Dependency-Parsed Corpus from Common Crawl
 
Future Watch, China's Big Data Ecosystem Update
Future Watch, China's Big Data Ecosystem UpdateFuture Watch, China's Big Data Ecosystem Update
Future Watch, China's Big Data Ecosystem Update
 
IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...
IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...
IC-SDV 2018: Stefan Geißler (Expert System) Navigating to new shores: the Bio...
 
Resume_Mahendra Kalyan
Resume_Mahendra KalyanResume_Mahendra Kalyan
Resume_Mahendra Kalyan
 
Resume_Mahendra Kalyan
Resume_Mahendra KalyanResume_Mahendra Kalyan
Resume_Mahendra Kalyan
 
The Role of Venture Capital in the US Economy
The Role of Venture Capital in the US EconomyThe Role of Venture Capital in the US Economy
The Role of Venture Capital in the US Economy
 
Leap business plan
Leap business planLeap business plan
Leap business plan
 
Tracxn Research - Robo Advisors Report, June 2017
Tracxn Research - Robo Advisors Report, June 2017Tracxn Research - Robo Advisors Report, June 2017
Tracxn Research - Robo Advisors Report, June 2017
 
assessment 1 Submission dat e 14 - Apr- 2018 0833AM.docx
assessment 1   Submission dat e  14 - Apr- 2018 0833AM.docxassessment 1   Submission dat e  14 - Apr- 2018 0833AM.docx
assessment 1 Submission dat e 14 - Apr- 2018 0833AM.docx
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Novi Labs-Tudor Pickering & Holt Energy Disruption Conference
Novi Labs-Tudor Pickering & Holt Energy Disruption Conference Novi Labs-Tudor Pickering & Holt Energy Disruption Conference
Novi Labs-Tudor Pickering & Holt Energy Disruption Conference
 
Preparing our Students for an AI Future
Preparing our Students for an AI FuturePreparing our Students for an AI Future
Preparing our Students for an AI Future
 
2017 AI Index report
2017 AI Index report2017 AI Index report
2017 AI Index report
 
Aly factsheet May_2016_white
Aly factsheet May_2016_whiteAly factsheet May_2016_white
Aly factsheet May_2016_white
 
Aly presentation nov 2017
Aly presentation nov 2017Aly presentation nov 2017
Aly presentation nov 2017
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
 
Tracxn Research - Tutoring Landscape, January 2017
Tracxn Research - Tutoring Landscape, January 2017Tracxn Research - Tutoring Landscape, January 2017
Tracxn Research - Tutoring Landscape, January 2017
 
F-Prime Capital: Market Perspective, 2018
F-Prime Capital: Market Perspective, 2018 F-Prime Capital: Market Perspective, 2018
F-Prime Capital: Market Perspective, 2018
 
Tracxn Research — Enterprise Storage Landscape, November 2016
Tracxn Research — Enterprise Storage Landscape, November 2016Tracxn Research — Enterprise Storage Landscape, November 2016
Tracxn Research — Enterprise Storage Landscape, November 2016
 

Plus de 台灣資料科學年會

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用台灣資料科學年會
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告台灣資料科學年會
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機台灣資料科學年會
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇台灣資料科學年會
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 台灣資料科學年會
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵台灣資料科學年會
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用台灣資料科學年會
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告台灣資料科學年會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話台灣資料科學年會
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人台灣資料科學年會
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維台灣資料科學年會
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察台灣資料科學年會
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰台灣資料科學年會
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達台灣資料科學年會
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳台灣資料科學年會
 

Plus de 台灣資料科學年會 (20)

[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用[台灣人工智慧學校] 人工智慧技術發展與應用
[台灣人工智慧學校] 人工智慧技術發展與應用
 
[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告[台灣人工智慧學校] 執行長報告
[台灣人工智慧學校] 執行長報告
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
[台灣人工智慧學校] 開創台灣產業智慧轉型的新契機
 
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
[台灣人工智慧學校] 台北總校第三期結業典禮 - 執行長談話
 
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
[TOxAIA台中分校] AI 引爆新工業革命,智慧機械首都台中轉型論壇
 
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察 [TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
[TOxAIA台中分校] 2019 台灣數位轉型 與產業升級趨勢觀察
 
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
[TOxAIA台中分校] 智慧製造成真! 產線導入AI的致勝關鍵
 
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用[台灣人工智慧學校] 從經濟學看人工智慧產業應用
[台灣人工智慧學校] 從經濟學看人工智慧產業應用
 
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
[台灣人工智慧學校] 台中分校第二期開學典禮 - 執行長報告
 
台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會台灣人工智慧學校成果發表會
台灣人工智慧學校成果發表會
 
[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話[台中分校] 第一期結業典禮 - 執行長談話
[台中分校] 第一期結業典禮 - 執行長談話
 
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
[TOxAIA新竹分校] 工業4.0潛力新應用! 多模式對話機器人
 
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
[TOxAIA新竹分校] AI整合是重點! 竹科的關鍵轉型思維
 
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
[TOxAIA新竹分校] 2019 台灣數位轉型與產業升級趨勢觀察
 
[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰[TOxAIA新竹分校] 深度學習與Kaggle實戰
[TOxAIA新竹分校] 深度學習與Kaggle實戰
 
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
[台灣人工智慧學校] Bridging AI to Precision Agriculture through IoT
 
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
[2018 台灣人工智慧學校校友年會] 產業經驗分享: 如何用最少的訓練樣本,得到最好的深度學習影像分析結果,減少一半人力,提升一倍品質 / 李明達
 
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
[2018 台灣人工智慧學校校友年會] 啟動物聯網新關鍵 - 未來由你「喚」醒 / 沈品勳
 

Dernier

Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 

Dernier (20)

Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 

[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹

  • 1. Talk @ Taiwan AI Academy, November 17, 2018 Textual Data Analytics in Finance Dr. Chuan-Ju Wang (王釧茹) Research Center for Information Technology Innovation, Academia Sinica Computational Finance and Data Analytics Laboratory (CFDA Lab) http://cfda.csie.org
  • 2. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Quant — Data Scientist Source: http://www.indeed.com/jobtrends Source: http://www.computerweekly.com/blogs/Data-Matters/2014/06/data-scientist-the-new-quant.html
  • 3. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Data Science in Finance
  • 4. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Text Analytics ❖ Big Data ❖ Structured Data ❖ user logs, sensor logs, click through logs, … ❖ Unstructured Data ❖ web texts, user conversions, public opinions, reports… ❖ Big Data for Unstructured Text – Text Analytics ❖ Goal — Turn text into data for analysis, via application of natural language processing (NLP) and analytical methods https://insidebigdata.com/2015/06/05/text-analytics-the-next-generation-of-big-data/
  • 5. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Textual Sentiment Analysis for Financial Risk Prediction On the Risk Prediction and Analysis of Soft Information in Finance Reports. European Journal of Operational Research (EJOR), 257(1), 243-250, 2017.
  • 6. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Soft and Hard Information in Finance ❖ Growing amount of financial data makes it more and more important to learn how to discover valuable information for various financial applications. ❖ In finance, there are typically two kinds of information: ❖ Soft information: text, including opinions, ideas, and market commentary. ❖ Hard information: numerical values, such as financial measures and historical prices. ❖ Our work aims to exploit soft information for financial risk prediction.
  • 7. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Risk Proxy: Stock Return Volatility ❖ Stock return ❖ Stock return volatility ❖ A common risk metric measured by the standard deviation of returns over a period of time. Rt = (St St 1) St 1 v[t n,t] = t i=t n(Ri R)2 n , where R = t i=t n Ri (n + 1) .
  • 8. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Financial Sentiment Analysis ❖ In this work, we attempt to apply sentiment analysis on the risk prediction task. ❖ A finance-specific sentiment lexicon is adopted for analysis. ❖ Two machine learning techniques are adopted for the task: ❖ Regression approach: Predict the stock return volatilities. ❖ Ranking approach: Rank the companies to be in line with their relative risk levels.
  • 9. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Financial Sentiment Lexicon ❖ Words in finance domain and in general usage usually have different meanings, such as ❖ vice: immoral or wicked behavior ❖ vice: secondary (in finance context) ❖ Almost three-fourths of the words in the 10-K financial reports from year 1994 to 2008, which are identified as negative by the widely used Harvard Psychosociological Dictionary, are typically not considered negative in financial contexts.
  • 10. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Six Finance-Specific Lexicons ❖ Loughran and McDonald (2011) ❖ When is a liability not a liability? textual analysis, dictionaries, and 10-ks. Journal of Finance.
  • 11. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Problem Formulation ❖ Predict target: Future’s stock return volatility (regression) and future’s relative risk levels (ranking) ❖ Features ❖ Soft textual information: All words or financial sentiment words ❖ Hard numerical information: The twelve months before the report volatility for each company v(+12) 2007/3/222006/3/22 Report filing date 2005/3/22 v(-12)
  • 12. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Corpora: The 10-K Corpus ❖ A Form 10-K is an annual report required by the U.S. Securities and Exchange Commission (SEC) ❖ Only section 7 “management’s discussion and analysis of financial conditions and results of operations”(MD&A) ❖ The Sarbanes-Oxley Act of 2002: Explain the drastic increase in length during the 2002-2003 period
  • 13. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Experimental Results
  • 14. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Financial Sentiment Terms Analysis amend deficit forbear delist defaultsureti discontinu wherebi unabl disput concern profit violat regain uncom -plet accid abl integr grantor ceg nasdaq gnb coven forbear waiver sureti excelsior rais ebix shelbour nplacement syndic pfc stage same driver default small- cap seri hearth awg amend libert special benefici sever breach doubt Fin-Neg Fin-Pos Fin-Lit Fin-Unc Non SEN ORG 1 1 2 3 4 5 2 3 4 5 deficit deficits default defaulted defaulting defaults delist delisted deslisting delists amend amendable amendatory amended amending amendment amendments amends forbear forbearance forbearances forbearing forbears
  • 15. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 FIN10K Prototype Demo https://cfda.csie.org/10K/ FIN10K: A Web-based Information System for Financial Report Analysis and Visualization. ACM CIKM (Demo paper), 2016.
  • 16. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Financial Keyword Expansion via Continuous Word Vector Representations Discovering Finance Keywords via Continuous Space Language Models. ACM Transactions on Management Information Systems, 7(3), 7:1-7:17, 2016.
  • 17. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Sentiment Analysis — the Lexicon ❖ For sentiment analysis, the lexicon is one of the most important and common resources. ❖ Usually have a great impact on results and the corresponding analyses ❖ In finance, the lexicon is usually semi-manually generated. ❖ Result in inadequate words ❖ In this work, we attempt to use the advanced continuous space language models to expand finance keywords automatically.
  • 18. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Continuous Space Language Models ❖ “You shall know a word by the company it keeps”
 (J. R. Firth 1957) ❖ One of the most successful ideas of modern statistical NLP!
  • 19. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Continuous Space Language Models ❖ Continuous space language models ❖ a.k.a. Continuous word embeddings ❖ Words are represented as low-rank dense vectors. ❖ Recent studies show their superiority in capturing syntactic and contextual regularities in language.
  • 20. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Keyword Expansion ❖ Our Proposed Keyword Expansion Method ❖ Adapt this technique to incorporate syntactic information to capture more similarly meaningful keywords. ❖ Learn vector representations of words via a large collection of financial reports (domain-specific) ❖ Words in the financial sentiment lexicon are used as seed words to obtain those within the top N cosine distances.
  • 21. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Keyword Expansion ❖ Keyword Expansion with Syntactic Information
  • 22. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 The New 10-K Corpus
  • 23. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Four Prediction Tasks ❖ Four prediction tasks are conducted. ❖ To demonstrate that our approach is effective for discovering predictability keywords 1) Post-event volatility 2) Stock volatility 3) Abnormal trading volume 4) Excess returns
  • 24. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Postevent Volatility Prediction
  • 25. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 FIN10K Prototype Demo https://cfda.csie.org/10K/ FIN10K: A Web-based Information System for Financial Report Analysis and Visualization. ACM CIKM (Demo paper), 2016.
  • 26. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Beyond Word-Level Analysis ❖ Multi-word expression detection and analysis ❖ Beyond Word-Level to Sentence-Level Sentiment Analysis for Financial Reports ❖ RiskFinder: A Sentence-level Risk Detector for Financial Reports, NAACL’18 ❖ https://cfda.csie.org/RiskFinder/ ❖ FRIDAYS: A Financial Risk Information Detecting and Analyzing System, AAAI’18 ❖ https://cfda.csie.org/FRIDAYS/
  • 27. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Summary ❖ If structured data is big, then unstructured data is huge. ❖ 20% (structured) vs. 80% (unstructured) ❖ There is a massive potential waiting to be leveraged in the analysis of unstructured data in the field of finance.
  • 28. Chuan-Ju Wang (CITI, AS) Talk @ Taiwan AI Academy November 17, 2018 Thanks for Your Listening!