Recorded Future harvests and organizes the predictive information contained on the web. They provide clients with tools to analyze past, planned, and speculative events extracted from web sources. This allows clients to create customized media-analytic feeds and apply the data to tasks like liquidity management, short-term trading, strategy allocation, and risk modeling. Case studies show the data has predictive power and can improve models when combined with traditional market data.
Recorded Future News Analytics for Financial Services
1. Recorded Future David Moon Global Head of Financial Services Bill Ladd Chief Analytic Officer
2. What is Recorded Future? 3/1/2011 2 We believe that the content of the web has predictive power. So... We’ve harvested and organized the only real-time source for past, planned and speculative events on the web. To... Allow users to “slice-and-dice” the web to make predictions.
3. Web is Loaded with Predictions 3/1/2011 3 Silicon Valley executives head to Vail, Colo. next week for the annual Pacific Crest Technology Leadership Forum Drought and malnutrition hinder next year’s development plans in Yemen... “Strange new Russian worm set to unleash botnet on 4/1/2012...” The carrier may select partners to set up a new carrier as early as next month “According to TechCrunch China’s new 4G network will be deployed by mid-2010” “... Dr Sarkar says the new facility will be operational by March 2014...” “2010 is the year when Iran will kick out Islam. Ya Ahura we will.” “...opposition organizers plan to meet on Thursday to protest...” “Excited to see Mubarak speak this weekend...”
9. Case Studies Liquidity Management Predicting liquidity with media coverage Short Term Trading “Future event” study Strategy Allocation Measuring investment strategy crowdedness with online media. Risk Modeling Anticipating future volatility with media sentiment and macroeconomic discussion. 3/1/2011 5
10. Case 1 – Liquidity ManagementPredicting Liquidity with Momentum Recorded Future momentum contains predictive information for dollar volume of S&P 500 companies. Control for trailing market volume on a 1 and 20-day basis. Use 1-day trailing momentum. Call:lm(formula = Dollarvol.1 ~ 0 + lDollarvol.1 + smaDvol.Dollarvol.1 + smaxlMo, data = seriesdf)Residuals:Min 1Q Median 3Q Max -5.039e+09 -2.215e+07 -2.284e+06 1.813e+07 1.597e+10 Coefficients:Estimate Std. Error t value Pr(>|t|) lDollarvol.1 0.513193 0.003237 158.54 < 2e-16 ***smaDvol.Dollarvol.1 0.471645 0.003817 123.56 < 2e-16 ***smaxlMo 0.077162 0.015683 4.92 8.67e-07 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 170900000 on 72109 degrees of freedomMultiple R-squared: 0.8539, Adjusted R-squared: 0.8539 F-statistic: 1.405e+05 on 3 and 72109 DF, p-value: < 2.2e-16 3/1/2011 6
11. Case 2 – Short Term TradingFuture Event Distributions 3/1/2011 7 Non-earnings related events are negative. We controlled for earnings and non-earnings related news. The study queried instances where there was advance notice of specific future events. Events defined as one day long with S&P 500 constituents These typically provided one to three days advance notice ~19,000 unique events satisfied these criteria ~1-3 days t(days)
12. Case 2 – Short Term TradingNews “Should” be Priced in Immediately Buy the rumor, sell the news describes earnings related events. Market adjusted returns increase on approach to the event day and decline thereafter. It does not describe non-earnings related events. No increase in returns on approach to event-day Statistically significant increase in volume (0.3σ) and decrease in market adjusted returns. Non-earnings related events were net negative. 3/1/2011 8 Typical Publication Day Predicted Event Day
13. Case 3 – Strategy AllocationQuantifying Strategy Crowdedness 3/1/2011 9 Recorded Future data yielded an inverse correlation between the performance of a momentum strategy and the business media’s discussion of momentum. The study introduced a synthetic linguistic score. Relied on standard API queries Scored fragments based on momentum-related terms Increased discussion of momentum-related trading correlated with declining returns. Inverse correlation with $NAV/share of momentum mutual fund Monthly correlation of -0.56 over the past year
14. Case 4 – Risk ModelingVolatility Forecasting Methodology Data Extraction Extract all references to S&P 500 Companies from Recorded Future’s structured content database from January 1, 2009 to December 9, 2010. Includes synonyms (IBM vs. International Business Machines, etc.) Reduce to only mentions on “Blog” sources. Compute sentiment and momentum of text surrounding references to the Index over the time period. Data Aggregation Compute daily series of count-weighted mean sentiment and momentum. Modeling Calculate exponential moving averages of these values over a 26-day trailing window. Regress against 1-month forward realized volatility of S&P 500. Model Assessment Economic evaluation of model parameters – do they make sense? Comparison to other volatility metrics – how does the signal compare? 3/1/2011 10
15. Case 4 – Risk ModelingModel Summary Call: lm(formula = spyvol ~ vix + emamo + emaneg, data = blogus) Residuals: Min 1Q Median 3Q Max -0.0087503 -0.0020655 -0.0004415 0.0020463 0.0100361 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.237e-02 2.460e-03 -5.028 7.03e-07 *** vix 3.938e-04 2.511e-05 15.681 < 2e-16 *** emamo 2.337e-02 8.164e-03 2.863 0.00439 ** emaneg 3.204e-01 3.631e-02 8.824 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.003263 on 478 degrees of freedom (25 observations deleted due to missingness) Multiple R-squared: 0.6867, Adjusted R-squared: 0.6848 F-statistic: 349.3 on 3 and 478 DF, p-value: < 2.2e-16 3/1/2011 11 Regressors are VIX value, and 28-day EMAs of average momentum and negative sentiment in text surrounding S&P500 companies. Controlling for VIX, an increase in chatter around S&P 500 companies and an increase in negative sentiment around S&P500 companies lead increases in one-month forward realized volatility. Positive sentiment NOT a statistically significant term in this model. Volatility driven by fear, not euphoria? R-squared of 0.68 respectable compared to VIX’s ability to predict 1-month forward volatility – R-squared 0.63. RF data orthogonal to market data – controlling for VIX leads to models with R-squared > 0.63
16. Getting Started – Data & Aggregates Data Instances Sources & Documents Entities & Events Canonical events Entity identifiers: tickers, industry taxonomy Time Publication Date Event Date Calculated Scores Momentum, Sentiment Aggregates US equities aggregates Daily composite momentum and sentiment scores for constituents of the Russell 3000 Custom aggregates built on data elements 3/1/2011 12 Canonical info Sentiment Momentum Event time Co-occurring entities Source metadata Document metadata RF State Data Entity information
17. Access – Historical & Live Data 3/1/2011 13 Recorded Future Web Service API Recorded Future FTP Archive Data Formats – JSON, CSV Historical Data Delivery – API, FTP API – Historical results from raw data via web-service calls FTP – Files of aggregates, and bulk history Live Data Delivery – API Customized calls – as frequently as intra-day RF Aggregates – calculated daily JSON HTTP Request .zip archive csv aggregates json/tsv instances FTP Request JSON/CSV Response Historical Batch Download Live Download Load RF Data RF Customer Analytic Environment (R, Matlab, Java, Python, Excel, etc.)
18. Applications – Slicing the Data Case Studies, revisited Liquidity Management Pull aggregate Day/Company momentum data for S&P 500 Short Term Trading Pull instance data for S&P 500 companies where publish date is before event date Strategy Allocation Pull instance data where document category is “Business/Finance” and score fragments based on word/phrase choice Risk Modeling Pull aggregate momentum and sentiment data for the S&P 500 Companiesfor specified time period Different slices entail unique media-analytic feeds 3/1/2011 14
19. Summary Recorded Future provides the world’s only real-time source of past, planned and speculative events. Designed for clients to create unique media-analytic feeds via web-services API and FTP access. Applied to liquidity planning, short term trading, strategy allocation, risk modeling, among other scenarios 3/1/2011 15