The document discusses modeling mergers and acquisitions (M&A) in the high tech industry. It proposes using topic modeling to measure business proximity between companies and an exponential random graph model (ERGM) to model the interdependent relationships between M&A deals. Evaluation of the models using M&A transaction data from CrunchBase found that business proximity is a significant factor in M&A deals, even after accounting for industry and geographic selective mixing. A proposed interface called VentureMap could utilize the models to recommend potential M&A matches.
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Towards modeling M&A in high tech industries
1. Towards Modeling M&A in High Tech Industry
December 4th 2013
Gene Moo Lee
Department of Computer Science
The University of Texas at Austin
Research Preparation Exam
2. Startups in high tech industry
High tech startups are very active these days, thanks to
many platforms including
Mobile Platforms Cloud Platforms Financial Platforms
2 / 34
3. M&A is important in high tech
Mergers and acquisitions: buying, selling, dividing, combining companies
● Startups (sellers): M&A and IPO are the main exit strategies
● Established companies (buyers): pursue innovation by acquisitions
4. M&A matching and challenges
Q: Can we model M&A matchings?
Q: Which factors play important roles in M&A?
Challenges
● How to measure proximities among companies
→ Topic modeling for business proximity
● How to incorporate the interdependency of M&A deals
→ Random graph model (ERGM)
● How to access venture data: mostly private
→ CrunchBase: wikipedia for venture industry
● How to make the data accessible
→ Visualization with VentureMap interface
Buyer A Seller B
Will they do M&A?
If so, why?
4 / 34
5. Academic literature
● M&A analysis
○ interview on 12 deals [Graebner, Eisenhardt, Admin. Science Quarterly, 2004]
○ geography [Erel et al., J. Finance, 2012] [Kalnins, Lafontaine, Amer. Econ. J., 2013]
○ social networks [Hochberg et al., J. Finance, 2007] [Cohen et al., J. Finance, 2010]
● Matching problem
○ matching in graph [Mucha, Sankovski, Foundations of Computer Science, 2004]
○ kidney exchange [Roth et al, Quarterly Journal of Economics, 2004]
○ medical interns/residents [Roth, Journal of Political Economy, 1984]
● Link prediction in complex networks
○ social networks [Liben-Nowell, Kleinberg, Conf. Info. Knowledge Mgmt., 2003]
○ biological networks [Yu et al., Science, 2008]
● Innovation & entrepreneurship
○ two-sided market [Weyl, American Economic Review, 2010]
○ entrepreneurship [Glaser, Kerr, Ponzetto, Journal of Urban Economics, 2010]
5 / 34
For complete reference list
6. Roadmap
1. Introduction
a. Startups and M&A in high tech industry
b. Problem definition
2. Model
a. Proximity measures
b. M&A graph
c. ERGM
3. Evaluation
4. Platform
6 / 34
7. Roadmap
1. Introduction
a. Startups and M&A in high tech industry
b. Problem definition
2. Model
a. Proximity measures
b. M&A graph
c. ERGM
3. Evaluation
4. Platform
6 / 34
8. Proximity measures
How do we quantify the closeness between firms?
- hypothesis: companies with closer proximity measures are more likely
to have M&A deals
1. Business proximity [Haigu, Yoffie, J. Economic Perspectives, 2013]
- closeness on business area and intellectual property
2. Social linkage [Hochberg et al., J. Finance, 2007] [Cohen et al., J. Finance, 2010]
- socially connected by board members, executives, developers
3. Common ownership
- backed by same VCs or angels
4. Geography [Erel et al., J. Finance, 2012] [Kalnins, Lafontaine, Amer. Economic J., 2013]
- distance matters in decision making
Firm A
Firm B
Firm C
sim(A,B)
sim(B,C)
7 / 34
sim(A,C)
9. Business proximity & topic modeling
● Topic modeling [Blei, Ng, Jordan, J. Machine Learning Research, 2002]
○ To discover abstract topics in a collection of documents
○ Inputs: business descriptions and # of topics
○ Outputs: (1) keywords in each topic, (2) distribution of topics
for each company description
● Business proximity
○ Measure similarity in topic distribution
8 / 34
10. More proximity measures
Social linkage
● board members
● executives
● developers
Count common
people in two firms
Common ownership
● VCs
● angels
● institutions
Count shared
investors of two firms
Geographic distance
● lat, long
● city
● state
Use great circle
distance of two coord.
* We can extend measures with multiple hop connections [LK, CIKM, 2003]
9 / 34
11. More factors for M&A
● Selective mixing (homophily)
○ Companies with same characteristics are likely to M&A
○ Same state in the US: tax, regulations
○ Same industry sector
● Power law
○ Companies who acquired many startups are likely to make
more M&A transactions
○ Or companies who already acquired many startups have
incentives to buy more
10 / 34
12. Roadmap
1. Introduction
a. Startups and M&A in high tech industry
b. Problem definition
2. Model
a. Proximity measures
b. M&A graph
c. ERGM
3. Evaluation
4. Platform
11 / 34
13. M&A graph
We use graph models which incorporate the link interdependency
● M&A deals are interdependent
● But conventional models (logit, probit) assume independency:
treat each M&A deal separately
photo photo
photo
12 / 34
video blog
face recognition
14. Let Y = <V, E> be an M&A graph, where
● V is the set of companies (nodes)
● E is the set of M&A transactions (undirected edges)
M&A graph
Want to explain an observed graph Y with statistics on E and V
Some notations before moving on...
13 / 34
For complete list of notations
15. Exponential Random Graph Model [Erdos, Renyi, Pub. Math., 1959], [Newman, SIAM
Review, 2003], [Robins et al., Social Networks, 2007]
● Given a fixed set of n nodes, there are 2n(n-1)/2
possible graphs (Y)
● Generative model to explain an observed graph
○ based on various properties on nodes and edges
In an ERGM, we want to estimate that maximizes P(Y=y), where
ERGM 101
where
● zk
(y) is a certain property of the graph y
○ function of graph y and exogenous variables on nodes
○ e.g. # of edges with nodes having the same category
● = parameter for kth
statistic (want to estimate this)
● = normalization constant (require exponential computation)
● K = # of statistics we are interested in
14 / 34
(ERGM vs logit comparison)
16. ● Degree distribution
○ t = # of M&A deals (network density)
○ d2
= # of companies w/ 2+ deals (power law)
● Selective mixing (nodal attributes)
○ hs
sta
= # of deals within the same US state s (50 states)
○ hc
cat
= # of deals within the same industry c (30 categories)
● Proximity (dyad attributes)
○ pb
= sum of business proximities in all deals
○ ps
= sum of social proximities in all deals
○ pf
= sum of investment proximities in all deals
○ pg
= sum of geographic proximities in all deals
Our M&A model
degree selective mixing proximity
15 / 34
(conditional form)
21. Business topics from topic modeling
● Inputs: company profiles from CrunchBase
● Unsupervised learning with minimum manual efforts
(selecting stop words)
● Outputs: extracted 50 topics (topic=set of related keywords)
20 / 34
For complete list of 50 topics
22. Business proximity by topic model
● A 50-dimensional vector is assigned to each company
● Business proximity
= cosine similarity
21 / 34
23. M&A and proximity measures
22 / 34
geographic distance business distance
Measure the distance of company pairs: M&A vs. random
● geo distance (km) by great circle distance
● business distance (0~1) by (1 - topic similarity)
M&A pairs have significantly lower distances than random pairs
25. Evaluation
Dataset
● US companies founded from 2008 to 2012: |V| = 25,692
● M&A transactions within the US: |E| = 1,243
● # of possible networks (Y) exceeds # of atoms in universe
Estimate our ERGM M&A model
● Sample 25% companies from V: for computational feasibility
● Run 100 times with different samples
● Estimate model coefficients by following Markov chain Monte Carlo
(MCMC) maximum likelihood estimation (MLE)
24 / 34
26. Proximity measures
● Business > social > investor >> geographic
● Business proximity is statistically significant in our model
○ Even with the selective mixing of industries
● Geographic distance is less significant
○ Due to selective mixing of states
degree selective mixing proximity
25 / 34
27. Selective mixing: industry sectors
● Selective mixing holds for industry sectors
○ but it is coarse grained
● Proposed business proximity provides even
finer grained measures
degree selective mixing proximity
26 / 34
28. Selective mixing: state locations
Selective
mixing holds for
state locations
CA, MA, NJ,
NY, TX, WA
27 / 34
29. 1. Selective mixing holds for geography and industry
2. Topic modeling results give very significant and fine-grained
proximity measures
3. Social links play important roles
4. Geographic distance play limited roles
a. state-level binary relation vs geographic distance
Implication: we can use the proposed proximity measures to
understand/recommend/predict M&A deals
Evaluation summary
28 / 34
31. ● M&A market is a two-sided platform
○ buyers: established companies
○ sellers: startups
● We can increase the efficiency of this two-sided market by
○ building interface, VentureMap, to make data accessible
○ recommending matchings with our M&A model
● Potential beneficiaries
○ Established firms: intelligence/M&A department
○ Startups: identify opportunities, potential buyers
○ Venture capitalists
○ Market intelligence firms
○ Researchers in finance field
Platform for M&A
30 / 34
32. VentureMap: search M&A deals
● Search M&A deals by
○ date, buyers, sellers, industry, etc.
Click here for VentureMap search page
31 / 34
34. We showed how Big Data analytics can serve the M&A market
● Proposed new business proximity measures
● Built a generative model to explain M&A deals
● Developed a new interface to support venture industry
Future directions
● Improve proximity to distinguish complementarity & substitution
● Scale up ERGM model using distributed systems
● Build M&A prediction models
Concluding remarks
33 / 34
35. Thank you!
Gene Moo Lee: gene@cs.utexas.edu
Center for Research in Electronic Commerce
The University of Texas at Austin
36. 1. M&A analysis
a. M. Graebner, K. Eisenhardt, The Seller’s Side of the Story: Acquisition as Courtship and
Governance as Syndicate in Entrepreneurial Firms, Administrative Science Quarterly, 2004
2. Link prediction
a. D. Liben-Nowell, J. Kleinberg., The Link Prediction Problem for Social Networks. Proc. 12th
International Conference on Information and Knowledge Management (CIKM), 2003.
b. H. Yu, et al., High-Quality Binary Protein Interaction Map of the Yeast Interactome Network,
Science, 2008
3. Matching problem
a. M. Mucha, P. Sankowski, Maximum Matchings via Gaussian Elimination, Proc. of Foundations of
Computer Science (FOCS), 2004
b. A. Roth, T. Sonmez, M Unver, Kidney Exchange, Quarterly Journal of Economics, 2004
c. A. E. Roth, The college admissions problem is not equivalent to the marriage problem, Journal of
Economic Theory, 1985
d. A. E. Roth, The evolution of the Labor Market for Medical Interns and Residents: A Case Study in
Game Theory, Journal of Political Economy, 1984
4. Innovation and entrepreneurship
a. W. Kerr, Breakthrough inventions and migrating clusters of innovation, Journal of Urban
Economics, 2010
5. Topic modeling
a. D. Blei, A. Ng, M. Jordan, Latent Dirichlet allocation, Journal of Machine Learning Research, 2003
References
37. 1. Random graph
a. P. Erdos, A. Renyi, On random graphs, Publicationes Mathematicae, 1959
b. M. Newman, The structure and function of complex networks, SIAM Reviews, 2003
c. G. Robins, P. Pattison, Y. Kalish, D. Lusher, An introduction to exponential random graph models
for social networks, Social Networks, 2007
2. Business
a. A. Haigu, D. Yoffie, The New Patent Intermediaries: Platforms, Defensive Aggregators, and Super-
Aggregators, Journal of Economic Perspectives, 2003
3. Geography
a. I. Erel, R. Liao, M. Weisbach, Determinants of Cross-Border Mergers and Acquisitions, Journal of
Finance, 2012
b. A. Kalnins, F. Lafontaine, Too Far Away? The Effect of Distance to Headquarters on Business
Establishment Performance, American Economic Journal: Microeconomics, 2013
4. Social links
a. L. Cohen, A. Frazzini, C. Malloy, Sell-Side School Ties, Journal of Finance, 2010
b. Y. Hochberg, A. Ljungqvist, Y. Liu, Whom You Know Matters: Venture Capital Networks and
Investment Performance, Journal of Finance, 2007
c. M. Conyon, M. Muldoon, The Small World of Corporate Boards, Journal of Business Finance &
Accounting, 2006
5. Two-sided markets
a. G. Weyl, A Price Theory of Multi-Sided Platforms, American Economic Review, 2010
b. A. Haigu, Two-Sided Platforms: Product Variety and Pricing Structures, Journal of Economics &
Management Strategy, 2009
References
38. M&A and states
○ Many deals are within California or related to California
○ Still cross state deal volume is substantial
39. M&A and industries
○ Many deals are within software/web industry
○ Still cross industry deal volume is substantial
21 / 36
40. Power law in M&A
Distribution on # of M&A follows the power law
43. In logit/probit models,
● we assume that all the M&A deals are independent
● calculate the probability of observing an individual M&A deal
● maximize the product of each deals’ likelihood function
ERGM vs. Logit Model Back to main slide
In ERGM,
● we assume that M&A deals are interdependent
● calculate the probability of observing the whole M&A graph
● maximize likelihood of the graph as a whole
45. Density and node degree
● Degree > 2 coefficient is positive
○ Power law is observed from the data
● Edge coefficient is a constant for the model
degree selective mixing proximity
30 / 36