Home runs, singles and strike outs in venture capital returns

Posted in research on September 2nd, 2009 by Michael Ewens – Be the first to comment

Yesterday I posted a graph of the implied distribution of returns as an entrepreneurial firm increases its capital stock.  Today I present one important piece of that picture: the probabilities of return “regimes.”  First, the mixture model with mixing probabilities as a function of capital stock results in the following set of returns pdfs.

Individual return regimes and full pdf
Individual return regimes and full pdf

It is clear from the figure that the return regimes separate nicely into the outcomes “high,” “medium” and “low.”  Venture capitalists like to call the outcomes in their portfolios “home runs,” “singles” or “strikeouts” and they typically set goals for proportions of each in their portfolio.  The mean log returns and volatilities for each regime show extreme separation between the two tails.

Distribution of Returns by Regime

Regime E[\ln R] \sigma(\ln R) Probability
Home run 231% 123% 20%
Break-even -1% 80% 60%
Bankruptcy -273% 137% 20%
Full Model -9% 112% N/A

Includes all returns observations.  Estimated with sample selection and endogeneity corrections.

The mixing probabilities are a function of lagged capital stock, so I can plot the probability of each outcome for a range of dollars invested.  Figure 2 below shows that the bankruptcy risk is constant across capital stock while the probability of a home-run is highest for small firms.  Similarly, as firms raise more capital (and thus avoid bankruptcy) the most likely outcome becomes the “break-even” state with a 0% return.

The probability of each regime as a function of capital stock
The probability of each regime as a function of capital stock

Tomorrow I will discuss the motivation — theoretical and statistical — for the mixture model and parameterization of the mixing probabilities.

VC Returns by Stage

Posted in research, visualization on September 2nd, 2009 by Michael Ewens – 3 Comments

Warning: Preliminary Dissertation Results Below

My work on VC risk and return currently focuses on fitting a mixture model to the selection-corrected round-to-round returns data. This model can incorporate non-normality, skewness, kurtosis and outliers. Recently, I introduced lagged capital stock into the mixing probabilities through a multinomial logit model because analysis of the full model on “small” and “large” firms illustrated significant differences in results across firm size. With a continuous variable like capital stock, I can produce the estimated mixture pdf for a wide range of entrepreneurial firm sizes. The video below shows the progression of the selection and endogeneity-corrected (they are different!) mixture pdf.

[vimeo]http://www.vimeo.com/6393464[/vimeo]

The most dramatic change as firm size increases is in the right tail: larger firms have significantly more mass in the middle of the distribution. The underlying regimes match a world of “Losers,” “Winners” and “Break Even” as seen in the figure below.

3-regime VC returns

3-regime VC returns

I have discovered that incorporating lagged capital stock into the mixing probability helps to separate the individual regimes. I will be posting some more information about my results later in the week.

Angels, VCs and Homeruns

Posted in research on April 22nd, 2009 by Michael Ewens – Be the first to comment

Stephen Fleming argues that differences in fund structure and investment horizons creates conflict between the interests of angels and VCs:
Read more: “Angels, VCs and Homeruns – Michael’s posterous” – http://snewe.posterous.com/angels-vcs-and-homerunds#ixzz0DR2Wiu0T&A

  1. Venture economics dictate that VC funds must have a certain number of home runs to make up for the number of deals that simply go broke.
  2. The average size of a venture fund has grown from $100M to $350M in ten years. That means the home runs have to be bigger… as a rule of thumb, you probably need to exit at $200M to “move the needle.”

This article is another motivation for my mixture model.

Posted via web from Michael’s posterous

Venture Capital Spinoffs, 1992 – 2007

Posted in economics, research on April 7th, 2009 by Michael Ewens – Be the first to comment

One of my research papers studies spinoffs in the venture capital industry. I seek to understand the process of spinoff formation and the performance of spinoff firms versus both their parent firms and other new firms.  If the data permits, I also hope to study the characteristics and investment performance of spinoff founders.

The data was a lot more difficult to collect that I originally planned for, so the following graph is quite exciting:

New firms and spinoffs in VC

New firms and spinoffs in VC. Shows the total number of new firms and spinoffs from 1992 - 2007.

Of all new firms each year, some 15% are spinoffs founded by employees of existing VC firms.  Defining a spinoff requires rich data on the partners who found the firms and the investment activity of all actors over time.  I define a spinoff in the following way:

  1. New firm post-1992
  2. Has a partner that sat on an entrepreneurial firm within two years of the firm’s founding
  3. Any partner has past experience at an existing VC firm (the “parent”) and did not found any previous VC firm

Right now, I assume that if the partners found in step 2 satisfy step 3, they are founders.  Of course, it could be the case that the partner was hired by the new firm by the real founders.  Fortunately, at least one of these partners is a founder, so the number of spinoff firms is basically known.  So, I have to use Mechanical Turk to identify the true founders from the set of “potential founders.”   There are some 390 such individuals for which I know had past VC employment before working at this new firm, but it is unknown whether they are simply an employee or a founder.  So, I will use Mechanical Turk to:

  1. Collect the biographies of each potential founder
  2. Read each bio to determine whether they founded the firm in question.

I will update this post with my progress — and new graphs — soon.

Using Mechanical Turk for Research

Posted in economics, research on April 5th, 2009 by Michael Ewens – 2 Comments

Amazon’s Mechanical Turk is a services that allows you to hire 100s of people from across the world to tag photos, complete surveys or find websites. According to Wikipedia:

The Amazon Mechanical Turk (MTurk) is one of the suite of Amazon Web Services, a crowdsourcing marketplace that enables computer programs to co-ordinate the use of human intelligence to perform tasks which computers are unable to do. Requesters, the human beings that write these programs, are able to pose tasks known as HITs (Human Intelligence Tasks), such as choosing the best among several photographs of a storefront, writing product descriptions, or identifying performers on music CDs. Workers (called Providers in Mechanical Turk’s Terms of Service) can then browse among existing tasks and complete them for a monetary payment set by the Requester. To place HITs, the requesting programs use an open Application Programming Interface, or the somewhat limited Mturk Requester site.

Why would economists find this service useful? An example from my own work might help. I am collecting all the individuals employed by new VC firms founded from 1992 – 2007. I need demographic information and employment histories for each VC partner. After scraping the web to get the VC firm websites and “team pages” I have a set of locations for an individual to find the online biography of each of some 3000 VC partners. I submit a job to Mechanical Turk that asks the Turk’er to go to the website, find the individual’s biography and copy and paste the text. Further jobs could ask the Turk to read the bio and answer questions like: 1) Does this person have an MBA or PhD? 2) Male or Female? 3) Founder of firm? Unfortunately, submitting HITs to the Turk system is somewhat difficult. Enter Smartsheet.

Smartsheet as a Frontend to Mechanical Turk

Although the Mechanical Turk service has a system for submitting HITs, it is a little cubersome and requires a bit of strange formatting steps. If you value your time just a little (say, $10/hour) I recommend using the project management webapp Smartsheet’s service SmartSourcing. Again, I refer to my research. I created an Excel file with columns like “Name”, “Firm Name”, “Website” and an empty column “Biography.” I upload this file to Smartsheet, select the rows for which I need biography filled in and walk through their SmartSourcing steps. In about 3 minutes I have submitted a HIT to 1000s of workers that will be complete in 12 hours. I can approve or reject the responses while I watch them populate the online spreadsheet.

Such a service is not free. For a $9.95/month fee (or $99/annual…ask them for a non-profit coupon), you get access to SmartSourcing. Then, on top of the standard Turk fees, you have Smartsheet charges:

Any paid Smartsheet subscriber has access to the Crowdsourcing feature. Monthly charges include Amazon fees plus the cost for work performed (number of tasks completed * cents paid per task) and a low Smartsheet processing fee ($.01 + 10% per task completed – usually $10-$30 per 1,000 tasks).

For example, I paid about $7 for 116 biographies of VC partners. It sounds relatively expensive, but this service has:

  • increased the potential sample size of my studies
  • expanded the set of possible control variables
  • gives you the ability to request multiple workers per task for error checking
  • kept me sane by outsourcing mundane data tasks

Venture Capital Returns, Mixture Models and Reality

Posted in economics, research on March 20th, 2009 by Michael Ewens – Be the first to comment

Fred Wilson of Union Square Ventures asks:

But is 3x a good venture return? It depends entirely on the stage you invest in and your “batting average”.

As an economist, I also think it matters how long it took to earn this return.  Ignoring that, Fred explains his terms:

In VC parlance, the batting average is the number of times you make a successful investment divided by the total number of investments you make.

Depending on what types of investments you make — late stage (less risky) or early stage (more risky) — the expected batting average will be different.  In order to earn a respectable final return then, a low batting average has to include at least a couple of home runs to “make the fund”:

[I]f you are an early stage investor (like our firm Union Square Ventures), then it is a different story. I’ve said many times [...] that our target batting average is “1/3, 1/3, 1/3″ which means that we expect to lose our entire investment on 1/3 of our investments, we expect to get our money back (or maybe make a small return) on 1/3 of our investments, and we expect to generate the bulk of our returns on 1/3 of our investments.

Surprisingly, this division of returns looks very similar to the empirical results of the mixture model in my venture capital returns paper.  Some 1/3 of investments earn a positive mean return, while the remain earn a negative annualized return. The large positive alphas in the bottom 1/3 regimes have a negative expected return and significant systematic risk. Fred confirms that at least for early stage investors, outliers generate the bulk of the returns:

I’ve also said on this blog a bunch of times that we look for one investment to return the entire fund. In the case of our 2004 fund, that would be a $125mm return on one single investment.

Fred suggests that late-stage investors typically hit “100%” but have lower average returns. How can I reconcile these anecdotal facts with my paper’s results?  First, what Fred states are goals, not actual results.  Next, the final weights and returns look much like those earned by early-stage VCs rather than late-stage investors.  My model effectively averages across all VC investments, so the final mixture weights are across all stages and industries.   The model says something about the full population of VC investment opportunities that the average VC faces.

Ignoring investment skill or sorting, suppose that a VC simply draws from one of the three “bins” in the final mixture distribution.  My last draft suggests these are the possible outcomes:

Full VC Returns Mixture Results
Probability Mean Log Return (annualized) Std. log return
32.5% -32% 146%
34% 4.5% 32%
33.6% -19% 103%

66% of the time the investment will have a negative expected return (in an annualized sense).  However, once a VC chooses a bad “bin” they face an enormous amount of idiosyncratic risk, so they could earn a large return with a small probability.  I ran the mixture model separately for early and late-stage investments:

Early-stage VC Returns Mixture Results
Probability Mean Log Return (annualized) Std. log return
22% -50% 157%
33% 7% 37%
44% -15% 90%

Here, the mixture weights suggest that again, 66% of investments have a negative expected return. Probability has shifted from the very low return state to the low return state. The last table shows the distribution of returns for late-stage investments:

Early-stage VC Returns Mixture Results
Probability Mean Log Return (annualized) Std. log return
10% 1.7% 6%
52% 2% 53%
38% -16% 127%

Note: the low volatility and probability of the first stage may suggest a two-state, rather than three-state model.

For late-stage investments, only 38% have a negative expected return, while 52% “break-even.” The expected return to all late-stage investments is -%5 versus -15% for early-stage investments.  Late-stage investors earn higher average returns, face much less risk, but don’t have the same opportunities for outliers.  It is not immediately clear whether the late-stage estimates confirm the “100%” batting average , as the model’s final predict 50% of both late and early-stage investment lose money.  The implied cdfs of each sub-sample show that early-stage investments have much more left-tail risk:

The cdf for early and late stage investments

The cdf for early and late stage investments

Do late-stage investors bat “100%”?  The data suggests the have less left-tail risk, a near zero expected return and low volatility.  Although presented for early-stage investors, the full population of VC returns looks much like the “1/3,1/3,1/3″ model proposed by Fred Wilson.