Pipeline Physics produces profit
Gary Summers, PhD	1700 University Blvd, #936
President, Pipeline Physics LLC	Round Rock, TX 78665-8016
gary.summers@PipelinePhysics.com	503-332-4095

How erroneous data causes project selection errors

Whether data comes from forecasts, guesstimates or measurements, all data used to evaluate projects contains errors called estimation errors Two types of errors occur: inaccuracy and imprecision, whcih are also called bias and error. I will use the terms inaccuracy and imprecise because I use the term error in other ways, such as estimation errors, project evaluation errors and project selection errors. Figure 1 illustrates inaccuracy and imprecision. The target on the left illustrates inaccuracy. Estimation errors persist in the same direction. The target on the right illustrates imprecision. Estimation errors occur in random directions.

A illustration of inaccuracy (bias) and imprecision buy showing the distribution of arrows shot at a target.

Figure 1: The left-side target illustrates precise but inaccurate (biased) estimates. Its "shots" are tightly clustered (precise) but to the right of the bulls-eye (inaccurate). The right-side target illustrates accurate but imprecise estimates. If averaged together the "shots" would hit the bulls-eye (accurate, no bias), but they vary around the bulls-eye considerably, in all directions (imprecision).

When evaluating compounds and projects both inaccuracy and imprecision are problems.

Inaccuracy (bias): Some managers have told me that biases do not matter if biases affects all projects, but this assertion is wrong. Optimistic biases, which are pervasive, favor risky projects, which causes managers to unknowingly build portfolios that are too risky. (See my discussion, "Optimistic probability estimates, or 'How to unknowingly pick the wrong projects and mistakenly increase portfolio risk.'")
Imprecision: With near ubiquity, managers underestimate the imprecision in their estimates. (See my discussion, "Overconfidence and underestimating risk" and in my forthcoming discussion, "Four common errors in PPM's Monte Carlo analysis.")

Knowing that imprecision is a significant problem, let's see how it produces project selection errors. Suppose one must choose between two projects, one of which creates more value. If the estimates of the projects' values, such as NPVs or eNPVs, are imprecise, what is the probability of mistakenly selecting the less valuable project?

To answer this question, Figure 2 presents the imprecise evaluations of two projects. The center of each distribution represents a project's true value. Project B has a higher true value than project A, so the correct decision is to choose project B. The curves in the figure are probability distributions that show potential estimated values of the projects. In technical terms, the distributions depict the imprecision in the project evaluations. Because the curves overlap, the estimated value of project A can exceed the estimated value of project B, and that event would cause one to select project A, which is a project selection error.

Overlapping probability distributions show the imprecision in estimating the value of two projects.

Figure 2: Imprecise estimates of project value.

How likely is such an error? Greater overlap of the curves increases the probability of an error, which implies two conclusions. If two projects have similar values, the curves are closer together and there is a greater chance of error. Of course, if the two projects have similar values, selecting the lesser project causes little harm. The second conclusion is more insidious. As the imprecision in the projects' evaluations increases, the curves get wider, their overlap increases and selection errors become more likely.

One wishes to know the relationship between these three variables: imprecision in the project evaluations, difference in the projects' true values and probability of selecting the inferior project. Let's get some answers by considering an example. The decision tree of Figure 3 is for a drug that is being considered for clinical trials. Let this project be project A from Figure 2. Furhtmore, for this example, assume all the numbers in the decision tree are true values, not estimates. Finally, assume Project B of Figure 2 has an identical decision tree, excpet if successful, it has larger revenues.

For simplicity, assume the evaluations of both projects perfectly estimate all variables except for the revenues. Focusing on imprecise revenue estimates is useful because these estimation errors dominate the project evaluations made with decision trees and capital asset budgeting models. (See my discussion, "Revenue forecasting errors dominate decision trees.")

Figure 3: A decision tree for a compound entering clinical trials in drug development. The four chance nodes (branching points) represent the three phases of clinical trials and FDA approval. The top branch of each node represents success and the bottom branch represents failure. For example, the drug has a 70% chance of success in phase I clinical trials, and if successful, it has a 45% chance of success in phase II trials. The red numbers within the decision tree represent the costs of each stage, while the green number represents revenue (all presented as present values). For example, phase I clinical trials costs $3 million and phase II trials costs $6.5 million.

I'll represent project B's expected value as a percent of project A's expected value. For example, suppose project B has an expected value of $141.85 million. I will report it as being 120% of project A's estimated value, since $141.85 million = 120% * $118.21 million. Referring to Figure 2, 120% specifies the distance between the means of the curves. If project B's expected value is only 110% of project A's expected value, the curves are close together. If project B's expected value is 130% of project A's expected value, the curves are further apart.

We can now answer the question, "If project evaluations are imprecise, how likely are project selection errors?" Figure 4 presents the answer. The horizontal axis shows the estimated value of project B as a percent of project A's estimated value. The vertical axis shows the probability of a project selection error. The three curves illustrate different amounts of imprecision in revenue forecasts for projects A and B, reported as a percent of project A's revenues. Recall the true revenue for project A is $700 million. If the estimate of this revenue has an imprecision of 10%, the revenue is $700�70 million. (Technically, the standard deviation of the imprecision is 70 million.) Before exploring each curve, consider random selection. Each project has a 50% chance of being selected. This is the worst possible performance for project selection.

The probabilities of erroneously selecting the less valuable projects as a function of their expected values and the imprecision in estimating their expected values.

Figure 4: Probability of selecting the lesser of two projects as a function of imprecision and difference in value.

Let's begin with an imprecision of 10% so that both projects' revenues are estimated with an error of �$70 million. The probability of selecting the wrong project (project A) is high only if the projects have nearly identical values. For example, if project B's true expected value is 110% of project A's true expected value, the probability of a selection error is about 30%. If project B's estimated expected value is 120% of project A's expected value, the probability of a selection error is only 10%.

An average error of only 10% is amazing. If one can estimate revenues with an imprecision of only 10%, project selection will be exceptional, and the few errors that occur will be among similar valued projects, so these errors have small costs.

Now consider demand forecasts with imprecision of 40%. The imprecision in estimating project A and project B revenues are �$280 million (280 = 40% * 700). Project selection errors are much more probable and hence frequent. If project B's true expected value is 120% of project A's true expected value, the chance of a selection error is nearly 40%. This situation is only slightly better than random selection (each project has a 50% chance of being selected).

To many managers, imprecision between 10% and 40% may seem likely, but imprecision is probably much larger. McKinsey and Company measured the imprecision in forecasting demand for compounds two years prior to launch. The imprecision in the estimated demand was 75%. (See my discussion, "Revenue forecasting errors dominate decision trees.") With such imprecision, revenue estimates for project A would be $700�$525 million.

Figure 4 shows the impact of 75% imprecision. If project B's estimated expected value is 150% of project A's expected value, there is a 34% chance of selecting the wrong project. With such large errors in estimated revenues, project selection is fraught with errors.

For several reasons selection errors are even more likely:

The 75% imprecision is for revenue forecasts made two years prior to market launch. Compounds being consider for phase I, II or III are typically, 8 years, 6.5 years and 4 years away from launch, so their revenue forecasts will contain even larger errors.
While errors in estimating revenues have the biggest effect, imprecision in estimating costs, probabilities, discount rates and other qualities of projects increase the likelihood of project selection errors.
All project evaluation models are abstractions, providing imperfect and incomplete representations of projects. These imperfections increase the probability of project selection errors.
The above example assumes imprecise data is normally distributed. Normal curves have fat middles and skinny tails, reducing the overlap of the curves and potentially underestimating the likelihood of selection errors. However, the assumption of normally distributed imprecision is a poor model for pharmaceuticals for which a small percent of candidates are blockbusters. It's a poor assumption for any PPM in which projects follow the 80-20 rule: 80% of benefits are produced by 20% of projects.

Fortunately, one can mitigate the impact estimation errors in three ways. First, some evaluation models propagate estimation errors through their calculations to produce large project evaluation errors. Other evaluation models dampen the propagation of errors, lessening the impact of estimation errors on project evaluations. If there are large evaluation errors, select an evaluation method that mitigates their impact. (See my forthcoming discussion, "To combine or not to combine? That is the question.")

Second, some project selection techniques manage project evaluation errors better than others. Generally, complex selection techniques are sensitive to project evaluation errors while appropriately designed simpler selection techniques are more robust to project evaluation errors. If project evaluation errors are a problem, use a simpler selection technique. Table 1 provides general guidelines for choosing a selection technique.

*Table 1*: How to match your selection technique to your project evaluation errors.
		Project Selection Technique
		Simpler	Sophisticated
Project Evaluation Errors	Small	Poor result (value left on the table)	Best result (achieves action flexibility)
	Large	Good result (achieves state flexibility)	Poor result (too many avoidable errors)

The third way of managing estimation errors makes more fundamental changes to PPM. One can replace its static, portfolio optimization model with a dynamic model of value flowing through a pipeline (phase-gate system). To learn about this approach, see my discussions, "How to make drug development more productive" and "Managing drug development pipelines."

After reading my discussions, many managers wish to share their experiences, thoughts and critiques of my ideas. I always welcome and reply to their comments.

Please share your thoughts with me by using form below. I will send reply to you via email. If you prefer to be contacted by phone, fax or postal mail, please send your comments via my contact page.