NBER Reporter 2010 Number 1: Research Summary

Measuring Returns to Healthcare

Joseph Doyle *

Healthcare spending in the United States comprises 16 percent of GDP-nearly 80 percent more than in the median OECD country and 45 percent above that of the second-highest spending nation, France. Across countries, and across markets within the United States, the vast disparities in spending are not associated with better measures of health-outcome.1 However, evidence from time series and panel data suggest that higher healthcare spending has generated benefits that, when converted to dollar magnitudes in various ways, appear to exceed their costs. 2 Of course, the type of variation in treatment intensity differs across these two types of comparisons, but the question remains: are the returns to healthcare large or small?

Estimating such returns can be confounded because medical providers attempt to provide each patient with a particular level of care. With heterogeneous returns, greater care is likely provided to those with the highest returns. This would tend to bias results toward finding beneficial effects of treatment. At the same time, patients with the highest returns may be those in relatively poor health. Indeed, hospitalized patients who receive more care are much more likely to die in the hospital, even after controlling for a host of observable characteristics: more care is provided to patients in worse health. With the raw correlation between treatment and health seemingly negative, estimating returns is an uphill battle.

In a series of research studies, my co-authors and I have explored natural experiments that can shed some light on the returns to healthcare. Most of these papers consider conditions where selection bias associated with admission into the hospital is less of an issue: childbirth-the most common reason for hospitalization in the United States - and emergency admissions. This research summary briefly describes this work and points to future work in the area.

Evidence from At-Risk Newborns

One project, joint with Douglas Almond, Amanda Kowalski, and Heidi Williams, uses the idea that diagnostic thresholds can offer the potential to estimate returns to healthcare. 3 If physicians provide greater levels of care to patients falling just above a diagnostic criterion, then researchers can compare treatment and health outcomes for patients just above and below the threshold. The nature of the variation allows us to measure marginal returns, which are crucial for the interpretation of whether additional care saves lives.

Our work focuses on at-risk newborns on either side of the "very low birthweight" threshold of 1500 grams (3lbs. 5oz.) The underlying health of newborns who weigh 1499 grams is similar to those weighing 1500 grams, yet the rules of thumb used by physicians and hospital protocols call for additional attention for newborns below the threshold. By comparing newborns on either side of the threshold, we are able to avoid some of the confounding factors that usually affect measurements of the returns to health care.

We show that newborns with birthweights just below 1500 grams have discontinuously higher hospital costs than slightly heavier newborns, on the order of $10,000 each. When we study data from the census of U.S. births over twenty years, we find that newborns with birth weights just below 1500 grams have a single percentage-point lower infant mortality rate than newborns with birth weights just above this cutoff, even though mortality risk tends to decrease with birthweight. This constitutes a relatively large reduction when compared to a mortality rate of 5.5 percent just above 1500 grams. We conclude that the additional medical attention afforded to very low birthweight newborns is highly cost effective at saving lives.

The same project shows that hospitals with the most state-of-the-art neonatal intensive care units (NICUs) are less likely to use the threshold, whereas changes in treatment and mortality are found at those hospitals with lower-level or no NICUs.

Evidence from "Uncomplicated" Births

In another study with Douglas Almond, we test whether a longer stay in the hospital after a birth affects the health of newborns and mothers.4 We use insurance rules that provide coverage for one or two days in the hospital after birth, and these days are counted as "the number of midnights in care." That is, a newborn delivered at 12:05 a.m will have one more night of reimbursable care than an infant born a few minutes earlier. In a dataset of California births from 1991-2002-including nearly 100,000 births within 20 minutes of midnight-we find that the discontinuous change in insurance coverage leads to significantly longer stays for those born just after midnight than for those born before midnight. We find no differences in major health problems, summarized by hospital readmissions and mortality, for either the infants or the mothers. Together with a 1997 law that mandated coverage for a minimum of two days, these results suggest that increases in the length of stay from 1-2 days or from 2-3 days impose substantial costs without apparent health benefits.

In comparison to the findings on at-risk newborns described above, this study shows the results that apply to "uncomplicated" deliveries. These newborns are representative of the typical birth. While new parents may benefit from the additional night of supervision, we conclude that in this instance the insurance mandates result in moral hazard: greater use of hospital resources with little benefit in terms of major health problems. This is consistent with efforts by insurers to reduce stays to one night in care.

Evidence from Health Emergencies

As noted earlier, there is a large amount of regional variation in healthcare spending within the United States. The Dartmouth Atlas of Healthcare shows that some markets spend 60 percent more than others, yet survival from a heart attack is remarkably similar across these areas. Is it possible that individuals in high-spending areas are in worse health in ways that are difficult to control for in the comparisons?

While it is not possible to randomly assign patients to different healthcare systems, I have compared the outcomes of patients who are exposed to different healthcare systems not designed for them: patients who are far from home when a health emergency strikes. 5 Patients who experience these health shocks may find themselves in an area that spends a great deal on patients or in one that tends to spend less. For example, West Palm Beach and Fort Lauderdale are neighboring cities on Florida's east coast with similar lodging prices, yet Fort Lauderdale tends to spend 30 percent more on heart attack patients. The idea is that these types of cities are close demand substitutes in terms of destinations, and they attract "close substitutes" in terms of patients.

Contrary to the literature that focuses on local patients, analyzing visitors to Florida who have a serious heart-related emergency in a high-spending market results in a 20 percent lower mortality rate than for patients in low-spending areas. These estimates are robust across different types of patients, including patient-income levels, and within groups of similar destinations. In addition, the results suggest that intensive-care unit services drive cost differences, and they appear to be cost effective.

The results apply to emergency care, and specifically to a set of patients healthy enough to travel. To the extent that the results may apply more broadly, it appears that high-spending areas may not be as wasteful as the previous cross-section results suggest.

The Consequences of Being Uninsured

An earlier study also considers health shocks by comparing patients with and without health insurance following a severe automobile accident.6 This is a sudden health emergency when the individuals have no choice but to visit the hospital. The analysis uses a dataset originally intended for highway-safety research linking hospital discharge records with police reports. I find that the uninsured receive 20 percent less care and have a significantly higher mortality rate.

Another innovation in this project is the comparison group available in the rich data source: individuals who have health insurance but do not have automobile insurance according to the police report. These patients are quite similar to those who do not have health insurance, and the results are similar when the analysis is restricted to these two groups. This is one of the few studies to examine the potential effects of health insurance directly on health outcomes.7 The results again suggest that greater treatment intensity yields health benefits for trauma care.

Returns to Physician Quality

Some physicians provide much more care to patients than others, which is a major source of variation within and across cities. Patients are referred or choose their physicians, however, and it is not clear how much of the variance in care can be explained by differences among the patients themselves.

A project with Todd Wagner and Steven Ewer studies a setting where over 30,000 patients in a large, urban hospital were randomly assigned to physician teams.8 Further, the teams are affiliated with one of two academic institutions: one institution is among the top medical schools in the United States, while the other institution is ranked lower in the quality distribution. Because of the randomization, patients treated by the two teams have identical observable characteristics. Further, both teams have access to a single set of facilities and ancillary staff, because care is located in the same hospital.

We show that across common conditions, the more-prestigious teams provide care that is 10-25 percent less costly than the less-prestigious ones. Health outcomes are not related to the physician-team assignment, and the estimates are precise: they (statistically) rule out better health outcomes associated with assignment to the more-prestigious team. Further investigating the source of the treatment differences, the results are consistent with the ability of physicians in the lower-ranked institution to substitute diagnostic tests and specialist consultations for the faster judgments of physicians from the top-ranked institution.

The comparison is among only two institutions, but the results suggest a number of implications. First, local-area variation in care can be substantial, even after controlling for patient characteristics. Second, inequality in access to high-quality physicians may lead to differences in the use of specialists and testing, but not to health disparities. Third, a relaxation of accreditation standards may not adversely affect the quality of care, but it may raise operating costs. Fourth, while previous studies have found that high-cost areas are associated with lower-quality care, a greater reliance on specialists, and little difference in health outcomes, and interpreted this as evidence of wasteful spending, these results suggest the possibility of an alternative interpretation. Areas with lower-quality providers may require greater treatment intensity and the use of specialists in order to achieve outcomes on par with areas with higher-quality providers. This appears to be a fruitful area for future research.


Measuring returns to healthcare can be confounded by the nature of the delivery: more care is provided to patients in worse health. My research has investigated instances when additional care is less likely to be related to underlying patient health and found large returns for at-risk newborns and patients receiving emergency care, but small returns for longer postpartum hospital stays among typical births. Future work should continue to consider additional types of patients and treatments, begin to consider chronic conditions, and investigate the interaction between physician quality and the cost of care.

* Doyle is a Faculty Research Fellow in the NBER's Program on Aging and the Alfred Henry and Jean Morrison Hayes Career Development Associate Professor of Applied Economics at MIT's Sloan School. His profile appears later in this issue.

1. See, for example, E. Fisher, D. Wennberg, T. Stukel, D. Gottlieb, F. Lucas, and E. Pinder, "Implications of regional variations in Medicare spending, Part 2: health outcomes and satisfaction with care," in Annals of Internal Medicine, 138(4) (2003), pp. 288-98; and A.M. Garber and J. S. Skinner, "Is American Healthcare Uniquely Inefficient?" Journal of Economic Perspectives, 22(4) (2008), pp. 27-50.

2. D. Cutler, A. Rosen, and S. Vijan, "The Value of Medical Spending in the United States, 1960-2000," New England Journal of Medicine, 355 (2006), pp. 920-27; and K.M. Murphy and R. Topel, "The Economic Value of Medical Research" in Measuring the Gains from Medical Research: An Economic Approach, K.M. Murphy and R. Topel, eds. Chicago: University of Chicago Press, 2003.

3. D. Almond, J. Doyle, A. Kowalski, and H. Williams, "Estimating Marginal Returns to Medical Care: Evidence from Care for At-Risk Newborns," NBER Working Paper No. 14522, December 2008, and Quarterly Journal of Economics, forthcoming.

4. D. Almond and J. Doyle, "After Midnight: A Regression Discontinuity Design in Length of Postpartum Hospital Stays," NBER Working Paper No. 13877, March 2008.

5. J. Doyle, "Returns to Local-Area Health Care Spending: Using Health Shocks to Patients Far from Home," NBER Working Paper No. 13301, August 2007.

6. J. Doyle, "Health Insurance, Treatment, and Outcomes: Using Auto Accidents as Health Shocks," NBER Working Paper No. 11099, February 2005, and Review of Economics and Statistics, 87(2) (2005) pp. 256-70.

7. Institute of Medicine, Hidden Costs, Value Lost: Uninsurance in America, Washington D.C: National Academies Press, 2003, p. 141.

8. J. Doyle, S. Ewer, and T. Wagner, "Returns to Physician Human Capital: Analyzing Patients Randomized to Physician Teams," NBER Working Paper No. 14174, July 2008.


National Bureau of Economic Research, 1050 Massachusetts Ave., Cambridge, MA 02138; 617-868-3900; email:

Contact Us