Advertisement






Click here for more guidelines.
CME Topic Collections Past Issues Search Current Issue Home
     

J Am Coll Cardiol, 2003; 42:1896-1899, doi:10.1016/j.jacc.2003.09.008
© 2003 by the American College of Cardiology Foundation
This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Cutlip, D. E.
Right arrow Articles by Baim, D. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cutlip, D. E.
Right arrow Articles by Baim, D. S.

EDITORIAL COMMENT

Risk assessment for percutaneous coronary intervention: our version of the weather report?*

Donald E. Cutlip, MD, FACC{dagger}{ddagger}, Kalon K. L. Ho, MD, MSc, FACC{dagger}{ddagger}, Richard E. Kuntz, MD, MSc, FACC{dagger}§ and Donald S. Baim, MD, FACC{dagger}§,*

{dagger} Harvard Medical School, Boston, Massachusetts, USA
{ddagger} Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
§ Brigham and Women's Hospital, Boston, Massachusetts, USA.

* Reprint requests and correspondence: Dr. Donald S. Baim, Professor of Medicine, Harvard Medical School, and Director, Center for Integration of Medicine and Innovative Technology (CIMIT), Brigham and Women's Hospital, Boston, Massachusetts 02115, USA.
dbaim{at}partners.org


Humans beings have an intrinsic need to extract predictability from apparent chaos, like the weather. Weather reports consequently take various forms: 1) based on all available current meteorological data, what is the estimated probability of rain next Saturday morning (when I am supposed to play golf)? 2) summarizing and comparing trends and variances, how does last month's rainfall compare to what we usually expect for July in Boston? In the world of coronary intervention, the analogous questions are as follows: 1) what is the chance that Mrs. Jones is going to die during her planned procedure or before hospital discharge? 2) how do the outcomes of her interventionalist (Dr. Smith) compare with those of other good interventionalists treating a similar mix of cases (the latter question now known as "score-card" medicine)? Once these models are established and an increased risk is anticipated, it is reasonable to also ask, "given the increased risks in this case, what can I do differently to prevent or attenuate the anticipated complications?"

Even in the earliest years of coronary angioplasty, Andreas Gruntzig established the precedent of collecting detailed demographic and angiographic data as well as short- and long-term procedural outcomes of his patients. As the number of coronary interventions increased in the mid- and late 1980s, such data collection continued and was used to populate databases whose summary outcomes could be examined. The availability of personal computers and powerful statistical tools then allowed these databases to be examined in greater detail to identify correlates of particular outcomes of interest. In fact, the very evolution of modern percutaneous intervention has been driven by the information gained from these clinical databases, whether large single-center experiences, regional registries, medical society or government-sponsored initiatives, or multi-center clinical trials. The extent to which we are able to properly measure this type of data and use our analysis to drive improvements in practice will determine whether the quality of care and the outcome after percutaneous coronary intervention (PCI) will continue to improve.

The study by Qureshi et al. (1) in this issue of the Journal is the latest in that series of efforts that concentrate on defining the most important clinical variables for predicting the single most important adverse outcome: in-hospital death. The facts are familiar: the overall mortality is 1.3%, but there are some factors (acute myocardial infarction [MI], age, multi-vessel disease, and baseline renal dysfunction) whose presence separately or in combination increases the likelihood of death to 30% and whose absence reduces the risk of mortality to 0.2%. What uses can we expect to make from this model, and how does it differ from earlier models?


    The role of outcomes databases
 Top
 The role of outcomes...
 Developing a risk prediction...
 Risk adjustment models,...
 References
 
The value of any such model depends on many factors: the number of patients, the detail and quality control of data collection (baseline, procedural, and outcome), the completeness of ascertainment (particularly of short-term and follow-up outcome events), referral biases that strongly affect outcome (e.g., a regional cardiogenic shock center vs. an elective-only center), the quality of the operators, and the interventional tools available during the data collection period. In a field with rapidly evolving device and pharmacologic treatment strategies, such as interventional cardiology over the past decade, these questions are of great importance. In fact, our current expectations with drug-eluting stents, distal embolic protection, and glycoprotein IIb/IIIa blockers (success >98%, major complications <3%, late recurrence <7%) would seem utterly fantastic to an interventionalist in 1990 or even 1995.

Of equal importance are the statistical tools used to analyze the dataset and a clear understanding of their robustness for the particular forecasting uses that are planned. Published estimates of risk for an individual patient may aid the patient and family in the consenting process or assist the operator in selecting or avoiding specific devices or adjunctive pharmacotherapy. The certainty of this type of prediction, however, is limited if there are differences between the model set and the patients to whom the model is being applied (some differences not being fully captured in standard angiographic and clinical variables) or if there are other statistical issues such as sampling variability and random variability in operator performance. These limitations are of particular concern when the model is going to be used to provide a performance "scorecard" for other operators.


    Developing a risk prediction model
 Top
 The role of outcomes...
 Developing a risk prediction...
 Risk adjustment models,...
 References
 
The general strategy of the risk-prediction process includes having access to a large, detailed, and relatively contemporaneous dataset and understanding its intrinsic limitations (which variables were collected, whether they were ascertained in all patients, whether data coding used uniform definitions and unbiased collection agents, whether angiographic data were evaluated by the operators or by a core laboratory, and so forth). A classic multi-variable model requires that a relatively small number of pre-specified potential risk factors be selected based on clinical logic (too many candidate variables or post hoc selection of such variables increases the risk to type 1 [false positive] results). Because these variables may be related to each other (e.g., congestive heart failure, left ventricular function, previous MI), careful multivariable modeling should then be used to identify which remain as independent predictors after adjustments are made for all other variables in the model.

The currently available models have good utility, but each has some level of limitation (2–9). Several were developed within single centers (6,8,9), specialized centers (4), or particular geographic regions (2,5) and thus may have limited generalizability to other populations. Many years may elapse between the collection of data, analysis, and eventual publication, compromising applicability to contemporary practice by the time the results are available. Moreover, robust models require a large sample size to predict outcomes that occur infrequently. For in-hospital mortality after PCI, a 10,000-patient database with a 1.5% overall mortality has only 150 events, which limits the number of variables it can test (roughly 20 events are required for each variable tested). Most of the databases used for model development and validation have thus included far too few patients for complex models of mortality prediction.

The quality of any model must also be measured carefully. A model that is constructed using a given population (the test set) is then validated by testing the model either in another portion of the same database—by jack-knifing or bootstrapping—or in a separate external database (the validation set). These procedures reduce the chance that a detected predictor was due to a unique property of the test set rather than being a robust predictor. The statistical quality of the proposed models relies on two measures regarded as measures of quality: discrimination and goodness-of-fit. Discrimination is usually measured by the c statistic. This reflects the area under the receiver-operating curve and thus is a measure of the model's ability to assign true positive outcomes as opposed to false positives. Models with a c statistic approaching 1 have perfect discrimination with a false-positive rate of 0% and a true-positive rate or sensitivity of 100%. Logistic regression models with c statistics in the range of 0.80 are usually considered to have high discriminatory ability, but this means that the model will still miss 20% of the patients with that adverse event. In fact, it would be only slightly better than a model with no discriminatory ability, which has an area under the received operating curve or straight line of 0.50! The goodness-of-fit is frequently assessed using the Hosmer-Lemeshow test, which determines the difference between the event rate predicted by the model and the observed rate. A p value >0.1 usually indicates that the model provides a good fit for the data and that differences are not statistically different, but it does not exclude potentially clinically significant differences between observed and predicted outcomes. Therefore, high scores for discrimination and goodness of fit do not necessarily mean that the model has high predictive accuracy for individual patients. A mortality predictor model for a population of 10,000 patients may thus predict a mortality of 10% for the highest decile, but even with a perfect fit, we do not know which 100 of those 1,000 patients will die.

The good news is that the available PCI mortality models provide some reassuring features despite these inherent limitations. First, each of the models included in Table 1 does have high discriminatory and goodness-of-fit scores. Second, even though the models represent patients from different eras and various populations, their strongest predictors are remarkably consistent and relate mostly to patient rather than technical variables. This has been substantiated in a recent report from the National Heart, Lung, and Blood Institute dynamic registry (NHLBI), in which three of the five tested models developed in the pre-stent era (New York State, Northern New England Cooperative Group, and Cleveland Clinic Foundation) showed excellent correlation for predicted and observed mortality among patients in the NHLBI database treated between 1997 and 1999 (10). This is somewhat less certain for angiographic lesion factors, however, because many of the lesion characteristics in the American College of Cardiology/American Heart Association classification scheme (e.g., lesion eccentricity) have been eliminated as technology has improved. Also, some of the remaining angiographic variables are actually surrogates for basic clinical variables (e.g., recent total occlusion is a surrogate for acute MI) (8).


View this table:
[in this window]
[in a new window]
 
Table 1 Multivariable Predictors of Mortality in Various Published Interventional Models

 
The model presented by Qureshi et al. (1) is simple enough to use at the bedside, appears to be useful for forecasting procedural risk for some important patient groups, and has high discriminatory ability and calibration. Knowing that a patient is in the highest risk group, whose expected mortality is >10 times higher than the lowest risk group, may be useful in giving patients and families a more refined risk estimate than the routinely quoted "1% mortality" and may assist the operator in making decisions regarding the use of certain therapeutic options.

But knowledge of the most reliable predictors should also allow comparison of outcomes observed for different operators or hospitals to the outcomes expected based on the predictive model. This would ideally allow appropriate and complete adjustment of the treated population for significant differences in baseline risk, and thus allow fair comparisons between operators or hospitals (the rainfall in July question). There are several concerns, however, with using this model for comparing different operators and hospitals in a scorecard fashion. The boundary selected by Qureshi for each of the four variables is arbitrary and does not delineate among various levels of increased risk. For example, although patients over age 65 are at higher risk, this risk is certainly higher for an 86-year-old than a 66-year-old patient. Likewise, no one would question a higher overall risk for patients with MI within 14 days, but the highest-risk patients would be those being treated for acute MI within 24 h, particularly if they have hemodynamic instability. Similar arguments can be made for the other two variables of creatinine >1.5 mg/dl and multi-vessel disease, which the model considers as unqualified binary variables. This considerable degree of smoothing of the overall risk curve by using these dichotomous cutoffs may lead to a systematic underestimation of the risk for the truly high-risk patient and a significant overestimation of the risk for many other patients, thus failing to adjust adequately for higher- or lower-risk cohorts across operators and hospitals for which the distribution of variable values may differ from the test set.


    Risk adjustment models, scorecards, and quality improvement
 Top
 The role of outcomes...
 Developing a risk prediction...
 Risk adjustment models,...
 References
 
Although such simplified risk predictor models for scoring individual patients may have limited utility beyond what an experienced clinician can surmise using even less complex methods of clinical assessment, the future for true risk-adjustment models appears much brighter. Keeping performance scores of individual operators or, more commonly, for institutions has been of increasing interest over the past 10 to 15 years, following the lead of cardiac surgery. Even though such scoring systems are initiated for the purpose of quality assessment and improvement, public reporting of results and the dissemination of provider rankings add an element of fear and anxiety for many providers, if data unadjusted for risk are published, as they were by Medicare for coronary artery bypass grafting surgery in 1987. This underscores the importance of using the most refined and scientifically valid methods for risk adjustment.

Unfortunately, even the best and most sophisticated multiple regression models developed for the purpose of risk adjustment have serious deficiencies and limitations, as discussed previously. Moreover, even the best models cannot compensate for the smaller sample sizes present at the institution or operator level and the associated statistical uncertainty. The wide resulting confidence intervals make it virtually impossible to provide any meaningful estimation of appropriateness of outcome for the low-volume operator or institution. A low-volume operator may look very good or very bad depending on how his or her last case went, and such models cannot fully correct for all confounders of risk in a small sample size. There are additional problems, such as failure to account for sampling variability, unmeasured confounding, and random variability (noise) between operators that are not fully correctable by any model, so that when resulting data are disclosed publicly, any risk-adjustment effort must be viewed as imperfect rather than as a true leveling of the playing field.

Given these significant problems with multiple regression models, Shahian et al. (11) have suggested the use of hierarchical or random-effect models for risk adjustment among cardiac surgery providers. The hierarchical models reduce the overly optimistic precision estimates by attempting to adjust for confounding by variances in treatment decisions between physicians and patients in the predictor dataset. Accounting for random operator effects dampens variability toward the mean and thereby provides more reliable estimates (12). Although they are much more complex, they are not beyond the capacity of groups involved in the risk-adjustment exercise.

In summary, the objective for any risk-prediction or adjustment tool should be to foster continuous quality improvement. Although simple bedside scoring as proposed by Qureshi et al. (1) may be of some use for classifying patients into broad risk categories, the ramifications of bona-fide risk adjustment demand more complex systems. Public presentation of the results must be undertaken cautiously and with adequate explanation of limitations to avoid unnecessary punitive components that might lead to gaming of the system (e.g., by avoiding high-risk cases, which may deny benefit to the patients with the most to gain from a high-quality procedure). Although the tracking of performance scores within individual centers and the comparisons with regional or national standards are desirable, those centers should also implement the minimum volume standards that have been shown to be reasonable, if not perfect, surrogates for performance quality (4,13). Finally, it is not clear whether mortality is the appropriate outcome measure, given its low frequency and the increasing difficulty in predicting risk as the frequency of the studied event diminishes. Other ways of measuring the success of a procedure and sound judgment, rather than the natural history of an acute illness, may be more useful. Physician-led continuous quality improvement initiatives that include the reporting of specified measurements of the success of a procedure have been effective in cardiac surgery (14). Regardless of the statistical methods used, however, the goal of continuous quality improvement is essential to our delivering the brightest forecast for the safety of our interventional cardiology patients.


    Footnotes
 
* Editorials published in the Journal of the American College of Cardiology reflect the views of the authors and do not necessarily represent the views of JACC or the American College of Cardiology. Back


    References
 Top
 The role of outcomes...
 Developing a risk prediction...
 Risk adjustment models,...
 References
 
1. Qureshi MA, Safian RD, Grines CL, et al. Simplified scoring system for predicting mortality after percutaneous coronary intervention. J Am Coll Cardiol 2003;42:1890–5

2. Hannan EL, Arani DT, Johnson LW, Kemp HG Jr., Lukacik G. Percutaneous transluminal coronary angioplasty in New York State. Risk factors and outcomes. JAMA. 1992;268:3092–3097[Abstract/Free Full Text]

3. Kimmel SE, Berlin JA, Strom BL, Laskey WK. Development and validation of simplified predictive index for major complications in contemporary percutaneous transluminal coronary angioplasty practice. The Registry Committee of the Society for Cardiac Angiography and Interventions. J Am Coll Cardiol. 1995;26:931–938[Abstract]

4. Ellis SG, Weintraub W, Holmes D, Shaw R, Block PC, King SB 3rd. Relation of operator volume and experience to procedural outcome of percutaneous coronary revascularization at hospitals with high interventional volumes. Circulation. 1997;95:2479–2484[Abstract/Free Full Text]

5. O'Connor GT, Malenka DJ, Quinton H, et al. Multivariate prediction of in-hospital mortality after percutaneous coronary interventions in 1994–1996. Northern New England Cardiovascular Disease Study Group. J Am Coll Cardiol. 1999;34:681–691[Abstract/Free Full Text]

6. Moscucci M, Kline-Rogers E, Share D, et al. Simple bedside additive tool for prediction of in-hospital mortality after percutaneous coronary interventions. Circulation. 2001;104:263–268[Abstract/Free Full Text]

7. Shaw RE, Anderson HV, Brindis RG, et al. Development of a risk adjustment mortality model using the American College of Cardiology-National Cardiovascular Data Registry (ACC-NCDR) experience: 1998–2000. J Am Coll Cardiol. 2002;39:1104–1112[Abstract/Free Full Text]

8. Ellis SG, Guetta V, Miller D, Whitlow PL, Topol EJ. Relation between lesion characteristics and risk with percutaneous intervention in the stent and glycoprotein IIb/IIIa era: an analysis of results from 10,907 lesions and proposal for new classification scheme. Circulation. 1999;100:1971–1976[Abstract/Free Full Text]

9. Resnic FS, Ohno-Machado L, Selwyn A, Simon DI, Popma JJ. Simplified risk score models accurately predict the risk of major in-hospital complications following percutaneous coronary intervention. Am J Cardiol. 2001;88:5–9[Medline]

10. Holmes DR, Selzer F, Johnston JM, et al. Modeling and risk prediction in the current era of interventional cardiology: a report from the National Heart, Lung, and Blood Institute Dynamic Registry. Circulation. 2003;107:1871–1876[Abstract/Free Full Text]

11. Shahian DM, Normand SL, Torchiana DF, et al. Cardiac surgery report cards: comprehensive review and statistical critique. Ann Thorac Surg. 2001;72:2155–2168[Abstract/Free Full Text]

12. Grunkemeier GL, Zerr KJ, Jin R. Cardiac surgery report cards: making the grade. Ann Thorac Surg. 2001;72:1845–1848[Free Full Text]

13. Hannan EL, Racz M, Ryan TJ, et al. Coronary angioplasty volume-outcome relationships for hospitals and cardiologists. JAMA. 1997;277:892–898[Abstract/Free Full Text]

14. Ferguson TB Jr., Peterson ED, Coombs LP, et al. Use of continuous quality improvement to increase use of process measures in patients undergoing coronary artery bypass graft surgery: a randomized controlled trial. JAMA. 2003;290:49–56[Abstract/Free Full Text]




This article has been cited by other articles:


Home page
J Am Coll CardiolHome page
F. S. Resnic and F. G.P. Welt
The public health hazards of risk avoidance associated with public reporting of risk-adjusted outcomes in coronary intervention.
J. Am. Coll. Cardiol., March 10, 2009; 53(10): 825 - 830.
[Abstract] [Full Text] [PDF]


Home page
J Am Coll CardiolHome page
A. Halkin, M. Singh, E. Nikolsky, C. L. Grines, J. E. Tcheng, E. Garcia, D. A. Cox, M. Turco, T. D. Stuckey, Y. Na, et al.
Prediction of Mortality After Primary Percutaneous Coronary Intervention for Acute Myocardial Infarction: The CADILLAC Risk Score
J. Am. Coll. Cardiol., May 3, 2005; 45(9): 1397 - 1405.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Cutlip, D. E.
Right arrow Articles by Baim, D. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cutlip, D. E.
Right arrow Articles by Baim, D. S.

 
  CME Topic Collections Past Issues Search Current Issue Home

Advertisement