Classification versus association models: should the same methods apply?

Publication Type:

Journal Article


Feng, Ziding


Scandinavian journal of clinical and laboratory investigation. Supplementum, Volume 242, p.53-8 (2010)


2010, Algorithms, Biological Markers, Center-Authored Paper, Epidemiologic Methods, False Negative Reactions, False Positive Reactions, Humans, Laboratory Techniques and Procedures, Logistic Models, Models, Statistical, Odds Ratio, Predictive Value of Tests, Prognosis, Proportional Hazards Models, Public Health Sciences Division, Risk Factors, ROC Curve, Sensitivity and Specificity


Association and classification models differ fundamentally in objectives, measurements, and clinical context specificity. Association studies aim to identify biomarker association with disease in a study population and provide etiologic insights. Common association measurements are odds ratio, hazard ratio, and correlation coefficient. Classification studies aim to evaluate biomarker use in aiding specific clinical decisions for individual patients. Common classification measurements are sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Good association is usually a necessary, but not a sufficient, condition for good classification. Methods for developing classification models have mainly used the criteria for association models, usually minimizing total classification error without consideration of clinical application settings, and therefore are not optimal for classification purposes. We suggest that developing classification models by focusing on the region of receiver operating characteristic (ROC) curve relevant to the intended clinical application optimizes the model for the intended application setting.