Building a Patient-Specific Risk Score with a Large Database of Discharge Summary Reports.

Publication Type:

Journal Article


Medical science monitor : international medical journal of experimental and clinical research, Volume 22, p.2097-104 (2016)


BACKGROUND There is increasing interest in clinical research with electronic medical data, but it often faces the challenges of heterogeneity between hospitals. Our objective was to develop a single numerical score for characterizing such heterogeneity via computing inpatient mortality in treating acute myocardial infarction (AMI) patients based on diagnostic information recorded in the database of Discharge Summary Reports (DSR). MATERIAL AND METHODS Using 4 216 135 DSRs of 49 tertiary hospitals from 2006 to 2010 in Beijing, more than 200 secondary diagnoses were identified to develop a risk score for AMI (n=50 531). This risk score was independently validated with 21 571 DSRs from 65 tertiary hospitals in 2012. The c-statistics of new risk score was computed as a measure of discrimination and was compared with the Charlson comorbidity index (CCI) and its adaptions for further validation. RESULTS We finally identified and weighted 22 secondary diagnoses using a logistic regression model. In the external validation, the novel risk score performed better than the widely used CCI in predicting in-hospital mortality of AMI patients (c-statistics: 0.829, 0.832, 0.824 vs. 0.775, 0.773, and 0.710 in training, testing, and validating dataset, respectively). CONCLUSIONS The new risk score developed from DSRs outperform the existing administrative data when applied to healthcare data from China. This risk score can be used for adjusting heterogeneity between hospitals when clinical data from multiple hospitals are included.