Semiparametric estimation exploiting covariate independence in two-phase randomized trials.

Publication Type:

Journal Article


Biometrics, Volume 65, Issue 1, p.178-87 (2009)


2009, Algorithms, Biological Markers, Biometry, Center-Authored Paper, Data Interpretation, Statistical, Humans, Models, Theoretical, Public Health Sciences Division, Randomized Controlled Trials as Topic, Treatment Outcome


Recent results for case-control sampling suggest when the covariate distribution is constrained by gene-environment independence, semiparametric estimation exploiting such independence yields a great deal of efficiency gain. We consider the efficient estimation of the treatment-biomarker interaction in two-phase sampling nested within randomized clinical trials, incorporating the independence between a randomized treatment and the baseline markers. We develop a Newton-Raphson algorithm based on the profile likelihood to compute the semiparametric maximum likelihood estimate (SPMLE). Our algorithm accommodates both continuous phase-one outcomes and continuous phase-two biomarkers. The profile information matrix is computed explicitly via numerical differentiation. In certain situations where computing the SPMLE is slow, we propose a maximum estimated likelihood estimator (MELE), which is also capable of incorporating the covariate independence. This estimated likelihood approach uses a one-step empirical covariate distribution, thus is straightforward to maximize. It offers a closed-form variance estimate with limited increase in variance relative to the fully efficient SPMLE. Our results suggest exploiting the covariate independence in two-phase sampling increases the efficiency substantially, particularly for estimating treatment-biomarker interactions.