Some Simplifications for the Expectation-Maximization (EM) Algorithm: The Linear Regression Model Case by Daniel A. Griffith

by Daniel A. Griffith .

Abstract: The EM algorithm is a generic tool that offers maximum likelihood solutions when datasets are incomplete with data values missing at random or completely at random. At least for its simplest form, the algorithm can be rewritten in terms of an ANCOVA regression specification. This formulation allows several analytical results to be derived that permit the EM algorithm solution to be expressed in terms of new observation predictions and their variances. Implementations can be made with a linear regression or a nonlinear regression model routine, allowing missing value imputations, even when they must satisfy constraints. Thirteen example datasets gleaned from the EM algorithm literature are reanalyzed. Imputation results have been verified with SAS PROC MI. Five theorems are proved that broadly contextualize imputation findings in terms of the theory, methodology, and practice of statistical science.

Key Words: EM algorithm, missing value, imputation, new observation prediction, prediction error

Author:
Daniel A. Griffith, dagriffith@utdallas.edu

Editor: Sapra, Sunil K., ssapra@exchange.calstatela.edu

READING THE ARTICLE: You can read the article in portable document (.pdf) format (216527 bytes.)

NOTE: The content of this article is the intellectual property of the authors, who retains all rights to future publication.

This page has been accessed 1596 times since MARCH 14, 2010.


Return to the InterStat Home Page.