Applying Logistic Regression Model to The Second Primary Cancer Data.

by Amr I. Abdelrahman.

Abstract: The logistic regression model is used to determine the social-demographic risk factors which affect the second cancer occurrence for 200 patients who were initially treated for first primary cancer stage I and were cancer free for at least 1 year after first primary cancer treatment. The 200 patients were classified as "having a second cancer", and "not having a second cancer". The social-demographic risk factors used are age at first cancer, gender, area the patient lives in, marital status, family history, smoking, education and obesity in addition to treatment by radiation. The binary Logistic regression model is used in this study to estimate the probability of the occurrence and to determine the effective risk factors that cause the second cancer occurrence. The odds ratio analysis compare whether the probability of having a second primary cancer is the same for each covariate groups. Significance testing for the logistic coefficients using Wald test and likelihood ratio show that five risk factors were significant. To assess the fitness of the model the Hosmer and Lemeshow test is used. The logistic regression model proved to have a lower sensitivity level due to the clinical risk factors not considered in this study.

Key Words: Logistic regression model; Wald test, Odds Ratio, Cross-Validation; Roc curve; Second primary cancer

Amr I. Abdelrahman,

Editor: Ahmed Youssef,

READING THE ARTICLE: You can read the article in portable document (.pdf) format (246009 bytes.)

NOTE: The content of this article is the intellectual property of the authors, who retains all rights to future publication.

This page has been accessed 2067 times since January 19, 2009.

Return to the InterStat Home Page.