An Economical Sample Size Determination Algorithm for Clinical Data Statistical Analysis
by Hassan Assareh,Mary Waterhouse, Ian Smith, Russell Brighouse, Kelley Foster, and KerrieMengersen.
For most data analysis problems, sample size formulae are constructed by focusing on statisticalcharacteristics rather than economical constraints. When performing a complicatedstatistical analysis involving clinical data, such as risk model construction, choosing a samplesize which simultaneously satisfies statistical (accuracy and precision) and economical (cost of data inspection and error modification) requirements are non-trivial. This research presents ageneral data capturing algorithm which addresses this issue. It uses Value of Information theoryfrom a Bayesian decision making context and the concept of Utility. We propose a customizedversion of the algorithm to determine an appropriate sample size for risk model constructionusing logistic regression and then apply it for calibration of the Acute Physiology and ChronicHealth Evaluation II (APACHE II), a severity disease scoring system of intensive care units, for various utility scenarios. We also outline extensionswhich could be made to the framework and techniques.
Bayesian Statistics, Data Quality, Logistic Regression, Modification Cost, Optimization, Risk Model, Sample Size, Utility, Value of Information
Hassan Assareh, firstname.lastname@example.org
Mary Waterhouse, email@example.com
Ian Smith, firstname.lastname@example.org
Russell Brighouse, email@example.com
Kelley Foster, firstname.lastname@example.org
Kerrie Mengersen, email@example.com
James Knaub, James.Knaub@eia.gov
READING THE ARTICLE: You can read the article in
portable document (.pdf) format (375506 bytes.)
NOTE: The content of this article is the intellectual property of the authors, who retains all rights to future publication.
This page has been accessed 1288 times since JUNE 18, 2013.
Return to the Home Page.