..

Журнал биометрии и биостатистики

Отправить рукопись arrow_forward arrow_forward ..

Объем 6, Проблема 1 (2015)

Обзорная статья

Count Data Analysis in Randomised Clinical Trials

Jakobsen JC, Tamborrino M, Winkel P, Haase N, Perner A, Wetterslev J and Gluud C

Choosing the best model for analysis of count data in randomised clinical trials is complicated. In this paper, we review count data analysis with different parametric and non-parametric methods used in randomised clinical trials, and we define procedures for choosing between the two methods and their subtypes. We focus on analysis of simple count data and do not consider methods for analyzing longitudinal count data or Bayesian statistical analysis. We recommend that: (1) a detailed statistical analysis plan is published prior to access to trial data; (2) if there is lack of evidence for a parametric model, both non-parametric tests (either the van Elteren test or the Tadap2 test, based on an aligned rank test with equal stratum weights) and bootstrapping should be used as default methods; and (3) if more than two intervention groups are compared, then the Kruskal–Wallis test may be considered. If our recommendations are followed, the risk of biased results ensuing from analysis of count data in randomised clinical trials is expected to decrease.

исследовательская статья

Detection of Thyroid Cancer Clusters in Algeria

Oumelkheir Moussi, Boudrissa N, Bouakline S, Semrouni M and Hasbellaoui F

In the mid-seventies, the Pierre and Marie Curie Center of Algiers (CPMC), was the only structure at the national level to take care of patients with thyroid cancer, it recorded from 15 to 20 cases per year. Today, and although Algeria has ten structures for the management of these patients, CPMC has recorded more than 100 new cases of thyroid cancer per year over the period 2007/2011 This disease is the third most common cancer for women. These observations lead us to ask several questions: • Are there wilayas having an excessive number of cancer’s cases? • Is cases’ concentration abnormally? • Is the spatial distribution of these cases random? Answering these questions is by describing the spatial heterogeneity, that is to say, identifying potential spatial clusters. A cluster is spatial organization defined as an aggregation, a combination of the nearest cases, the proximity being defined in the sense of geographical distance. Various statistical methods have been used to study the spatial heterogeneity. The global methods, for the detection of clusters, the study of the spatial correlation, and the detection of cases tending to "clustering", and the local methods, for the identification of clusters of cases inconsistent under the null hypothesis of no clusters, and the evaluation of their significance level. To study the spatial distribution of the disease, several tests were applied, such as the test of Pearson, the index of Moran, the Tango test, and Pothoff-Whittinghill test to check the hypothesis of constant risk of the incidence of thyroid cancer and study the tendency to "clustering", next the "Two-step clustering", was used for the localization of the clusters, and the scanning Kulldorf test to identify potential clusters, confirm and assess their significance.

Обзорная статья

A Comparison of Six Methods for Missing Data Imputation

Peter Schmitt, Jonas Mandel and Mickael Guedj

Missing data are part of almost all research and introduce an element of ambiguity into data analysis. It follows that we need to consider them appropriately in order to provide an efficient and valid analysis. In the present study, we compare 6 different imputation methods: Mean, K-nearest neighbors (KNN), fuzzy K-means (FKM), singular value decomposition (SVD), bayesian principal component analysis (bPCA) and multiple imputations by chained equations (MICE). Comparison was performed on four real datasets of various sizes (from 4 to 65 variables), under a missing completely at random (MCAR) assumption, and based on four evaluation criteria: Root mean squared error (RMSE), unsupervised classification error (UCE), supervised classification error (SCE) and execution time. Our results suggest that bPCA and FKM are two imputation methods of interest which deserve further consideration in practice.

исследовательская статья

Survival Analysis of Premature Infants Admitted to Neonatal Int ensive Care Unit (NICU) in Northwest Ethiopia using Semi-Parametric Fr ailty Model

Sheferaw Yehuala, Salie Ayalew and and Zinabu Teka

In this research, the cox proportional hazard model and the semi-parametric gamma frailty model were compared on the survival of premature infants admitted to neonatal intensive care unit from December 29, 2011 to April 6, 2014. A retrospective study design was used to collect the data from patients chart. A frailty effect (θ=0.252, P-Value = 0.0031 < α=0.05) was obtained from the semi-parametric gamma frailty model, and mortality was depend within and across categories of premature infants based on their gestational age. The values of frailty were dispersed and hence induce greater heterogeneity in the infant hazards. Therefore, when there is heterogeneity, semi-parametric gamma frailty model could be used and lead to acceptable conclusions. Both models identifies Antenatal Care Visit, gravidity of (6-10), HIV status of mother, Respiratory Distress Syndrome, Prenatal Asphyxia, anemia and breastfeed initiated as the most determinant and statistically associated with time to death of premature infants admitted to NICU. Based on the model comparison analysis, semi-parametric gamma frailty was the best model to fit the data.

Краткое сообщение

Comparison of Horvitz and Thompson Estimator with that of Rao, Hartley and Cochran Estimator in PPS without Replacement Scheme

Anieting AE

This paper was focused on the comparison of Horvitz and Thompson estimator of population total with that of Rao, Hartley and Cochran estimator in PPS without replacement scheme when a sample of size six is taken from the same finite population. The data used were from Nigerian Bureau of Statistics bulletin. The result showed that the variances of both estimators gave positive values but the variance of the estimator by Rao, Hartley and Cochran was smaller making it a better estimator.

исследовательская статья

A Stochastic Segmentation Model for Recurrent Copy Number Alteration Analysis

Haipeng Xing and Ying Cai

Recurrent DNA copy number alterations (CNAs) are key genetic events in the study of human genetics and disease. Analysis of recurrent DNA CNA data often involves the inference of individual samples’ true signal levels and the crosssample recurrent regions at each location. We propose for the analysis of multiple samples CNA data a new stochastic segmentation model and an associated inference procedure that has attractive statistical and computational properties. An important feature of our model is that it yields explicit formulas for posterior probabilities of recurrence at each location, which can be used to estimate the recurrent regions directly. We propose an approximation method whose computational complexity is only linear in sequence length, which makes our model applicable to data of higher density. Simulation studies and analysis of an ovarian cancer dataset with 15 samples and a lung cancer dataset with 10 samples are conducted to illustrate the advantage of the proposed model.

исследовательская статья

Relative Likelihood Differences to Examine Asymptotic Convergen ce: A Bootstrap Simulation Approach

Milan Bimali and Michael Brimacombe

Maximum likelihood estimators (mle) and their large sample properties are extensively used in descriptive as well as inferential statistics. In the framework of large sample distribution of mle, it is important to know the relationship between the sample size and asymptotic convergence i.e. for what sample size does the mle behave satisfactorily attaining asymptotic normality. Previous works have discussed the undesirable impacts of using large sample approximations of the mles when such approximations do not hold. It has been argued that relative likelihood functions must be examined before making inferences based on mle. It was also demonstrated that transformation of mle can help achieve asymptotic normality with smaller sample sizes. Little has been explored regarding the appropriate sample size that would allow the mle achieve asymptotic normality from relative likelihood perspective directly. Our work proposes bootstrap/simulation based approach in examining the relationship between sample size and asymptotic behaviors of mle. We propose two measures of the convergence of observed relative likelihood function to the asymptotic relative likelihood functions namely: differences in areas and dissimilarity in shape between the two relative likelihood functions. These two measures were applied to datasets from literatures as well as simulated datasets.

исследовательская статья

Systemize the Probabilistic Discrete Event Systems with Moorepenrose Generalized-inverse Matrix Theory for Cross-sectional Behavioral Data

Ding-Geng (Din) Chen, Xinguang Chen, Feng Lin, Y.L. Lio and Harriet Kitzman

Moore-Penrose (M-P) generalized inverse matrix theory provides a powerful approach to solve an admissible linear-equation system when the inverse of the coefficient matrix does not exist. M-P matrix theory has been used in different areas to solve challenging research questions, including operations research, signal process, and system controls. In this study, we report our work to systemize a probability discrete event systems (PDES) modeling in characterizing the progression of health risk behaviors. A novel PDES model was devised by Lin and Chen to extract and investigate longitudinal properties of smoking multi-stage behavioral progression with cross-sectional survey data. Despite its success, this PDES model requires extra exogenous equations for the model to be solvable and practically implementable. However, exogenous equations are often difficult if not impossible to obtain. Even if the additional exogenous equations are derived, the data used to generate the equations are often error-prone. By applying the M-P theory, our research demonstrates that Lin and Chen’s PDES model can be solved without using exogenous equations. For practical application, we demonstrate the M-P approach using the open-source R software with real data from 2000 National Survey of Drug Use and Health. The removal of extra data facilitate researchers to use the novel PDES method in examining human behaviors, particularly, health related behaviors for disease prevention and health promotion. Successful application of the M-P matrix theory in solving the PDES model suggests potentials of this method in system modeling to solve challenge problems for other medical and health related research.

исследовательская статья

Multimodal Biometrics for Robust Fusion Systems using Logic Gat es

N. Celik, N. Manivannan, W. Balachandran and S. Kosunalp

Many professionals indicate that unimodal biometric recognition systems have many shortcomings associated with performance accuracy rates. In order to make the system design more robust, we propose a multimodal biometric which includes fingerprint and face recognition using logical AND operators at decision-level fusion. In this paper, we also discuss some concerns about the security issues regarding the identification and verification processes for the multimodal recognition system against invaders and threatening attackers. While the unimodal fingerprint and face biometric gives recognition rate of 94% and 90.8% respectively, the multi-modal approach was giving a recognition rate of 98% at the decision level fusion, showing an improvement in the accuracy. Also, both the FAR and FRR have been considerably reduced, showing that the multi-modal system implemented is more robust.

исследовательская статья

Using Available Information in the Assessment of Diagnostic Pro tocols

Cecilia A Cotton, Oana Danila, Stefan H Steiner, Daniel Severn and R Jock MacKay

A new binary screening or diagnostic test may be combined sequentially with an existing test using either a believe the positive or believe the negative protocol. Interest then lies in estimating the properties of the new combined protocol and in comparing the new protocol with the existing test via sensitivity, specificity, or likelihood ratios that capture the trade-o between sensitivity and specificity. We consider a paired assessment study with complete verification via a gold standard. Our goal is to quantify the gain in precision for the estimators of the sensitivity, specificity and the ratio of likelihood ratios in protocols when baseline information on the performance of the existing test is available. We find maximum likelihood estimators of the quantities of interest and derive their asymptotic standard deviations. The methods are illustrated using previously published mammography and ultrasound test results from a cohort of symptomatic women. We find that incorporating baseline information has a large impact on the precision of the estimator for the specificity of the believe the positive protocol and of the sensitivity of the believe the negative protocol. Including available baseline information can improve the precision of estimators of the sensitivity, specificity, and the ratio of likelihood ratios and/or reduce the number of subjects needed in an assessment study to evaluate the protocol.

исследовательская статья

Chaotic Maps for Biometric Template Protection-A Proposal

Supriya VG and Ramachandra Manjunatha

Modern biometric technologies claim to provide alternative solution to traditional authentication processes. Even though there are various advantages of biometric process, it is vulnerable to attacks which can decline its security. Towards addressing these concerns and improving public confidence, Biometric Cryptosystems and Cancelable Biometrics represent emerging technologies. This paper presents a comprehensive survey of Biometric Cryptosystems and Cancelable Biometrics along with the open issues and challenges. A new approach is proposed to address these open issues and challenges based on Cancellable Biometrics using chaotic maps, which are known to posses desirable properties of pseudo randomness, high sensitivity to initial conditions and very large key space.

Индексировано в

arrow_upward arrow_upward