Share |

Comp Stat and Data Analysis (CSDA)

Syndicate content ScienceDirect
ScienceDirect RSS
Updated: 9 min 46 sec ago

A comparison of block and semi-parametric bootstrap methods for variance estimation in spatial statistics

June 8, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 8 June 2010

N., Iranpanah , M., Mohammadzadeh , C.C., Taylor

Efron (1979) introduced the bootstrap method for independent data but it can not be easily applied to spatial data because of their dependency. For spatial data that are correlated in terms of their locations in the underlying space the moving block bootstrap method is usually used to estimate the precision measures of the estimators. The precision of the moving block bootstrap estimators is related to the block size which is difficult to select. In the moving block bootstrap method also the variance estimator is underestimated. In this paper, first the semi-parametric bootstrap is used to estimate the precision measures of...
Categories: Statistical Journals

Bayesian nonlinear regression models with scale mixtures of skew normal distributions: Estimation and case influence diagnostics

June 8, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 8 June 2010

Vicente G., Cancho , Dipak K., Dey , Victor H., Lachos , Marinho G., Andrade

The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this...
Categories: Statistical Journals

On robust tail index estimation

June 4, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 4 June 2010

Jan, Beran , Dieter, Schell

A new approach to tail index estimation based on huberization of the Pareto MLE is considered. The proposed estimator is robust in a nonstandard way in that it protects against deviations from the central model at low quantiles. Asymptotic normality with the parametric -rate of convergence is obtained with a bounded asymptotic bias under deviations from the Pareto model. The method is particularly useful for small samples where Hill-type estimators tend to be highly volatile. This is illustrated by a simulation study with sample sizes n≤100.
Categories: Statistical Journals

Cost-efficiency considerations in the choice of a microarray platform for time course experimental designs

June 4, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 4 June 2010

Valéria, Lima Passos , Frans E.S., Tan , Martijn P.F., Berger

Customarily, the choice between one- or two-colour microarray platforms is based on their respective practical and technical merits, contingent on objectives and constraints of the study at stake. Statistical efficiency, if accounted for, plays a secondary role. A cost-efficiency comparison of the one- and two-colour designs for a 2×4 time course experiment was conducted. It is shown that differences in costs between the platforms’ designs, once adjusted for statistical efficiency, are not always negligible. The extent of these differences is largely influenced by subjects and array prices as well as by biological and error variances in their relative magnitude. Circumstances...
Categories: Statistical Journals

Non linear methods for inverse statistical problems

June 4, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 4 June 2010

Pierre, Barbillon , Gilles, Celeux , Agnès, Grimaud , Yannick, Lefebvre , Étienne, De Rocquigny

In the uncertainty treatment framework considered, the intrinsic variability of the inputs of a physical simulation model is modelled by a multivariate probability distribution. The objective is to identify this probability distribution - the dispersion of which is independent of the sample size since intrinsic variability is at stake - based on observation of some model outputs. Moreover, in order to limit to a reasonable level the number of (usually burdensome) physical model runs inside the inversion algorithm, a non linear approximation methodology making use of Kriging and stochastic EM algorithm is presented. It is compared with iterated linear approximation...
Categories: Statistical Journals

Feature selection in the Laplacian support vector machine

June 3, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 2 June 2010

Sangjun, Lee , Changyi, Park , Ja-Yong, Koo

Traditional classifiers including support vector machines use only labeled data in training. However, labeled instances are often difficult, costly, or time consuming to obtain while unlabeled instances are relatively easy to collect. The goal of semi-supervised learning is to improve the classification accuracy by using unlabeled data together with a few labeled data in training classifiers. Recently, the Laplacian support vector machine has been proposed as an extension of the support vector machine to semi-supervised learning. The Laplacian support vector machine has drawbacks in its interpretability as the support vector machine has. Also it performs poorly when there are many...
Categories: Statistical Journals

Error rates for multivariate outlier detection

June 2, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 1 June 2010

Andrea, Cerioli , Alessio, Farcomeni

Multivariate outlier identification requires the choice of reliable cut-off points for the robust distances that measure the discrepancy from the fit provided by high-breakdown estimators of location and scatter. Multiplicity issues affect the identification of the appropriate cut-off points. It is described how careful choice of the error rate which is controlled during the outlier detection process can yield a good compromise between high power and low swamping, when alternatives to the Family Wise Error Rate are considered. Correspondingly, multivariate outlier detection rules based on the False Discovery Rate and the False Discovery Exceedance criteria are proposed. The properties of...
Categories: Statistical Journals

Quadratic approximation on SCAD penalized estimation

June 2, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 1 June 2010

Sunghoon, Kwon , Hosik, Choi , Yongdai, Kim

In this paper, we propose a method of quadratic approximation that unifies various types of smoothly clipped absolute deviation (SCAD) penalized estimations. For convenience, we call it the quadratically approximated SCAD penalized estimation (Q-SCAD). We prove that the proposed Q-SCAD estimator achieves the oracle property and requires only the least angle regression (LARS) algorithm for computation. Numerical studies including simulations and real data analysis confirm that the Q-SCAD estimator performs as efficient as the original SCAD estimator.
Categories: Statistical Journals

A space-time filter for panel data models containing random effects

June 2, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 1 June 2010

Olivier, Parent , James P., LeSage

A space-time filter structure is introduced that can be used to accommodate dependence across space and time in the error components of panel data models that contain random effects. This specification provides insights regarding several space-time structures that have been used recently in the panel data literature. Markov Chain Monte Carlo methods are set forth for estimating the model which allow simple treatment of initial period observations as endogenous or exogenous. Performance of the approach is demonstrated using both Monte Carlo experiments and an applied illustration.
Categories: Statistical Journals

Model-based classification via mixtures of multivariate t-distributions

June 2, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 1 June 2010

Jeffrey L., Andrews , Paul D., McNicholas , Sanjeena, Subedi

A novel model-based classification technique is introduced based on mixtures of multivariate t-distributions. A family of four models is defined by constraining, or not, the covariance matrices and the degrees of freedom to be equal across mixture components. Parameters for each of the resulting four models are estimated using a multicycle expectation conditional-maximization algorithm, where convergence is determined using a criterion based on Aitken’s acceleration. A straightforward, but very effective, technique for the initialization of the unknown component memberships is introduced and compared with a popular, more sophisticated, initialization technique. This novel four-member family is applied to real and simulated...
Categories: Statistical Journals

An extension of an over-dispersion test for count data

June 1, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 1 June 2010

M., Fazil Baksh , Dankmar, Böhning , Rattana, Lerdsuwansri

While over-dispersion in capture-recapture studies is well known to lead to poor estimation of population size, current diagnostic tools to detect the presence of heterogeneity have not been specifically developed for capture-recapture studies. To address this, a simple and efficient method of testing for over-dispersion in zero-truncated count data is developed and evaluated. The proposed method generalizes an over-dispersion test previously suggested for un-truncated count data and may also be used for testing residual over-dispersion in zero-inflation data. Simulations suggest that the asymptotic distribution of the test statistic is standard normal and that this approximation is also reasonable for small...
Categories: Statistical Journals

Robust inference in generalized partially linear models

June 1, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 1 June 2010

Graciela, Boente , Daniela, Rodriguez

In many situations, data follow a generalized partly linear model in which the mean of the responses is modeled, through a link function, linearly on some covariates and nonparametrically on the remaning ones. A new class of robust estimates for the smooth function η, associated to the nonparametric component, and for the parameter , related to the linear one, is defined. The robust estimators are based on a three step procedure, where large values of the deviance or Pearson residuals are bounded through a score function. These estimators allow to make easier inferences on the regression parameter and also...
Categories: Statistical Journals

Exact and approximate algorithms for variable selection in linear discriminant analysis

June 1, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 1 June 2010

Michael J., Brusco , Douglas, Steinley

Variable selection is a venerable problem in multivariate statistics. In the context of discriminant analysis, the goal is to select a subset of variables that accomplishes one of two objectives: (1) the provision of a parsimonious, yet descriptive, representation of group structure, or (2) the ability to correctly allocate new cases to groups. We present an exact (branch-and-bound) algorithm for variable selection in linear discriminant analysis that identifies subsets of variables that minimize Wilks’ Λ. An important feature of this algorithm is a variable reordering scheme that greatly reduces computation time. We also present an approximate procedure based on tabu...
Categories: Statistical Journals

Inference in HIV dynamics models via hierarchical likelihood

May 29, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Uncorrected Proof, Available online 28 May 2010

D., Commenges , D., Jolly , J., Drylewicz , H., Putter , R., Thiébaut

HIV dynamical models are often based on non-linear systems of ordinary differential equations (ODE), which do not have an analytical solution. Introducing random effects in such models leads to very challenging non-linear mixed-effects models. To avoid the numerical computation of multiple integrals involved in the likelihood, a hierarchical likelihood (h-likelihood) approach, treated in the spirit of a penalized likelihood is proposed. The asymptotic distribution of the maximum h-likelihood estimators (MHLE) for fixed effects is given. The MHLE are slightly biased but the bias can be made negligible by using a parametric bootstrap procedure. An efficient algorithm for maximizing the h-likelihood...
Categories: Statistical Journals

Generalized weighted likelihood density estimators with application to finite mixture of exponential family distributions

May 27, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 26 May 2010

Tingting, Zhan , Inna, Chevoneva , Boris, Iglewicz

The family of weighted likelihood estimators largely overlaps with minimum divergence estimators. They are robust to data contaminations compared to MLE. We define the class of generalized weighted likelihood estimators (GWLE), provide its influence function and discuss the efficiency requirements. We introduce a new truncated cubic-inverse weight, which is both first and second order efficient and more robust than previously reported weights. We also discuss new ways of selecting the smoothing bandwidth and weighted starting values for the iterative algorithm. The advantage of the truncated cubic-inverse weight is illustrated in a simulation study of three-components normal mixtures model with large...
Categories: Statistical Journals

Hierarchical multilinear models for multiway data

May 27, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 26 May 2010

Peter D., Hoff

Reduced-rank decompositions provide descriptions of the variation among the elements of a matrix or array. In such decompositions, the elements of an array are expressed as products of low-dimensional latent factors. This article presents a model-based version of such a decomposition, extending the scope of reduced rank methods to accommodate a variety of data types such as longitudinal social networks and continuous multivariate data that are cross-classified by categorical variables. The proposed model-based approach is hierarchical, in that the latent factors corresponding to a given dimension of the array are not a priori independent, but exchangeable. Such a hierarchical approach...
Categories: Statistical Journals

Partially varying coefficient single-index proportional hazards regression models☆

May 25, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 24 May 2010

Jianbo, Li , Riquan, Zhang

In this paper, the partially varying-coefficient single-index proportional hazards regression models are discussed. All unknown functions are fitted by polynomial B splines. The index parameters and B-spline coefficients are estimated by the partial likelihood method and a two-step Newton-Raphson algorithm. Consistency and asymptotic normality of the estimators of all the parameters are derived. Through a simulation study and the VA data example, we illustrate that the proposed estimation procedure is accurate, rapid and stable.
Categories: Statistical Journals

Similarity analysis in bayesian random partition models

May 25, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 24 May 2010

Carlos A., Navarrete , Fernando A., Quintana

This work proposes a method to assess the influence of individual observations in the clustering generated by any process that involves random partitions. We call it Similarity Analysis. It basically consists of decomposing the estimated similarity matrix into an intrinsic and an extrinsic part, coupled with a new approach for representing and interpreting partitions. Individual influence is associated with the particular ordering induced by individual covariates, which in turn provides an interpretation of the underlying clustering mechanism. We present applications in the context of Species Sampling Mixture Models (SSMMs), including Bayesian density estimation and dependent linear regression models.
Categories: Statistical Journals

Computational issues with fitting joint location/dispersion models in unreplicated 2k factorials

May 25, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, In Press, Accepted Manuscript, Available online 24 May 2010

Thomas M., Loughin , Jorge E., Rodríguez

Maximum likelihood estimation for a joint location/dispersion model has been found occasionally to experience convergence problems when applied to experiments of the 2k factorial series. We explore these problems and identify models for which the likelihood diverges or is multimodal. We derive the conditions under which this occurs and provide simple ways to check for problems both before and during computation.
Categories: Statistical Journals

Editorial Board

May 23, 2010
Publication year: 2010
Source: Computational Statistics & Data Analysis, Volume 54, Issue 10, 1 October 2010, Pages iii-v

[No author name available]
Categories: Statistical Journals