Share |

J Royal Stat Soc, Ser B (JRSS,B)

Syndicate content
Wiley Online Library : Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Updated: 35 min 42 sec ago

Cumulative incidence association models for bivariate competing risks data

January 6, 2012

Summary.  Association models, like frailty and copula models, are frequently used to analyse clustered survival data and to evaluate within-cluster associations. The assumption of non-informative censoring is commonly applied to these models, though it may not be true in many situations. We consider bivariate competing risk data and focus on association models specified for the bivariate cumulative incidence function (CIF), which is a non-parametrically identifiable quantity. Copula models are proposed which relate the bivariate CIF to its corresponding univariate CIFs, similarly to independently right-censored data, and accommodate frailty models for the bivariate CIF. Two estimating equations are developed to estimate the association parameter, permitting the univariate CIFs to be estimated either parametrically or non-parametrically. Goodness-of-fit tests are presented for formally evaluating the parametric models. Both estimators perform well with moderate sample sizes in simulation studies. The practical use of the methodology is illustrated in an analysis of dementia associations.

Categories: Statistical Journals

Local shrinkage rules, Lévy processes and regularized regression

January 5, 2012

Summary.  We use Lévy processes to generate joint prior distributions, and therefore penalty functions, for a location parameter as p grows large. This generalizes the class of local–global shrinkage rules based on scale mixtures of normals, illuminates new connections between disparate methods and leads to new results for computing posterior means and modes under a wide class of priors. We extend this framework to large-scale regularized regression problems where p>n, and we provide comparisons with other methodologies.

Categories: Statistical Journals

Fast subset scan for spatial pattern detection

January 5, 2012

Summary.  We propose a new ‘fast subset scan’ approach for accurate and computationally efficient event detection in massive data sets. We treat event detection as a search over subsets of data records, finding the subset which maximizes some score function. We prove that many commonly used functions (e.g. Kulldorff's spatial scan statistic and extensions) satisfy the ‘linear time subset scanning’ property, enabling exact and efficient optimization over subsets. In the spatial setting, we demonstrate that proximity-constrained subset scans substantially improve the timeliness and accuracy of event detection, detecting emerging outbreaks of disease 2 days faster than existing methods.

Categories: Statistical Journals

The dynamic ‘expectation–conditional maximization either’ algorithm

January 5, 2012

Summary.  The ‘expectation–conditional maximization either’ (ECME) algorithm has proven to be an effective way of accelerating the expectation–maximization algorithm for many problems. Recognizing the limitation of using prefixed acceleration subspaces in the ECME algorithm, we propose a dynamic ECME (DECME) algorithm which allows the acceleration subspaces to be chosen dynamically. The simplest DECME implementation is what we call DECME-1, which uses the line that is determined by the two most recent estimates as the acceleration subspace. The investigation of DECME-1 leads to an efficient, simple, stable and widely applicable DECME implementation, which uses two-dimensional acceleration subspaces and is referred to as DECME-2. The fast convergence of DECME-2 is established by the theoretical result that, in a small neighbourhood of the maximum likelihood estimate, it is equivalent to a conjugate direction method. The remarkable accelerating effect of DECME-2 and its variant is also demonstrated with several numerical examples.

Categories: Statistical Journals

Semiparametric tests for sufficient cause interaction

January 5, 2012

Summary.  A sufficient cause interaction between two exposures signals the presence of individuals for whom the outcome would occur only under certain values of the two exposures. When the outcome is dichotomous and all exposures are categorical, then, under certain no confounding assumptions, empirical conditions for sufficient cause interactions can be constructed on the basis of the sign of linear contrasts of conditional outcome probabilities between differently exposed subgroups, given confounders. It is argued that logistic regression models are unsatisfactory for evaluating such contrasts, and that Bernoulli regression models with linear link are prone to misspecification. We therefore develop semiparametric tests for sufficient cause interactions under models which postulate probability contrasts in terms of a finite dimensional parameter, but which are otherwise unspecified. Estimation is often not feasible in these models because it would require non-parametric estimation of auxiliary conditional expectations given high dimensional variables. We therefore develop ‘multiply robust tests’ under a union model which assumes that at least one of several working submodels holds. In the special case of a randomized experiment or a family-based genetic study in which the joint exposure distribution is known by design or Mendelian inheritance, the procedure leads to asymptotically distribution-free tests of the null hypothesis of no sufficient cause interaction.

Categories: Statistical Journals

Adaptive and dynamic adaptive procedures for false discovery rate control and estimation

January 1, 2012

Summary.  Many methods for estimation or control of the false discovery rate (FDR) can be improved by incorporating information about π0, the proportion of all tested null hypotheses that are true. Estimates of π0 are often based on the number of p-values that exceed a threshold λ. We first give a finite sample proof for conservative point estimation of the FDR when the λ-parameter is fixed. Then we establish a condition under which a dynamic adaptive procedure, whose λ-parameter is determined by data, will lead to conservative π0- and FDR estimators. We also present asymptotic results on simultaneous conservative FDR estimation and control for a class of dynamic adaptive procedures. Simulation results show that a novel dynamic adaptive procedure achieves more power through smaller estimation errors for π0 under independence and mild dependence conditions. We conclude by discussing the connection between estimation and control of the FDR and show that several recently developed FDR control procedures can be cast in a unifying framework where the strength of the procedures can be easily evaluated.

Categories: Statistical Journals

Control variates for estimation based on reversible Markov chain Monte Carlo samplers

January 1, 2012

Summary.  A general methodology is introduced for the construction and effective application of control variates to estimation problems involving data from reversible Markov chain Monte Carlo samplers. We propose the use of a specific class of functions as control variates, and we introduce a new consistent estimator for the values of the coefficients of the optimal linear combination of these functions. For a specific Markov chain Monte Carlo scenario, the form and proposed construction of the control variates is shown to provide an exact solution of the associated Poisson equation. This implies that the estimation variance in this case (in the central limit theorem regime) is exactly zero. The new estimator is derived from a novel, finite dimensional, explicit representation for the optimal coefficients. The resulting variance reduction methodology is primarily (though certainly not exclusively) applicable when the simulated data are generated by a random-scan Gibbs sampler. Markov chain Monte Carlo examples of Bayesian inference problems demonstrate that the corresponding reduction in the estimation variance is significant, and that in some cases it can be quite dramatic. Extensions of this methodology are discussed and simulation examples are presented illustrating the utility of the methods proposed. All methodological and asymptotic arguments are rigorously justified under essentially minimal conditions.

Categories: Statistical Journals

New consistent and asymptotically normal parameter estimates for random-graph mixture models

January 1, 2012

Summary.  Random-graph mixture models are very popular for modelling real data networks. Parameter estimation procedures usually rely on variational approximations, either combined with the expectation–maximization (EM) algorithm or with Bayesian approaches. Despite good results on synthetic data, the validity of the variational approximation is, however, not established. Moreover, these variational approaches aim at approximating the maximum likelihood or the maximum a posteriori estimators, whose behaviour in an asymptotic framework (as the sample size increases to ∞) remains unknown for these models. In this work, we show that, in many different affiliation contexts (for binary or weighted graphs), parameter estimators based either on moment equations or on the maximization of some composite likelihood are strongly consistent and √n convergent, when the number n of nodes increases to ∞. As a consequence, our result establishes that the overall structure of an affiliation model can be (asymptotically) caught by the description of the network in terms of its number of triads (order 3 structures) and edges (order 2 structures). Moreover, these parameter estimates are either explicit (as for the moment estimators) or may be approximated by using a simple EM algorithm, whose convergence properties are known. We illustrate the efficiency of our method on simulated data and compare its performances with other existing procedures. A data set of cross-citations among economics journals is also analysed.

Categories: Statistical Journals

Conditional quantile analysis when covariates are functions, with application to growth data

January 1, 2012

Summary.  Motivated by the conditional growth charts problem, we develop a method for conditional quantile analysis when predictors take values in a functional space. The method proposed aims at estimating conditional distribution functions under a generalized functional regression framework. This approach facilitates balancing of model flexibility and the curse of dimensionality for the infinite dimensional functional predictors. Its good performance in comparison with other methods, both for sparsely and for densely observed functional covariates, is demonstrated through theory as well as in simulations and an application to growth curves, where the method proposed can, for example, be used to assess the entire growth pattern of a child by relating it to the predicted quantiles of adult height.

Categories: Statistical Journals

A full scale approximation of covariance functions for large spatial data sets

January 1, 2012

Summary.  Gaussian process models have been widely used in spatial statistics but face tremendous computational challenges for very large data sets. The model fitting and spatial prediction of such models typically require O(n3) operations for a data set of size n. Various approximations of the covariance functions have been introduced to reduce the computational cost. However, most existing approximations cannot simultaneously capture both the large- and the small-scale spatial dependence. A new approximation scheme is developed to provide a high quality approximation to the covariance function at both the large and the small spatial scales. The new approximation is the summation of two parts: a reduced rank covariance and a compactly supported covariance obtained by tapering the covariance of the residual of the reduced rank approximation. Whereas the former part mainly captures the large-scale spatial variation, the latter part captures the small-scale, local variation that is unexplained by the former part. By combining the reduced rank representation and sparse matrix techniques, our approach allows for efficient computation for maximum likelihood estimation, spatial prediction and Bayesian inference. We illustrate the new approach with simulated and real data sets.

Categories: Statistical Journals

Hybrid confidence regions based on data depth

January 1, 2012

Summary.  We consider the general problem of constructing confidence regions for, possibly multi-dimensional, parameters when we have available more than one approach for the construction. These approaches may be motivated by different model assumptions, different levels of approximation, different settings of tuning parameters or different Monte Carlo algorithms. Their effectiveness is often governed by different sets of conditions which are difficult to vindicate in practice. We propose two procedures for constructing hybrid confidence regions which endeavour to integrate all such individual approaches. The procedures employ the concept of data depth to calibrate the confidence region in two different ways, the first rendering its coverage error minimax and the second rendering its coverage error conservative. The resulting region reconciles in many important aspects the discrepancies between the various approaches, and is robust against misspecification of their governing conditions. Theoretical and empirical properties of our procedures are investigated in comparison with those of the constituent individual approaches.

Categories: Statistical Journals

Variance estimation using refitted cross-validation in ultrahigh dimensional regression

January 1, 2012

Summary.  Variance estimation is a fundamental problem in statistical modelling. In ultrahigh dimensional linear regression where the dimensionality is much larger than the sample size, traditional variance estimation techniques are not applicable. Recent advances in variable selection in ultrahigh dimensional linear regression make this problem accessible. One of the major problems in ultrahigh dimensional regression is the high spurious correlation between the unobserved realized noise and some of the predictors. As a result, the realized noises are actually predicted when extra irrelevant variables are selected, leading to a serious underestimate of the level of noise. We propose a two-stage refitted procedure via a data splitting technique, called refitted cross-validation, to attenuate the influence of irrelevant variables with high spurious correlations. Our asymptotic results show that the resulting procedure performs as well as the oracle estimator, which knows in advance the mean regression function. The simulation studies lend further support to our theoretical claims. The naive two-stage estimator and the plug-in one-stage estimators using the lasso and smoothly clipped absolute deviation are also studied and compared. Their performances can be improved by the refitted cross-validation method proposed.

Categories: Statistical Journals

Strong rules for discarding predictors in lasso-type problems

November 3, 2011

Summary.  We consider rules for discarding predictors in lasso regression and related problems, for computational efficiency. El Ghaoui and his colleagues have proposed ‘SAFE’ rules, based on univariate inner products between each predictor and the outcome, which guarantee that a coefficient will be 0 in the solution vector. This provides a reduction in the number of variables that need to be entered into the optimization. We propose strong rules that are very simple and yet screen out far more predictors than the SAFE rules. This great practical improvement comes at a price: the strong rules are not foolproof and can mistakenly discard active predictors, i.e. predictors that have non-zero coefficients in the solution. We therefore combine them with simple checks of the Karush–Kuhn–Tucker conditions to ensure that the exact solution to the convex problem is delivered. Of course, any (approximate) screening method can be combined with the Karush–Kuhn–Tucker conditions to ensure the exact solution; the strength of the strong rules lies in the fact that, in practice, they discard a very large number of the inactive predictors and almost never commit mistakes. We also derive conditions under which they are foolproof. Strong rules provide substantial savings in computational time for a variety of statistical optimization problems.

Categories: Statistical Journals

Achieving near perfect classification for functional data

November 3, 2011

Summary.  We show that, in functional data classification problems, perfect asymptotic classification is often possible, making use of the intrinsic very high dimensional nature of functional data. This performance is often achieved by linear methods, which are optimal in important cases. These results point to a marked contrast between classification for functional data and its counterpart in conventional multivariate analysis, where the dimension is kept fixed as the sample size diverges. In the latter setting, linear methods can sometimes be quite inefficient, and there are no prospects for asymptotically perfect classification, except in pathological cases where, for example, a variance vanishes. By way of contrast, in finite samples of functional data, good performance can be achieved by truncated versions of linear methods. Truncation can be implemented by partial least squares or projection onto a finite number of principal components, using, in both cases, cross-validation to determine the truncation point. We establish consistency of the cross-validation procedure.

Categories: Statistical Journals

Reduced rank stochastic regression with a sparse singular value decomposition

November 3, 2011

Summary.  For a reduced rank multivariate stochastic regression model of rank r*, the regression coefficient matrix can be expressed as a sum of r* unit rank matrices each of which is proportional to the outer product of the left and right singular vectors. For improving predictive accuracy and facilitating interpretation, it is often desirable that these left and right singular vectors be sparse or enjoy some smoothness property. We propose a regularized reduced rank regression approach for solving this problem. Computation algorithms and regularization parameter selection methods are developed, and the properties of the new method are explored both theoretically and by simulation. In particular, the regularization method proposed is shown to be selection consistent and asymptotically normal and to enjoy the oracle property. We apply the proposed model to perform biclustering analysis with microarray gene expression data.

Categories: Statistical Journals

Penalized classification using Fisher's linear discriminant

November 1, 2011

Summary.  We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high dimensional setting where pn, LDA is not appropriate for two reasons. First, the standard estimate for the within-class covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is difficult to interpret the classification rule that is obtained from LDA, since it involves all p features. We propose penalized LDA, which is a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability. The discriminant problem is not convex, so we use a minorization–maximization approach to optimize it efficiently when convex penalties are applied to the discriminant vectors. In particular, we consider the use of L1 and fused lasso penalties. Our proposal is equivalent to recasting Fisher's discriminant problem as a biconvex problem. We evaluate the performances of the resulting methods on a simulation study, and on three gene expression data sets. We also survey past methods for extending LDA to the high dimensional setting and explore their relationships with our proposal.

Categories: Statistical Journals

Estimation of direct effects for survival data by using the Aalen additive hazards model

November 1, 2011

Summary.  We extend the definition of the controlled direct effect of a point exposure on a survival outcome, other than through some given, time-fixed intermediate variable, to the additive hazard scale. We propose two-stage estimators for this effect when the exposure is dichotomous and randomly assigned and when the association between the intermediate variable and the survival outcome is confounded only by measured factors, which may themselves be affected by the exposure. The first stage of the estimation procedure involves assessing the effect of the intermediate variable on the survival outcome via Aalen's additive regression for the event time, given exposure, intermediate variable and confounders. The second stage involves applying Aalen's additive model, given the exposure alone, to a modified stochastic process (i.e. a modification of the observed counting process based on the first-stage estimates). We give the large sample properties of the estimator proposed and investigate its small sample properties by Monte Carlo simulation. A real data example is provided for illustration.

Categories: Statistical Journals