Directed acyclic graphs are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical and biological systems where directed edges between nodes represent the influence of components of the system on each other. Estimation of directed graphs from observational data is computationally NP-hard. In addition, directed graphs with the same structure may be indistinguishable based on observations alone. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose an efficient penalized likelihood method for estimation of the adjacency matrix of directed acyclic graphs, when variables inherit a natural ordering. We study variable selection consistency of lasso and adaptive lasso penalties in high-dimensional sparse settings, and propose an error-based choice for selecting the tuning parameter. We show that although the lasso is only variable selection consistent under stringent conditions, the adaptive lasso can consistently estimate the true graph under the usual regularity assumptions.
In this paper we propose a new regression interpretation of the Cholesky factor of the covariance matrix, as opposed to the well-known regression interpretation of the Cholesky factor of the inverse covariance, which leads to a new class of regularized covariance estimators suitable for high-dimensional problems. Regularizing the Cholesky factor of the covariance via this regression interpretation always results in a positive definite estimator. In particular, one can obtain a positive definite banded estimator of the covariance matrix at the same computational cost as the popular banded estimator of Bickel & Levina (2008b), which is not guaranteed to be positive definite. We also establish theoretical connections between banding Cholesky factors of the covariance matrix and its inverse and constrained maximum likelihood estimation under the banding constraint, and compare the numerical performance of several methods in simulations and on a sonar data example.
Regularization methods are characterized by loss functions measuring data fits and penalty terms constraining model parameters. The commonly used quadratic loss is not suitable for classification with binary responses, whereas the loglikelihood function is not readily applicable to models where the exact distribution of observations is unknown or not fully specified. We introduce the penalized Bregman divergence by replacing the negative loglikelihood in the conventional penalized likelihood with Bregman divergence, which encompasses many commonly used loss functions in the regression analysis, classification procedures and machine learning literature. We investigate new statistical properties of the resulting class of estimators with the number pn of parameters either diverging with the sample size n or even nearly comparable with n, and develop statistical inference tools. It is shown that the resulting penalized estimator, combined with appropriate penalties, achieves the same oracle property as the penalized likelihood estimator, but asymptotically does not rely on the complete specification of the underlying distribution. Furthermore, the choice of loss function in the penalized classifiers has an asymptotically relatively negligible impact on classification performance. We illustrate the proposed method for quasilikelihood regression and binary classification with simulation evaluation and real-data application.
A family of shape curves is introduced that is useful for modelling the changes in shape in a series of geometrical objects. The relationship between the preshape sphere and the shape space is used to define a general family of curves based on horizontal geodesics on the preshape sphere. Methods for fitting geodesics and more general curves in the non-Euclidean shape space of point sets are discussed, based on minimizing sums of squares of Procrustes distances. Likelihood-based inference is considered. We illustrate the ideas by carrying out statistical analysis of two-dimensional landmarks on rats’ skulls at various times in their development and three-dimensional landmarks on lumbar vertebrae from three primate species.
We study a class of monotone univariate regression estimators. We use B-splines to approximate an underlying regression function and estimate spline coefficients based on grouped data. We investigate asymptotic properties of two monotone estimators: a grouped Brunk estimator and a penalized monotone estimator. These estimators are consistent at the boundary and their mean square errors achieve optimal convergence rates under suitable assumptions of the true regression function. Asymptotic distributions are developed and are shown to be independent of spline degrees and the number of knots. Simulation results and car data illustrate performance of the proposed estimators.
This paper considers the asymptotic distribution of the likelihood ratio statistic T for testing a subset of parameter of interest , , , based on the pseudolikelihood , where is a consistent estimator of , the nuisance parameter. We show that the asymptotic distribution of T under H0 is a weighted sum of independent chi-squared variables. Some sufficient conditions are provided for the limiting distribution to be a chi-squared variable. When the true value of the parameter of interest, , or the true value of the nuisance parameter, , lies on the boundary of parameter space, the problem is shown to be asymptotically equivalent to the problem of testing the restricted mean of a multivariate normal distribution based on one observation from a multivariate normal distribution with misspecified covariance matrix, or from a mixture of multivariate normal distributions. A variety of examples are provided for which the limiting distributions of T may be mixtures of chi-squared variables. We conducted simulation studies to examine the performance of the likelihood ratio test statistics in variance component models and teratological experiments.
In this paper we propose accurate parameter and over-identification tests for indirect inference. Under the null hypothesis the new tests are asymptotically 2-distributed with a relative error of order n–1. They exhibit better finite sample accuracy than classical tests for indirect inference, which have the same asymptotic distribution but an absolute error of order n–1/2. Robust versions of the tests are also provided. We illustrate their accuracy in nonlinear regression, Poisson regression with overdispersion and diffusion models.
We discuss the detection of local signals that occur at the same location in multiple one-dimensional noisy sequences, with particular attention to relatively weak signals that may occur in only a fraction of the sequences. We propose simple scan and segmentation algorithms based on the sum of the chi-squared statistics for each individual sample, which is equivalent to the generalized likelihood ratio for a model where the errors in each sample are independent. The simple geometry of the statistic allows us to derive accurate analytic approximations to the significance level of such scans. The formulation of the model is motivated by the biological problem of detecting recurrent DNA copy number variants in multiple samples. We show using replicates and parent-child comparisons that pooling data across samples results in more accurate detection of copy number variants. We also apply the multisample segmentation algorithm to the analysis of a cohort of tumour samples containing complex nested and overlapping copy number aberrations, for which our method gives a sparse and intuitive cross-sample summary.
Definitions are given for weak and strong sufficient cause interactions in settings in which the outcome is binary and in which there are two exposures of interest that are categorical or ordinal. Weak sufficient cause interactions concern cases in which a mechanism will operate under certain values of the two exposures but not when one or the other of the exposures takes some other value. Strong sufficient cause interactions concern cases in which a mechanism will operate under certain values of the two exposures but not when one or the other of the exposures takes any other value. Empirical conditions are derived for such interactions when exposures have two or three levels and are related to regression coefficients in linear and log-linear models. When the exposures are binary, the notions of a weak and a strong sufficient cause interaction coincide, but not when the exposures are categorical or ordinal. The results are applied to examples concerning gene-gene and gene-environment interactions.
Consider estimating the mean of an outcome in the presence of missing data or estimating population average treatment effects in causal inference. A doubly robust estimator remains consistent if an outcome regression model or a propensity score model is correctly specified. We build on a previous nonparametric likelihood approach and propose new doubly robust estimators, which have desirable properties in efficiency if the propensity score model is correctly specified, and in boundedness even if the inverse probability weights are highly variable. We compare the new and existing estimators in a simulation study and find that the robustified likelihood estimators yield overall the smallest mean squared errors.
Complex diseases like cancers can often be classified into subtypes using various pathological and molecular traits of the disease. In this article, we develop methods for analysis of disease incidence in cohort studies incorporating data on multiple disease traits using a two-stage semiparametric Cox proportional hazards regression model that allows one to examine the heterogeneity in the effect of the covariates by the levels of the different disease traits. For inference in the presence of missing disease traits, we propose a generalization of an estimating equation approach for handling missing cause of failure in competing-risk data. We prove asymptotic unbiasedness of the estimating equation method under a general missing-at-random assumption and propose a novel influence-function-based sandwich variance estimator. The methods are illustrated using simulation studies and a real data application involving the Cancer Prevention Study II nutrition cohort.
We propose a semiparametric additive rate model for modelling recurrent events in the presence of a terminal event. The dependence between recurrent events and terminal event is nonparametric. A general transformation model is used to model the terminal event. We construct an estimating equation for parameter estimation and derive the asymptotic distributions of the proposed estimators. Simulation studies demonstrate that the proposed inference procedure performs well in realistic settings. Application to a medical study is presented.
Attributable fractions are commonly used to measure the impact of risk factors on disease incidence in the population. These static measures can be extended to functions of time when the time to disease occurrence or event time is of interest. The present paper deals with nonparametric and semiparametric estimation of attributable fraction functions for cohort studies with potentially censored event time data. The semiparametric models include the familiar proportional hazards model and a broad class of transformation models. The proposed estimators are shown to be consistent, asymptotically normal and asymptotically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. A cardiovascular health study is provided. Connections to causal inference are discussed.
We propose a Poisson-compound gamma approach for species richness estimation. Based on the denseness and nesting properties of the gamma mixture, we fix the shape parameter of each gamma component at a unified value, and estimate the mixture using nonparametric maximum likelihood. A least-squares crossvalidation procedure is proposed for the choice of the common shape parameter. The performance of the resulting estimator of N is assessed using numerical studies and genomic data.
Nested sampling is a simulation method for approximating marginal likelihoods. We establish that nested sampling has an approximation error that vanishes at the standard Monte Carlo rate and that this error is asymptotically Gaussian. It is shown that the asymptotic variance of the nested sampling approximation typically grows linearly with the dimension of the parameter. We discuss the applicability and efficiency of nested sampling in realistic problems, and compare it with two current methods for computing marginal likelihood. Finally, we propose an extension that avoids resorting to Markov chain Monte Carlo simulation to obtain the simulated points.
We consider empirical likelihood for the mean similarity shape of objects in two dimensions described by labelled landmarks. The restriction to two dimensions permits the representation of preshapes as complex unit vectors. We focus on the use of empirical likelihood techniques for the construction of confidence regions for the mean shape and for testing the hypothesis of a common mean shape across several populations. Theoretical properties and computational details are discussed and the results of a simulation study are presented. Our results show that bootstrap calibrated empirical likelihood performs well in practice in the planar shape setting.
Necessary and sufficient conditions for the existence of a strictly stationary solution of the equations defining an autoregressive moving average process driven by an independent and identically distributed noise sequence are determined. No moment assumptions on the driving noise sequence are made.
Objective Bayes methodology is considered for conditional frequentist inference about a canonical parameter in a multi-parameter exponential family. A condition is derived under which posterior Bayes quantiles match the conditional frequentist coverage to a higher-order approximation in terms of the sample size. This condition is on the model, not on the prior, and it ensures that any first-order probability matching prior in the unconditional sense automatically yields higher-order conditional probability matching. Objective Bayes methods are compared to parametric bootstrap and analytic methods for higher-order conditional frequentist inference.
This paper discusses copula model selection procedures and goodness-of-fit tests under censoring. The proposed methodology is based on a comparison of nonparametric and model-based estimators of the probability integral transformation, K. New weighted estimators for K are introduced. The resulting tests are compared to an existing approach by simulation and illustrated with an example involving bleeding changes in a woman’s reproductive history.
We derive locally D- and EDp-optimal designs for the exponential, log-linear and three-parameter emax models. For each model the locally D- and EDp-optimal designs are supported at the same set of points, while the corresponding weights are different. This indicates that for a given model, D-optimal designs are efficient for estimating the smallest dose that achieves 100p% of the maximum effect in the observed dose range. Conversely, EDp-optimal designs also yield good D-efficiencies. We illustrate the results using several examples and demonstrate that locally D- and EDp-optimal designs for the emax, log-linear and exponential models are relatively robust with respect to misspecification of the model parameters.