To compare the survival between screen-detected and clinically detected cancers, we applied a series of non-homogeneous stochastic processes to deal with leadtime, length bias, and over-detection by using full information on detection modes obtained from the Finnish randomized controlled trial for prostate cancer screening. The results show after 9-year follow-up the hazard ratio of prostate cancer death for screen-detected cases against clinically detected cases increased from 0.24 (95% CI: 0.16–0.35) without correction for these biases, to 0.76 after correction for leadtime and length biases, and finally to 1.03 (95% CI: 0.79–1.33) for a further adjustment for over-detection. Adjustment for leadtime and length bias but no over-detection led to a 24% reduction in prostate cancer death as a result of prostate-specific antigen test. The further calibration of over-detection indicates no gain in survival of screen-detected prostate cancers (excluding over-detected case as stayer considered in the mover–stayer model) as compared with the control group in the absence of screening that is considered as the mover. However, whether the model assumption on over-detection is robust should be validated with other data sets and longer follow-up.
This paper focuses on the problem of functional statistical classification of gene expression curves. A local-wavelet-vaguelette-based functional logistic regression approach is presented. This approach is specially suitable for the classification of non-stationary singular (non-differentiable) curves. The performance of the methodology proposed is illustrated by implementing it for the classification of yeast cell-cycle temporal gene expression profiles. A simulation study is also carried out for comparison with other functional classification methodologies.
Cancer survival is one of the most important measures to evaluate the effectiveness of treatment and early diagnosis. The ultimate goal of cancer research and patient care is the cure of cancer. As cancer treatments progress, cure becomes a reality for many cancers if patients are diagnosed early and get effective treatment. If a cure does exist for a certain type of cancer, it is useful to estimate the time of cure. For cancers that impose excess risk of mortality, it is informative to understand the difference in survival between cancer patients and the general cancer-free population. In population-based cancer survival studies, relative survival is the standard measure of excess mortality due to cancer. Cure is achieved when the survival of cancer patients is equivalent to that of the general population. This definition of cure is usually called the statistical cure, which is an important measure of burden due to cancer. In this paper, a minimum version of the log-rank test is proposed to test the equivalence of cancer patients' survival using the relative survival data. Performance of the proposed test is evaluated by simulation. Relative survival data from population-based cancer registries in SEER Program are used to examine patients' survival after diagnosis for various major cancer sites.
We present a hierarchical Bayesian model (HBM) to estimate the growth parameters, production, and production over biomass ratio (P/B) of resident brown trout (Salmo trutta fario) populations. The data which are required to run the model are removal sampling and air temperature data which are conveniently gathered by freshwater biologists. The model is the combination of eight submodels: abundance, weight, biomass, growth, growth rate, time of emergence, water temperature, and production. Abundance is modeled as a mixture of Gaussian cohorts; cohorts centers and standard deviations are related by a von Bertalanffy growth function; time of emergence and growth rate are functions of water temperature; water temperature is predicted from air temperature; biomass, production, and P/B are subsequently computed. We illustrate the capabilities of the model by investigating the growth and production of a brown trout population (Neste d'Oueil, Pyrénées, France) by using data collected in the field from 2005 to 2010.
An estimator of the hazard rate function from discrete failure time data is obtained by semiparametric smoothing of the (nonsmooth) maximum likelihood estimator, which is achieved by repeated multiplication of a Markov chain transition-type matrix. This matrix is constructed so as to have a given standard discrete parametric hazard rate model, termed the vehicle model, as its stationary hazard rate. As with the discrete density estimation case, the proposed estimator gives improved performance when the vehicle model is a good one and otherwise provides a nonparametric method comparable to the only purely nonparametric smoother discussed in the literature. The proposed semiparametric smoothing approach is then extended to hazard models with covariates and is illustrated by applications to simulated and real data sets.
Analysis of adverse events (AE) for drug safety assessment presents challenges to statisticians in observational studies as well as in clinical trials since AEs are typically recurrent with varying duration and severity. Routine analyses often concentrate on the number of patients who had at least one occurrence of a specific AE or a group of AEs, or the time to occurrence of the first event. We argue that other information in AE data particularly cumulative duration of events is also important, particularly for benefit-risk assessment. We propose a nonparametric method to estimate the mean cumulative duration (MCD) based on the nonparametric cumulative mean function estimate, together with a robust estimate for the variance of the estimate, as in Lawless and Nadeau (1995). This approach can be easily used to analyze multiple, overlapped and severity weighted AE durations. This method can also be used for estimating the difference between two MCDs. Estimation in the presence of censoring due to informative dropouts and/or a terminal event is also considered. The method can be implemented in standard softwares such as SAS. We illustrate the use of the method with a numerical example. Small sample properties of this approach are examined via simulation.
A modification of the principal component test is presented. It uses a weighted combination of the sums of squares for different principal components and is thus more powerful in high-dimensional settings with small sample sizes. Under usual normality assumptions, a rotation test is proposed which enables an exact conditional parametric test. The procedure is demonstrated with microarray data for the bacterial composition in the rhizosphere of different potato cultivars. In simulation studies, the power of the proposed statistic is compared with the competing multivariate parametric tests.
The classification accuracy of new diagnostic tests is based on receiver operating characteristic (ROC) curves. The area under the ROC curve (AUC) is one of the well-accepted summary measures for describing the accuracy of diagnostic tests. The AUC summary measure can vary by patient and testing characteristics. Thus, the performance of the test may be different in certain subpopulation of patients and readers. For this purpose, we propose a direct semi-parametric regression model for the non-parametric AUC measure for ordinal data while accounting for discrete and continuous covariates. The proposed method can be used to estimate the AUC value under degenerate data where certain rating categories are not observed. We will discuss the non-standard asymptotic theory, since the estimating functions were based on cross-correlated random variables. Simulation studies based on different classification models showed that the proposed model worked reasonably well with small percent bias and percent mean-squared error. The proposed method was applied to the prostate cancer study to estimate the AUC for four readers, and the carotid vessel study with age, gender, history of previous stroke, and total number of risk factors as covariates, to estimate the accuracy of the diagnostic test in the presence of subject-level covariates.
Online risk prediction tools for common cancers are now easily accessible and widely used by patients and doctors for informed decision-making concerning screening and diagnosis. A practical problem is as cancer research moves forward and new biomarkers and risk factors are discovered, there is a need to update the risk algorithms to include them. Typically, the new markers and risk factors cannot be retrospectively measured on the same study participants used to develop the original prediction tool, necessitating the merging of a separate study of different participants, which may be much smaller in sample size and of a different design. Validation of the updated tool on a third independent data set is warranted before the updated tool can go online. This article reports on the application of Bayes rule for updating risk prediction tools to include a set of biomarkers measured in an external study to the original study used to develop the risk prediction tool. The procedure is illustrated in the context of updating the online Prostate Cancer Prevention Trial Risk Calculator to incorporate the new markers %freePSA and [-2]proPSA measured on an external case–control study performed in Texas, U.S.. Recent state-of-the art methods in validation of risk prediction tools and evaluation of the improvement of updated to original tools are implemented using an external validation set provided by the U.S. Early Detection Research Network.
This paper discusses multiplicity issues arising in confirmatory clinical trials with hierarchically ordered multiple objectives. In order to protect the overall type I error rate, multiple objectives are analyzed using multiple testing procedures. When the objectives are ordered and grouped in multiple families (e.g. families of primary and secondary endpoints), gatekeeping procedures are employed to account for this hierarchical structure. We discuss considerations arising in the process of building gatekeeping procedures, including proper use of relevant trial-specific information and criteria for selecting gatekeeping procedures. The methods and principles discussed in this paper are illustrated using a clinical trial in patients with type II diabetes mellitus.
For the all pairwise comparisons for equivalence of k (k≥2) treatments Lauzon and Caffo proposed simply to divide the type I error level α by k−1 to achieve a Bonferroni-based familywise error control when declaring pairs of two treatments equivalent. This rule is shown to be too liberal for k≥4. It works for k=3 yet for reasons not considered by Lauzon and Caffo. Based on the two one-sided testing procedures and using the closure test principle we develop valid alternatives based on Bonferroni's inequality. The set H of intersection hypotheses reveals a rich structure, leading to the possibility to present H as a directed acyclic graph (DAG). This in turn allows using some graph theoretical theorems and eases proving properties of the resulting multiple testing problems.
We propose a robust Cox regression model with outliers. The model is fit by trimming the smallest contributions to the partial likelihood. To do so, we implement a Metropolis-type maximization routine, and show its convergence to a global optimum. We discuss global robustness properties of the approach, which is illustrated and compared through simulations. We finally fit the model on an original and on a benchmark data set.
We describe a nonparametric Bayesian approach for estimating the three-way ROC surface based on mixtures of finite Polya trees (MFPT) priors. Mixtures of finite Polya trees are robust models that can handle nonstandard features in the data. We address the difficulties in modeling continuous diagnostic data with skewness, multimodality, or other nonstandard features, and how parametric approaches can lead to misleading results in such cases. Robust, data-driven inference for the ROC surface and for the volume under the ROC surface is obtained. A simulation study is performed to assess the performance of the proposed method. Methods are applied to data from a magnetic resonance spectroscopy study on human immunodeficiency virus patients.