EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Variable Selection with Penalized Gaussian Process Regression Models

Download or read book Variable Selection with Penalized Gaussian Process Regression Models written by Gang Yi and published by . This book was released on 2010 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Gaussian Process Regression Analysis for Functional Data

Download or read book Gaussian Process Regression Analysis for Functional Data written by Jian Qing Shi and published by CRC Press. This book was released on 2011-07-01 with total page 218 pages. Available in PDF, EPUB and Kindle. Book excerpt: Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables. Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dimensional data and variable selection. The remainder of the text explores advanced topics of functional regression analysis, including novel nonparametric statistical methods for curve prediction, curve clustering, functional ANOVA, and functional regression analysis of batch data, repeated curves, and non-Gaussian data. Many flexible models based on Gaussian processes provide efficient ways of model learning, interpreting model structure, and carrying out inference, particularly when dealing with large dimensional functional data. This book shows how to use these Gaussian process regression models in the analysis of functional data. Some MATLAB® and C codes are available on the first author’s website.

Book Consistent Bi level Variable Selection Via Composite Group Bridge Penalized Regression

Download or read book Consistent Bi level Variable Selection Via Composite Group Bridge Penalized Regression written by Indu Seetharaman and published by . This book was released on 2013 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: We study the composite group bridge penalized regression methods for conducting bilevel variable selection in high dimensional linear regression models with a diverging number of predictors. The proposed method combines the ideas of bridge regression (Huang et al., 2008a) and group bridge regression (Huang et al., 2009), to achieve variable selection consistency in both individual and group levels simultaneously, i.e., the important groups and the important individual variables within each group can both be correctly identi ed with probability approaching to one as the sample size increases to in nity. The method takes full advantage of the prior grouping information, and the established bi-level oracle properties ensure that the method is immune to possible group misidenti cation. A related adaptive group bridge estimator, which uses adaptive penalization for improving bi-level selection, is also investigated. Simulation studies show that the proposed methods have superior performance in comparison to many existing methods.

Book Variable Selection Via Penalized Regression and the Genetic Algorithm Using Information Complexity  with Applications for High dimensional  omics Data

Download or read book Variable Selection Via Penalized Regression and the Genetic Algorithm Using Information Complexity with Applications for High dimensional omics Data written by Tyler J. Massaro and published by . This book was released on 2016 with total page 360 pages. Available in PDF, EPUB and Kindle. Book excerpt: This dissertation is a collection of examples, algorithms, and techniques for researchers interested in selecting influential variables from statistical regression models. Chapters 1, 2, and 3 provide background information that will be used throughout the remaining chapters, on topics including but not limited to information complexity, model selection, covariance estimation, stepwise variable selection, penalized regression, and especially the genetic algorithm (GA) approach to variable subsetting. In chapter 4, we fully develop the framework for performing GA subset selection in logistic regression models. We present advantages of this approach against stepwise and elastic net regularized regression in selecting variables from a classical set of ICU data. We further compare these results to an entirely new procedure for variable selection developed explicitly for this dissertation, called the post hoc adjustment of measured effects (PHAME). In chapter 5, we reproduce many of the same results from chapter 4 for the first time in a multinomial logistic regression setting. The utility and convenience of the PHAME procedure is demonstrated on a set of cancer genomic data. Chapter 6 marks a departure from supervised learning problems as we shift our focus to unsupervised problems involving mixture distributions of count data from epidemiologic fields. We start off by reintroducing Minimum Hellinger Distance estimation alongside model selection techniques as a worthy alternative to the EM algorithm for generating mixtures of Poisson distributions. We also create for the first time a GA that derives mixtures of negative binomial distributions. The work from chapter 6 is incorporated into chapters 7 and 8, where we conclude the dissertation with a novel analysis of mixtures of count data regression models. We provide algorithms based on single and multi-target genetic algorithms which solve the mixture of penalized count data regression models problem, and we demonstrate the usefulness of this technique on HIV count data that were used in a previous study published by Gray, Massaro et al. (2015) as well as on time-to-event data taken from the cancer genomic data sets from earlier.

Book Subset Selection in Regression

Download or read book Subset Selection in Regression written by Alan Miller and published by CRC Press. This book was released on 2002-04-15 with total page 258 pages. Available in PDF, EPUB and Kindle. Book excerpt: Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author ha

Book Penalized Regressions for Variable Selection Model  Single Index Model and an Analysis of Mass Spectrometry Data

Download or read book Penalized Regressions for Variable Selection Model Single Index Model and an Analysis of Mass Spectrometry Data written by Yubing Wan and published by . This book was released on 2014 with total page 84 pages. Available in PDF, EPUB and Kindle. Book excerpt: The focus of this dissertation is to develop statistical methods, under the framework of penalized regressions, to handle three different problems. The first research topic is to address missing data problem for variable selection models including elastic net (ENet) method and sparse partial least squares (SPLS). I proposed a multiple imputation (MI) based weighted ENet (MI-WENet) method based on the stacked MI data and a weighting scheme for each observation. Numerical simulations were implemented to examine the performance of the MIWENet method, and compare it with competing alternatives. I then applied the MI-WENet method to examine the predictors for the endothelial function characterized by median effective dose and maximum effect in an ex-vivo experiment. The second topic is to develop monotonic single-index models for assessing drug interactions. In single-index models, the link function f is unnecessary monotonic. However, in combination drug studies, it is desired to have a monotonic link function f . I proposed to estimate f by using penalized splines with I-spline basis. An algorithm for estimating f and the parameter a in the index was developed. Simulation studies were conducted to examine the performance of the proposed models in term of accuracy in estimating f and a. Moreover, I applied the proposed method to examine the drug interaction of two drugs in a real case study. The third topic was focused on the SPLS and ENet based accelerated failure time (AFT) models for predicting patient survival time with mass spectrometry (MS) data. A typical MS data set contains limited number of spectra, while each spectrum contains tens of thousands of intensity measurements representing an unknown number of peptide peaks as the key features of interest. Due to the high dimension and high correlations among features, traditional linear regression modeling is not applicable. Semi-parametric AFT model with an unspecified error distribution is a well-accepted approach in survival analysis. To reduce the bias caused in denoising step, we proposed a nonparametric imputation approach based on Kaplan-Meier estimator. Numerical simulations and a real case study were conducted under the proposed method.

Book Variable Selection Via Penalized Likelihood

Download or read book Variable Selection Via Penalized Likelihood written by and published by . This book was released on 2014 with total page 121 pages. Available in PDF, EPUB and Kindle. Book excerpt: Variable selection via penalized likelihood plays an important role in high dimensional statistical modeling and it has attracted great attention in recent literature. This thesis is devoted to the study of variable selection problem. It consists of three major parts, all of which fall within the framework of penalized least squares regression setting. In the first part of this thesis, we propose a family of nonconvex penalties named the K-Smallest Items (KSI) penalty for variable selection, which is able to improve the performance of variable selection and reduce estimation bias on the estimates of the important coefficients. We fully investigate the theoretical properties of the KSI method and show that it possesses the weak oracle property and the oracle property in the high-dimensional setting where the number of coefficients is allowed to be much larger than the sample size. To demonstrate its numerical performance, we applied the KSI method to several simulation examples as well as the well known Boston housing dataset. We also extend the idea of the KSI method to handle the group variable selection problem. In the second part of this thesis, we propose another nonconvex penalty named Self-adaptive penalty (SAP) for variable selection. It is distinguished from other existing methods in the sense that the penalization on each individual coefficient takes into account directly the influence of other estimated coefficients. We also thoroughly study the theoretical properties of the SAP method and show that it possesses the weak oracle property under desirable conditions. The proposed method is applied to the glioblastoma cancer data obtained from The Cancer Genome Atlas. In many scientific and engineering applications, covariates are naturally grouped. When the group structures are available among covariates, people are usually interested in identifying both important groups and important variables within the selected groups. In statistics, this is a group variable selection problem. In the third part of this thesis, we propose a novel Log-Exp-Sum(LES) penalty for group variable selection. The LES penalty is strictly convex. It can identify important groups as well as select important variables within the group. We develop an efficient group-level coordinate descent algorithm to fit the model. We also derive non-asymptotic error bounds and asymptotic group selection consistency for our method in the high-dimensional setting. Numerical results demonstrate the good performance of our method in both variable selection and prediction. We applied the proposed method to an American Cancer Society breast cancer survivor dataset. The findings are clinically meaningful and may help design intervention programs to improve the quality of life for breast cancer survivors.

Book Gaussian Processes for Machine Learning

Download or read book Gaussian Processes for Machine Learning written by Carl Edward Rasmussen and published by MIT Press. This book was released on 2005-11-23 with total page 266 pages. Available in PDF, EPUB and Kindle. Book excerpt: A comprehensive and self-contained introduction to Gaussian processes, which provide a principled, practical, probabilistic approach to learning in kernel machines. Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

Book Variable Selection and Parameter Estimation Using a Continuous and Differentiable Approximation to the L0 Penalty Function

Download or read book Variable Selection and Parameter Estimation Using a Continuous and Differentiable Approximation to the L0 Penalty Function written by Douglas Nielsen VanDerwerken and published by . This book was released on 2011 with total page 48 pages. Available in PDF, EPUB and Kindle. Book excerpt: L0 penalized likelihood procedures like Mallows' Cp, AIC, and BIC directly penalize for the number of variables included in a regression model. This is a straightforward approach to the problem of overfitting, and these methods are now part of every statistician?s repertoire. However, these procedures have been shown to sometimes result in unstable parameter estimates as a result on the L0 penalty?s discontinuity at zero. One proposed alternative, seamless-L0 (SELO), utilizes a continuous penalty function that mimics L0 and allows for stable estimates. Like other similar methods (e.g. LASSO and SCAD), SELO produces sparse solutions because the penalty function is non-differentiable at the origin. Because these penalized likelihoods are singular (non-differentiable) at zero, there is no closed-form solution for the extremum of the objective function. We propose a continuous and everywhere-differentiable penalty function that can have arbitrarily steep slope in a neighborhood near zero, thus mimicking the L0 penalty, but allowing for a nearly closed-form solution for the beta-hat vector. Because our function is not singular at zero, beta-hat will have no zero-valued components, although some will have been shrunk arbitrarily close thereto. We employ a BIC-selected tuning parameter used in the shrinkage step to perform zero-thresholding as well. We call the resulting vector of coefficients the ShrinkSet estimator. It is comparable to SELO in terms of model performance (selecting the truly nonzero coefficients, overall MSE, etc.), but we believe it to be more intuitive and simpler to compute. We provide strong evidence that the estimator enjoys favorable asymptotic properties, including the oracle property.

Book Variable Selection in Non parametric Regression

Download or read book Variable Selection in Non parametric Regression written by Pʻing Chang and published by . This book was released on 1990 with total page 212 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Handbook of Bayesian Variable Selection

Download or read book Handbook of Bayesian Variable Selection written by Mahlet G. Tadesse and published by CRC Press. This book was released on 2021-12-24 with total page 491 pages. Available in PDF, EPUB and Kindle. Book excerpt: Bayesian variable selection has experienced substantial developments over the past 30 years with the proliferation of large data sets. Identifying relevant variables to include in a model allows simpler interpretation, avoids overfitting and multicollinearity, and can provide insights into the mechanisms underlying an observed phenomenon. Variable selection is especially important when the number of potential predictors is substantially larger than the sample size and sparsity can reasonably be assumed. The Handbook of Bayesian Variable Selection provides a comprehensive review of theoretical, methodological and computational aspects of Bayesian methods for variable selection. The topics covered include spike-and-slab priors, continuous shrinkage priors, Bayes factors, Bayesian model averaging, partitioning methods, as well as variable selection in decision trees and edge selection in graphical models. The handbook targets graduate students and established researchers who seek to understand the latest developments in the field. It also provides a valuable reference for all interested in applying existing methods and/or pursuing methodological extensions. Features: Provides a comprehensive review of methods and applications of Bayesian variable selection. Divided into four parts: Spike-and-Slab Priors; Continuous Shrinkage Priors; Extensions to various Modeling; Other Approaches to Bayesian Variable Selection. Covers theoretical and methodological aspects, as well as worked out examples with R code provided in the online supplement. Includes contributions by experts in the field. Supported by a website with code, data, and other supplementary material

Book Variable selection and parameter estimation for normal linear regression models

Download or read book Variable selection and parameter estimation for normal linear regression models written by Peter J. Kempthorne and published by . This book was released on 1985 with total page 159 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Design and Modeling for Computer Experiments

Download or read book Design and Modeling for Computer Experiments written by Kai-Tai Fang and published by Chapman and Hall/CRC. This book was released on 2006 with total page 312 pages. Available in PDF, EPUB and Kindle. Book excerpt: Emphasizing a practical approach, 'Design and Modeling for Computer Experiments' provides useful techniques for statisticians, engineers, and scientists to apply the methodologies presented.

Book Variable Selection and Function Estimation Using Penalized Methods

Download or read book Variable Selection and Function Estimation Using Penalized Methods written by Ganggang Xu and published by . This book was released on 2012 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Penalized methods are becoming more and more popular in statistical research. This dissertation research covers two major aspects of applications of penalized methods: variable selection and nonparametric function estimation. The following two paragraphs give brief introductions to each of the two topics. Infinite variance autoregressive models are important for modeling heavy-tailed time series. We use a penalty method to conduct model selection for autoregressive models with innovations in the domain of attraction of a stable law indexed by alpha is an element of (0, 2). We show that by combining the least absolute deviation loss function and the adaptive lasso penalty, we can consistently identify the true model. At the same time, the resulting coefficient estimator converges at a rate of n(1/alpha) . The proposed approach gives a unified variable selection procedure for both the finite and infinite variance autoregressive models. While automatic smoothing parameter selection for nonparametric function estimation has been extensively researched for independent data, it is much less so for clustered and longitudinal data. Although leave-subject-out cross-validation (CV) has been widely used, its theoretical property is unknown and its minimization is computationally expensive, especially when there are multiple smoothing parameters. By focusing on penalized modeling methods, we show that leave-subject-out CV is optimal in that its minimization is asymptotically equivalent to the minimization of the true loss function. We develop an efficient Newton-type algorithm to compute the smoothing parameters that minimize the CV criterion. Furthermore, we derive one simplification of the leave-subject-out CV, which leads to a more efficient algorithm for selecting the smoothing parameters. We show that the simplified version of CV criteria is asymptotically equivalent to the unsimplified one and thus enjoys the same optimality property. This CV criterion also provides a completely data driven approach to select working covariance structure using generalized estimating equations in longitudinal data analysis. Our results are applicable to additive, linear varying-coefficient, nonlinear models with data from exponential families.

Book Regularized Regression Methods for Variable Selection and Estimation

Download or read book Regularized Regression Methods for Variable Selection and Estimation written by Lee Herbrandson Dicker and published by . This book was released on 2010 with total page 222 pages. Available in PDF, EPUB and Kindle. Book excerpt: We make two contributions to the body of work on the variable selection and estimation problem. First, we propose a new penalized likelihood procedure--the seamless- L 0 (SELO) method--which utilizes a continuous penalty function that closely approximates the discontinuous L 0 penalty. The SELO penalized likelihood procedure consistently selects the correct variables and is asymptotically normal, provided the number of variables grows slower than the number of observations. The SELO method is efficiently implemented using a coordinate descent algorithm. Tuning parameter selection is crucial to the performance of the SELO procedure. We propose a BIC-like tuning parameter selection method for SELO which consistently identifies the correct model, even if the number of variables diverges. Simulation results show that the SELO procedure with BIC tuning parameter selection performs very well in a variety of settings--outperforming other popular penalized likelihood procedures by a substantial margin. Using SELO, we analyze a publicly available HIV drug resistance and mutation dataset and obtain interpretable results.