EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Bayesian Variable Selection Using Lasso

Download or read book Bayesian Variable Selection Using Lasso written by Yuchen Han and published by . This book was released on 2017 with total page 44 pages. Available in PDF, EPUB and Kindle. Book excerpt: This thesis proposes to combine the Kuo and Mallick approach (1998) and Bayesian Lasso approach (2008) by introducing a Laplace distribution on the conditional prior of the regression parameters given the indicator variables. Gibbs Sampling will be used to sample from the joint posterior distribution. We compare these two new method to existing Bayesian variable selection methods such as Kuo and Mallick, George and McCulloch and Park and Casella and provide an overall qualitative assessment of the efficiency of mixing and separation. We will also use air pollution dataset to test the proposed methodology with the goal of identifying the main factors controlling the pollutant concentration.

Book Handbook of Bayesian Variable Selection

Download or read book Handbook of Bayesian Variable Selection written by Mahlet G. Tadesse and published by CRC Press. This book was released on 2021-12-24 with total page 762 pages. Available in PDF, EPUB and Kindle. Book excerpt: Bayesian variable selection has experienced substantial developments over the past 30 years with the proliferation of large data sets. Identifying relevant variables to include in a model allows simpler interpretation, avoids overfitting and multicollinearity, and can provide insights into the mechanisms underlying an observed phenomenon. Variable selection is especially important when the number of potential predictors is substantially larger than the sample size and sparsity can reasonably be assumed. The Handbook of Bayesian Variable Selection provides a comprehensive review of theoretical, methodological and computational aspects of Bayesian methods for variable selection. The topics covered include spike-and-slab priors, continuous shrinkage priors, Bayes factors, Bayesian model averaging, partitioning methods, as well as variable selection in decision trees and edge selection in graphical models. The handbook targets graduate students and established researchers who seek to understand the latest developments in the field. It also provides a valuable reference for all interested in applying existing methods and/or pursuing methodological extensions. Features: Provides a comprehensive review of methods and applications of Bayesian variable selection. Divided into four parts: Spike-and-Slab Priors; Continuous Shrinkage Priors; Extensions to various Modeling; Other Approaches to Bayesian Variable Selection. Covers theoretical and methodological aspects, as well as worked out examples with R code provided in the online supplement. Includes contributions by experts in the field. Supported by a website with code, data, and other supplementary material

Book A Two stage Bayesian Variable Selection Method with the Extension of Lasso for Geo referenced Count Data

Download or read book A Two stage Bayesian Variable Selection Method with the Extension of Lasso for Geo referenced Count Data written by Yuqian Shen and published by . This book was released on 2019 with total page 59 pages. Available in PDF, EPUB and Kindle. Book excerpt: Due to the complex nature of geo-referenced data, multicollinearity of the risk factors in public health spatial studies is a commonly encountered issue, which leads to low parameter estimation accuracy because it inflates the variance in the regression analysis. To address this issue, we proposed a two-stage variable selection method by extending the least absolute shrinkage and selection operator (Lasso) to the Bayesian spatial setting, investigating the impact of risk factors to health outcomes. Specifically, in stage I, we performed the variable selection using Bayesian Lasso and several other variable selection approaches. Then, in stage II, we performed the model selection with only the selected variables from stage I and compared again the methods. To evaluate the performance of the two-stage variable selection methods, we conducted a simulation study with different distributions for the risk factors, using geo-referenced count data as the outcome and Michigan as the research region. We considered the cases when all candidate risk factors are independently normally distributed, or follow a multivariate normal distribution with different correlation levels. Two other Bayesian variable selection methods, Binary indicator, and the combination of Binary indicator and Lasso are considered and compared as alternative methods. The simulation results indicate that the proposed two-stage Bayesian Lasso variable selection method has the best performance for both independent and dependent cases considered. When compared with the one-stage approach, and the other two alternative methods, the two-stage Bayesian Lasso approach provides the highest estimation accuracy in all scenarios considered.

Book Bayesian Variable Selection and Estimation

Download or read book Bayesian Variable Selection and Estimation written by Xiaofan Xu and published by . This book was released on 2014 with total page 76 pages. Available in PDF, EPUB and Kindle. Book excerpt: The paper considers the classical Bayesian variable selection problem and an important subproblem in which grouping information of predictors is available. We propose the Half Thresholding (HT) estimator for simultaneous variable selection and estimation with shrinkage priors. Under orthogonal design matrix, variable selection consistency and asymptotic distribution of HT estimators are investigated and the oracle property is established with Three Parameter Beta Mixture of Normals (TPBN) priors. We then revisit Bayesian group lasso and use spike and slab priors for variable selection at the group level. In the process, the connection of our model with penalized regression is demonstrated, and the role of posterior median for thresholding is pointed out. We show that the posterior median estimator has the oracle property for group variable selection and estimation under orthogonal design while the group lasso has suboptimal asymptotic estimation rate when variable selection consistency is achieved. Next we consider Bayesian sparse group lasso again with spike and slab priors to select variables both at the group level and also within the group, and develop the necessary algorithm for its implementation. We demonstrate via simulation that the posterior median estimator of our spike and slab models has excellent performance for both variable selection and estimation.

Book Handbook of Bayesian Variable Selection

Download or read book Handbook of Bayesian Variable Selection written by Mahlet G. Tadesse and published by CRC Press. This book was released on 2021-12-24 with total page 491 pages. Available in PDF, EPUB and Kindle. Book excerpt: Bayesian variable selection has experienced substantial developments over the past 30 years with the proliferation of large data sets. Identifying relevant variables to include in a model allows simpler interpretation, avoids overfitting and multicollinearity, and can provide insights into the mechanisms underlying an observed phenomenon. Variable selection is especially important when the number of potential predictors is substantially larger than the sample size and sparsity can reasonably be assumed. The Handbook of Bayesian Variable Selection provides a comprehensive review of theoretical, methodological and computational aspects of Bayesian methods for variable selection. The topics covered include spike-and-slab priors, continuous shrinkage priors, Bayes factors, Bayesian model averaging, partitioning methods, as well as variable selection in decision trees and edge selection in graphical models. The handbook targets graduate students and established researchers who seek to understand the latest developments in the field. It also provides a valuable reference for all interested in applying existing methods and/or pursuing methodological extensions. Features: Provides a comprehensive review of methods and applications of Bayesian variable selection. Divided into four parts: Spike-and-Slab Priors; Continuous Shrinkage Priors; Extensions to various Modeling; Other Approaches to Bayesian Variable Selection. Covers theoretical and methodological aspects, as well as worked out examples with R code provided in the online supplement. Includes contributions by experts in the field. Supported by a website with code, data, and other supplementary material

Book Monte Carlo Simulation and Resampling Methods for Social Science

Download or read book Monte Carlo Simulation and Resampling Methods for Social Science written by Thomas M. Carsey and published by SAGE Publications. This book was released on 2013-08-05 with total page 304 pages. Available in PDF, EPUB and Kindle. Book excerpt: Taking the topics of a quantitative methodology course and illustrating them through Monte Carlo simulation, this book examines abstract principles, such as bias, efficiency, and measures of uncertainty in an intuitive, visual way. Instead of thinking in the abstract about what would happen to a particular estimator "in repeated samples," the book uses simulation to actually create those repeated samples and summarize the results. The book includes basic examples appropriate for readers learning the material for the first time, as well as more advanced examples that a researcher might use to evaluate an estimator he or she was using in an actual research project. The book also covers a wide range of topics related to Monte Carlo simulation, such as resampling methods, simulations of substantive theory, simulation of quantities of interest (QI) from model results, and cross-validation. Complete R code from all examples is provided so readers can replicate every analysis presented using R.

Book Flexible Imputation of Missing Data  Second Edition

Download or read book Flexible Imputation of Missing Data Second Edition written by Stef van Buuren and published by CRC Press. This book was released on 2018-07-17 with total page 444 pages. Available in PDF, EPUB and Kindle. Book excerpt: Missing data pose challenges to real-life data analysis. Simple ad-hoc fixes, like deletion or mean imputation, only work under highly restrictive conditions, which are often not met in practice. Multiple imputation replaces each missing value by multiple plausible values. The variability between these replacements reflects our ignorance of the true (but missing) value. Each of the completed data set is then analyzed by standard methods, and the results are pooled to obtain unbiased estimates with correct confidence intervals. Multiple imputation is a general approach that also inspires novel solutions to old problems by reformulating the task at hand as a missing-data problem. This is the second edition of a popular book on multiple imputation, focused on explaining the application of methods through detailed worked examples using the MICE package as developed by the author. This new edition incorporates the recent developments in this fast-moving field. This class-tested book avoids mathematical and technical details as much as possible: formulas are accompanied by verbal statements that explain the formula in accessible terms. The book sharpens the reader’s intuition on how to think about missing data, and provides all the tools needed to execute a well-grounded quantitative analysis in the presence of missing data.

Book Statistical Learning with Sparsity

Download or read book Statistical Learning with Sparsity written by Trevor Hastie and published by CRC Press. This book was released on 2015-05-07 with total page 354 pages. Available in PDF, EPUB and Kindle. Book excerpt: Discover New Methods for Dealing with High-Dimensional DataA sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underl

Book Bayesian Variable Selection with Spike and slab Priors

Download or read book Bayesian Variable Selection with Spike and slab Priors written by Anjali Agarwal and published by . This book was released on 2016 with total page 90 pages. Available in PDF, EPUB and Kindle. Book excerpt: A major focus of intensive methodological research in recent times has been on knowledge extraction from high-dimensional datasets made available by advances in research technologies. Coupled with the growing popularity of Bayesian methods in statistical analysis, a range of new techniques have evolved that allow innovative model-building and inference in high-dimensional settings – an important one among these being Bayesian variable selection (BVS). The broad goal of this thesis is to explore different BVS methods and demonstrate their application in high-dimensional psychological data analysis. In particular, the focus will be on a class of sparsity-enforcing priors called 'spike-and-slab' priors which are mixture priors on regression coefficients with density functions that are peaked at zero (the 'spike') and also have large probability mass for a wide range of non-zero values (the 'slab'). It is demonstrated that BVS with spike-and-slab priors achieved a reasonable degree of dimensionality-reduction when applied to a psychiatric dataset in a logistic regression setup. BVS performance was also compared to that of LASSO (least absolute shrinkage and selection operator), a popular machine-learning technique, as reported in Ahn et al.(2016). The findings indicate that BVS with a spike-and-slab prior provides a competitive alternative to machine-learning methods, with the additional advantages of ease of interpretation and potential to handle more complex models. In conclusion, this thesis serves to add a new cutting-edge technique to the lab’s tool-shed and helps introduce Bayesian variable-selection to researchers in Cognitive Psychology where it still remains relatively unexplored as a dimensionality-reduction tool.

Book Bayesian Variable Selection Via a Benchmark

Download or read book Bayesian Variable Selection Via a Benchmark written by and published by . This book was released on 2013 with total page 84 pages. Available in PDF, EPUB and Kindle. Book excerpt: With increasing appearances of high dimensional data over the past decades, variable selections through likelihood penalization remains a popular yet challenging research area in statistics. Ridge and Lasso, the two of the most popular penalized regression methods, served as the foundation of regularization technique and motivated several extensions to accommodate various circumstances, mostly through frequentist models. These two regularization problems can also be solved by their Bayesian counterparts, via putting proper priors on the regression parameters and then followed by Gibbs sampling. Compared to the frequentist version, the Bayesian framework enables easier interpretation and more straightforward inference on the parameters, based on the posterior distributional results. In general, however, the Bayesian approaches do not provide sparse estimates for the regression coefficients. In this thesis, an innovative Bayesian variable selection method via a benchmark variable in conjunction with a modified BIC is proposed under the framework of linear regression models as the first attempt, to promote both model sparsity and accuracy. The motivation of introducing such a benchmark is discussed, and the statistical properties regarding its role in the model are demonstrated. In short, it serves as a criterion to measure the importance of each variable based on the posterior inference of the corresponding coefficients, and only the most important variables providing the minimal modified BIC value are included. The Bayesian approach via a benchmark is extended to accommodate linear models with covariates exhibiting group structures. An iterative algorithm is implemented to identify both important groups and important variables within the selected groups. What's more, the method is further developed and modified to select variables for generalized linear models, by taking advantage of the normal approximation on the likelihood function. Simulation studies are carried out to assess and compare the performances among the proposed approaches and other state-of-art methods for each of the above three scenarios. The numerical results consistently illustrate our Bayesian variable selection approaches tend to select exactly the true variables or groups, while producing comparable prediction errors as other methods. Besides the numerical work, several real data sets are analyzed by these methods and the corresponding performances are further compared. The variable selection results by our approach are intuitively appealing or consistent with existing literatures in general.

Book Jointness in Bayesian Variable Selection with Applications to Growth Regression

Download or read book Jointness in Bayesian Variable Selection with Applications to Growth Regression written by and published by World Bank Publications. This book was released on with total page 17 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Advanced Mean Field Methods

Download or read book Advanced Mean Field Methods written by Manfred Opper and published by MIT Press. This book was released on 2001 with total page 300 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling. A major problem in modern probabilistic modeling is the huge computational complexity involved in typical calculations with multivariate probability distributions when the number of random variables is large. Because exact computations are infeasible in such cases and Monte Carlo sampling techniques may reach their limits, there is a need for methods that allow for efficient approximate computations. One of the simplest approximations is based on the mean field method, which has a long history in statistical physics. The method is widely used, particularly in the growing field of graphical models. Researchers from disciplines such as statistical physics, computer science, and mathematical statistics are studying ways to improve this and related methods and are exploring novel application areas. Leading approaches include the variational approach, which goes beyond factorizable distributions to achieve systematic improvements; the TAP (Thouless-Anderson-Palmer) approach, which incorporates correlations by including effective reaction terms in the mean field theory; and the more general methods of graphical models. Bringing together ideas and techniques from these diverse disciplines, this book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling.

Book Bayesian LASSO Survival Analysis

Download or read book Bayesian LASSO Survival Analysis written by Justin P. Neely and published by . This book was released on 2019 with total page 38 pages. Available in PDF, EPUB and Kindle. Book excerpt: his thesis examines the use of Bayesian LASSO regression for survival data to estimate the survival function and to select significant covariates simultaneously. We consider survival times of patients with adenocarcinoma lung cancer. The survival and genetic data are available in the Cancer Genome Atlas (TCGA) Research Network. As a pilot study, within chromosome 5, we apply Bayesian LASSO regression to explore genetic markers that may help to identify crucial genes to determine survival times of patients. Using Gibbs sampling we can obtain Markov Chain Monte Carlo samples for regression coefficients and model variance as well as LASSO penalty from their full conditional distribution. However,under the Cox Proportional Hazard model sampling from the full conditional distribution for the Bayesian LASSO regression coefficients is computationally difficult. Therefore, we use latent variables for survival likelihood and perform Bayesian inference. We compare the Bayesian LASSO with a common variable selection method and a Frequentist LASSO for the estimation of the survival function and identified critical covariates.

Book Multivariate Statistical Machine Learning Methods for Genomic Prediction

Download or read book Multivariate Statistical Machine Learning Methods for Genomic Prediction written by Osval Antonio Montesinos López and published by Springer Nature. This book was released on 2022-02-14 with total page 707 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book is open access under a CC BY 4.0 license This open access book brings together the latest genome base prediction models currently being used by statisticians, breeders and data scientists. It provides an accessible way to understand the theory behind each statistical learning tool, the required pre-processing, the basics of model building, how to train statistical learning methods, the basic R scripts needed to implement each statistical learning tool, and the output of each tool. To do so, for each tool the book provides background theory, some elements of the R statistical software for its implementation, the conceptual underpinnings, and at least two illustrative examples with data from real-world genomic selection experiments. Lastly, worked-out examples help readers check their own comprehension.The book will greatly appeal to readers in plant (and animal) breeding, geneticists and statisticians, as it provides in a very accessible way the necessary theory, the appropriate R code, and illustrative examples for a complete understanding of each statistical learning tool. In addition, it weighs the advantages and disadvantages of each tool.

Book Bayesian Variable Selection and Functional Data Analysis

Download or read book Bayesian Variable Selection and Functional Data Analysis written by Asish Kumar Banik and published by . This book was released on 2019 with total page 157 pages. Available in PDF, EPUB and Kindle. Book excerpt: High-dimensional statistics is one of the most studied topics in the field of statistics. The most interesting problem to arise in the last 15 years is variable selection or subset selection. Variable selection is a strong statistical tool that can be explored in functional data analysis. In the first part of this thesis, we implement a Bayesian variable selection method for automatic knot selection. We propose a spike-and-slab prior on knots and formulate a conjugate stochastic search variable selection for significant knots. The computation is substantially faster than existing knot selection methods, as we use Metropolis-Hastings algorithms and a Gibbs sampler for estimation. This work focuses on a single nonlinear covariate, modeled as regression splines. In the next stage, we study Bayesian variable selection in additive models with high-dimensional predictors. The selection of nonlinear functions in models is highly important in recent research, and the Bayesian method of selection has more advantages than contemporary frequentist methods. Chapter 2 examines Bayesian sparse group lasso theory based on spike-and-slab priors to determine its applicability for variable selection and function estimation in nonparametric additive models.The primary objective of Chapter 3 is to build a classification method using longitudinal volumetric magnetic resonance imaging (MRI) data from five regions of interest (ROIs). A functional data analysis method is used to handle the longitudinal measurement of ROIs, and the functional coefficients are later used in the classification models. We propose a P\\'olya-gamma augmentation method to classify normal controls and diseased patients based on functional MRI measurements. We obtain fast-posterior sampling by avoiding the slow and complicated Metropolis-Hastings algorithm. Our main motivation is to determine the important ROIs that have the highest separating power to classify our dichotomous response. We compare the sensitivity, specificity, and accuracy of the classification based on single ROIs and with various combinations of them. We obtain a sensitivity of over 85% and a specificity of around 90% for most of the combinations.Next, we work with Bayesian classification and selection methodology. The main goal of Chapter 4 is to employ longitudinal trajectories in a significant number of sub-regional brain volumetric MRI data as statistical predictors for Alzheimer's disease (AD) classification. We use logistic regression in a Bayesian framework that includes many functional predictors. The direct sampling of regression coefficients from the Bayesian logistic model is difficult due to its complicated likelihood function. In high-dimensional scenarios, the selection of predictors is paramount with the introduction of either spike-and-slab priors, non-local priors, or Horseshoe priors. We seek to avoid the complicated Metropolis-Hastings approach and to develop an easily implementable Gibbs sampler. In addition, the Bayesian estimation provides proper estimates of the model parameters, which are also useful for building inference. Another advantage of working with logistic regression is that it calculates the log of odds of relative risk for AD compared to normal control based on the selected longitudinal predictors, rather than simply classifying patients based on cross-sectional estimates. Ultimately, however, we combine approaches and use a probability threshold to classify individual patients. We employ 49 functional predictors consisting of volumetric estimates of brain sub-regions, chosen for their established clinical significance. Moreover, the use of spike-and-slab priors ensures that many redundant predictors are dropped from the model.Finally, we present a new approach of Bayesian model-based clustering for spatiotemporal data in chapter 5 . A simple linear mixed model (LME) derived from a functional model is used to model spatiotemporal cerebral white matter data extracted from healthy aging individuals. LME provides us with prior information for spatial covariance structure and brain segmentation based on white matter intensity. This motivates us to build stochastic model-based clustering to group voxels considering their longitudinal and location information. The cluster-specific random effect causes correlation among repeated measures. The problem of finding partitions is dealt with by imposing prior structure on cluster partitions in order to derive a stochastic objective function.

Book Bayesian Variable Selection in Linear and Non linear Models

Download or read book Bayesian Variable Selection in Linear and Non linear Models written by Arnab Kumar Maity and published by . This book was released on 2016 with total page 124 pages. Available in PDF, EPUB and Kindle. Book excerpt: Appropriate feature selection is a fundamental problem in the field of statistics. Models with large number of features or variables require special attention due to the computational complexity of the huge model space. This is generally known as the variable or model selection problem in the field of statistics whereas in machine learning and other literature, this is also known as feature selection, attribute selection or variable subset selection. The method of variable selection is the process of efficiently selecting an optimal subset of relevant variables for use in model construction. The central assumption in this methodology is that the data contain many redundant variable; those which do not provide any significant additional information than the optimally selected subset of variable. Variable selection is widely used in all application areas of data analytics, ranging from optimal selection of genes in large scale micro-array studies, to optimal selection of biomarkers for targeted therapy in cancer genomics to selection of optimal predictors in business analytics. Under the Bayesian approach, the formal way to perform this optimal selection is to select the model with highest posterior probability. Using this fact the problem may be thought as an optimization problem over the model space where the objective function is the posterior probability of model and the maximization is taken place with respect to the models. We propose an efficient method for implementing this optimization and we illustrate its feasibility in high dimensional problems. By means of various simulation studies, this new approach has been shown to be efficient and to outperform other statistical feature selection methods methods namely median probability model and sampling method with frequency based estimators. Theoretical justifications are provided. Applications to logistic regression and survival regression are discussed.