EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Bayesian Variable Selection in High dimensional Applications

Download or read book Bayesian Variable Selection in High dimensional Applications written by Veronika Roc̆ková and published by . This book was released on 2013 with total page 195 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Bayesian Variable Selection for High Dimensional Data Analysis

Download or read book Bayesian Variable Selection for High Dimensional Data Analysis written by Yang Aijun and published by LAP Lambert Academic Publishing. This book was released on 2011-09 with total page 92 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the practice of statistical modeling, it is often desirable to have an accurate predictive model. Modern data sets usually have a large number of predictors.Hence parsimony is especially an important issue. Best-subset selection is a conventional method of variable selection. Due to the large number of variables with relatively small sample size and severe collinearity among the variables, standard statistical methods for selecting relevant variables often face difficulties. Bayesian stochastic search variable selection has gained much empirical success in a variety of applications. This book, therefore, proposes a modified Bayesian stochastic variable selection approach for variable selection and two/multi-class classification based on a (multinomial) probit regression model.We demonstrate the performance of the approach via many real data. The results show that our approach selects smaller numbers of relevant variables and obtains competitive classification accuracy based on obtained results.

Book Handbook of Bayesian Variable Selection

Download or read book Handbook of Bayesian Variable Selection written by Mahlet G. Tadesse and published by CRC Press. This book was released on 2021-12-24 with total page 762 pages. Available in PDF, EPUB and Kindle. Book excerpt: Bayesian variable selection has experienced substantial developments over the past 30 years with the proliferation of large data sets. Identifying relevant variables to include in a model allows simpler interpretation, avoids overfitting and multicollinearity, and can provide insights into the mechanisms underlying an observed phenomenon. Variable selection is especially important when the number of potential predictors is substantially larger than the sample size and sparsity can reasonably be assumed. The Handbook of Bayesian Variable Selection provides a comprehensive review of theoretical, methodological and computational aspects of Bayesian methods for variable selection. The topics covered include spike-and-slab priors, continuous shrinkage priors, Bayes factors, Bayesian model averaging, partitioning methods, as well as variable selection in decision trees and edge selection in graphical models. The handbook targets graduate students and established researchers who seek to understand the latest developments in the field. It also provides a valuable reference for all interested in applying existing methods and/or pursuing methodological extensions. Features: Provides a comprehensive review of methods and applications of Bayesian variable selection. Divided into four parts: Spike-and-Slab Priors; Continuous Shrinkage Priors; Extensions to various Modeling; Other Approaches to Bayesian Variable Selection. Covers theoretical and methodological aspects, as well as worked out examples with R code provided in the online supplement. Includes contributions by experts in the field. Supported by a website with code, data, and other supplementary material

Book Bayesian Variable Selection for High dimensional Data with an Ordinal Response

Download or read book Bayesian Variable Selection for High dimensional Data with an Ordinal Response written by Yiran Zhang (Ph. D. in biostatistics) and published by . This book was released on 2019 with total page 136 pages. Available in PDF, EPUB and Kindle. Book excerpt: Health outcome and disease status measurements frequently appear on an ordinal scale, that is, the outcome is categorical but has inherent ordering. Many previous studies have shown associations between gene expression and disease status. Identification of important genes may be useful for developing novel diagnostic and prognostic tools to predict or classify stage of disease. Gene expression data is usually high-dimensional, meaning that the number of genes is much greater than the sample size or number of patients. We will describe some existing frequentist methods for high-dimensional data with an ordinal response. Following Tibshirani (1996) who described the LASSO estimate as the Bayesian posterior mode when the regression parameters have independent Laplace priors, we propose a new approach for high-dimensional data with an ordinal response that is rooted in the Bayesian paradigm. We show that our proposed Bayesian approach outperforms the existing frequentist methods through simulation studies. We then compare the performance of frequentist and Bayesian approaches using hepatocellular carcinoma studies.

Book Handbook of Bayesian Variable Selection

Download or read book Handbook of Bayesian Variable Selection written by Mahlet G. Tadesse and published by CRC Press. This book was released on 2021-12-24 with total page 491 pages. Available in PDF, EPUB and Kindle. Book excerpt: Bayesian variable selection has experienced substantial developments over the past 30 years with the proliferation of large data sets. Identifying relevant variables to include in a model allows simpler interpretation, avoids overfitting and multicollinearity, and can provide insights into the mechanisms underlying an observed phenomenon. Variable selection is especially important when the number of potential predictors is substantially larger than the sample size and sparsity can reasonably be assumed. The Handbook of Bayesian Variable Selection provides a comprehensive review of theoretical, methodological and computational aspects of Bayesian methods for variable selection. The topics covered include spike-and-slab priors, continuous shrinkage priors, Bayes factors, Bayesian model averaging, partitioning methods, as well as variable selection in decision trees and edge selection in graphical models. The handbook targets graduate students and established researchers who seek to understand the latest developments in the field. It also provides a valuable reference for all interested in applying existing methods and/or pursuing methodological extensions. Features: Provides a comprehensive review of methods and applications of Bayesian variable selection. Divided into four parts: Spike-and-Slab Priors; Continuous Shrinkage Priors; Extensions to various Modeling; Other Approaches to Bayesian Variable Selection. Covers theoretical and methodological aspects, as well as worked out examples with R code provided in the online supplement. Includes contributions by experts in the field. Supported by a website with code, data, and other supplementary material

Book Variable Selection in High Dimensional Data Analysis with Applications

Download or read book Variable Selection in High Dimensional Data Analysis with Applications written by and published by . This book was released on 2015 with total page 108 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Jointness in Bayesian Variable Selection with Applications to Growth Regression

Download or read book Jointness in Bayesian Variable Selection with Applications to Growth Regression written by and published by World Bank Publications. This book was released on with total page 17 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Bayesian Variable Selection and Functional Data Analysis

Download or read book Bayesian Variable Selection and Functional Data Analysis written by Asish Kumar Banik and published by . This book was released on 2019 with total page 157 pages. Available in PDF, EPUB and Kindle. Book excerpt: High-dimensional statistics is one of the most studied topics in the field of statistics. The most interesting problem to arise in the last 15 years is variable selection or subset selection. Variable selection is a strong statistical tool that can be explored in functional data analysis. In the first part of this thesis, we implement a Bayesian variable selection method for automatic knot selection. We propose a spike-and-slab prior on knots and formulate a conjugate stochastic search variable selection for significant knots. The computation is substantially faster than existing knot selection methods, as we use Metropolis-Hastings algorithms and a Gibbs sampler for estimation. This work focuses on a single nonlinear covariate, modeled as regression splines. In the next stage, we study Bayesian variable selection in additive models with high-dimensional predictors. The selection of nonlinear functions in models is highly important in recent research, and the Bayesian method of selection has more advantages than contemporary frequentist methods. Chapter 2 examines Bayesian sparse group lasso theory based on spike-and-slab priors to determine its applicability for variable selection and function estimation in nonparametric additive models.The primary objective of Chapter 3 is to build a classification method using longitudinal volumetric magnetic resonance imaging (MRI) data from five regions of interest (ROIs). A functional data analysis method is used to handle the longitudinal measurement of ROIs, and the functional coefficients are later used in the classification models. We propose a P\\'olya-gamma augmentation method to classify normal controls and diseased patients based on functional MRI measurements. We obtain fast-posterior sampling by avoiding the slow and complicated Metropolis-Hastings algorithm. Our main motivation is to determine the important ROIs that have the highest separating power to classify our dichotomous response. We compare the sensitivity, specificity, and accuracy of the classification based on single ROIs and with various combinations of them. We obtain a sensitivity of over 85% and a specificity of around 90% for most of the combinations.Next, we work with Bayesian classification and selection methodology. The main goal of Chapter 4 is to employ longitudinal trajectories in a significant number of sub-regional brain volumetric MRI data as statistical predictors for Alzheimer's disease (AD) classification. We use logistic regression in a Bayesian framework that includes many functional predictors. The direct sampling of regression coefficients from the Bayesian logistic model is difficult due to its complicated likelihood function. In high-dimensional scenarios, the selection of predictors is paramount with the introduction of either spike-and-slab priors, non-local priors, or Horseshoe priors. We seek to avoid the complicated Metropolis-Hastings approach and to develop an easily implementable Gibbs sampler. In addition, the Bayesian estimation provides proper estimates of the model parameters, which are also useful for building inference. Another advantage of working with logistic regression is that it calculates the log of odds of relative risk for AD compared to normal control based on the selected longitudinal predictors, rather than simply classifying patients based on cross-sectional estimates. Ultimately, however, we combine approaches and use a probability threshold to classify individual patients. We employ 49 functional predictors consisting of volumetric estimates of brain sub-regions, chosen for their established clinical significance. Moreover, the use of spike-and-slab priors ensures that many redundant predictors are dropped from the model.Finally, we present a new approach of Bayesian model-based clustering for spatiotemporal data in chapter 5 . A simple linear mixed model (LME) derived from a functional model is used to model spatiotemporal cerebral white matter data extracted from healthy aging individuals. LME provides us with prior information for spatial covariance structure and brain segmentation based on white matter intensity. This motivates us to build stochastic model-based clustering to group voxels considering their longitudinal and location information. The cluster-specific random effect causes correlation among repeated measures. The problem of finding partitions is dealt with by imposing prior structure on cluster partitions in order to derive a stochastic objective function.

Book Bayesian Variable Selection with Spike and slab Priors

Download or read book Bayesian Variable Selection with Spike and slab Priors written by Anjali Agarwal and published by . This book was released on 2016 with total page 90 pages. Available in PDF, EPUB and Kindle. Book excerpt: A major focus of intensive methodological research in recent times has been on knowledge extraction from high-dimensional datasets made available by advances in research technologies. Coupled with the growing popularity of Bayesian methods in statistical analysis, a range of new techniques have evolved that allow innovative model-building and inference in high-dimensional settings – an important one among these being Bayesian variable selection (BVS). The broad goal of this thesis is to explore different BVS methods and demonstrate their application in high-dimensional psychological data analysis. In particular, the focus will be on a class of sparsity-enforcing priors called 'spike-and-slab' priors which are mixture priors on regression coefficients with density functions that are peaked at zero (the 'spike') and also have large probability mass for a wide range of non-zero values (the 'slab'). It is demonstrated that BVS with spike-and-slab priors achieved a reasonable degree of dimensionality-reduction when applied to a psychiatric dataset in a logistic regression setup. BVS performance was also compared to that of LASSO (least absolute shrinkage and selection operator), a popular machine-learning technique, as reported in Ahn et al.(2016). The findings indicate that BVS with a spike-and-slab prior provides a competitive alternative to machine-learning methods, with the additional advantages of ease of interpretation and potential to handle more complex models. In conclusion, this thesis serves to add a new cutting-edge technique to the lab’s tool-shed and helps introduce Bayesian variable-selection to researchers in Cognitive Psychology where it still remains relatively unexplored as a dimensionality-reduction tool.

Book Bayesian Solutions to High dimensional Data Challenges Using Hybrid Search

Download or read book Bayesian Solutions to High dimensional Data Challenges Using Hybrid Search written by Shiqiang Jin and published by . This book was released on 2021 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the era of Big Data, variable selection with high-dimensional data has drawn increasing attention. With a large number of predictors, there rises a big challenge for model fitting and prediction. In this dissertation, we propose three different yet interconnected methodologies, which include theory, computation, and real applications for various scenarios of regression analysis. The primary goal in this dissertation is to develop powerful Bayesian solutions to high-dimensional data challenges using a new variable selection strategy, called hybrid search. To effectively reduce computation costs in high-dimensional data analysis, we propose novel computational strategies that can quickly evaluate a large number of marginal likelihoods simultaneously within a single computation. In Chapter 1, we discuss background and current challenges in high-dimensional variable selection. The motivation of our study is also justified. In Chapter 2, we introduce a new Bayesian method of best subset selection in the context of linear regression. The proposed method rapidly finds the best subset via a hybrid search algorithm that combines deterministic local search and stochastic global search. In Chapter 3, on the basis of the approach in Chapter 2, we extend it to a framework of multivariate linear regression model, which analyzes the relationship between multiple response variables and a common set of predictors. In Chapter 4, we propose a general Bayesian method to perform high-dimensional variable selection for various data types, such as binary, count, continuous and time-to-event (survival) data. Using Bayesian approximation techniques, we develop a general computing strategy that enables us to assess the marginal likelihoods of many candidate models within a single computation. In addition, to accelerate the convergence, we employ a hybrid search algorithm that can quickly explore the model spaces and accurately obtain the global maximum of marginal posterior probabilities.

Book Topics on Variable Selection in High Dimensional Data

Download or read book Topics on Variable Selection in High Dimensional Data written by Jia Wang and published by . This book was released on 2021 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Variable selection has been extensively studied in the last few decades as it provides a principled solution to high dimensionality arising in a broad spectrum of real applications, such as bioinformatics, health studies, social science and econometrics. This dissertation is concerned with variable selection for ultrahigh-dimensional data when the dimension is allowed to grow with the sample size or the network size at an exponential rate. We propose new Bayesian approaches to selecting variables under several model frameworks, including (1) partially linear models (2) static social network models with degree heterogeneity and (3) time-varying network models. Firstly for partially linear models, we develop a procedure which employs the difference-based method to reduce the impact from the estimation of the nonparametric component, and incorporates Bayesian subset modeling with diffusing prior (BSM-DP) to shrink the corresponding estimator in the linear component. Secondly, a class of network models where the connection probability depends on ultrahigh-dimensional nodal covariates (homophily) and node-specific popularity (degree heterogeneity) is considered. We propose a Bayesian method to select nodal features in both dense and sparse networks under a relaxed assumption on popularity parameters. To alleviate the computational burden for large sparse networks, we particularly develop another working model in which parameters are updated based on a dense sub-graph at each step. Lastly, we extend the static model to time-varying cases, where the connection probability at time t is modeled based on observed nodal attributes at time t and node-specific continuous-time baseline functions evaluated at time t. Those Bayesian proposals are shown to be analogous to a mixture of L0 and L2 penalized methods and work well in the setting of highly correlated predictors. Corresponding model selection consistency is studied for all aforementioned models, in the sense that the probability of the true model being selected converges to one asymptotically. The finite sample performance of the proposed models is further examined by simulation studies and analyses on social-media and financial datasets.

Book Bayesian Variable Selection in Parametric and Semiparametric High Dimensional Survival Analysis

Download or read book Bayesian Variable Selection in Parametric and Semiparametric High Dimensional Survival Analysis written by Kyu Ha Lee and published by . This book was released on 2011 with total page 159 pages. Available in PDF, EPUB and Kindle. Book excerpt: In this dissertation, we propose several Bayesian variable selection schemes forBayesian parametric and semiparametric survival models for right-censored survivaldata. In the rst chapter we introduce a special shrinkage prior on the coe cients corresponding to the predictor variables. The shrinkage prior is obtained through a scale mixture representation of Normal and Gamma distributions. The likelihood functionis constructed based on the Cox proportional hazards model framework, where the cumulative baseline hazard function is modeled a priori by a gamma process. In the second chapter we extend the idea of the shrinkage prior such that it can incorporate the existing grouping structure among the covariates. Our selected priors are similar to the elastic-net, group lasso, and fused lasso penalty. The proposed models are highly useful when we want to take into consideration the grouping structure. In the third chapter we propose a Bayesian variable selection method for high dimensional survival analysis in the context of parametric accelerated failure time (AFT) model. To identify subsets of relevant covariates the regression coe cients are assumed to follow the conditional Laplace distribution as in the rst chapter. We used a data augmentation approach to impute the survival times of censored subjects.

Book Bayesian Variable Selection in Linear and Non linear Models

Download or read book Bayesian Variable Selection in Linear and Non linear Models written by Arnab Kumar Maity and published by . This book was released on 2016 with total page 124 pages. Available in PDF, EPUB and Kindle. Book excerpt: Appropriate feature selection is a fundamental problem in the field of statistics. Models with large number of features or variables require special attention due to the computational complexity of the huge model space. This is generally known as the variable or model selection problem in the field of statistics whereas in machine learning and other literature, this is also known as feature selection, attribute selection or variable subset selection. The method of variable selection is the process of efficiently selecting an optimal subset of relevant variables for use in model construction. The central assumption in this methodology is that the data contain many redundant variable; those which do not provide any significant additional information than the optimally selected subset of variable. Variable selection is widely used in all application areas of data analytics, ranging from optimal selection of genes in large scale micro-array studies, to optimal selection of biomarkers for targeted therapy in cancer genomics to selection of optimal predictors in business analytics. Under the Bayesian approach, the formal way to perform this optimal selection is to select the model with highest posterior probability. Using this fact the problem may be thought as an optimization problem over the model space where the objective function is the posterior probability of model and the maximization is taken place with respect to the models. We propose an efficient method for implementing this optimization and we illustrate its feasibility in high dimensional problems. By means of various simulation studies, this new approach has been shown to be efficient and to outperform other statistical feature selection methods methods namely median probability model and sampling method with frequency based estimators. Theoretical justifications are provided. Applications to logistic regression and survival regression are discussed.

Book Bayesian Model Selection Consistency for High dimensional Regression

Download or read book Bayesian Model Selection Consistency for High dimensional Regression written by Min Hua and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Bayesian model selection has enjoyed considerable prominence in high-dimensional variable selection in recent years. Despite its popularity, the asymptotic theory for high-dimensional variable selection has not been fully explored yet. In this study, we aim to identify prior conditions for Bayesian model selection consistency under high-dimensional regression settings. In a Bayesian framework, posterior model probabilities can be used to quantify the importance of models given the observed data. Hence, our focus is on the asymptotic behavior of posterior model probabilities when the number of the potential predictors grows with the sample size. This dissertation contains the following three projects. In the first project, we investigate the asymptotic behavior of posterior model probabilities under the Zellner's g-prior, which is one of the most popular choices for model selection in Bayesian linear regression. We establish a simple and intuitive condition of the Zellner's g-prior under which the posterior model distribution tends to be concentrated at the true model as the sample size increases even if the number of predictors grows much faster than the sample size does. Simulation study results indicate that the satisfaction of our condition is essential for the success of Bayesian high-dimensional variable selection under the g-prior. In the second project, we extend our framework to a general class of priors. The most pressing challenge in our generalization is that the marginal likelihood cannot be expressed in a closed form. To address this problem, we develop a general form of Laplace approximation under a high-dimensional setting. As a result, we establish general sufficient conditions for high-dimensional Bayesian model selection consistency. Our simulation study and real data analysis demonstrate that the proposed condition allows us to identify the true data generating model consistently. In the last project, we extend our framework to Bayesian generalized linear regression models. The distinctive feature of our proposed framework is that we do not impose any specific form of data distribution. In this project we develop a general condition under which the true model tends to maximize the marginal likelihood even when the number of predictors increases faster than the sample size. Our condition provides useful guidelines for the specification of priors including hyperparameter selection. Our simulation study demonstrates the validity of the proposed condition for Bayesian model selection consistency with non-Gaussian data.

Book Variable Selection in High Dimensional Complex Data and Bayesian Estimation of Reduction Subspace

Download or read book Variable Selection in High Dimensional Complex Data and Bayesian Estimation of Reduction Subspace written by Moumita Karmakar and published by . This book was released on 2015 with total page 200 pages. Available in PDF, EPUB and Kindle. Book excerpt: Nowadays researchers are collecting large amount of data for which the number of predictors p is often too large to allow a thorough graphical visualization of the data for regression modeling. Commonly regression data are collected jointly on (Y, X) where X = (X1, ⋯, Xp)T is a random p-dimensional predictor and Y is a univariate response. In high dimensional setup, frequently encountered problems for variable selection or estimation in regression analyses are i) nonlinear relationship among predictors and response, ii) number of predictors much larger than sample size, iii) presence of sparsity.

Book Classification and Data Mining

Download or read book Classification and Data Mining written by Antonio Giusti and published by Springer Science & Business Media. This book was released on 2012-12-18 with total page 291 pages. Available in PDF, EPUB and Kindle. Book excerpt: ​​​​​​​​​This volume contains both methodological papers showing new original methods, and papers on applications illustrating how new domain-specific knowledge can be made available from data by clever use of data analysis methods. The volume is subdivided in three parts: Classification and Data Analysis; Data Mining; and Applications. The selection of peer reviewed papers had been presented at a meeting of classification societies held in Florence, Italy, in the area of "Classification and Data Mining".​