EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Hypothesis Testing with High dimensional Data

Download or read book Hypothesis Testing with High dimensional Data written by Sen Zhao and published by . This book was released on 2017 with total page 185 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the past two decades, vast high-dimensional biomedical datasets have become mainstay in various biomedical applications from genomics to neuroscience. These high-dimensional data enable researchers to answer scientific questions that are impossible to answer with classical, low-dimensional datasets. However, due to the "curse of dimensionality", such high-dimensional datasets also pose serious statistical challenges. Motivated by these emerging applications, statisticians have devoted much effort to developing estimation methods for high-dimensional linear models and graphical models. However, there is still little progress on quantifying the uncertainty of the estimates, e.g., obtaining p-values and confidence intervals, which are crucial for drawing scientific conclusions. While encouraging advances have been made in this area over the past couple of years, the majority of existing high-dimensional hypothesis testing methods still suffer from low statistical power or high computational intensity. In this dissertation, we focus on developing hypothesis testing methods for high-dimensional linear and graphical models. In Chapter 2, we investigate a naive and simple two-step hypothesis testing procedure for linear models. We show that, under appropriate conditions, such a simple procedure controls type-I error rate, and is closely connected to more complicated alternatives. We also show in numerical studies that such a simple procedure achieves similar performance as procedures that are computationally more intense. In Chapter 3, we consider hypothesis testing for linear regression that incorporates external information about the relationship between variables represented by a graph, such as the gene regulatory network. We show in theory and numerical studies that by incorporating informative external information, our proposal is substantially more powerful than existing methods that ignore such information. We also propose a more robust procedure for settings where the external information is potentially inaccurate or imprecise. This robust procedure could adaptively choose the amount of external information to be incorporated based on the data. In Chapter 4, we shift our focus to Gaussian graphical models. We propose a novel procedure to test whether two Gaussian graphical models share the same edge set, while controlling the false positive rate. In the case that two networks are different, our proposals could identify specific nodes and edges that show differential connectivity. In this chapter, we also demonstrate that when the goal is to identify differentially connected nodes and edges, the results from our proposal are more interpretable than existing procedures based on covariance or precision matrices. We finish the dissertation with a discussion in Chapter 5, in which we present viable future research directions, and discuss a possible extension of our proposals to vector autoregression models for time series.

Book Inference Methods for High Dimensional Data

Download or read book Inference Methods for High Dimensional Data written by Zhe Zhang and published by . This book was released on 2023 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: This dissertation aims to develop new statistical inference procedure for high-dimensional regression models, and focuses on three fundamental problems: (a) individual hypothesis testing without specification of high-dimensional regression models, (b) high dimensional linear hypothesis testing in linear regression model and (c) individual hypothesis testing in partial linear model . In Chapter 3, we propose an effective model-free inference procedure for high-dimensional regression models. We first reformulate the hypothesis testing problem via sufficient dimension reduction framework. With the aid of new reformulation, we propose a new test statistic and show that its asymptotic distribution is $\chi^2$ distribution whose degree of freedom does not depend on the unknown population distribution. We further conduct power analysis under local alternative hypotheses. In addition, we study how to control the false discovery rate of the proposed chi-squared tests, which are correlated, to identify important predictors under a model-free framework. To this end, we propose a multiple testing procedure and establish its theoretical guarantees. Monte Carlo simulation studies are conducted to assess the performance of the proposed tests and an empirical analysis of a real-world data set is used to illustrate the proposed methodology. In Chapter 4, we present a novel transformation-based inference method for conducting linear hypothesis tests in high-dimensional linear regression models. Our method uses score functions to construct a new random vector and links high-dimensional coefficient tests to high-dimensional one sample mean tests. We provide a formulation for a U-statistic with a kernel of order two and demonstrate its asymptotic normality. The presence of high-dimensional nuisance parameters presents a significant challenge in our model setting, however, we have shown that their impact can be disregarded asymptotically under mild conditions. Additionally, we have studied the influence of the power enhancement term on power performance through both theoretical analysis and simulations. The results indicate that the enhancement term does not impact the type-I error rate and can improve power performance in scenarios where the U-statistic may not perform well. In Chapter 5, we consider testing the treatment effect in high-dimensional partial linear models. Due to the slow convergence rate of the unknown nuisance function estimator from some machine learning algorithms, we can not directly estimate and plug in the nuisance function on the same data. To overcome this limitation, we update the estimation of the nuisance function recursively. This leads to an explicit expression of the estimators of the parameters of interest. Our approach has been shown to have asymptotic normality, and we assess its finite sample performance through simulations. The results indicate that our statistic offers higher power than in cases of model misspecification.

Book High dimensional Data Analysis

Download or read book High dimensional Data Analysis written by Tianwen Tony Cai and published by World Scientific Publishing Company Incorporated. This book was released on 2011 with total page 307 pages. Available in PDF, EPUB and Kindle. Book excerpt: Over the last few years, significant developments have been taking place in high-dimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics and signal processing. In particular, substantial advances have been made in the areas of feature selection, covariance estimation, classification and regression. This book intends to examine important issues arising from high-dimensional data analysis to explore key ideas for statistical inference and prediction. It is structured around topics on multiple hypothesis testing, feature selection, regression, classification, dimension reduction, as well as applications in survival analysis and biomedical research. The book will appeal to graduate students and new researchers interested in the plethora of opportunities available in high-dimensional data analysis.

Book Bootstrap Based Hypothesis Testing for High dimensional Data

Download or read book Bootstrap Based Hypothesis Testing for High dimensional Data written by Nilanjan Chakraborty and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Over the last two decades inference problems on high-dimensional data that arise in finance, genetics and information technology have gained huge momentum. In this work, the main focus will be on developing bootstrap testing procedures under high dimensional set up for the following two hypotheses testing problems.i)High-dimensional Multivariate Analysis of Variance ii)Testing the equality of two covariance matrices in the two sample set up.The statistics considered for testing are infinity norm based statistics over either weighted sums or differences across various samples. We provide Gaussian approximation results for normalized sums of high dimensional random vectors and U-statistics under some weak conditions on moments and tails of their marginal distributions. The obtained results are free from the assumption of sparsity and correlation structures among the components of the random vectors. For the implementation of these tests, we develop multiplier bootstrap and jackknifed multiplier bootstrap procedures. These newly developed bootstrap techniques ensure first order accuracy of the asymptotic level and power of the formulated tests, enhancing their applicability. We also provide consistency of the proposed test against both fixed and local alternatives.

Book Handbook of Big Data Analytics

Download or read book Handbook of Big Data Analytics written by Wolfgang Karl Härdle and published by Springer. This book was released on 2018-07-20 with total page 532 pages. Available in PDF, EPUB and Kindle. Book excerpt: Addressing a broad range of big data analytics in cross-disciplinary applications, this essential handbook focuses on the statistical prospects offered by recent developments in this field. To do so, it covers statistical methods for high-dimensional problems, algorithmic designs, computation tools, analysis flows and the software-hardware co-designs that are needed to support insightful discoveries from big data. The book is primarily intended for statisticians, computer experts, engineers and application developers interested in using big data analytics with statistics. Readers should have a solid background in statistics and computer science.

Book Tests of Hypotheses on Regression Coefficients in High Dimensional Regression Models

Download or read book Tests of Hypotheses on Regression Coefficients in High Dimensional Regression Models written by Ye Alex Zhao and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Statistical inference in high-dimensional settings has become an important area of research due to the increased production of high-dimensional data in a wide variety of areas. However, few approaches towards simultaneous hypothesis testing of high-dimensional regression coefficients have been proposed. In the first project of this dissertation, we introduce a new method for simultaneous tests of the coefficients in a high-dimensional linear regression model. Our new test statistic is based on the sum-of-squares of the score function mean with an additional power-enhancement term. The asymptotic distribution and power of the test statistic are derived, and our procedure is shown to outperform existing approaches. We conduct Monte Carlo simulations to demonstrate performance improvements over existing methods and apply the testing procedure to a real data example. In the second project, we propose a test statistic for regression coefficients in a high-dimensional setting that applies for generalized linear models. Building on previous work on testing procedures for high-dimensional linear regression models, we extend this approach to create a new testing methodology for GLMs, with specific illustrations for the Poisson and logistic regression scenarios. The asymptotic distribution of the test statistic is established, and both simulation results and a real data analysis are conducted to illustrate the performance of our proposed method. The final project of this dissertation introduces two new approaches for testing high-dimensional regression coefficients in the partial linear model setting and more generally for linear hypothesis tests in linear models. Our proposed statistic is motivated by the profile least squares method and decorrelation score method for high-dimensional inference, which we show to be equivalent in these particular cases. We outline the empirical performance of the new test statistic with simulation studies and real data examples. These results indicate generally satisfactory performance under a wide range of settings and applicability to real world data problems.

Book Analysis and Testing of Sparse High Dimensional Discrete Data

Download or read book Analysis and Testing of Sparse High Dimensional Discrete Data written by Amanda Rae Plunkett and published by . This book was released on 2015 with total page 266 pages. Available in PDF, EPUB and Kindle. Book excerpt: High dimensional data analysis has been one of the most challenging problems in statistics and related areas for the last two decades. High dimensions occur in many applications where computers are able to capture large amounts of information related to a collected sample. Applications include genetic research, image processing, natural language processing, and signal processing to name a few. We focus on the problem of two-sample hypothesis testing for two cases: 1) sparse high dimensional multinomial data, and 2) sparse high dimensional binary data. We propose new statistical tests for each, prove their theoretical validity, and test their performance in various scenarios through simulations and analysis of applied problems. Additionally, we perform follow up analysis of these datasets using statistical classification methods.

Book Statistical Analysis for High Dimensional Data

Download or read book Statistical Analysis for High Dimensional Data written by Arnoldo Frigessi and published by Springer. This book was released on 2016-02-16 with total page 313 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book features research contributions from The Abel Symposium on Statistical Analysis for High Dimensional Data, held in Nyvågar, Lofoten, Norway, in May 2014. The focus of the symposium was on statistical and machine learning methodologies specifically developed for inference in “big data” situations, with particular reference to genomic applications. The contributors, who are among the most prominent researchers on the theory of statistics for high dimensional inference, present new theories and methods, as well as challenging applications and computational solutions. Specific themes include, among others, variable selection and screening, penalised regression, sparsity, thresholding, low dimensional structures, computational challenges, non-convex situations, learning graphical models, sparse covariance and precision matrices, semi- and non-parametric formulations, multiple testing, classification, factor models, clustering, and preselection. Highlighting cutting-edge research and casting light on future research directions, the contributions will benefit graduate students and researchers in computational biology, statistics and the machine learning community.

Book Pattern Analysis  Dimensionality Reduction and Hypothesis Testing in High dimensional Data from Animal Studies with Small Sample Sizes

Download or read book Pattern Analysis Dimensionality Reduction and Hypothesis Testing in High dimensional Data from Animal Studies with Small Sample Sizes written by Hristo Todorov and published by . This book was released on 2020 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Data adaptive Statistics for Multiple Hypothesis Testing in High dimensional Settings

Download or read book Data adaptive Statistics for Multiple Hypothesis Testing in High dimensional Settings written by Weixin Cai and published by . This book was released on 2017 with total page 25 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Handbook of Beta Distribution and Its Applications

Download or read book Handbook of Beta Distribution and Its Applications written by Arjun K. Gupta and published by CRC Press. This book was released on 2004-06-21 with total page 594 pages. Available in PDF, EPUB and Kindle. Book excerpt: A milestone in the published literature on the subject, this first-ever Handbook of Beta Distribution and Its Applications clearly enumerates the properties of beta distributions and related mathematical notions. It summarizes modern applications in a variety of fields, reviews up-and-coming progress from the front lines of statistical research and practice, and demonstrates the applicability of beta distributions in fields such as economics, quality control, soil science, and biomedicine. The book discusses the centrality of beta distributions in Bayesian inference, the beta-binomial model and applications of the beta-binomial distribution, and applications of Dirichlet integrals.

Book Topics on Power Enhancement in High dimensional Hypothesis Tests

Download or read book Topics on Power Enhancement in High dimensional Hypothesis Tests written by Xiufan Yu and published by . This book was released on 2021 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: In recent years, power-enhanced tests with high-dimensional data have received growing attention in theoretical and applied statistics. Many scientific research questions can be converted into hypothesis testing problems, for example, the discovery of association between gene-sets and disease outcomes, the evaluation on the validity of a pricing model for financial market. Various tests possess different high-power regions. In practice, we may lack prior knowledge about the alternatives when testing for a problem of interest. It is important to develop powerful testing procedures against more general alternatives. In this dissertation, we propose new methods to achieve power enhancement (PE) in tests for high-dimensional data. In particular, we consider the problem of enhancing test power in three topics: (1) a one-sample test on multi-factor pricing models for large panels, (2) a two-sample test on the equality of high-dimensional covariance matrices, and (3) a simultaneous test on the equality of two-sample mean vectors and covariance matrices of high-dimensions. Methodologically, we provide a new perspective to the literature by studying and utilizing the asymptotic joint distribution of different statistics. We show two PE techniques of (i) aggregating information via the combination of p-values, and (ii) constructing PE components, to achieve enhanced test power in two aspects: (a) expanding high-power regions towards a wider alternative space with respect to one parameter of interest, and (b) expanding test capability to alternative spaces with respect to more parameters. Theoretically, we derive joint limiting laws of the corresponding test statistics. We prove that the proposed power-enhanced tests achieve the desired PE properties following the guidance of the three general PE principles (Fan, Liao and Yao, 2015). Practically, the test efficacy is demonstrated by Monte Carlo simulations as well as empirical studies on testing market efficiency and identifying differentially expressed gene-sets.

Book Large Sample Covariance Matrices and High Dimensional Data Analysis

Download or read book Large Sample Covariance Matrices and High Dimensional Data Analysis written by Jianfeng Yao and published by Cambridge University Press. This book was released on 2015-03-26 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: High-dimensional data appear in many fields, and their analysis has become increasingly important in modern statistics. However, it has long been observed that several well-known methods in multivariate analysis become inefficient, or even misleading, when the data dimension p is larger than, say, several tens. A seminal example is the well-known inefficiency of Hotelling's T2-test in such cases. This example shows that classical large sample limits may no longer hold for high-dimensional data; statisticians must seek new limiting theorems in these instances. Thus, the theory of random matrices (RMT) serves as a much-needed and welcome alternative framework. Based on the authors' own research, this book provides a first-hand introduction to new high-dimensional statistical methods derived from RMT. The book begins with a detailed introduction to useful tools from RMT, and then presents a series of high-dimensional problems with solutions provided by RMT methods.

Book Permutation Tests for Complex Data

Download or read book Permutation Tests for Complex Data written by Fortunato Pesarin and published by John Wiley & Sons. This book was released on 2010-02-25 with total page 448 pages. Available in PDF, EPUB and Kindle. Book excerpt: Complex multivariate testing problems are frequently encountered in many scientific disciplines, such as engineering, medicine and the social sciences. As a result, modern statistics needs permutation testing for complex data with low sample size and many variables, especially in observational studies. The Authors give a general overview on permutation tests with a focus on recent theoretical advances within univariate and multivariate complex permutation testing problems, this book brings the reader completely up to date with today’s current thinking. Key Features: Examines the most up-to-date methodologies of univariate and multivariate permutation testing. Includes extensive software codes in MATLAB, R and SAS, featuring worked examples, and uses real case studies from both experimental and observational studies. Includes a standalone free software NPC Test Release 10 with a graphical interface which allows practitioners from every scientific field to easily implement almost all complex testing procedures included in the book. Presents and discusses solutions to the most important and frequently encountered real problems in multivariate analyses. A supplementary website containing all of the data sets examined in the book along with ready to use software codes. Together with a wide set of application cases, the Authors present a thorough theory of permutation testing both with formal description and proofs, and analysing real case studies. Practitioners and researchers, working in different scientific fields such as engineering, biostatistics, psychology or medicine will benefit from this book.

Book Introduction to Property Testing

Download or read book Introduction to Property Testing written by Oded Goldreich and published by Cambridge University Press. This book was released on 2017-11-23 with total page 473 pages. Available in PDF, EPUB and Kindle. Book excerpt: An extensive and authoritative introduction to property testing, the study of super-fast algorithms for the structural analysis of large quantities of data in order to determine global properties. This book can be used both as a reference book and a textbook, and includes numerous exercises.

Book Introduction to High Dimensional Statistics

Download or read book Introduction to High Dimensional Statistics written by Christophe Giraud and published by CRC Press. This book was released on 2021-08-25 with total page 410 pages. Available in PDF, EPUB and Kindle. Book excerpt: Praise for the first edition: "[This book] succeeds singularly at providing a structured introduction to this active field of research. ... it is arguably the most accessible overview yet published of the mathematical ideas and principles that one needs to master to enter the field of high-dimensional statistics. ... recommended to anyone interested in the main results of current research in high-dimensional statistics as well as anyone interested in acquiring the core mathematical skills to enter this area of research." —Journal of the American Statistical Association Introduction to High-Dimensional Statistics, Second Edition preserves the philosophy of the first edition: to be a concise guide for students and researchers discovering the area and interested in the mathematics involved. The main concepts and ideas are presented in simple settings, avoiding thereby unessential technicalities. High-dimensional statistics is a fast-evolving field, and much progress has been made on a large variety of topics, providing new insights and methods. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this new edition: Offers revised chapters from the previous edition, with the inclusion of many additional materials on some important topics, including compress sensing, estimation with convex constraints, the slope estimator, simultaneously low-rank and row-sparse linear regression, or aggregation of a continuous set of estimators. Introduces three new chapters on iterative algorithms, clustering, and minimax lower bounds. Provides enhanced appendices, minimax lower-bounds mainly with the addition of the Davis-Kahan perturbation bound and of two simple versions of the Hanson-Wright concentration inequality. Covers cutting-edge statistical methods including model selection, sparsity and the Lasso, iterative hard thresholding, aggregation, support vector machines, and learning theory. Provides detailed exercises at the end of every chapter with collaborative solutions on a wiki site. Illustrates concepts with simple but clear practical examples.