EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Performance of Augmented Inverse Probability Weighting Estimation for High dimensional Data

Download or read book Performance of Augmented Inverse Probability Weighting Estimation for High dimensional Data written by Xiaoyu Wei and published by . This book was released on 2018 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: "Doubly-robust estimators have been used extensively for estimating the treatment effect, for their property of being unbiased when either the outcome regression model or the propensity score model is correctly specified. As the number of data dimension increases nowadays, little is known about how these methods perform in high-dimensional data. In this thesis, we aimed to examine the performance of one doubly-robust estimator, augmented inverse probability weighting (AIPW) estimator, in such data. Several Monte Carlo simulation studies were conducted, and the treatment effect was estimated under both model specification and misspecification. Simulation results showed that propensity score estimation was challenging in such settings. Advanced methods other than multiple logistic regression should be utilized for propensity score estimation and eliminating imbalance. We also investigated further into a high-dimensional propensity score algorithm, a variable selection method for confounding adjustment in high-dimensional data. We incorporated this algorithm in the estimation process, and explored the optimal value for the number of variables to adjust for. Finally, we presented a plasmode simulation study based on a real data set from Clinical Practice Research Datalink, where the effect of post-myocardial infarction statin use on the rate of one-year mortality was studied." --

Book Semiparametric Theory and Missing Data

Download or read book Semiparametric Theory and Missing Data written by Anastasios Tsiatis and published by Springer Science & Business Media. This book was released on 2007-01-15 with total page 392 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book summarizes current knowledge regarding the theory of estimation for semiparametric models with missing data, in an organized and comprehensive manner. It starts with the study of semiparametric methods when there are no missing data. The description of the theory of estimation for semiparametric models is both rigorous and intuitive, relying on geometric ideas to reinforce the intuition and understanding of the theory. These methods are then applied to problems with missing, censored, and coarsened data with the goal of deriving estimators that are as robust and efficient as possible.

Book Targeted Learning

    Book Details:
  • Author : Mark J. van der Laan
  • Publisher : Springer Science & Business Media
  • Release : 2011-06-17
  • ISBN : 1441997822
  • Pages : 628 pages

Download or read book Targeted Learning written by Mark J. van der Laan and published by Springer Science & Business Media. This book was released on 2011-06-17 with total page 628 pages. Available in PDF, EPUB and Kindle. Book excerpt: The statistics profession is at a unique point in history. The need for valid statistical tools is greater than ever; data sets are massive, often measuring hundreds of thousands of measurements for a single subject. The field is ready to move towards clear objective benchmarks under which tools can be evaluated. Targeted learning allows (1) the full generalization and utilization of cross-validation as an estimator selection tool so that the subjective choices made by humans are now made by the machine, and (2) targeting the fitting of the probability distribution of the data toward the target parameter representing the scientific question of interest. This book is aimed at both statisticians and applied researchers interested in causal inference and general effect estimation for observational and experimental data. Part I is an accessible introduction to super learning and the targeted maximum likelihood estimator, including related concepts necessary to understand and apply these methods. Parts II-IX handle complex data structures and topics applied researchers will immediately recognize from their own research, including time-to-event outcomes, direct and indirect effects, positivity violations, case-control studies, censored data, longitudinal data, and genomic studies.

Book Robust High dimensional Data Analysis Using a Weight Shrinkage Rule

Download or read book Robust High dimensional Data Analysis Using a Weight Shrinkage Rule written by Bin Luo and published by . This book was released on 2016 with total page 73 pages. Available in PDF, EPUB and Kindle. Book excerpt: "In high-dimensional settings, a penalized least squares approach may lose its efficiency in both estimation and variable selection due to the existence of either outliers or heteroscedasticity. In this thesis, we propose a novel approach to perform robust high-dimensional data analysis in a penalized weighted least square framework. The main idea is to relate the irregularity of each observation to a weight vector and obtain the outlying status data-adaptively using a weight shrinkage rule. By usage of L-1 type regularization on both the coefficients and weight vectors, the proposed method is able to perform simultaneous variable selection and outliers detection efficiently. Eventually, this procedure results in estimators with potentially strong robustness and non-asymptotic consistency. We provide a unified link between the weight shrinkage rule and a robust M-estimation in general settings. We also establish the non-asymptotic oracle inequalities for the joint estimation of both the regression coefficients and weight vectors. These theoretical results allow the number of variables to far exceed the sample size. The performance of the proposed estimator is demonstrated in both simulation studies and real examples."--Abstract from author supplied metadata.

Book Targeted Minimum Loss Based Estimation

Download or read book Targeted Minimum Loss Based Estimation written by Samuel David Lendle and published by . This book was released on 2015 with total page 76 pages. Available in PDF, EPUB and Kindle. Book excerpt: Causal inference generally requires making some assumptions on a causal mechanism followed by statistical estimation. The statistical estimation problem in causal inference is often that of estimating a pathwise differentiable parameter in a semiparametric or nonparametric model. Targeted minimum loss-based estimating (TMLE) is a framework for constructing an asymptotically linear plug-in estimator for such parameters. The natural direct effect (NDE) is a parameter that quantifies how some treatment affects some outcome directly, as opposed to indirectly through some mediator value between the treatment and outcome on the causal pathway. In Chapter 2, we introduce the NDE among the untreated and show that under some assumptions the NDE among the untreated is identifiable and equivalent to a statistical parameter as the so called average treatment effect among the untreated. We then present a locally efficient, doubly robust TMLE for the statistical target parameter and apply it to the estimation of the NDE among the untreated in simulations and of the NDE in a data set from an RCT. Some estimators that adjust for the propensity score (PS) nonparametrically, such as PS matching or stratification by the PS, are robust to slight misspecification of the PS estimator. In particular, if the PS estimator fails to estimate the true propensity score, but still approximates some other balancing score, such methods are still consistent for average treatment effect (ATE). In Chapter 3, we extend a traditional TMLE for the ATE to have this property while still being locally efficient and doubly robust and investigate the performance of the proposed estimator in a simulation study. Online estimators are estimators that process a relatively small piece of a data set at a time, and can be updated as more data becomes available. Typically, online estimators are used in the large scale machine learning literature, but to our knowledge, have not been used to estimate statistical parameters associated with causal parameters. In Chapter 4, we propose two online estimators for the ATE that are asymptotically efficient and doubly robust in a single pass through a data set. The first is similar to the augmented inverse probability of treatment weighting estimator in the batch setting, and the second involves an additional targeting step inspired by TMLE, which improves performance in some cases. We investigate the performance of both in a simulation study.

Book Unified Methods for Censored Longitudinal Data and Causality

Download or read book Unified Methods for Censored Longitudinal Data and Causality written by Mark J. van der Laan and published by Springer Science & Business Media. This book was released on 2012-11-12 with total page 412 pages. Available in PDF, EPUB and Kindle. Book excerpt: A fundamental statistical framework for the analysis of complex longitudinal data is provided in this book. It provides the first comprehensive description of optimal estimation techniques based on time-dependent data structures. The techniques go beyond standard statistical approaches and can be used to teach masters and Ph.D. students. The text is ideally suitable for researchers in statistics with a strong interest in the analysis of complex longitudinal data.

Book Foundations of Data Science

Download or read book Foundations of Data Science written by Avrim Blum and published by Cambridge University Press. This book was released on 2020-01-23 with total page 433 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.

Book Journal of the American Statistical Association

Download or read book Journal of the American Statistical Association written by and published by . This book was released on 2008 with total page 920 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Statistical Foundations of Data Science

Download or read book Statistical Foundations of Data Science written by Jianqing Fan and published by CRC Press. This book was released on 2020-09-21 with total page 942 pages. Available in PDF, EPUB and Kindle. Book excerpt: Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.

Book Large dimensional Panel Data Econometrics  Testing  Estimation And Structural Changes

Download or read book Large dimensional Panel Data Econometrics Testing Estimation And Structural Changes written by Feng Qu and published by World Scientific. This book was released on 2020-08-24 with total page 167 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book aims to fill the gap between panel data econometrics textbooks, and the latest development on 'big data', especially large-dimensional panel data econometrics. It introduces important research questions in large panels, including testing for cross-sectional dependence, estimation of factor-augmented panel data models, structural breaks in panels and group patterns in panels. To tackle these high dimensional issues, some techniques used in Machine Learning approaches are also illustrated. Moreover, the Monte Carlo experiments, and empirical examples are also utilised to show how to implement these new inference methods. Large-Dimensional Panel Data Econometrics: Testing, Estimation and Structural Changes also introduces new research questions and results in recent literature in this field.

Book The Prevention and Treatment of Missing Data in Clinical Trials

Download or read book The Prevention and Treatment of Missing Data in Clinical Trials written by National Research Council and published by National Academies Press. This book was released on 2010-12-21 with total page 163 pages. Available in PDF, EPUB and Kindle. Book excerpt: Randomized clinical trials are the primary tool for evaluating new medical interventions. Randomization provides for a fair comparison between treatment and control groups, balancing out, on average, distributions of known and unknown factors among the participants. Unfortunately, these studies often lack a substantial percentage of data. This missing data reduces the benefit provided by the randomization and introduces potential biases in the comparison of the treatment groups. Missing data can arise for a variety of reasons, including the inability or unwillingness of participants to meet appointments for evaluation. And in some studies, some or all of data collection ceases when participants discontinue study treatment. Existing guidelines for the design and conduct of clinical trials, and the analysis of the resulting data, provide only limited advice on how to handle missing data. Thus, approaches to the analysis of data with an appreciable amount of missing values tend to be ad hoc and variable. The Prevention and Treatment of Missing Data in Clinical Trials concludes that a more principled approach to design and analysis in the presence of missing data is both needed and possible. Such an approach needs to focus on two critical elements: (1) careful design and conduct to limit the amount and impact of missing data and (2) analysis that makes full use of information on all randomized participants and is based on careful attention to the assumptions about the nature of the missing data underlying estimates of treatment effects. In addition to the highest priority recommendations, the book offers more detailed recommendations on the conduct of clinical trials and techniques for analysis of trial data.

Book Inverse Probability Tilting for Moment Condition Models with Missing Data

Download or read book Inverse Probability Tilting for Moment Condition Models with Missing Data written by Daniel Egel and published by . This book was released on 2010 with total page 42 pages. Available in PDF, EPUB and Kindle. Book excerpt: We propose a new inverse probability weighting (IPW) estimator for moment condition models with missing data. Our estimator is easy to implement and compares favorably with existing IPW estimators, including augmented inverse probability weighting (AIPW) estimators, in terms of efficiency, robustness, and higher order bias. We illustrate our method with a study of the relationship between early Black-White differences in cognitive achievement and subsequent differences in adult earnings. In our dataset the early childhood achievement measure, the main regressor of interest, is missing for many units.

Book Inverse Probability Tilting for Moment Condition Models with Missing Data

Download or read book Inverse Probability Tilting for Moment Condition Models with Missing Data written by Bryan S. Graham and published by . This book was released on 2008 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: We propose a new inverse probability weighting (IPW) estimator for moment condition models with missing data. Our estimator is easy to implement and compares favorably with existing IPW estimators, including augmented inverse probability weighting (AIPW) estimators, in terms of efficiency, robustness, and higher order bias. We illustrate our method with a study of the relationship between early Black-White differences in cognitive achievement and subsequent differences in adult earnings. In our dataset the early childhood achievement measure, the main regressor of interest, is missing for many units.

Book Matched Sampling for Causal Effects

Download or read book Matched Sampling for Causal Effects written by Donald B. Rubin and published by Cambridge University Press. This book was released on 2006-09-04 with total page 5 pages. Available in PDF, EPUB and Kindle. Book excerpt: Matched sampling is often used to help assess the causal effect of some exposure or intervention, typically when randomized experiments are not available or cannot be conducted. This book presents a selection of Donald B. Rubin's research articles on matched sampling, from the early 1970s, when the author was one of the major researchers involved in establishing the field, to recent contributions to this now extremely active area. The articles include fundamental theoretical studies that have become classics, important extensions, and real applications that range from breast cancer treatments to tobacco litigation to studies of criminal tendencies. They are organized into seven parts, each with an introduction by the author that provides historical and personal context and discusses the relevance of the work today. A concluding essay offers advice to investigators designing observational studies. The book provides an accessible introduction to the study of matched sampling and will be an indispensable reference for students and researchers.

Book Elements of Causal Inference

Download or read book Elements of Causal Inference written by Jonas Peters and published by MIT Press. This book was released on 2017-11-29 with total page 289 pages. Available in PDF, EPUB and Kindle. Book excerpt: A concise and self-contained introduction to causal inference, increasingly important in data science and machine learning. The mathematization of causality is a relatively recent development, and has become increasingly important in data science and machine learning. This book offers a self-contained and concise introduction to causal models and how to learn them from data. After explaining the need for causal models and discussing some of the principles underlying causal inference, the book teaches readers how to use causal models: how to compute intervention distributions, how to infer causal models from observational and interventional data, and how causal ideas could be exploited for classical machine learning problems. All of these topics are discussed first in terms of two variables and then in the more general multivariate case. The bivariate case turns out to be a particularly hard problem for causal learning because there are no conditional independences as used by classical methods for solving multivariate cases. The authors consider analyzing statistical asymmetries between cause and effect to be highly instructive, and they report on their decade of intensive research into this problem. The book is accessible to readers with a background in machine learning or statistics, and can be used in graduate courses or as a reference for researchers. The text includes code snippets that can be copied and pasted, exercises, and an appendix with a summary of the most important technical concepts.

Book Analog Estimation Methods in Econometrics

Download or read book Analog Estimation Methods in Econometrics written by Charles F. Manski and published by Chapman and Hall/CRC. This book was released on 1988-06-15 with total page 186 pages. Available in PDF, EPUB and Kindle. Book excerpt: Presents familiar elements of estimation theory from an analog perspective discussing recent developments in the theory of analog estimation and new results that offer flexibility in empirical research. Annotation copyrighted by Book News, Inc., Portland, OR

Book High Dimensional Covariance Estimation

Download or read book High Dimensional Covariance Estimation written by Mohsen Pourahmadi and published by John Wiley & Sons. This book was released on 2013-06-24 with total page 204 pages. Available in PDF, EPUB and Kindle. Book excerpt: Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and machine learning. Recently, the classical sample covariance methodologies have been modified and improved upon to meet the needs of statisticians and researchers dealing with large correlated datasets. High-Dimensional Covariance Estimation focuses on the methodologies based on shrinkage, thresholding, and penalized likelihood with applications to Gaussian graphical models, prediction, and mean-variance portfolio management. The book relies heavily on regression-based ideas and interpretations to connect and unify many existing methods and algorithms for the task. High-Dimensional Covariance Estimation features chapters on: Data, Sparsity, and Regularization Regularizing the Eigenstructure Banding, Tapering, and Thresholding Covariance Matrices Sparse Gaussian Graphical Models Multivariate Regression The book is an ideal resource for researchers in statistics, mathematics, business and economics, computer sciences, and engineering, as well as a useful text or supplement for graduate-level courses in multivariate analysis, covariance estimation, statistical learning, and high-dimensional data analysis.