EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book The Use of Semi parametric Methods in Achieving Robust Inference

Download or read book The Use of Semi parametric Methods in Achieving Robust Inference written by and published by . This book was released on 1996 with total page 290 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Robust Nonparametric and Semiparametric Modeling

Download or read book Robust Nonparametric and Semiparametric Modeling written by Bo Kai and published by . This book was released on 2009 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: In this dissertation, several new statistical procedures in nonparametric and semiparametric models are proposed. The concerns of the research are efficiency, robustness and sparsity. In Chapter 3, we propose complete composite quantile regression (CQR) procedures for estimating both the regression function and its derivatives in fully nonparametric regression models by using local smoothing techniques. The CQR estimator was recently proposed by Zou and Yuan (2008) for estimating the regression coefficients in the classical linear regression model. The asymptotic theory of the proposed estimator was established. We show that, compared with the classical local linear least squares estimator, the new method can significantly improve the estimation efficiency of the local linear least squares estimator for commonly used non-normal error distributions, and at the same time, the loss in efficiency is at most 8.01% in the worst case scenario. In Chapter 4, we further consider semiparametric models. The complexity of semiparametric models poses new challenges to parametric inferences and model selection that frequently arise from real applications. We propose new robust inference procedures for the semiparametric varying-coefficient partially linear model. We first study a quantile regression estimate for the nonparametric varying-coefficient functions and the parametric regression coefficients. To improve efficiency, we further develop a composite quantile regression procedure for both parametric and nonparametric components. To achieve sparsity, we develop a variable selection procedure for this model to select significant variables. We study the sampling properties of the resulting quantile regression estimate and composite quantile regression estimate. With proper choices of penalty functions and regularization parameters, we show the proposed variable selection procedure possesses the oracle property in the terminology of Fan and Li (2001). In Chapter 5, we propose a novel estimation procedure for varying coefficient models based on local ranks. By allowing the regression coefficients to change with certain covariates, the class of varying coefficient models offers a flexible semiparametric approach to modeling nonlinearity and interactions between covariates. Varying coefficient models are useful nonparametric regression models and have been well studied in the literature. However, the performance of existing procedures can be adversely influenced by outliers. The new procedure provides a highly efficient and robust alternative to the local linear least squares method and can be conveniently implemented using existing R software packages. We study the sample properties of the proposed procedure and establish the asymptotic normality of the resulting estimate. We also derive the asymptotic relative efficiency of the proposed local rank estimate to the local linear estimate for the varying coefficient model. The gain of the local rank regression estimate over the local linear regression estimate can be substantial. We further develop nonparametric inferences for the rank-based method. Monte Carlo simulations are conducted to access the finite sample performance of the proposed estimation procedure. The simulation results are promising and consistent with our theoretical findings. All the proposed procedures are supported by intensive finite sample simulation studies and most are illustrated with real data examples.

Book Contributions to Semiparametric Inference and Its Applications

Download or read book Contributions to Semiparametric Inference and Its Applications written by Seong Ho Lee and published by . This book was released on 2023 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: This dissertation focuses on developing statistical methods for semiparametric inference and its applications. Semiparametric theory provides statistical tools that are flexible and robust to model misspecification. Utilizing the theory, this work proposes robust estimation approaches that are applicable to several scenarios with mild conditions, and establishes their asymptotic properties for inference. Chapter 1 provides a brief review of the literature related to this work. It first introduces the concept of semiparametric models and the efficiency bound. It further discusses two nonparametric techniques employed in the following chapters, kernel regression and B-spline approximation. The chapter then addresses the concept of dataset shift. In Chapter 2, novel estimators of causal effects for categorical and continuous treatments are proposed by using an optimal covariate balancing strategy for inverse probability weighting. The resulting estimators are shown to be consistent for causal contrasts and asymptotically normal, when either the model explaining the treatment assignment is correctly specified, or the correct set of bases for the outcome models has been chosen and the assignment model is sufficiently rich. Asymptotic results are complemented with simulations illustrating the finite sample properties. A data analysis suggests a nonlinear effect of BMI on self-reported health decline among the elderly. In Chapter 3, we consider a semiparametric generalized linear model and study estimation of both marginal mean effects and marginal quantile effects in this model. We propose an approximate maximum likelihood estimator and rigorously establish the consistency, the asymptotic normality, and the semiparametric efficiency of our method in both the marginal mean effect and the marginal quantile effect estimation. Simulation studies are conducted to illustrate the finite sample performance, and we apply the new tool to analyze non-labor income data and discover a new interesting predictor. In Chapter 4, we propose a procedure to select the best training subsample for a classification model. Identifying patient's disease status from electronic health records (EHR) is a frequently encountered task in EHR related research. However, assessing patient's phenotype is costly and labor intensive, hence a proper selection of EHR as a training set is desired. We propose a procedure to tailor the training subsample for a classification model minimizing its mean squared error (MSE). We provide theoretical justification on its optimality in terms of MSE. The performance gain from our method is illustrated through simulation and a real data example, and is found often satisfactory under criteria beyond mean squared error. In Chapter 5, we study label shift assumption and propose robust estimators for quantities of interest. In studies ranging from clinical medicine to policy research, the quantity of interest is often sought for a population from which only partial data is available, based on complete data from a related but different population. In this work, we consider this setting under the so-called label shift assumption. We propose an estimation procedure that only needs standard nonparametric techniques to approximate a conditional expectation, while by no means needs estimates for other model components. We develop the large sample theory for the proposed estimator, and examine its finite-sample performance through simulation studies, as well as an application to the MIMIC-III database.

Book Towards Distribution free Interpretation  Inference and Network Estimation

Download or read book Towards Distribution free Interpretation Inference and Network Estimation written by Yue Gao (Ph.D.) and published by . This book was released on 2023 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the era of AI, statistical or machine learning methods towards distribution-free assumptions are becoming increasingly important due to the growing amount of data that is being collected and analyzed. Traditional parametric methods may not always be appropriate or may lead to model mis-specification and inaccurate results when dealing with large or complex data sets. Besides, as specific distributional assumptions or parametric modeling are removed, the challenge of model interpretation and prediction inference arises and has been currently at the forefront of research efforts. One problem of our interests in this regard is non-parametric or semi-parametric network estimation for data that are not independent. Specifically, influence network estimation from a multi-variate point process or time series data is a problem of fundamental importance. Prior work has focused on parametric approaches that require a known parametric model, which makes estimation procedures less robust to model mis-specification, non-linearities and heterogeneities. In Chapter 2, we develop a semi-parametric approach based on the monotone single-index multi-variate autoregressive model (SIMAM) which addresses these challenges. In particular, rather than using standard parametric approaches, we use the monotone single index model (SIM) for network estimation. We provide theoretical guarantees for dependent data, and an alternating projected gradient descent algorithm. Significantly we achieve rates of the form O(T^{-1/3} \sqrt{s\log(TM)}) (optimal in the independent design case) where s is {he number of edges in the influence network that indicates the sparsity level, M is the number of actors and T is the number of time points. In addition, we demonstrate the performance of SIMAM both on simulated data and two real data examples, and show it outperforms state-of-the-art parametric methods both in terms of prediction and network estimation. Another aspect important for distribution-free or model-free learning is the interpretation, i.e. to make the complicated non-parametric predictive models explainable. A number of model-agnostic methods for measuring variable importance (VI) have emerged in recent times, which assess the difference in predictive power between a full model trained on all variables and a reduced model that omits the variable(s) of interest. However, these methods typically encounter a bottleneck when estimating the reduced model for each variable or subset of variables, which is both costly and lacks theoretical guarantees. To address this problem, Chapter 3 proposes an efficient and adaptable approach for approximating the reduced model while ensuring important inferential guarantees. Specifically, we replace the need for fully retraining a wide neural network with a linearization that is initiated using the full model parameters. By including a ridge-like penalty to make the problem convex, we establish that our method can estimate the variable importance measure with an error rate of O({1}/{\sqrt{n}), where n represents the number of training samples, provided that the ridge penalty parameter is adequately large. Furthermore, we demonstrate that our estimator is asymptotically normal, enabling us to provide confidence bounds for the VI estimates. Finally, we demonstrate the method's speed and accuracy under different data-generating regimes and showcase its applicability in a real-world seasonal climate forecasting example. In addition to semi-parametric network estimation and fast estimation of variable importance for interpretation, an efficient method for prediction inference without specific distributional assumptions on the data is of our interest as well. In Chapter 4, we present a novel, computationally-efficient algorithm for predictive inference (PI) that requires no distributional assumptions in the data and can be computed faster than existing bootstrap-type methods for neural networks. Specifically, if there are $n$ training samples, bootstrap methods require training a model on each of the n subsamples of size n-1; for large models like neural networks, this process can be computationally prohibitive. In contrast, the proposed method trains one neural network on the full dataset with ([epsilon], [delta]) -differential privacy (DP) and then approximates each leave-one-out model efficiently using a linear approximation around the neural network estimate. With exchangeable data, we prove that our approach has a rigorous coverage guarantee that depends on the preset privacy parameters and the stability of the neural network, regardless of the data distribution. Simulations and experiments on real data demonstrate that our method satisfies the coverage guarantees with substantially reduced computation compared to bootstrap methods.

Book Towards Distribution free Interpretation  Inference and Network Estimation

Download or read book Towards Distribution free Interpretation Inference and Network Estimation written by Yue Gao (Ph.D.) and published by . This book was released on 2023 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the era of AI, statistical or machine learning methods towards distribution-free assumptions are becoming increasingly important due to the growing amount of data that is being collected and analyzed. Traditional parametric methods may not always be appropriate or may lead to model mis-specification and inaccurate results when dealing with large or complex data sets. Besides, as specific distributional assumptions or parametric modeling are removed, the challenge of model interpretation and prediction inference arises and has been currently at the forefront of research efforts. One problem of our interests in this regard is non-parametric or semi-parametric network estimation for data that are not independent. Specifically, influence network estimation from a multi-variate point process or time series data is a problem of fundamental importance. Prior work has focused on parametric approaches that require a known parametric model, which makes estimation procedures less robust to model mis-specification, non-linearities and heterogeneities. In Chapter 2, we develop a semi-parametric approach based on the monotone single-index multi-variate autoregressive model (SIMAM) which addresses these challenges. In particular, rather than using standard parametric approaches, we use the monotone single index model (SIM) for network estimation. We provide theoretical guarantees for dependent data, and an alternating projected gradient descent algorithm. Significantly we achieve rates of the form O(T^{-1/3} \sqrt{s\log(TM)}) (optimal in the independent design case) where s is {he number of edges in the influence network that indicates the sparsity level, M is the number of actors and T is the number of time points. In addition, we demonstrate the performance of SIMAM both on simulated data and two real data examples, and show it outperforms state-of-the-art parametric methods both in terms of prediction and network estimation. Another aspect important for distribution-free or model-free learning is the interpretation, i.e. to make the complicated non-parametric predictive models explainable. A number of model-agnostic methods for measuring variable importance (VI) have emerged in recent times, which assess the difference in predictive power between a full model trained on all variables and a reduced model that omits the variable(s) of interest. However, these methods typically encounter a bottleneck when estimating the reduced model for each variable or subset of variables, which is both costly and lacks theoretical guarantees. To address this problem, Chapter 3 proposes an efficient and adaptable approach for approximating the reduced model while ensuring important inferential guarantees. Specifically, we replace the need for fully retraining a wide neural network with a linearization that is initiated using the full model parameters. By including a ridge-like penalty to make the problem convex, we establish that our method can estimate the variable importance measure with an error rate of O({1}/{\sqrt{n}), where n represents the number of training samples, provided that the ridge penalty parameter is adequately large. Furthermore, we demonstrate that our estimator is asymptotically normal, enabling us to provide confidence bounds for the VI estimates. Finally, we demonstrate the method's speed and accuracy under different data-generating regimes and showcase its applicability in a real-world seasonal climate forecasting example. In addition to semi-parametric network estimation and fast estimation of variable importance for interpretation, an efficient method for prediction inference without specific distributional assumptions on the data is of our interest as well. In Chapter 4, we present a novel, computationally-efficient algorithm for predictive inference (PI) that requires no distributional assumptions in the data and can be computed faster than existing bootstrap-type methods for neural networks. Specifically, if there are $n$ training samples, bootstrap methods require training a model on each of the n subsamples of size n-1; for large models like neural networks, this process can be computationally prohibitive. In contrast, the proposed method trains one neural network on the full dataset with ([epsilon], [delta]) -differential privacy (DP) and then approximates each leave-one-out model efficiently using a linear approximation around the neural network estimate. With exchangeable data, we prove that our approach has a rigorous coverage guarantee that depends on the preset privacy parameters and the stability of the neural network, regardless of the data distribution. Simulations and experiments on real data demonstrate that our method satisfies the coverage guarantees with substantially reduced computation compared to bootstrap methods.

Book Robust Semi Parametric Inference in Semi Supervised Settings

Download or read book Robust Semi Parametric Inference in Semi Supervised Settings written by Abhishek Chakrabortty and published by . This book was released on 2016 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: In Chapter 1, we propose a class of Efficient and Adaptive Semi-Supervised Estimators (EASE) for linear regression. These are semi-non-parametric imputation based two-step estimators adaptive to model mis-specification, leading to improved efficiency under model mis-specification, and equal (optimal) efficiency when the linear model holds. This adaptive property is crucial for advocating safe use of U. We provide asymptotic results establishing our claims, followed by simulations and application to real data.

Book Nonparametric and Semiparametric Models

Download or read book Nonparametric and Semiparametric Models written by Wolfgang Karl Härdle and published by Springer Science & Business Media. This book was released on 2012-08-27 with total page 317 pages. Available in PDF, EPUB and Kindle. Book excerpt: The statistical and mathematical principles of smoothing with a focus on applicable techniques are presented in this book. It naturally splits into two parts: The first part is intended for undergraduate students majoring in mathematics, statistics, econometrics or biometrics whereas the second part is intended to be used by master and PhD students or researchers. The material is easy to accomplish since the e-book character of the text gives a maximum of flexibility in learning (and teaching) intensity.

Book Essays on Semi  non parametric Methods in Econometrics

Download or read book Essays on Semi non parametric Methods in Econometrics written by Sungwon Lee and published by . This book was released on 2018 with total page 416 pages. Available in PDF, EPUB and Kindle. Book excerpt: My dissertation contains three chapters focusing on semi-/non-parametric models in econometrics. The first chapter, which is a joint work with Sukjin Han, considers parametric/semiparametric estimation and inference in a class of bivariate threshold crossing models with dummy endogenous variables. We investigate the consequences of common practices employed by empirical researchers using this class of models, such as the specification of the joint distribution of the unobservables to be a bivariate normal distribution, resulting in a bivariate probit model. To address the problem of misspecification, we propose a semiparametric estimation framework with parametric copula and nonparametric marginal distributions. This specification is an attempt to ensure robustness while achieving point identification and efficient estimation. We establish asymptotic theory for the sieve maximum likelihood estimators that can be used to conduct inference on the individual structural parameters and the average treatment effects. Numerical studies suggest the sensitivity of parametric specification and the robustness of semiparametric estimation. This paper also shows that the absence of excluded instruments may result in the failure of identification, unlike what some practitioners believe. The second chapter develops nonparametric significance tests for quantile regression models with duration outcomes. It is common for empirical studies to specify models with many covariates to eliminate the omitted variable bias, even if some of them are potentially irrelevant. In the case where models are nonparametrically specified, such a practice results in the curse of dimensionality. I adopt the integrated conditional moment (ICM) approach, which was developed by Bierens (1982) and Bierens (1990) to construct test statistics. The proposed test statistics are functionals of a stochastic process which converges weakly to a centered Gaussian process. The test has non-trivial power against local alternatives at the parametric rate. A subsampling procedure is proposed to obtain critical values. The third chapter considers identification of treatment effect and its distribution under some distributional assumptions. I assume that a binary treatment is endogenously determined. The main identification objects are the quantile treatment effect and the distribution of the treatment effect. I construct a counterfactual model and apply Manski's approach (Manski (1990)) to find the quantile treatment effects. For the distribution of the treatment effect, I adapt the approach proposed by Fan and Park (2010). Some distributional assumptions called stochastic dominance are imposed on the model to tighten the bounds on the parameters of interest. It also provides confidence regions for identified sets that are pointwise consistent in level. An empirical study on the return to college confirms that the stochastic dominance assumptions improve the bounds on the distribution of the treatment effect.

Book Efficient and Adaptive Estimation for Semiparametric Models

Download or read book Efficient and Adaptive Estimation for Semiparametric Models written by Peter J. Bickel and published by Springer. This book was released on 1998-06-01 with total page 588 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book deals with estimation in situations in which there is believed to be enough information to model parametrically some, but not all of the features of a data set. Such models have arisen in a wide context in recent years, and involve new nonlinear estimation procedures. Statistical models of this type are directly applicable to fields such as economics, epidemiology, and astronomy.

Book Semiparametric Regression with R

Download or read book Semiparametric Regression with R written by Jaroslaw Harezlak and published by Springer. This book was released on 2018-12-12 with total page 331 pages. Available in PDF, EPUB and Kindle. Book excerpt: This easy-to-follow applied book on semiparametric regression methods using R is intended to close the gap between the available methodology and its use in practice. Semiparametric regression has a large literature but much of it is geared towards data analysts who have advanced knowledge of statistical methods. While R now has a great deal of semiparametric regression functionality, many of these developments have not trickled down to rank-and-file statistical analysts. The authors assemble a broad range of semiparametric regression R analyses and put them in a form that is useful for applied researchers. There are chapters devoted to penalized spines, generalized additive models, grouped data, bivariate extensions of penalized spines, and spatial semi-parametric regression models. Where feasible, the R code is provided in the text, however the book is also accompanied by an external website complete with datasets and R code. Because of its flexibility, semiparametric regression has proven to be of great value with many applications in fields as diverse as astronomy, biology, medicine, economics, and finance. This book is intended for applied statistical analysts who have some familiarity with R.

Book Nonparametric and Semiparametric Methods in Econometrics and Statistics

Download or read book Nonparametric and Semiparametric Methods in Econometrics and Statistics written by William A. Barnett and published by Cambridge University Press. This book was released on 1991-06-28 with total page 512 pages. Available in PDF, EPUB and Kindle. Book excerpt: Papers from a 1988 symposium on the estimation and testing of models that impose relatively weak restrictions on the stochastic behaviour of data.

Book Introduction to Empirical Processes and Semiparametric Inference

Download or read book Introduction to Empirical Processes and Semiparametric Inference written by Michael R. Kosorok and published by Springer Science & Business Media. This book was released on 2007-12-29 with total page 482 pages. Available in PDF, EPUB and Kindle. Book excerpt: Kosorok’s brilliant text provides a self-contained introduction to empirical processes and semiparametric inference. These powerful research techniques are surprisingly useful for developing methods of statistical inference for complex models and in understanding the properties of such methods. This is an authoritative text that covers all the bases, and also a friendly and gradual introduction to the area. The book can be used as research reference and textbook.

Book Robust Estimation in Semiparametric Models

Download or read book Robust Estimation in Semiparametric Models written by Zaiqian Shen and published by . This book was released on 1992 with total page 212 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Recent Advances in Statistical Models

Download or read book Recent Advances in Statistical Models written by Wenqian Qiao and published by . This book was released on 2012 with total page 104 pages. Available in PDF, EPUB and Kindle. Book excerpt: This dissertation consists of three chapters. It develops new methodologies to address two specific problems of recent statistical research: * How to incorporate hierarchical structure in high dimensional regression model selection. * How to achieve semi-parametric efficiency in the presence of missing data. For the first problem, we provide a new approach to explicitly incorporate a given hierarchical structure among the predictors into high dimensional regression model selection. The proposed estimation approach has a hierarchical grouping property so that a pair of variables that are "close" in the hierarchy will be more likely grouped in the estimated model than those that are "far away". We also prove that the proposed method can consistently select the true model. These properties are demonstrated numerically in simulation and a real data analysis on peripheral-blood mononuclear cell (PBMC) study. For the second problem, two frameworks are considered: generalized partially linear model (GPLM) and causal inference of observational study. Specifically, under the GPLM framework, we consider a broad range of missing patterns which subsume most publications on the same topic. We use the concept of least favorable curve and extend the generalized profile likelihood approach [Severini and Wong (1992)] to estimate the parametric component of the model, and prove that the proposed estimator is consistent and semi-parametrically efficient. Also, under the causal inference framework, we propose to estimate the mean treatment effect with non-randomized treatment exposures in the presence of missing data. An appealing aspect of this development is that we incorporate the post-baseline covariates which are often excluded from causal effect inference due to their inherent confounding effect with treatment. We derive the semiparametric efficiency bound for regular asymptotically linear (RAL) estimators and propose an estimator which achieves this bound. Moreover, we prove that the proposed estimator is robust against four types of model mis-specifications. The performance of the proposed approaches are illustrated numerically through simulations and real data analysis on group testing dataset from Nebraska Infertility Prevention Project and burden of illness dataset from Duke University Medical Center.

Book Semiparametric Inference

Download or read book Semiparametric Inference written by Zhi He and published by . This book was released on 2010 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Semi-parametric and nonparametric modeling and inference have been widely studied during the last two decades. In this manuscript, we do statistical inference based on semi-parametric and nonparametric models in several different scenarios. Firstly, we develop a semi-parametric additivity test for nonparametric multi-dimensional model. The test statistic can test two or higher way interactions and achieve the biggest local power when the interaction terms have Tukey's format. Secondly, we develop a two step iterative estimating algorithm for generalized linear model with nonparametric varying dispersion. The algorithm is derived for heteroscedastic error generalized linear models, but it can be extended to more general setting for example censored data. Thirdly, we develop a multivariate intersection-union bioequivalence test. The intersection- union test is uniform more powerful compare with other common used test for multivariate bioequivalence. Fourthly, we extend the multivariate bioequivalence test to functional data, which can also be considered as high dimensional multivariate data. We develop two bioequiv- alence test based on L2 and L infinity norm. We illustrate the issues and methodology by both simulation and in the context of ultrasound safety study, backscatter coefficient vs. frequency study as well as a pharmacokinetics study.

Book Moving Beyond Non Informative Prior Distributions  Achieving the Full Potential of Bayesian Methods for Psychological Research

Download or read book Moving Beyond Non Informative Prior Distributions Achieving the Full Potential of Bayesian Methods for Psychological Research written by Christoph Koenig and published by Frontiers Media SA. This book was released on 2022-02-01 with total page 197 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Robust Inference Using Higher Order Influence Functions

Download or read book Robust Inference Using Higher Order Influence Functions written by Lingling Li and published by . This book was released on 2007 with total page 258 pages. Available in PDF, EPUB and Kindle. Book excerpt: We present a theory of point and interval estimation for nonlinear functionals in parametric, semi-, and non-parametric models based on higher order influence functions (Robins 2004, Sec. 9, Li et al., 2006, Tchetgen et al., 2006, Robins et al., 2007). Higher order influence functions are higher order U-statistics. Our theory extends the first order semiparametric theory of Bickel et al. (1993) and van der Vaart (1991) by incorporating the theory of higher order scores considered by Pfanzagl (1990), Small and McLeish (1994), and Lindsay and Waterman (1996). The theory reproduces many previous results, produces new non- n results, and opens up the ability to perform optimal non- n inference in complex high dimensional models. We present novel rate-optimal point and interval estimators for various functionals of central importance to biostatistics in settings in which estimation at the expected n rate is not possible, owing to the curse of dimensionality. We also show that our higher order influence functions have a multi-robustness property that extends the double robustness property of first order influence functions described by Robins and Rotnitzky (2001) and van der Laan and Robins (2003).