EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book MODIFIED STOCHASTIC VARIANCE REDUCTION GRADIENT DESCENT ALGORITHM AND ITS APPLICATION

Download or read book MODIFIED STOCHASTIC VARIANCE REDUCTION GRADIENT DESCENT ALGORITHM AND ITS APPLICATION written by Cai Fei and published by . This book was released on 2018 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: While machine learning is becoming an indispensable element in our modern society, various algorithms are developed to help decision makers solve complicated problems. A major theme of this study is to review and analyze popular algorithms with a focus on Stochastic Gradient (SG) based methods in large-scale machine learning problems. While SG has been the fundamental method playing an essential role in optimization problems, the algorithm has been further modified by various researchers for improved performances. Stochastic Gradient Descent with Variance Reduction (SVRG) is a method known for its low computation cost and fast convergence rate in solving convex optimization problems. However, in nonconvex settings, the existence of saddle points negatively influences the performance of the algorithm. While the practical problems in the real-world majorly lie in nonconvex settings, to further improve the performance of SVRG, a new algorithm is designed and discussed in this study. The new algorithm combines traditional SVRG with two additional features introduced by Perturbed Accelerated Gradient Descent (Perturbed AGD) to expedite algorithm from escaping from saddle points, which ultimately leads to convergence in nonconvex optimization. This study focuses on the elaboration of the modified SVRG algorithm and its implementation with a synthetic and an empirical dataset.

Book Riemannian Optimization and Its Applications

Download or read book Riemannian Optimization and Its Applications written by Hiroyuki Sato and published by Springer Nature. This book was released on 2021-02-17 with total page 129 pages. Available in PDF, EPUB and Kindle. Book excerpt: This brief describes the basics of Riemannian optimization—optimization on Riemannian manifolds—introduces algorithms for Riemannian optimization problems, discusses the theoretical properties of these algorithms, and suggests possible applications of Riemannian optimization to problems in other fields. To provide the reader with a smooth introduction to Riemannian optimization, brief reviews of mathematical optimization in Euclidean spaces and Riemannian geometry are included. Riemannian optimization is then introduced by merging these concepts. In particular, the Euclidean and Riemannian conjugate gradient methods are discussed in detail. A brief review of recent developments in Riemannian optimization is also provided. Riemannian optimization methods are applicable to many problems in various fields. This brief discusses some important applications including the eigenvalue and singular value decompositions in numerical linear algebra, optimal model reduction in control engineering, and canonical correlation analysis in statistics.

Book On Variants of Stochastic Gradient Descent

Download or read book On Variants of Stochastic Gradient Descent written by Vatsal Nilesh Shah and published by . This book was released on 2020 with total page 444 pages. Available in PDF, EPUB and Kindle. Book excerpt: Stochastic Gradient Descent (SGD) has played a crucial role in the success of modern machine learning methods. The popularity of SGD arises due to its ease of implementation, low memory and computational requirements, and applicability to a wide variety of optimization problems. However, SGD suffers from numerous issues; chief amongst them are high variance, slow rate of convergence, poor generalization, non-robustness to outliers, and poor performance for imbalanced classification. In this thesis, we propose variants of stochastic gradient descent, to tackle one or more of these issues for different problem settings. In the first chapter, we analyze the trade-off between variance and complexity to improve the convergence rate of SGD. A common alternative in the literature to SGD is Stochastic Variance Reduced Gradient (SVRG), which achieves linear convergence. However, SVRG involves the computation of a full gradient every few epochs, which is often intractable. We propose the Cheap Stochastic Variance Reduced Gradient (CheapSVRG) algorithm that attains linear convergence up to a neighborhood around the optimum without requiring a full gradient computation step. In the second chapter, we aim to compare the generalization capabilities of adaptive and non-adaptive methods for over-parameterized linear regression. Of the many possible solutions, SGD tends to gravitate towards the solution with minimum l2-norm while adaptive methods do not. We provide specific conditions on the pre-conditioner matrices under which a subclass of adaptive methods has the same generalization guarantees as SGD for over-parameterized linear regression. With synthetic examples and real data, we show that minimum norm solutions are not an excellent certificate to guarantee better generalization. In the third chapter, we propose a simple variant of SGD that guarantees robustness. Instead of considering SGD with one sample, we take a mini-batch and choose the sample with the lowest loss. For the noiseless framework with and without outliers, we provide conditions for the convergence of MKL-SGD to a provably better solution than SGD in the worst case. We also perform the standard rate of convergence analysis for both noiseless and noisy settings. In the final chapter, we tackle the challenges introduced by imbalanced class distribution in SGD. In place of using all the samples to update the parameter, our proposed Balancing SGD (B-SGD) algorithm rejects samples with low loss as they are redundant and do not play a role in determining the separating hyperplane. Imposing this label-dependent loss-based thresholding scheme on incoming samples allows us to improve the rate of convergence and achieve better generalization

Book Optimization for Machine Learning

Download or read book Optimization for Machine Learning written by Suvrit Sra and published by MIT Press. This book was released on 2012 with total page 509 pages. Available in PDF, EPUB and Kindle. Book excerpt: An up-to-date account of the interplay between optimization and machine learning, accessible to students and researchers in both communities. The interplay between optimization and machine learning is one of the most important developments in modern computational science. Optimization formulations and methods are proving to be vital in designing algorithms to extract essential knowledge from huge volumes of data. Machine learning, however, is not simply a consumer of optimization technology but a rapidly evolving field that is itself generating new optimization ideas. This book captures the state of the art of the interaction between optimization and machine learning in a way that is accessible to researchers in both fields. Optimization approaches have enjoyed prominence in machine learning because of their wide applicability and attractive theoretical properties. The increasing complexity, size, and variety of today's machine learning models call for the reassessment of existing assumptions. This book starts the process of reassessment. It describes the resurgence in novel contexts of established frameworks such as first-order methods, stochastic approximations, convex relaxations, interior-point methods, and proximal methods. It also devotes attention to newer themes such as regularized optimization, robust optimization, gradient and subgradient methods, splitting techniques, and second-order methods. Many of these techniques draw inspiration from other fields, including operations research, theoretical computer science, and subfields of optimization. The book will enrich the ongoing cross-fertilization between the machine learning community and these other fields, and within the broader optimization community.

Book Optimization Algorithms for Machine Learning

Download or read book Optimization Algorithms for Machine Learning written by Anant Raj and published by . This book was released on 2020 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: With the advent of massive datasets and increasingly complex tasks, modern machine learning systems pose several new challenges in terms of scalability to high dimensional data as well as to large datasets. In this thesis, we consider to study scalable descent methods such as coordinate descent and stochastic coordinate descent which are based on the stochastic approximation of full gradient. In the first part of the thesis, we propose faster and scalable coordinate based opti- mization which scales to high dimensional problems. As a first step to achieve scalable coordinate based descent approaches, we propose a new framework to derive screening rules for convex optimization problems based on duality gap which covers a large class of constrained and penalized optimization formulations. In later stages, we develop new approximately greedy coordinate selection strategy in coordinate descent for large-scale optimization. This novel coordinate selection strategy provavbly works better than uni- formly random selection, and can reach the efficiency of steepest coordinate descent (SCD) in the best case. In best case scenario, this may enable an acceleration of a factor of up to n, the number of coordinates. Having similar objective in mind, we further propose an adaptive sampling strategy for sampling in stochastic gradient based optimization. The proposed safe sampling scheme provably achieves faster convergence than any fixed deterministic sampling schemes for coordinate descent and stochastic gradient descent methods. Exploiting the connection between matching pursuit where a more generalized notion of directions is considered and greedy coordinate descent where all the moving directions are orthogonal, we also propose a unified analysis for both the approaches and extend it to get the accelerated rate. In the second part of this thesis, we focus on providing provably faster and scalable mini batch stochastic gradient descent (SGD) algorithms. Variance reduced SGD methods converge significantly faster than the vanilla SGD counterpart. We propose a variance reduce algorithm k-SVRG that addresses issues of SVRG [98] and SAGA[54] by making best use of the available memory and minimizes the stalling phases without progress. In later part of the work, we provide a simple framework which utilizes the idea of optimistic update to obtain accelerated stochastic algorithms. We obtain accelerated variance reduced algorithm as well as accelerated universal algorithm as a direct consequence of this simple framework. Going further, we also employ the idea of local sensitivity based importance sampling in an iterative optimization method and analyze its convergence while optimizing over the selected subset. In the final part of the thesis, we connect the dots between coordinate descent method and stochastic gradient descent method in the interpolation regime. We show that better stochastic gradient based dual algorithms with fast rate of convergence can be obtained to optimize the convex objective in the interpolation regime.

Book Stochastic Approximation and Recursive Algorithms and Applications

Download or read book Stochastic Approximation and Recursive Algorithms and Applications written by Harold Kushner and published by Springer Science & Business Media. This book was released on 2006-05-04 with total page 485 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book presents a thorough development of the modern theory of stochastic approximation or recursive stochastic algorithms for both constrained and unconstrained problems. This second edition is a thorough revision, although the main features and structure remain unchanged. It contains many additional applications and results as well as more detailed discussion.

Book Distributed Optimization  Advances in Theories  Methods  and Applications

Download or read book Distributed Optimization Advances in Theories Methods and Applications written by Huaqing Li and published by Springer Nature. This book was released on 2020-08-04 with total page 243 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book offers a valuable reference guide for researchers in distributed optimization and for senior undergraduate and graduate students alike. Focusing on the natures and functions of agents, communication networks and algorithms in the context of distributed optimization for networked control systems, this book introduces readers to the background of distributed optimization; recent developments in distributed algorithms for various types of underlying communication networks; the implementation of computation-efficient and communication-efficient strategies in the execution of distributed algorithms; and the frameworks of convergence analysis and performance evaluation. On this basis, the book then thoroughly studies 1) distributed constrained optimization and the random sleep scheme, from an agent perspective; 2) asynchronous broadcast-based algorithms, event-triggered communication, quantized communication, unbalanced directed networks, and time-varying networks, from a communication network perspective; and 3) accelerated algorithms and stochastic gradient algorithms, from an algorithm perspective. Finally, the applications of distributed optimization in large-scale statistical learning, wireless sensor networks, and for optimal energy management in smart grids are discussed.

Book First order and Stochastic Optimization Methods for Machine Learning

Download or read book First order and Stochastic Optimization Methods for Machine Learning written by Guanghui Lan and published by Springer Nature. This book was released on 2020-05-15 with total page 591 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book covers not only foundational materials but also the most recent progresses made during the past few years on the area of machine learning algorithms. In spite of the intensive research and development in this area, there does not exist a systematic treatment to introduce the fundamental concepts and recent progresses on machine learning algorithms, especially on those based on stochastic optimization methods, randomized algorithms, nonconvex optimization, distributed and online learning, and projection free methods. This book will benefit the broad audience in the area of machine learning, artificial intelligence and mathematical programming community by presenting these recent developments in a tutorial style, starting from the basic building blocks to the most carefully designed and complicated algorithms for machine learning.

Book Proceedings of COMPSTAT 2010

Download or read book Proceedings of COMPSTAT 2010 written by Yves Lechevallier and published by Springer Science & Business Media. This book was released on 2010-11-08 with total page 627 pages. Available in PDF, EPUB and Kindle. Book excerpt: Proceedings of the 19th international symposium on computational statistics, held in Paris august 22-27, 2010.Together with 3 keynote talks, there were 14 invited sessions and more than 100 peer-reviewed contributed communications.

Book Non convex Optimization for Machine Learning

Download or read book Non convex Optimization for Machine Learning written by Prateek Jain and published by Foundations and Trends in Machine Learning. This book was released on 2017-12-04 with total page 218 pages. Available in PDF, EPUB and Kindle. Book excerpt: Non-convex Optimization for Machine Learning takes an in-depth look at the basics of non-convex optimization with applications to machine learning. It introduces the rich literature in this area, as well as equips the reader with the tools and techniques needed to apply and analyze simple but powerful procedures for non-convex problems. Non-convex Optimization for Machine Learning is as self-contained as possible while not losing focus of the main topic of non-convex optimization techniques. The monograph initiates the discussion with entire chapters devoted to presenting a tutorial-like treatment of basic concepts in convex analysis and optimization, as well as their non-convex counterparts. The monograph concludes with a look at four interesting applications in the areas of machine learning and signal processing, and exploring how the non-convex optimization techniques introduced earlier can be used to solve these problems. The monograph also contains, for each of the topics discussed, exercises and figures designed to engage the reader, as well as extensive bibliographic notes pointing towards classical works and recent advances. Non-convex Optimization for Machine Learning can be used for a semester-length course on the basics of non-convex optimization with applications to machine learning. On the other hand, it is also possible to cherry pick individual portions, such the chapter on sparse recovery, or the EM algorithm, for inclusion in a broader course. Several courses such as those in machine learning, optimization, and signal processing may benefit from the inclusion of such topics.

Book Neural Networks  Tricks of the Trade

Download or read book Neural Networks Tricks of the Trade written by Grégoire Montavon and published by Springer. This book was released on 2012-11-14 with total page 753 pages. Available in PDF, EPUB and Kindle. Book excerpt: The twenty last years have been marked by an increase in available data and computing power. In parallel to this trend, the focus of neural network research and the practice of training neural networks has undergone a number of important changes, for example, use of deep learning machines. The second edition of the book augments the first edition with more tricks, which have resulted from 14 years of theory and experimentation by some of the world's most prominent neural network researchers. These tricks can make a substantial difference (in terms of speed, ease of implementation, and accuracy) when it comes to putting algorithms to work on real problems.

Book Probabilistic Machine Learning

Download or read book Probabilistic Machine Learning written by Kevin P. Murphy and published by MIT Press. This book was released on 2022-03-01 with total page 858 pages. Available in PDF, EPUB and Kindle. Book excerpt: A detailed and up-to-date introduction to machine learning, presented through the unifying lens of probabilistic modeling and Bayesian decision theory. This book offers a detailed and up-to-date introduction to machine learning (including deep learning) through the unifying lens of probabilistic modeling and Bayesian decision theory. The book covers mathematical background (including linear algebra and optimization), basic supervised learning (including linear and logistic regression and deep neural networks), as well as more advanced topics (including transfer learning and unsupervised learning). End-of-chapter exercises allow students to apply what they have learned, and an appendix covers notation. Probabilistic Machine Learning grew out of the author’s 2012 book, Machine Learning: A Probabilistic Perspective. More than just a simple update, this is a completely new book that reflects the dramatic developments in the field since 2012, most notably deep learning. In addition, the new book is accompanied by online Python code, using libraries such as scikit-learn, JAX, PyTorch, and Tensorflow, which can be used to reproduce nearly all the figures; this code can be run inside a web browser using cloud-based notebooks, and provides a practical complement to the theoretical topics discussed in the book. This introductory text will be followed by a sequel that covers more advanced topics, taking the same probabilistic approach.

Book Advances in Knowledge Discovery and Data Mining

Download or read book Advances in Knowledge Discovery and Data Mining written by De-Nian Yang and published by Springer Nature. This book was released on with total page 448 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Advances in Knowledge Discovery and Data Mining

Download or read book Advances in Knowledge Discovery and Data Mining written by Hisashi Kashima and published by Springer Nature. This book was released on 2023-05-27 with total page 563 pages. Available in PDF, EPUB and Kindle. Book excerpt: The 4-volume set LNAI 13935 - 13938 constitutes the proceedings of the 27th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2023, which took place in Osaka, Japan during May 25–28, 2023. The 143 papers presented in these proceedings were carefully reviewed and selected from 813 submissions. They deal with new ideas, original research results, and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, big data technologies, and foundations.

Book Sample Efficient Nonconvex Optimization Algorithms in Machine Learning and Reinforcement Learning

Download or read book Sample Efficient Nonconvex Optimization Algorithms in Machine Learning and Reinforcement Learning written by Pan Xu and published by . This book was released on 2021 with total page 246 pages. Available in PDF, EPUB and Kindle. Book excerpt: Machine learning and reinforcement learning have achieved tremendous success in solving problems in various real-world applications. Many modern learning problems boil down to a nonconvex optimization problem, where the objective function is the average or the expectation of some loss function over a finite or infinite dataset. Solving such nonconvex optimization problems, in general, can be NP-hard. Thus one often tackles such a problem through incremental steps based on the nature and the goal of the problem: finding a first-order stationary point, finding a second-order stationary point (or a local optimum), and finding a global optimum. With the size and complexity of the machine learning datasets rapidly increasing, it has become a fundamental challenge to design efficient and scalable machine learning algorithms that can improve the performance in terms of accuracy and save computational cost in terms of sample efficiency at the same time. Though many algorithms based on stochastic gradient descent have been developed and widely studied theoretically and empirically for nonconvex optimization, it has remained an open problem whether we can achieve the optimal sample complexity for finding a first-order stationary point and for finding local optima in nonconvex optimization. In this thesis, we start with the stochastic nested variance reduced gradient (SNVRG) algorithm, which is developed based on stochastic gradient descent methods and variance reduction techniques. We prove that SNVRG achieves the near-optimal convergence rate among its type for finding a first-order stationary point of a nonconvex function. We further build algorithms to efficiently find the local optimum of a nonconvex objective function by examining the curvature information at the stationary point found by SNVRG. With the ultimate goal of finding the global optimum in nonconvex optimization, we then provide a unified framework to analyze the global convergence of stochastic gradient Langevin dynamics-based algorithms for a nonconvex objective function. In the second part of this thesis, we generalize the aforementioned sample-efficient stochastic nonconvex optimization methods to reinforcement learning problems, including policy gradient, actor-critic, and Q-learning. For these problems, we propose novel algorithms and prove that they enjoy state-of-the-art theoretical guarantees on the sample complexity. The works presented in this thesis form an incomplete collection of the recent advances and developments of sample-efficient nonconvex optimization algorithms for both machine learning and reinforcement learning.

Book On Deterministic and Stochastic Optimization Algorithms for Problems with Riemannian Manifold Constraints

Download or read book On Deterministic and Stochastic Optimization Algorithms for Problems with Riemannian Manifold Constraints written by Dewei Zhang (Ph. D. in systems engineering) and published by . This book was released on 2021 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Optimization methods have been extensively studied given their broad applications in areas such as applied mathematics, statistics, engineering, healthcare, business, and finance. In the past two decades, the fast growth of machine learning and artificial intelligence and their increasing applications in different industries have resulted in various optimization challenges related to scalability, uncertainty, or requirement to satisfy certain constraints. This dissertation mainly looks into the optimization problems where their solutions are required to satisfy certain (possibly nonlinear) constraints, \emph{namely} Riemannian manifold constraints or they should satisfy certain sparsity structures in conformance with directed acyclic graphs. More specifically, this dissertation explores the following research directions. \begin{enumerate} \item To optimize objective functions in form of finite-sum over Riemannian manifolds, the dissertation proposes a stochastic variance-reduced cubic regularized Newton algorithm in Chapter~\ref{chapter2:cubic}. The proposed algorithm requires a full gradient and Hessian updates at the beginning of each epoch while it performs stochastic variance-reduced updates in the iterations within each epoch. The iteration complexity of the algorithm to obtain an $(\epsilon,\sqrt{\epsilon})$-second order stationary point, i.e., a point with the Riemannian gradient norm upper bounded by $\epsilon$ and minimum eigenvalue of Riemannian Hessian eigenvalue lower bounded by $-\sqrt{\epsilon}$, is shown to be $O(\epsilon^{-3/2})$. Furthermore, this dissertation proposes a computationally more appealing extension of the algorithm which only requires an \emph{inexact} solution of the cubic regularized Newton subproblem with the same iteration complexity. \item To optimize the nested composition of two or more functions containing expectations over Riemannian manifolds, this dissertation proposes multi-level stochastic compositional algorithms in Chpter~\ref{chapter3:compositional}. For two-level compositional optimization, the dissertation presents a Riemannian Stochastic Compositional Gradient Descent (R-SCGD) method that finds an approximate stationary point, with expected squared Riemannian gradient smaller than $\epsilon$, in $\cO(\epsilon^{-2})$ calls to the stochastic gradient oracle of the outer function and stochastic function and gradient oracles of the inner function. Furthermore, this dissertation generalizes the R-SCGD algorithms for problems with multi-level nested compositional structures, with the same complexity of $\cO(\epsilon^{-2})$ for first-order stochastic oracles. \item In many statistical learning problems, it is desired that the optimal solution conforms to an a priori known sparsity structure represented by a directed acyclic graph. Inducing such structures by means of convex regularizers requires nonsmooth penalty functions that exploit group overlapping. Chapter~\ref{chap4_HSS} investigates evaluating the proximal operator of the Latent Overlapping Group lasso through an optimization algorithm with parallelizable subproblems. This dissertation implements an Alternating Direction Method of Multiplier with a sharing scheme to solve large-scale instances of the underlying optimization problem efficiently. In the absence of strong convexity, global linear convergence of the algorithm is established using the error bound theory. More specifically, this work also contributes to establishing primal and dual error bounds when the nonsmooth component in the objective function \emph{does not have a polyhedral epigraph}. \end{enumerate} The theoretical results established in each chapter are numerically verified through carefully designed simulation studies and also implemented on real applications with real data sets.

Book Approximate Computing Techniques

Download or read book Approximate Computing Techniques written by Alberto Bosio and published by Springer Nature. This book was released on 2022-06-10 with total page 541 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book serves as a single-source reference to the latest advances in Approximate Computing (AxC), a promising technique for increasing performance or reducing the cost and power consumption of a computing system. The authors discuss the different AxC design and validation techniques, and their integration. They also describe real AxC applications, spanning from mobile to high performance computing and also safety-critical applications.