[EBOOK] Our Experience Converting An Ibm Forecasting Solution From R To Ibm Spss Modeler PDF Download

Computers

Our Experience Converting an IBM Forecasting Solution from R to IBM SPSS Modeler

Book Details:

Author : Pitipong JS Lin
Publisher : IBM Redbooks
Release : 2015-03-06
ISBN : 0738454141
Pages : 82 pages

Download or read book Our Experience Converting an IBM Forecasting Solution from R to IBM SPSS Modeler written by Pitipong JS Lin and published by IBM Redbooks. This book was released on 2015-03-06 with total page 82 pages. Available in PDF, EPUB and Kindle. Book excerpt: This IBM® RedpaperTM publication presents the process and steps that were taken to move from an R language forecasting solution to an IBM SPSS® Modeler solution. The paper identifies the key challenges that the team faced and the lessons they learned. It describes the journey from analysis through design to key actions that were taken during development to make the conversion successful. The solution approach is described in detail so that you can learn how the team broke the original R solution architecture into logical components in order to plan for the conversion project. You see key aspects of the conversion from R to IBM SPSS Modeler and how basic parts, such as data preparation, verification, pre-screening, and automating data quality checks, are accomplished. The paper consists of three chapters: Chapter 1 introduces the business background and the problem domain. Chapter 2 explains critical technical challenges that the team confronted and solved. Chapter 3 focuses on lessons that were learned during this process and ideas that might apply to your conversion project. This paper applies to various audiences: Decision makers and IT Architects who focus on the architecture, roadmap, software platform, and total cost of ownership. Solution development team members who are involved in creating statistical/analytics-based solutions and who are familiar with R and IBM SPSS Modeler.

Computers

Introduction to R in IBM SPSS Modeler

Book Details:

Author : Wannes Rosius
Publisher : IBM Redbooks
Release : 2016-10-14
ISBN : 0738455601
Pages : 54 pages

Download or read book Introduction to R in IBM SPSS Modeler written by Wannes Rosius and published by IBM Redbooks. This book was released on 2016-10-14 with total page 54 pages. Available in PDF, EPUB and Kindle. Book excerpt: This IBM RedpaperTM publication focuses on the integration between IBM® SPSS® Modeler and R. The paper is aimed at people who know IBM SPSS Modeler and have only a very limited knowledge of R. Chapters 2, 3, and 4 provide you with a high level understanding of R integration within SPSS Modeler enabling you to create or recreate some very basic R models within SPSS Modeler, even if you have only a basic knowledge of R. Chapter 5 provides more detailed tips and tricks. This chapter is for the experienced user and consists of items that might help you get up to speed with more detailed functions of the integration and understand some pitfalls.

Computers

Systems of Insight for Digital Transformation Using IBM Operational Decision Manager Advanced and Predictive Analytics

Book Details:

Author : Whei-Jen Chen
Publisher : IBM Redbooks
Release : 2015-12-03
ISBN : 073844118X
Pages : 266 pages

Download or read book Systems of Insight for Digital Transformation Using IBM Operational Decision Manager Advanced and Predictive Analytics written by Whei-Jen Chen and published by IBM Redbooks. This book was released on 2015-12-03 with total page 266 pages. Available in PDF, EPUB and Kindle. Book excerpt: Systems of record (SORs) are engines that generates value for your business. Systems of engagement (SOE) are always evolving and generating new customer-centric experiences and new opportunities to capitalize on the value in the systems of record. The highest value is gained when systems of record and systems of engagement are brought together to deliver insight. Systems of insight (SOI) monitor and analyze what is going on with various behaviors in the systems of engagement and information being stored or transacted in the systems of record. SOIs seek new opportunities, risks, and operational behavior that needs to be reported or have action taken to optimize business outcomes. Systems of insight are at the core of the Digital Experience, which tries to derive insights from the enormous amount of data generated by automated processes and customer interactions. Systems of Insight can also provide the ability to apply analytics and rules to real-time data as it flows within, throughout, and beyond the enterprise (applications, databases, mobile, social, Internet of Things) to gain the wanted insight. Deriving this insight is a key step toward being able to make the best decisions and take the most appropriate actions. Examples of such actions are to improve the number of satisfied clients, identify clients at risk of leaving and incentivize them to stay loyal, identify patterns of risk or fraudulent behavior and take action to minimize it as early as possible, and detect patterns of behavior in operational systems and transportation that lead to failures, delays, and maintenance and take early action to minimize risks and costs. IBM® Operational Decision Manager is a decision management platform that provides capabilities that support both event-driven insight patterns, and business-rule-driven scenarios. It also can easily be used in combination with other IBM Analytics solutions, as the detailed examples will show. IBM Operational Decision Manager Advanced, along with complementary IBM software offerings that also provide capability for systems of insight, provides a way to deliver the greatest value to your customers and your business. IBM Operational Decision Manager Advanced brings together data from different sources to recognize meaningful trends and patterns. It empowers business users to define, manage, and automate repeatable operational decisions. As a result, organizations can create and shape customer-centric business moments. This IBM Redbooks® publication explains the key concepts of systems of insight and how to implement a system of insight solution with examples. It is intended for IT architects and professionals who are responsible for implementing a systems of insights solution requiring event-based context pattern detection and deterministic decision services to enhance other analytics solution components with IBM Operational Decision Manager Advanced.

Computers

Enabling Real time Analytics on IBM z Systems Platform

Book Details:

Author : Lydia Parziale
Publisher : IBM Redbooks
Release : 2016-08-08
ISBN : 0738441864
Pages : 218 pages

Download or read book Enabling Real time Analytics on IBM z Systems Platform written by Lydia Parziale and published by IBM Redbooks. This book was released on 2016-08-08 with total page 218 pages. Available in PDF, EPUB and Kindle. Book excerpt: Regarding online transaction processing (OLTP) workloads, IBM® z SystemsTM platform, with IBM DB2®, data sharing, Workload Manager (WLM), geoplex, and other high-end features, is the widely acknowledged leader. Most customers now integrate business analytics with OLTP by running, for example, scoring functions from transactional context for real-time analytics or by applying machine-learning algorithms on enterprise data that is kept on the mainframe. As a result, IBM adds investment so clients can keep the complete lifecycle for data analysis, modeling, and scoring on z Systems control in a cost-efficient way, keeping the qualities of services in availability, security, reliability that z Systems solutions offer. Because of the changed architecture and tighter integration, IBM has shown, in a customer proof-of-concept, that a particular client was able to achieve an orders-of-magnitude improvement in performance, allowing that client's data scientist to investigate the data in a more interactive process. Open technologies, such as Predictive Model Markup Language (PMML) can help customers update single components instead of being forced to replace everything at once. As a result, you have the possibility to combine your preferred tool for model generation (such as SAS Enterprise Miner or IBM SPSS® Modeler) with a different technology for model scoring (such as Zementis, a company focused on PMML scoring). IBM SPSS Modeler is a leading data mining workbench that can apply various algorithms in data preparation, cleansing, statistics, visualization, machine learning, and predictive analytics. It has over 20 years of experience and continued development, and is integrated with z Systems. With IBM DB2 Analytics Accelerator 5.1 and SPSS Modeler 17.1, the possibility exists to do the complete predictive model creation including data transformation within DB2 Analytics Accelerator. So, instead of moving the data to a distributed environment, algorithms can be pushed to the data, using cost-efficient DB2 Accelerator for the required resource-intensive operations. This IBM Redbooks® publication explains the overall z Systems architecture, how the components can be installed and customized, how the new IBM DB2 Analytics Accelerator loader can help efficient data loading for z Systems data and external data, how in-database transformation, in-database modeling, and in-transactional real-time scoring can be used, and what other related technologies are available. This book is intended for technical specialists and architects, and data scientists who want to use the technology on the z Systems platform. Most of the technologies described in this book require IBM DB2 for z/OS®. For acceleration of the data investigation, data transformation, and data modeling process, DB2 Analytics Accelerator is required. Most value can be achieved if most of the data already resides on z Systems platforms, although adding external data (like from social sources) poses no problem at all.

Business & Economics

SPSS Statistics For Dummies

Book Details:

Author : Jesus Salcedo
Publisher : John Wiley & Sons
Release : 2020-09-09
ISBN : 1119560837
Pages : 487 pages

Download or read book SPSS Statistics For Dummies written by Jesus Salcedo and published by John Wiley & Sons. This book was released on 2020-09-09 with total page 487 pages. Available in PDF, EPUB and Kindle. Book excerpt: The fun and friendly guide to mastering IBM’s Statistical Package for the Social Sciences Written by an author team with a combined 55 years of experience using SPSS, this updated guide takes the guesswork out of the subject and helps you get the most out of using the leader in predictive analysis. Covering the latest release and updates to SPSS 27.0, and including more than 150 pages of basic statistical theory, it helps you understand the mechanics behind the calculations, perform predictive analysis, produce informative graphs, and more. You’ll even dabble in programming as you expand SPSS functionality to suit your specific needs. Master the fundamental mechanics of SPSS Learn how to get data into and out of the program Graph and analyze your data more accurately and efficiently Program SPSS with Command Syntax Get ready to start handling data like a pro—with step-by-step instruction and expert advice!

Computers

Optimization and Decision Support Design Guide Using IBM ILOG Optimization Decision Manager

Book Details:

Author : Axel Buecker
Publisher : IBM Redbooks
Release : 2012-10-10
ISBN : 0738437360
Pages : 368 pages

Download or read book Optimization and Decision Support Design Guide Using IBM ILOG Optimization Decision Manager written by Axel Buecker and published by IBM Redbooks. This book was released on 2012-10-10 with total page 368 pages. Available in PDF, EPUB and Kindle. Book excerpt: Today many organizations face challenges when developing a realistic plan or schedule that provides the best possible balance between customer service and revenue goals. Optimization technology has long been used to find the best solutions to complex planning and scheduling problems. A decision-support environment that enables the flexible exploration of all the trade-offs and sensitivities needs to provide the following capabilities: Flexibility to develop and compare realistic planning and scheduling scenarios Quality sensitivity analysis and explanations Collaborative planning and scenario sharing Decision recommendations This IBM® Redbooks® publication introduces you to the IBM ILOG® Optimization Decision Manager (ODM) Enterprise. This decision-support application provides the capabilities you need to take full advantage of optimization technology. Applications built with IBM ILOG ODM Enterprise can help users create, compare, and understand planning or scheduling scenarios. They can also adjust any of the model inputs or goals, and fully understanding the binding constraints, trade-offs, sensitivities, and business options. This book enables business analysts, architects, and administrators to design and use their own operational decision management solution.

Computers

Performance and Capacity Implications for Big Data

Book Details:

Author : Dave Jewell
Publisher : IBM Redbooks
Release : 2014-02-07
ISBN : 0738453587
Pages : 48 pages

Download or read book Performance and Capacity Implications for Big Data written by Dave Jewell and published by IBM Redbooks. This book was released on 2014-02-07 with total page 48 pages. Available in PDF, EPUB and Kindle. Book excerpt: Big data solutions enable us to change how we do business by exploiting previously unused sources of information in ways that were not possible just a few years ago. In IBM® Smarter Planet® terms, big data helps us to change the way that the world works. The purpose of this IBM RedpaperTM publication is to consider the performance and capacity implications of big data solutions, which must be taken into account for them to be viable. This paper describes the benefits that big data approaches can provide. We then cover performance and capacity considerations for creating big data solutions. We conclude with what this means for big data solutions, both now and in the future. Intended readers for this paper include decision-makers, consultants, and IT architects.

Computers

Performance and Capacity Themes for Cloud Computing

Book Details:

Author : Elisabeth Stahl
Publisher : IBM Redbooks
Release : 2013-03-20
ISBN : 0738451207
Pages : 76 pages

Download or read book Performance and Capacity Themes for Cloud Computing written by Elisabeth Stahl and published by IBM Redbooks. This book was released on 2013-03-20 with total page 76 pages. Available in PDF, EPUB and Kindle. Book excerpt: This IBM® RedpaperTM is the second in a series that addresses the performance and capacity considerations of the evolving cloud computing model. The first Redpaper publication (Performance Implications of Cloud Computing, REDP-4875) introduced cloud computing with its various deployment models, support roles, and offerings along with IT performance and capacity implications associated with these deployment models and offerings. In this redpaper, we discuss lessons learned in the two years since the first paper was written. We offer practical guidance about how to select workloads that work best with cloud computing, and about how to address areas, such as performance testing, monitoring, service level agreements, and capacity planning considerations for both single and multi-tenancy environments. We also provide an example of a recent project where cloud computing solved current business needs (such as cost reduction, optimization of infrastructure utilization, and more efficient systems management and reporting capabilities) and how the solution addressed performance and capacity challenges. We conclude with a summary of the lessons learned and a perspective about how cloud computing can affect performance and capacity in the future.

Computers

Building Big Data and Analytics Solutions in the Cloud

Book Details:

Author : Wei-Dong Zhu
Publisher : IBM Redbooks
Release : 2014-12-08
ISBN : 0738453994
Pages : 114 pages

Download or read book Building Big Data and Analytics Solutions in the Cloud written by Wei-Dong Zhu and published by IBM Redbooks. This book was released on 2014-12-08 with total page 114 pages. Available in PDF, EPUB and Kindle. Book excerpt: Big data is currently one of the most critical emerging technologies. Organizations around the world are looking to exploit the explosive growth of data to unlock previously hidden insights in the hope of creating new revenue streams, gaining operational efficiencies, and obtaining greater understanding of customer needs. It is important to think of big data and analytics together. Big data is the term used to describe the recent explosion of different types of data from disparate sources. Analytics is about examining data to derive interesting and relevant trends and patterns, which can be used to inform decisions, optimize processes, and even drive new business models. With today's deluge of data comes the problems of processing that data, obtaining the correct skills to manage and analyze that data, and establishing rules to govern the data's use and distribution. The big data technology stack is ever growing and sometimes confusing, even more so when we add the complexities of setting up big data environments with large up-front investments. Cloud computing seems to be a perfect vehicle for hosting big data workloads. However, working on big data in the cloud brings its own challenge of reconciling two contradictory design principles. Cloud computing is based on the concepts of consolidation and resource pooling, but big data systems (such as Hadoop) are built on the shared nothing principle, where each node is independent and self-sufficient. A solution architecture that can allow these mutually exclusive principles to coexist is required to truly exploit the elasticity and ease-of-use of cloud computing for big data environments. This IBM® RedpaperTM publication is aimed at chief architects, line-of-business executives, and CIOs to provide an understanding of the cloud-related challenges they face and give prescriptive guidance for how to realize the benefits of big data solutions quickly and cost-effectively.

Computers

AI and Big Data on IBM Power Systems Servers

Book Details:

Author : Scott Vetter
Publisher : IBM Redbooks
Release : 2019-04-10
ISBN : 0738457515
Pages : 162 pages

Download or read book AI and Big Data on IBM Power Systems Servers written by Scott Vetter and published by IBM Redbooks. This book was released on 2019-04-10 with total page 162 pages. Available in PDF, EPUB and Kindle. Book excerpt: As big data becomes more ubiquitous, businesses are wondering how they can best leverage it to gain insight into their most important business questions. Using machine learning (ML) and deep learning (DL) in big data environments can identify historical patterns and build artificial intelligence (AI) models that can help businesses to improve customer experience, add services and offerings, identify new revenue streams or lines of business (LOBs), and optimize business or manufacturing operations. The power of AI for predictive analytics is being harnessed across all industries, so it is important that businesses familiarize themselves with all of the tools and techniques that are available for integration with their data lake environments. In this IBM® Redbooks® publication, we cover the best practices for deploying and integrating some of the best AI solutions on the market, including: IBM Watson Machine Learning Accelerator (see note for product naming) IBM Watson Studio Local IBM Power SystemsTM IBM SpectrumTM Scale IBM Data Science Experience (IBM DSX) IBM Elastic StorageTM Server Hortonworks Data Platform (HDP) Hortonworks DataFlow (HDF) H2O Driverless AI We map out all the integrations that are possible with our different AI solutions and how they can integrate with your existing or new data lake. We also walk you through some of our client use cases and show you how some of the industry leaders are using Hortonworks, IBM PowerAI, and IBM Watson Studio Local to drive decision making. We also advise you on your deployment options, when to use a GPU, and why you should use the IBM Elastic Storage Server (IBM ESS) to improve storage management. Lastly, we describe how to integrate IBM Watson Machine Learning Accelerator and Hortonworks with or without IBM Watson Studio Local, how to access real-time data, and security. Note: IBM Watson Machine Learning Accelerator is the new product name for IBM PowerAI Enterprise. Note: Hortonworks merged with Cloudera in January 2019. The new company is called Cloudera. References to Hortonworks as a business entity in this publication are now referring to the merged company. Product names beginning with Hortonworks continue to be marketed and sold under their original names.

Computers

SPSS Statistics for Data Analysis and Visualization

Book Details:

Author : Keith McCormick
Publisher : John Wiley & Sons
Release : 2017-05-01
ISBN : 1119003555
Pages : 528 pages

Download or read book SPSS Statistics for Data Analysis and Visualization written by Keith McCormick and published by John Wiley & Sons. This book was released on 2017-05-01 with total page 528 pages. Available in PDF, EPUB and Kindle. Book excerpt: Dive deeper into SPSS Statistics for more efficient, accurate, and sophisticated data analysis and visualization SPSS Statistics for Data Analysis and Visualization goes beyond the basics of SPSS Statistics to show you advanced techniques that exploit the full capabilities of SPSS. The authors explain when and why to use each technique, and then walk you through the execution with a pragmatic, nuts and bolts example. Coverage includes extensive, in-depth discussion of advanced statistical techniques, data visualization, predictive analytics, and SPSS programming, including automation and integration with other languages like R and Python. You'll learn the best methods to power through an analysis, with more efficient, elegant, and accurate code. IBM SPSS Statistics is complex: true mastery requires a deep understanding of statistical theory, the user interface, and programming. Most users don't encounter all of the methods SPSS offers, leaving many little-known modules undiscovered. This book walks you through tools you may have never noticed, and shows you how they can be used to streamline your workflow and enable you to produce more accurate results. Conduct a more efficient and accurate analysis Display complex relationships and create better visualizations Model complex interactions and master predictive analytics Integrate R and Python with SPSS Statistics for more efficient, more powerful code These "hidden tools" can help you produce charts that simply wouldn't be possible any other way, and the support for other programming languages gives you better options for solving complex problems. If you're ready to take advantage of everything this powerful software package has to offer, SPSS Statistics for Data Analysis and Visualization is the expert-led training you need.

Computers

IBM Software Defined Infrastructure for Big Data Analytics Workloads

Book Details:

Author : Dino Quintero
Publisher : IBM Redbooks
Release : 2015-06-29
ISBN : 0738440779
Pages : 180 pages

Download or read book IBM Software Defined Infrastructure for Big Data Analytics Workloads written by Dino Quintero and published by IBM Redbooks. This book was released on 2015-06-29 with total page 180 pages. Available in PDF, EPUB and Kindle. Book excerpt: This IBM® Redbooks® publication documents how IBM Platform Computing, with its IBM Platform Symphony® MapReduce framework, IBM Spectrum Scale (based Upon IBM GPFSTM), IBM Platform LSF®, the Advanced Service Controller for Platform Symphony are work together as an infrastructure to manage not just Hadoop-related offerings, but many popular industry offeringsm such as Apach Spark, Storm, MongoDB, Cassandra, and so on. It describes the different ways to run Hadoop in a big data environment, and demonstrates how IBM Platform Computing solutions, such as Platform Symphony and Platform LSF with its MapReduce Accelerator, can help performance and agility to run Hadoop on distributed workload managers offered by IBM. This information is for technical professionals (consultants, technical support staff, IT architects, and IT specialists) who are responsible for delivering cost-effective cloud services and big data solutions on IBM Power SystemsTM to help uncover insights among client's data so they can optimize product development and business results.

Psychology

Multilevel and Longitudinal Modeling with IBM SPSS

Book Details:

Author : Ronald H. Heck
Publisher : Routledge
Release : 2013-08-22
ISBN : 1135074240
Pages : 753 pages

Download or read book Multilevel and Longitudinal Modeling with IBM SPSS written by Ronald H. Heck and published by Routledge. This book was released on 2013-08-22 with total page 753 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book demonstrates how to use multilevel and longitudinal modeling techniques available in the IBM SPSS mixed-effects program (MIXED). Annotated screen shots provide readers with a step-by-step understanding of each technique and navigating the program. Readers learn how to set up, run, and interpret a variety of models. Diagnostic tools, data management issues, and related graphics are introduced throughout. Annotated syntax is also available for those who prefer this approach. Extended examples illustrate the logic of model development to show readers the rationale of the research questions and the steps around which the analyses are structured. The data used in the text and syntax examples are available at www.routledge.com/9780415817110. Highlights of the new edition include: Updated throughout to reflect IBM SPSS Version 21. Further coverage of growth trajectories, coding time-related variables, covariance structures, individual change and longitudinal experimental designs (Ch.5). Extended discussion of other types of research designs for examining change (e.g., regression discontinuity, quasi-experimental) over time (Ch.6). New examples specifying multiple latent constructs and parallel growth processes (Ch. 7). Discussion of alternatives for dealing with missing data and the use of sample weights within multilevel data structures (Ch.1). The book opens with the conceptual and methodological issues associated with multilevel and longitudinal modeling, followed by a discussion of SPSS data management techniques which facilitate working with multilevel, longitudinal, and cross-classified data sets. Chapters 3 and 4 introduce the basics of multilevel modeling: developing a multilevel model, interpreting output, and trouble-shooting common programming and modeling problems. Models for investigating individual and organizational change are presented in chapters 5 and 6, followed by models with multivariate outcomes in chapter 7. Chapter 8 provides an illustration of multilevel models with cross-classified data structures. The book concludes with ways to expand on the various multilevel and longitudinal modeling techniques and issues when conducting multilevel analyses. It's ideal for courses on multilevel and longitudinal modeling, multivariate statistics, and research design taught in education, psychology, business, and sociology.

Social Science

IBM SPSS by Example

Book Details:

Author : Alan C. Elliott
Publisher : SAGE Publications
Release : 2014-12-31
ISBN : 1483319040
Pages : 278 pages

Download or read book IBM SPSS by Example written by Alan C. Elliott and published by SAGE Publications. This book was released on 2014-12-31 with total page 278 pages. Available in PDF, EPUB and Kindle. Book excerpt: The updated Second Edition of Alan C. Elliott and Wayne A. Woodward’s "cut to the chase" IBM SPSS guide quickly explains the when, where, and how of statistical data analysis as it is used for real-world decision making in a wide variety of disciplines. This one-stop reference provides succinct guidelines for performing an analysis using SPSS software, avoiding pitfalls, interpreting results, and reporting outcomes. Written from a practical perspective, IBM SPSS by Example, Second Edition provides a wealth of information—from assumptions and design to computation, interpretation, and presentation of results—to help users save time, money, and frustration.

Medical

Applied Predictive Modeling

Book Details:

Author : Max Kuhn
Publisher : Springer Science & Business Media
Release : 2013-05-17
ISBN : 1461468493
Pages : 595 pages

Download or read book Applied Predictive Modeling written by Max Kuhn and published by Springer Science & Business Media. This book was released on 2013-05-17 with total page 595 pages. Available in PDF, EPUB and Kindle. Book excerpt: Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning. The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems. The text illustrates all parts of the modeling process through many hands-on, real-life examples, and every chapter contains extensive R code for each step of the process. This multi-purpose text can be used as an introduction to predictive models and the overall modeling process, a practitioner’s reference handbook, or as a text for advanced undergraduate or graduate level predictive modeling courses. To that end, each chapter contains problem sets to help solidify the covered concepts and uses data available in the book’s R package. This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. Non-mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics.

Mathematics

Data Mining with Rattle and R

Book Details:

Author : Graham Williams
Publisher : Springer Science & Business Media
Release : 2011-08-04
ISBN : 144199890X
Pages : 382 pages

Download or read book Data Mining with Rattle and R written by Graham Williams and published by Springer Science & Business Media. This book was released on 2011-08-04 with total page 382 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data mining is the art and science of intelligent data analysis. By building knowledge from information, data mining adds considerable value to the ever increasing stores of electronic data that abound today. In performing data mining many decisions need to be made regarding the choice of methodology, the choice of data, the choice of tools, and the choice of algorithms. Throughout this book the reader is introduced to the basic concepts and some of the more popular algorithms of data mining. With a focus on the hands-on end-to-end process for data mining, Williams guides the reader through various capabilities of the easy to use, free, and open source Rattle Data Mining Software built on the sophisticated R Statistical Software. The focus on doing data mining rather than just reading about data mining is refreshing. The book covers data understanding, data preparation, data refinement, model building, model evaluation, and practical deployment. The reader will learn to rapidly deliver a data mining project using software easily installed for free from the Internet. Coupling Rattle with R delivers a very sophisticated data mining environment with all the power, and more, of the many commercial offerings.

Business & Economics

Modern Data Science with R

Book Details:

Author : Benjamin S. Baumer
Publisher : CRC Press
Release : 2021-03-31
ISBN : 0429575394
Pages : 830 pages

Download or read book Modern Data Science with R written by Benjamin S. Baumer and published by CRC Press. This book was released on 2021-03-31 with total page 830 pages. Available in PDF, EPUB and Kindle. Book excerpt: From a review of the first edition: "Modern Data Science with R... is rich with examples and is guided by a strong narrative voice. What’s more, it presents an organizing framework that makes a convincing argument that data science is a course distinct from applied statistics" (The American Statistician). Modern Data Science with R is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world data problems. Rather than focus exclusively on case studies or programming syntax, this book illustrates how statistical programming in the state-of-the-art R/RStudio computing environment can be leveraged to extract meaningful information from a variety of data in the service of addressing compelling questions. The second edition is updated to reflect the growing influence of the tidyverse set of packages. All code in the book has been revised and styled to be more readable and easier to understand. New functionality from packages like sf, purrr, tidymodels, and tidytext is now integrated into the text. All chapters have been revised, and several have been split, re-organized, or re-imagined to meet the shifting landscape of best practice.