EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Data Observability for Data Engineering

Download or read book Data Observability for Data Engineering written by Michele Pinto and published by Packt Publishing Ltd. This book was released on 2023-12-29 with total page 228 pages. Available in PDF, EPUB and Kindle. Book excerpt: Discover actionable steps to maintain healthy data pipelines to promote data observability within your teams with this essential guide to elevating data engineering practices Key Features Learn how to monitor your data pipelines in a scalable way Apply real-life use cases and projects to gain hands-on experience in implementing data observability Instil trust in your pipelines among data producers and consumers alike Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionIn the age of information, strategic management of data is critical to organizational success. The constant challenge lies in maintaining data accuracy and preventing data pipelines from breaking. Data Observability for Data Engineering is your definitive guide to implementing data observability successfully in your organization. This book unveils the power of data observability, a fusion of techniques and methods that allow you to monitor and validate the health of your data. You’ll see how it builds on data quality monitoring and understand its significance from the data engineering perspective. Once you're familiar with the techniques and elements of data observability, you'll get hands-on with a practical Python project to reinforce what you've learned. Toward the end of the book, you’ll apply your expertise to explore diverse use cases and experiment with projects to seamlessly implement data observability in your organization. Equipped with the mastery of data observability intricacies, you’ll be able to make your organization future-ready and resilient and never worry about the quality of your data pipelines again.What you will learn Implement a data observability approach to enhance the quality of data pipelines Collect and analyze key metrics through coding examples Apply monkey patching in a Python module Manage the costs and risks associated with your data pipeline Understand the main techniques for collecting observability metrics Implement monitoring techniques for analytics pipelines in production Build and maintain a statistics engine continuously Who this book is for This book is for data engineers, data architects, data analysts, and data scientists who have encountered issues with broken data pipelines or dashboards. Organizations seeking to adopt data observability practices and managers responsible for data quality and processes will find this book especially useful to increase the confidence of data consumers and raise awareness among producers regarding their data pipelines.

Book Fundamentals of Data Observability

Download or read book Fundamentals of Data Observability written by Andy Petrella and published by "O'Reilly Media, Inc.". This book was released on 2023-08-14 with total page 267 pages. Available in PDF, EPUB and Kindle. Book excerpt: Quickly detect, troubleshoot, and prevent a wide range of data issues through data observability, a set of best practices that enables data teams to gain greater visibility of data and its usage. If you're a data engineer, data architect, or machine learning engineer who depends on the quality of your data, this book shows you how to focus on the practical aspects of introducing data observability in your everyday work. Author Andy Petrella helps you build the right habits to identify and solve data issues, such as data drifts and poor quality, so you can stop their propagation in data applications, pipelines, and analytics. You'll learn ways to introduce data observability, including setting up a framework for generating and collecting all the information you need. Learn the core principles and benefits of data observability Use data observability to detect, troubleshoot, and prevent data issues Follow the book's recipes to implement observability in your data projects Use data observability to create a trustworthy communication framework with data consumers Learn how to educate your peers about the benefits of data observability

Book 97 Things Every Data Engineer Should Know

Download or read book 97 Things Every Data Engineer Should Know written by Tobias Macey and published by "O'Reilly Media, Inc.". This book was released on 2021-06-11 with total page 243 pages. Available in PDF, EPUB and Kindle. Book excerpt: Take advantage of today's sky-high demand for data engineers. With this in-depth book, current and aspiring engineers will learn powerful real-world best practices for managing data big and small. Contributors from notable companies including Twitter, Google, Stitch Fix, Microsoft, Capital One, and LinkedIn share their experiences and lessons learned for overcoming a variety of specific and often nagging challenges. Edited by Tobias Macey, host of the popular Data Engineering Podcast, this book presents 97 concise and useful tips for cleaning, prepping, wrangling, storing, processing, and ingesting data. Data engineers, data architects, data team managers, data scientists, machine learning engineers, and software engineers will greatly benefit from the wisdom and experience of their peers. Topics include: The Importance of Data Lineage - Julien Le Dem Data Security for Data Engineers - Katharine Jarmul The Two Types of Data Engineering and Data Engineers - Jesse Anderson Six Dimensions for Picking an Analytical Data Warehouse - Gleb Mezhanskiy The End of ETL as We Know It - Paul Singman Building a Career as a Data Engineer - Vijay Kiran Modern Metadata for the Modern Data Stack - Prukalpa Sankar Your Data Tests Failed! Now What? - Sam Bail

Book Fundamentals of Data Observability

Download or read book Fundamentals of Data Observability written by Andy Petrella and published by O'Reilly Media. This book was released on 2023-09-29 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Quickly detect, troubleshoot, and prevent a wide range of data issues through data observability, a set of best practices that enable data teams to gain greater visibility of data and its usage. If you're a data engineer, data architect, or machine learning engineer, or if the quality of your work depends on the quality of your data, this book shows you how to focus on the practical aspects of introducing data observability in your everyday work. Author Andy Petrella helps you build the right habits to identify and solve data issues, such as data drifts and poor quality, so you can stop their propagation in data applications, pipelines, and analytics. You'll learn ways to introduce data observability, including setting up a framework for generating and collecting all the information you need. Learn the core principles and benefits of data observability Use data observability to detect, troubleshoot, and prevent data issues Follow the book�¢??s recipes to implement observability in your data projects Use data observability to create a trustable communication framework with data consumers Learn how to educate your peers about the benefits of data observability

Book Fundamentals of Data Observability

Download or read book Fundamentals of Data Observability written by Andy Petrella and published by "O'Reilly Media, Inc.". This book was released on 2023-08-14 with total page 275 pages. Available in PDF, EPUB and Kindle. Book excerpt: Quickly detect, troubleshoot, and prevent a wide range of data issues through data observability, a set of best practices that enables data teams to gain greater visibility of data and its usage. If you're a data engineer, data architect, or machine learning engineer who depends on the quality of your data, this book shows you how to focus on the practical aspects of introducing data observability in your everyday work. Author Andy Petrella helps you build the right habits to identify and solve data issues, such as data drifts and poor quality, so you can stop their propagation in data applications, pipelines, and analytics. You'll learn ways to introduce data observability, including setting up a framework for generating and collecting all the information you need. Learn the core principles and benefits of data observability Use data observability to detect, troubleshoot, and prevent data issues Follow the book's recipes to implement observability in your data projects Use data observability to create a trustworthy communication framework with data consumers Learn how to educate your peers about the benefits of data observability

Book Fundamentals of Data Engineering

Download or read book Fundamentals of Data Engineering written by Joe Reis and published by "O'Reilly Media, Inc.". This book was released on 2022-06-22 with total page 454 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data engineering has grown rapidly in the past decade, leaving many software engineers, data scientists, and analysts looking for a comprehensive view of this practice. With this practical book, you'll learn how to plan and build systems to serve the needs of your organization and customers by evaluating the best technologies available through the framework of the data engineering lifecycle. Authors Joe Reis and Matt Housley walk you through the data engineering lifecycle and show you how to stitch together a variety of cloud technologies to serve the needs of downstream data consumers. You'll understand how to apply the concepts of data generation, ingestion, orchestration, transformation, storage, and governance that are critical in any data environment regardless of the underlying technology. This book will help you: Get a concise overview of the entire data engineering landscape Assess data engineering problems using an end-to-end framework of best practices Cut through marketing hype when choosing data technologies, architecture, and processes Use the data engineering lifecycle to design and build a robust architecture Incorporate data governance and security across the data engineering lifecycle

Book 97 Things Every Data Engineer Should Know

Download or read book 97 Things Every Data Engineer Should Know written by Tobias Macey and published by "O'Reilly Media, Inc.". This book was released on 2021-06-11 with total page 263 pages. Available in PDF, EPUB and Kindle. Book excerpt: Take advantage of today's sky-high demand for data engineers. With this in-depth book, current and aspiring engineers will learn powerful real-world best practices for managing data big and small. Contributors from notable companies including Twitter, Google, Stitch Fix, Microsoft, Capital One, and LinkedIn share their experiences and lessons learned for overcoming a variety of specific and often nagging challenges. Edited by Tobias Macey, host of the popular Data Engineering Podcast, this book presents 97 concise and useful tips for cleaning, prepping, wrangling, storing, processing, and ingesting data. Data engineers, data architects, data team managers, data scientists, machine learning engineers, and software engineers will greatly benefit from the wisdom and experience of their peers. Topics include: The Importance of Data Lineage - Julien Le Dem Data Security for Data Engineers - Katharine Jarmul The Two Types of Data Engineering and Data Engineers - Jesse Anderson Six Dimensions for Picking an Analytical Data Warehouse - Gleb Mezhanskiy The End of ETL as We Know It - Paul Singman Building a Career as a Data Engineer - Vijay Kiran Modern Metadata for the Modern Data Stack - Prukalpa Sankar Your Data Tests Failed! Now What? - Sam Bail

Book Data Pipelines Pocket Reference

Download or read book Data Pipelines Pocket Reference written by James Densmore and published by O'Reilly Media. This book was released on 2021-02-10 with total page 277 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data pipelines are the foundation for success in data analytics. Moving data from numerous diverse sources and transforming it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack. You'll learn common considerations and key decision points when implementing pipelines, such as batch versus streaming data ingestion and build versus buy. This book addresses the most common decisions made by data professionals and discusses foundational concepts that apply to open source frameworks, commercial products, and homegrown solutions. You'll learn: What a data pipeline is and how it works How data is moved and processed on modern data infrastructure, including cloud platforms Common tools and products used by data engineers to build pipelines How pipelines support analytics and reporting needs Considerations for pipeline maintenance, testing, and alerting

Book Data Mesh

    Book Details:
  • Author : Zhamak Dehghani
  • Publisher : "O'Reilly Media, Inc."
  • Release : 2022-03-08
  • ISBN : 1492092363
  • Pages : 387 pages

Download or read book Data Mesh written by Zhamak Dehghani and published by "O'Reilly Media, Inc.". This book was released on 2022-03-08 with total page 387 pages. Available in PDF, EPUB and Kindle. Book excerpt: Many enterprises are investing in a next-generation data lake, hoping to democratize data at scale to provide business insights and ultimately make automated intelligent decisions. In this practical book, author Zhamak Dehghani reveals that, despite the time, money, and effort poured into them, data warehouses and data lakes fail when applied at the scale and speed of today's organizations. A distributed data mesh is a better choice. Dehghani guides architects, technical leaders, and decision makers on their journey from monolithic big data architecture to a sociotechnical paradigm that draws from modern distributed architecture. A data mesh considers domains as a first-class concern, applies platform thinking to create self-serve data infrastructure, treats data as a product, and introduces a federated and computational model of data governance. This book shows you why and how. Examine the current data landscape from the perspective of business and organizational needs, environmental challenges, and existing architectures Analyze the landscape's underlying characteristics and failure modes Get a complete introduction to data mesh principles and its constituents Learn how to design a data mesh architecture Move beyond a monolithic data lake to a distributed data mesh.

Book Data Engineering with Google Cloud Platform

Download or read book Data Engineering with Google Cloud Platform written by Adi Wijaya and published by Packt Publishing Ltd. This book was released on 2022-03-31 with total page 440 pages. Available in PDF, EPUB and Kindle. Book excerpt: Build and deploy your own data pipelines on GCP, make key architectural decisions, and gain the confidence to boost your career as a data engineer Key Features Understand data engineering concepts, the role of a data engineer, and the benefits of using GCP for building your solution Learn how to use the various GCP products to ingest, consume, and transform data and orchestrate pipelines Discover tips to prepare for and pass the Professional Data Engineer exam Book DescriptionWith this book, you'll understand how the highly scalable Google Cloud Platform (GCP) enables data engineers to create end-to-end data pipelines right from storing and processing data and workflow orchestration to presenting data through visualization dashboards. Starting with a quick overview of the fundamental concepts of data engineering, you'll learn the various responsibilities of a data engineer and how GCP plays a vital role in fulfilling those responsibilities. As you progress through the chapters, you'll be able to leverage GCP products to build a sample data warehouse using Cloud Storage and BigQuery and a data lake using Dataproc. The book gradually takes you through operations such as data ingestion, data cleansing, transformation, and integrating data with other sources. You'll learn how to design IAM for data governance, deploy ML pipelines with the Vertex AI, leverage pre-built GCP models as a service, and visualize data with Google Data Studio to build compelling reports. Finally, you'll find tips on how to boost your career as a data engineer, take the Professional Data Engineer certification exam, and get ready to become an expert in data engineering with GCP. By the end of this data engineering book, you'll have developed the skills to perform core data engineering tasks and build efficient ETL data pipelines with GCP.What you will learn Load data into BigQuery and materialize its output for downstream consumption Build data pipeline orchestration using Cloud Composer Develop Airflow jobs to orchestrate and automate a data warehouse Build a Hadoop data lake, create ephemeral clusters, and run jobs on the Dataproc cluster Leverage Pub/Sub for messaging and ingestion for event-driven systems Use Dataflow to perform ETL on streaming data Unlock the power of your data with Data Studio Calculate the GCP cost estimation for your end-to-end data solutions Who this book is for This book is for data engineers, data analysts, and anyone looking to design and manage data processing pipelines using GCP. You'll find this book useful if you are preparing to take Google's Professional Data Engineer exam. Beginner-level understanding of data science, the Python programming language, and Linux commands is necessary. A basic understanding of data processing and cloud computing, in general, will help you make the most out of this book.

Book The Self Service Data Roadmap

Download or read book The Self Service Data Roadmap written by Sandeep Uttamchandani and published by "O'Reilly Media, Inc.". This book was released on 2020-09-10 with total page 297 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data-driven insights are a key competitive advantage for any industry today, but deriving insights from raw data can still take days or weeks. Most organizations can’t scale data science teams fast enough to keep up with the growing amounts of data to transform. What’s the answer? Self-service data. With this practical book, data engineers, data scientists, and team managers will learn how to build a self-service data science platform that helps anyone in your organization extract insights from data. Sandeep Uttamchandani provides a scorecard to track and address bottlenecks that slow down time to insight across data discovery, transformation, processing, and production. This book bridges the gap between data scientists bottlenecked by engineering realities and data engineers unclear about ways to make self-service work. Build a self-service portal to support data discovery, quality, lineage, and governance Select the best approach for each self-service capability using open source cloud technologies Tailor self-service for the people, processes, and technology maturity of your data platform Implement capabilities to democratize data and reduce time to insight Scale your self-service portal to support a large number of users within your organization

Book Database Reliability Engineering

Download or read book Database Reliability Engineering written by Laine Campbell and published by "O'Reilly Media, Inc.". This book was released on 2017-10-26 with total page 294 pages. Available in PDF, EPUB and Kindle. Book excerpt: The infrastructure-as-code revolution in IT is also affecting database administration. With this practical book, developers, system administrators, and junior to mid-level DBAs will learn how the modern practice of site reliability engineering applies to the craft of database architecture and operations. Authors Laine Campbell and Charity Majors provide a framework for professionals looking to join the ranks of today’s database reliability engineers (DBRE). You’ll begin by exploring core operational concepts that DBREs need to master. Then you’ll examine a wide range of database persistence options, including how to implement key technologies to provide resilient, scalable, and performant data storage and retrieval. With a firm foundation in database reliability engineering, you’ll be ready to dive into the architecture and operations of any modern database. This book covers: Service-level requirements and risk management Building and evolving an architecture for operational visibility Infrastructure engineering and infrastructure management How to facilitate the release management process Data storage, indexing, and replication Identifying datastore characteristics and best use cases Datastore architectural components and data-driven architectures

Book Site Reliability Engineering

    Book Details:
  • Author : Niall Richard Murphy
  • Publisher : "O'Reilly Media, Inc."
  • Release : 2016-03-23
  • ISBN : 1491951176
  • Pages : 552 pages

Download or read book Site Reliability Engineering written by Niall Richard Murphy and published by "O'Reilly Media, Inc.". This book was released on 2016-03-23 with total page 552 pages. Available in PDF, EPUB and Kindle. Book excerpt: The overwhelming majority of a software system’s lifespan is spent in use, not in design or implementation. So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems? In this collection of essays and articles, key members of Google’s Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You’ll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient—lessons directly applicable to your organization. This book is divided into four sections: Introduction—Learn what site reliability engineering is and why it differs from conventional IT industry practices Principles—Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE) Practices—Understand the theory and practice of an SRE’s day-to-day work: building and operating large distributed computing systems Management—Explore Google's best practices for training, communication, and meetings that your organization can use

Book Data Engineering with AWS

Download or read book Data Engineering with AWS written by Gareth Eagar and published by Packt Publishing Ltd. This book was released on 2023-10-31 with total page 637 pages. Available in PDF, EPUB and Kindle. Book excerpt: Looking to revolutionize your data transformation game with AWS? Look no further! From strong foundations to hands-on building of data engineering pipelines, our expert-led manual has got you covered. Key Features Delve into robust AWS tools for ingesting, transforming, and consuming data, and for orchestrating pipelines Stay up to date with a comprehensive revised chapter on Data Governance Build modern data platforms with a new section covering transactional data lakes and data mesh Book DescriptionThis book, authored by a seasoned Senior Data Architect with 25 years of experience, aims to help you achieve proficiency in using the AWS ecosystem for data engineering. This revised edition provides updates in every chapter to cover the latest AWS services and features, takes a refreshed look at data governance, and includes a brand-new section on building modern data platforms which covers; implementing a data mesh approach, open-table formats (such as Apache Iceberg), and using DataOps for automation and observability. You'll begin by reviewing the key concepts and essential AWS tools in a data engineer's toolkit and getting acquainted with modern data management approaches. You'll then architect a data pipeline, review raw data sources, transform the data, and learn how that transformed data is used by various data consumers. You’ll learn how to ensure strong data governance, and about populating data marts and data warehouses along with how a data lakehouse fits into the picture. After that, you'll be introduced to AWS tools for analyzing data, including those for ad-hoc SQL queries and creating visualizations. Then, you'll explore how the power of machine learning and artificial intelligence can be used to draw new insights from data. In the final chapters, you'll discover transactional data lakes, data meshes, and how to build a cutting-edge data platform on AWS. By the end of this AWS book, you'll be able to execute data engineering tasks and implement a data pipeline on AWS like a pro!What you will learn Seamlessly ingest streaming data with Amazon Kinesis Data Firehose Optimize, denormalize, and join datasets with AWS Glue Studio Use Amazon S3 events to trigger a Lambda process to transform a file Load data into a Redshift data warehouse and run queries with ease Visualize and explore data using Amazon QuickSight Extract sentiment data from a dataset using Amazon Comprehend Build transactional data lakes using Apache Iceberg with Amazon Athena Learn how a data mesh approach can be implemented on AWS Who this book is forThis book is for data engineers, data analysts, and data architects who are new to AWS and looking to extend their skills to the AWS cloud. Anyone new to data engineering who wants to learn about the foundational concepts, while gaining practical experience with common data engineering services on AWS, will also find this book useful. A basic understanding of big data-related topics and Python coding will help you get the most out of this book, but it’s not a prerequisite. Familiarity with the AWS console and core services will also help you follow along.

Book Fighting Churn with Data

Download or read book Fighting Churn with Data written by Carl S. Gold and published by Manning Publications. This book was released on 2020-12-22 with total page 502 pages. Available in PDF, EPUB and Kindle. Book excerpt: The beating heart of any product or service business is returning clients. Don't let your hard-won customers vanish, taking their money with them. In Fighting Churn with Data you'll learn powerful data-driven techniques to maximize customer retention and minimize actions that cause them to stop engaging or unsubscribe altogether. Summary The beating heart of any product or service business is returning clients. Don't let your hard-won customers vanish, taking their money with them. In Fighting Churn with Data you'll learn powerful data-driven techniques to maximize customer retention and minimize actions that cause them to stop engaging or unsubscribe altogether. This hands-on guide is packed with techniques for converting raw data into measurable metrics, testing hypotheses, and presenting findings that are easily understandable to non-technical decision makers. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Keeping customers active and engaged is essential for any business that relies on recurring revenue and repeat sales. Customer turnover—or “churn”—is costly, frustrating, and preventable. By applying the techniques in this book, you can identify the warning signs of churn and learn to catch customers before they leave. About the book Fighting Churn with Data teaches developers and data scientists proven techniques for stopping churn before it happens. Packed with real-world use cases and examples, this book teaches you to convert raw data into measurable behavior metrics, calculate customer lifetime value, and improve churn forecasting with demographic data. By following Zuora Chief Data Scientist Carl Gold’s methods, you’ll reap the benefits of high customer retention. What's inside Calculating churn metrics Identifying user behavior that predicts churn Using churn reduction tactics with customer segmentation Applying churn analysis techniques to other business areas Using AI for accurate churn forecasting About the reader For readers with basic data analysis skills, including Python and SQL. About the author Carl Gold (PhD) is the Chief Data Scientist at Zuora, Inc., the industry-leading subscription management platform. Table of Contents: PART 1 - BUILDING YOUR ARSENAL 1 The world of churn 2 Measuring churn 3 Measuring customers 4 Observing renewal and churn PART 2 - WAGING THE WAR 5 Understanding churn and behavior with metrics 6 Relationships between customer behaviors 7 Segmenting customers with advanced metrics PART 3 - SPECIAL WEAPONS AND TACTICS 8 Forecasting churn 9 Forecast accuracy and machine learning 10 Churn demographics and firmographics 11 Leading the fight against churn

Book Data Quality Fundamentals

Download or read book Data Quality Fundamentals written by Barr Moses and published by "O'Reilly Media, Inc.". This book was released on 2022-09 with total page 311 pages. Available in PDF, EPUB and Kindle. Book excerpt: Do your product dashboards look funky? Are your quarterly reports stale? Is the data set you're using broken or just plain wrong? These problems affect almost every team, yet they're usually addressed on an ad hoc basis and in a reactive manner. If you answered yes to these questions, this book is for you. Many data engineering teams today face the "good pipelines, bad data" problem. It doesn't matter how advanced your data infrastructure is if the data you're piping is bad. In this book, Barr Moses, Lior Gavish, and Molly Vorwerck, from the data observability company Monte Carlo, explain how to tackle data quality and trust at scale by leveraging best practices and technologies used by some of the world's most innovative companies. Build more trustworthy and reliable data pipelines Write scripts to make data checks and identify broken pipelines with data observability Learn how to set and maintain data SLAs, SLIs, and SLOs Develop and lead data quality initiatives at your company Learn how to treat data services and systems with the diligence of production software Automate data lineage graphs across your data ecosystem Build anomaly detectors for your critical data assets

Book Data Engineering with AWS

Download or read book Data Engineering with AWS written by Gareth Eagar and published by Packt Publishing Ltd. This book was released on 2021-12-29 with total page 482 pages. Available in PDF, EPUB and Kindle. Book excerpt: The missing expert-led manual for the AWS ecosystem — go from foundations to building data engineering pipelines effortlessly Purchase of the print or Kindle book includes a free eBook in the PDF format. Key Features Learn about common data architectures and modern approaches to generating value from big data Explore AWS tools for ingesting, transforming, and consuming data, and for orchestrating pipelines Learn how to architect and implement data lakes and data lakehouses for big data analytics from a data lakes expert Book DescriptionWritten by a Senior Data Architect with over twenty-five years of experience in the business, Data Engineering for AWS is a book whose sole aim is to make you proficient in using the AWS ecosystem. Using a thorough and hands-on approach to data, this book will give aspiring and new data engineers a solid theoretical and practical foundation to succeed with AWS. As you progress, you’ll be taken through the services and the skills you need to architect and implement data pipelines on AWS. You'll begin by reviewing important data engineering concepts and some of the core AWS services that form a part of the data engineer's toolkit. You'll then architect a data pipeline, review raw data sources, transform the data, and learn how the transformed data is used by various data consumers. You’ll also learn about populating data marts and data warehouses along with how a data lakehouse fits into the picture. Later, you'll be introduced to AWS tools for analyzing data, including those for ad-hoc SQL queries and creating visualizations. In the final chapters, you'll understand how the power of machine learning and artificial intelligence can be used to draw new insights from data. By the end of this AWS book, you'll be able to carry out data engineering tasks and implement a data pipeline on AWS independently.What you will learn Understand data engineering concepts and emerging technologies Ingest streaming data with Amazon Kinesis Data Firehose Optimize, denormalize, and join datasets with AWS Glue Studio Use Amazon S3 events to trigger a Lambda process to transform a file Run complex SQL queries on data lake data using Amazon Athena Load data into a Redshift data warehouse and run queries Create a visualization of your data using Amazon QuickSight Extract sentiment data from a dataset using Amazon Comprehend Who this book is for This book is for data engineers, data analysts, and data architects who are new to AWS and looking to extend their skills to the AWS cloud. Anyone new to data engineering who wants to learn about the foundational concepts while gaining practical experience with common data engineering services on AWS will also find this book useful. A basic understanding of big data-related topics and Python coding will help you get the most out of this book but it’s not a prerequisite. Familiarity with the AWS console and core services will also help you follow along.