Jeremy Stanley,Paige Schwartz
Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning
Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning
💎 Earn 189 Points (£1.89) on this item.
YOU SAVE £15.16
- Condition: Brand new
- UK Delivery times: Usually arrives within 2 - 3 working days
- UK Shipping: Fee starts at £2.39. Subject to product weight & dimension
Bulk ordering. Want 15 or more copies? Get a personalised quote and bigger discounts. Learn more about bulk orders.
Couldn't load pickup availability
- More about Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning
Businesses generate 2.5 quintillion bytes of data daily, but much of it is poor quality or useless. This book provides practical advice on using automated data quality monitoring to ensure high-quality records, covering all tables efficiently, proactively alerting on issues, and resolving problems immediately. It also helps understand the limits of automated monitoring and how to overcome them, and how to deploy and manage the solution at scale.
Format: Paperback / softback
Length: 170 pages
Publication date: 30 January 2024
Publisher: O'Reilly Media
The world's businesses ingest a staggering 2.5 quintillion bytes of data every day, a vast amount of information that is used to build products, power AI systems, and drive business decisions. However, the question arises: how much of this data is of poor quality or simply bad? This practical book aims to address this concern and provide guidance on ensuring that the data your organization relies on contains only high-quality records.
While many data engineers, data analysts, and data scientists genuinely care about data quality, they often lack the time, resources, or understanding to create a data quality monitoring solution that succeeds at scale. In this book, Jeremy Stanley and Paige Schwartz from Anomalo offer valuable insights on how to leverage automated data quality monitoring to effectively cover all your tables, proactively alert on every category of issue, and resolve problems immediately.
Here are some key takeaways from the book:
* Data quality is a business imperative: Recognize that data quality is not just a technical concern but a critical factor in driving business success. Poor-quality data can lead to incorrect decisions, wasted resources, and customer dissatisfaction.
* Understand and assess unsupervised learning models for detecting data issues: Learn about unsupervised learning models, which are used to identify patterns and anomalies in data without human intervention. These models can help detect data quality issues such as missing values, duplicate records, and outliers.
* Implement notifications that reduce alert fatigue and let you triage and resolve issues quickly: Implement notifications that are tailored to the specific issues and categories of concern. This can help you prioritize and address problems efficiently, reducing alert fatigue and allowing you to focus on resolving critical issues.
* Integrate automated data quality monitoring with data catalogs, orchestration layers, and BI and ML systems: Integrate automated data quality monitoring with your existing data infrastructure to ensure seamless integration and centralized management. This can help you streamline your processes and improve efficiency.
* Understand the limits of automated data quality monitoring and how to overcome them: Understand the limitations of automated data quality monitoring and identify areas where manual intervention may be necessary. This can include identifying complex data relationships, handling sensitive data, and addressing specific business requirements.
* Learn how to deploy and manage your monitoring solution at scale: Deploy your automated data quality monitoring solution in a scalable and efficient manner. This can include optimizing performance, monitoring system health, and ensuring that your monitoring solution can handle the growing volume of data.
* Maintain automated data quality monitoring for the long term: Maintain your automated data quality monitoring solution over time to ensure that it continues to meet your evolving business requirements. This can include regular maintenance, updates, and training to ensure that your team is equipped to handle new challenges and emerging technologies.
In conclusion, this book provides valuable insights and practical guidance on ensuring data quality in your organization. By leveraging automated data quality monitoring, you can improve the accuracy, reliability, and usability of your data, leading to better business decisions and increased competitive advantage. Whether you are a data engineer, data analyst, or data scientist, this book will help you build a data quality monitoring solution that succeeds at scale and drives your organization's success.
Weight: 394g
Dimension: 176 x 234 x 15 (mm)
ISBN-13: 9781098145934
This item can be found in:
UK and International shipping information
UK and International shipping information
UK Delivery and returns information:
- Delivery within 2 - 3 days when ordering in the UK.
- Shipping fee for UK customers from £2.39. Fully tracked shipping service available.
- Returns policy: Return within 30 days of receipt for full refund.
International deliveries:
Shulph Ink now ships to Australia, Belgium, Canada, France, Germany, Ireland, Italy, India, Luxembourg Saudi Arabia, Singapore, Spain, Netherlands, New Zealand, United Arab Emirates, United States of America.
- Delivery times: within 5 - 10 days for international orders.
- Shipping fee: charges vary for overseas orders. Only tracked services are available for most international orders. Some countries have untracked shipping options.
- Customs charges: If ordering to addresses outside the United Kingdom, you may or may not incur additional customs and duties fees during local delivery.
