{"product_id":"data-analysis-with-python-and-pyspark-9781617297205","title":"Data Analysis with Python and PySpark","description":"\u003cp\u003e\u003c\/p\u003e\u003cblockquote\u003e\n\u003cbr\u003ePySpark is a powerful data analysis platform that combines the Spark big data processing engine with the Python programming language. It provides a scalable and efficient way to perform data analysis tasks, enabling you to handle large datasets and extract insights from them. This guide helps you leverage PySpark to deliver successful Python-driven data projects, covering topics such as distributed processing, data abstraction, and integration with Python-based data science tools. \u003c\/blockquote\u003e\u003cp\u003e\u003cstrong\u003eFormat\u003c\/strong\u003e: Paperback \/ softback\u003cbr\u003e\u003cstrong\u003eLength\u003c\/strong\u003e: 425 pages\u003cbr\u003e\u003cstrong\u003ePublication date\u003c\/strong\u003e: 09 March 2022\u003cbr\u003e\u003cstrong\u003ePublisher\u003c\/strong\u003e: Manning Publications\u003cbr\u003e\u003c\/p\u003e \u003cp\u003e\u003cbr\u003eWhen it comes to data analytics, it pays to think big. PySpark blends the powerful Spark big data processing engine with the Python programming language to provide a data analysis platform that can scale up for nearly any task.\u003cbr\u003e\u003cbr\u003eData Analysis with Python and PySpark is your guide to delivering successful Python-driven data projects.\u003cbr\u003e\u003cbr\u003eData Analysis with Python and PySpark is a carefully engineered tutorial that helps you use PySpark to deliver your data-driven applications at any scale. This clear and hands-on guide shows you how to enlarge your processing capabilities across multiple machines with data from any source, ranging from Hadoop-based clusters to Excel worksheets. You'll learn how to break down big analysis tasks into manageable chunks and how to choose and use the best PySpark data abstraction for your unique needs.\u003cbr\u003e\u003cbr\u003eThe Spark data processing engine is an amazing analytics factory: raw data comes in, and insight comes out. Thanks to its ability to handle massive amounts of data distributed across a cluster, Spark has been adopted as standard by organizations both big and small. PySpark, which wraps the core Spark engine with a Python-based API, puts Spark-based data pipelines in the hands of programmers and data scientists working with the Python programming language. PySpark simplifies Spark's steep learning curve, and provides a seamless bridge between Spark and an ecosystem of Python-based data science tools.\u003c\/p\u003e\u003cp\u003e\u003cstrong\u003eWeight\u003c\/strong\u003e: 760g\u003cbr\u003e\u003cstrong\u003eDimension\u003c\/strong\u003e: 234 x 186 x 30 (mm)\u003cbr\u003e\u003cstrong\u003eISBN-13\u003c\/strong\u003e: 9781617297205\u003c\/p\u003e","brand":"Jonathan Rioux","offers":[{"title":"Paperback \/ softback","offer_id":44100796481786,"sku":"9781617297205","price":47.01,"currency_code":"GBP","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0522\/4297\/2845\/products\/1649417729461_book.jpg?v=1649437963","url":"https:\/\/shulphink.com\/products\/data-analysis-with-python-and-pyspark-9781617297205","provider":"Shulph Ink","version":"1.0","type":"link"}