Skip to product information
1 of 1

Jonathan Rioux

Data Analysis with Python and PySpark

Data Analysis with Python and PySpark

💎 Earn 235 Points (£2.35) on this item.

Low Stock: Only 2 copies remaining
Regular price £47.01 GBP
Regular price £46.99 GBP Sale price £47.01 GBP
Sale Sold out
Taxes included. Shipping calculated at checkout.
  • Condition: Brand new
  • UK Delivery times: Usually arrives within 2 - 3 working days
  • UK Shipping: Fee starts at £2.39. Subject to product weight & dimension

Bulk ordering. Want 15 or more copies? Get a personalised quote and bigger discounts. Learn more about bulk orders.

  • More about Data Analysis with Python and PySpark


PySpark is a powerful data analysis platform that combines the Spark big data processing engine with the Python programming language. It provides a scalable and efficient way to perform data analysis tasks, enabling you to handle large datasets and extract insights from them. This guide helps you leverage PySpark to deliver successful Python-driven data projects, covering topics such as distributed processing, data abstraction, and integration with Python-based data science tools.

Format: Paperback / softback
Length: 425 pages
Publication date: 09 March 2022
Publisher: Manning Publications


When it comes to data analytics, it pays to think big. PySpark blends the powerful Spark big data processing engine with the Python programming language to provide a data analysis platform that can scale up for nearly any task.

Data Analysis with Python and PySpark is your guide to delivering successful Python-driven data projects.

Data Analysis with Python and PySpark is a carefully engineered tutorial that helps you use PySpark to deliver your data-driven applications at any scale. This clear and hands-on guide shows you how to enlarge your processing capabilities across multiple machines with data from any source, ranging from Hadoop-based clusters to Excel worksheets. You'll learn how to break down big analysis tasks into manageable chunks and how to choose and use the best PySpark data abstraction for your unique needs.

The Spark data processing engine is an amazing analytics factory: raw data comes in, and insight comes out. Thanks to its ability to handle massive amounts of data distributed across a cluster, Spark has been adopted as standard by organizations both big and small. PySpark, which wraps the core Spark engine with a Python-based API, puts Spark-based data pipelines in the hands of programmers and data scientists working with the Python programming language. PySpark simplifies Spark's steep learning curve, and provides a seamless bridge between Spark and an ecosystem of Python-based data science tools.

Weight: 760g
Dimension: 234 x 186 x 30 (mm)
ISBN-13: 9781617297205

This item can be found in:

UK and International shipping information

UK Delivery and returns information:

  • Delivery within 2 - 3 days when ordering in the UK.
  • Shipping fee for UK customers from £2.39. Fully tracked shipping service available.
  • Returns policy: Return within 30 days of receipt for full refund.

International deliveries:

Shulph Ink now ships to Australia, Belgium, Canada, France, Germany, Ireland, Italy, India, Luxembourg Saudi Arabia, Singapore, Spain, Netherlands, New Zealand, United Arab Emirates, United States of America.

  • Delivery times: within 5 - 10 days for international orders.
  • Shipping fee: charges vary for overseas orders. Only tracked services are available for most international orders. Some countries have untracked shipping options.
  • Customs charges: If ordering to addresses outside the United Kingdom, you may or may not incur additional customs and duties fees during local delivery.
View full details