Skip to product information
1 of 1

Peter Bruce,Andrew Bruce,Peter Gedeck

Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python

Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python

Earn [points_amount] when you buy this item.

Regular price £46.45 GBP
Regular price £63.99 GBP Sale price £46.45 GBP
Sale Sold out
Taxes included. Shipping calculated at checkout.

YOU SAVE £17.54

  • Condition: Brand new
  • UK Delivery times: Usually arrives within 2 - 3 working days
  • UK Shipping: Fee starts at £2.39. Subject to product weight & dimension

Bulk ordering. Want 15 or more copies? Get a personalised quote and bigger discounts. Learn more about bulk orders.

Trustpilot 4.5 stars rating  Excellent
We're rated excellent on Trustpilot.
  • More about Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python


The second edition of this popular guide provides practical guidance on applying statistical methods to data science, includes comprehensive examples in Python, and helps readers avoid misuse. It bridges the gap between statistical training and data science, suitable for those familiar with R or Python programming languages and with some exposure to statistics.

Format: Paperback / softback
Length: 350 pages
Publication date: 24 June 2020
Publisher: O'Reilly Media, Inc, USA


Data science is a rapidly growing field that relies heavily on statistical methods to analyze and interpret data. However, despite the importance of statistical training in data science, many data scientists lack formal training in statistics. Courses and books on basic statistics often focus on theoretical concepts and mathematical formulas, rather than providing practical guidance on applying statistical methods to data science.

The second edition of this popular guide aims to address this gap by providing comprehensive examples in Python, offering practical guidance on applying statistical methods to data science, and highlighting the importance of avoiding misuse. The guide also provides advice on what is essential and what is not when it comes to statistical methods in data science.

While many data science resources incorporate statistical methods, they often lack a deeper statistical perspective. This quick reference is designed to bridge the gap for individuals who are familiar with the R or Python programming languages and have some exposure to statistics. It provides a concise and accessible format for learning about statistical methods in data science.

The book covers a range of topics, including exploratory data analysis, random sampling, experimental design, regression analysis, key classification techniques, statistical machine learning methods, and unsupervised learning methods. Each topic is explained in detail, with practical examples and code snippets to illustrate the concepts.

Exploratory data analysis is a crucial preliminary step in data science, as it helps researchers understand the structure and characteristics of their data. Random sampling is used to reduce bias and yield a higher-quality dataset, even with large datasets. Experimental design provides definitive answers to questions by controlling variables and measuring outcomes. Regression analysis is used to estimate outcomes and detect anomalies. Key classification techniques, such as logistic regression and decision trees, are used to predict which categories a record belongs to.

Statistical machine learning methods, such as neural networks and support vector machines, learn from data and can be used for a wide range of applications, including image recognition, natural language processing, and fraud detection. Unsupervised learning methods, such as clustering and dimensionality reduction, are used to extract meaning from unlabeled data.

In conclusion, statistical methods are a critical component of data science, yet many data scientists lack formal training in statistics. This quick reference provides comprehensive examples in Python, practical guidance on applying statistical methods to data science, and advice on avoiding misuse. It is designed to bridge the gap for individuals who are familiar with the R or Python programming languages and have some exposure to statistics. By learning about statistical methods in data science, data scientists can gain a deeper understanding of their data and make more informed decisions based on their findings.

Weight: 620g
Dimension: 178 x 233 x 21 (mm)
ISBN-13: 9781492072942
Edition number: 2 New edition

This item can be found in:

UK Delivery and returns information:

  • Delivery within 2 - 3 days when ordering in the UK.
  • Shipping fee for UK customers from £2.39. Fully tracked shipping service available.
  • Returns policy: Return within 30 days of receipt for full refund.

International deliveries:

Shulph Ink now ships to Australia, Belgium, Canada, France, Germany, Ireland, Italy, India, Luxembourg Saudi Arabia, Singapore, Spain, Netherlands, New Zealand, United Arab Emirates, United States of America.

  • Delivery times: within 5 - 10 days for international orders.
  • Shipping fee: charges vary for overseas orders. Only tracked services are available for most international orders. Some countries have untracked shipping options.
  • Customs charges: If ordering to addresses outside the United Kingdom, you may or may not incur additional customs and duties fees during local delivery.
View full details