# Awesome Python Data Science Books

Probably the best curated list of data science books in Python.

## Contents

- Statistics
- Data Analysis
- Data Intuition
- Feature Engineering
- Machine Learning
- Time Series
- Natural Language Processing
- Deep Learning
- Code Optimization
- Scraping
- Career in Data Science

## Statistics

## Practical Statistics for Data Scientists: 50 Essential Concepts - Peter Bruce & Andrew Bruce

Learn how to apply various statistical methods to data science and how to avoid their misuse. Understand what statistical concept is important and what is not.

## Pattern Recognition and Machine Learning - Christopher M. Bishop

Learn approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. Familiarity with multivariate calculus and basic linear algebra is required

## Think Bayes: Bayesian Statistics in Python - Allen B. Downey

Learn how to solve statistical problems with Python code instead of mathematical notations. Learn how to work with problems involving estimation, prediction, decision analysis, evidence, and hypothesis testing.

## Probabilistic Programming & Bayesian Methods for Hackers - Cameron Davidson-Pilon

Learn Bayesian inference from a computational/understanding-first, and mathematics-second, point of view.

## An Introduction to Statistical Learning - Gareth James, Daniela Witten, Trevor Hastie, & Rob Tibshirani

Learn key topics in statistical learning. This book is perfect for those who want a gentle introduction all popular machine learning algorithms.

## Data Analysis

## Storytelling with Data: A Data Visualization Guide for Business Professionals - Cole Nussbaumer Knaflic

Learn how to determine the appropriate type of graph for your situation, eliminate irrelevant information, and direct your audience's attention to the most important parts of your data.

## Data Science from Scratch, 2nd Edition - Joel Grus

Learn data science libraries, frameworks, modules, tools and algorithms by implementing them from scratch.

## Data Intuition

## Head First Data Analysis: A learner's guide to big numbers, statistics, and good decisions - Michael Milton Knaflic

Learn how to determine which data sources to use for collecting information, distinguish signal from noise, cope with ambiguous information, design experiments to test hypothesis, organize your data using segmentation, and communicate the results of your analysis.

## Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management - Gordon S. Linoff & Michael J. A. Berry

Learn how to harness the newest data mining methods and techniques to prepare data for analysis and create the necessary infrastructure for data mining at your company. Learn core data mining techniques, including decision trees, neural networks, collaborative filtering, association rules, link analysis, survival analysis.

## Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight - Piyanka Jain & Puneet Sharma

Learn how to clarify the business question, lay out a hypothesis-driven plan, convert relevant data to insights, and make decisions that make an impact.

## The Book of Why: The New Science of Cause and Effect - Judea Pearl & Dana Mackenzie

Learn how to explore the world that is and the worlds that could have been by understanding causality. Learn to answer hard questions, like whether a drug cured an illness.

## Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy - Cathy O'Neil

Learn how the models being used today reinforce discrimination, prop up the lucky and punish the downtrodden. The book empower us to ask tough questions, uncover the truth, and demand change.

## Business Analytics: The Science of Data - Driven Decision Making - U Dinesh Kumar

Learn the foundations of data science and components of analytics such as descriptive, predictive and prescriptive analytics topics using examples from several industries, as well as nine analytics case studies. The book gives equal importance to theory and practice with examples across industries and the case studies provide a deeper understanding of analytics techniques and deployment of analytics-driven solutions.

## Feature Engineering

## Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists - Alice Zheng & Amanda Casari

Learn techniques for extracting and transforming features into formats for machine-learning models through practical application with exercises using tools such as numpy, Pandas, Scikit-learn, and Matplotlib.

## Python Data Science Handbook - Jake VanderPlas

Learn how to manipulate, transform, and clean data; visualize different types of data; and use data to build statistical or machine learning models using IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.

## Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython - Wes McKinney

Learn how to manipulate, process, clean, and crunch datasets in Python and how to work with time series data through real-world problems using Jupyter Notebook, Numpy, pandas, matplotlib.

## Machine Learning

## The Hundred-Page Machine Learning Book - Andriy Burkov

Learn everything you really need to know in Machine Learning in a hundred page.

## Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2 - Sebastian Raschka & Vahid Mirjalili

Learn all the essential machine learning techniques in depth. Learn how to use scikit-learn for machine learning and TensorFlow for deep learning.

## Machine Learning for Algorithmic Trading: Predictive models to extract signals from market and alternative data for systematic trading strategies with Python - Stefan Jansen

Learn end-to-end machine learning for the trading workflow, from the idea and feature engineering to model optimization, strategy design, and backtesting.

## Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems - Aurélien Géron

Learn a range of techniques, starting with simple linear regression and progressing to deep neural networks using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow.

## Building Machine Learning Powered Applications: Going from Idea to Product - Emmanuel Ameisen

Learn the skills necessary to design, build, and deploy applications powered by machine learning. Learn the tools, best practices, and challenges involved in building a real-world ML application.

## Machine Learning Yearning - Andrew Ng

Learn how to align on ML strategies in a team setting, as well as how to set up development (dev) sets and test sets.

## Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur

Learn how and what you should use to solve machine learning and deep learning problems. Appropriate for those who have some theoretical knowledge of machine learning and deep learning.

## Machine Learning Engineering - Andriy Burkov

Learn best practices and design patterns of building reliable machine learning solutions tha scale.

## Interpretable Machine Learning - Christoph Molnar

Learn the concepts of interpretability, interpretable models, and general methods for interpreting black box models. Learn in depth the strengths and weaknesses of each method and how their outputs can be interpreted.

## Building Machine Learning Pipelines - Hannes Hapke & Catherine Nelson

Learn the steps of automating a machine learning pipeline using the TensorFlow ecosystem.

## Introduction to Machine Learning with Python - Andreas C. Müller & Sarah Guido

Learn to create a successful machine-learning application with Python and the scikit-learn library.

## Time Series

## Introduction to Time Series Forecasting With Python - Jason Brownlee

Learn how to load and prepare data, evaluate model skill, and implement forecasting models for time series data.This book cuts through the math and specialized methods for time series forecasting.

## Practical Time Series Analysis - Aileen Nielsen

Learn to solve the most common data engineering and analysis challenges in time series, using both traditional statistical and modern machine learning techniques.

## Natural Language Processing

## Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit - Steven Bird & Ewan Klein

Learn how to predict text, filter email to automatic summarization and translation, and learn how to write Python programs that work with large collections of unstructured text.

## Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems - Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta & Harshit Surana

Learn how to adapt your solutions for different industry verticals such as healthcare, social media, and retail. Understand tasks and solution approaches within NLP and best practices around deployment for NLP systems.

## Natural Language Processing with PyTorch: Build Intelligent Language Applications Using Deep Learning - Delip Rao & Brian McMahan

Learn the basics of the PyTorch, traditional NLP concepts and methods, neural networks, embeddings, sequence prediction, and design patterns for building production NLP systems.

## Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more - Denis Rothman

Learn in detail the deep learning for machine translations, speech-to-text, text-to-speech, language modeling, question answering, and many more NLP domains with transformers.

## Deep Learning

## Deep Learning for Coders with Fastai and PyTorch: AI Applications Without a PhD - Jeremy Howard & Sylvain Gugger

Learn how to train a model on a wide range of tasks in deep learning with little math background and minimal code using fastai and Pytorch. Written by the creators of fastai.

## Deep Learning (Adaptive Computation and Machine Learning series) - Ian Goodfellow, Yoshua Bengio, Aaron Courville & Francis Bach

Learn mathematical and conceptual background, deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology, and other theoretical topics.

## Deep Learning with PyTorch - Eli Stevens, Luca Antiga, and Thomas Viehmann

Learn how to create deep learning and neural network systems with PyTorch and learn best practices for the entire deep learning pipeline for advanced projects.

## Long Short-Term Memory Networks With Python - Jason Brownlee

Learn what LSTMs are, and how to develop a suite of LSTM models using Keras and TensorFlow 2. This book cuts through the math, research papers and patchwork descriptions about LSTMs.

## Practical Deep Learning Book for Cloud, Mobile & Edge: Real-World AI and Computer Vision Projects Using Python, Keras and TensorFlow - Anirudh Koul, Siddha Ganju, & Meher Kasam

Learn how to build practical computer vision based deep learning applications that can be deployed on the cloud, mobile, browsers, or edge devices using a hands-on approach.

## Deep Learning Illustrated - Jon Krohn

Learn essential concepts in deep learning through visualization with little math.

## Code Optimization

## Effective Python: 59 Specific Ways to Write Better Python (Effective Software Development Series) - Brett Slatkin

Learn how to choose the most efficient and effective way to accomplish key tasks when multiple options exist, and how to write Python code that's easier to understand, maintain, and improve.

## Python Tricks: A Buffet of Awesome Python Features - Dan Bader

Learn best practices and little-known tricks to round out your Python knowledge.

## Python High Performance Programming - Gabriele Lanaro

Learn how to identify and sove the bottlenecks in your applications, write efficient numerical code in NumPy and Cython, and adapt your programs to run on multiple processors with parallel programming.

## Python Cookbook - David Beazley & Brian K. Jones

Learn the core Python language as well as tasks common to a wide variety of application domains such as data structures and algorithms, classes and objects, metaprogramming, modules and packages, testing, debugging, and exceptions.

## Scraping

## Web Scraping with Python: Collecting Data from the Modern Web - Ryan Mitchell

Learn how to query web servers, request data, and parse it to extract the information you need using tools such as requests, BeautifulSoup, Scrapy, APIs and how to store, read, and clean the data you scrape.

## Career in data science

## Build a Career in Data Science - Emily Robinson & Jacqueline Nolis

Learn how to how to land your first job to the lifecycle of a data science project, and how to become a manager.

## How to Contribute

Contributions are always welcome! If you know some interesting books or other categories that should be here but are not, feel free to contribute! To contribute, follow four steps below:

- Fork the repo
- Add new resources using the same markdown format.
- Start the book summary with “Learn…”
- Submit the pull request

That’s it. As soon as I review your pull request, your resources will be added to this page.

Alternatively, you can create an issue with book recommendation.