The best books on Data Science, Big Data, Data Mining, Machine Learning, Python, R, SQL, NoSQL and more.
Applications and Strategies for Human-in-the-loop Machine Learning.
Intro to Hadoop - An open-source framework for storing and processing big data in a distributed environment across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines.
'Hadoop illuminated' is the open source book about Apache Hadoop™. It aims to make Hadoop knowledge accessible to a wider audience, not just to the highly technical.
This is a simple book to learn the Python programming language, it is for the programmers who are new to Python.
The School of Data Handbook is a companion text to the School of Data. Its function is something like a traditional textbook – it will provide the detail and background theory to support the School of Data courses and challenges.
The aim of this Wikibook is to be the place where anyone can share his or her knowledge and tricks on R. It is supposed to be organized by task but not by discipline. We try to make a cross-disciplinary book, i.e. a book that can be used by all.