📘 Introduction to Data Science — Textbook¶
A compact, classroom-ready textbook with theory, by-hand examples.
🎯 What you’ll learn¶
- Solid foundations of Data Science: concepts, math, and workflow
- Visualization: histograms, scatter, KDE, Q–Q, boxplots, multivariate charts
- Modeling: regression, classification, clustering, evaluation, cross-validation
- Data preparation: cleaning, integration, reduction, transformation, discretization
- Hands-on: Python code to generate figures used in the chapters
Data Science Overview¶
graph TD
A[Data Science] --> B[Prior Knowledge]
A --> C[Data Preparation]
A --> D[Modeling]
A --> E[Evaluation]
A --> F[Deployment]
C --> C1[Cleaning]
C --> C2[Integration]
C --> C3[Reduction]
D --> D1[Regression]
D --> D2[Classification]
D --> D3[Clustering] 🗂️ How to use this site¶
- Use the left navigation to jump between chapters.
- Each chapter includes formulas (MathJax), diagrams (Mermaid/PNG), and code snippets to reproduce figures.
- Downloadable
.mdand.pngare available where relevant for GitHub Pages hosting.
💡 Tip: Use the search box (⌘/Ctrl + K) to find formulas, terms, or figure names instantly.
📚 Chapters¶
- Foundations
- Chapter 1 — Introduction
- Chapter 2 — What is Data Science
- Chapter 3 — Data Objects and Attribute Types
-
Visualization
-
Process & Preparation
-
Chapter 9 · Chapter 10 · Chapter 11 · Chapter 12 · Chapter 13
-
Modeling & Evaluation
- Chapter 14 · Chapter 15 · Chapter 16 · Chapter 17
🖼️ Plot Gallery (quick preview)¶
| Quantile | Histogram | Scatter |
|---|---|---|
![]() | ![]() | ![]() |
| Multi-Scatter | Scatter Matrix | Bubble |
|---|---|---|
![]() | ![]() | ![]() |
| KDE | Q–Q | Parallel Coordinates |
|---|---|---|
![]() | ![]() | ![]() |
| Deviation | Andrews Curves | Box |
|---|---|---|
![]() | ![]() | ![]() |











