Skip to content

📘 Introduction to Data Science — Textbook

A compact, classroom-ready textbook with theory, by-hand examples.

Build Status License Python Status


🎯 What you’ll learn

  • Solid foundations of Data Science: concepts, math, and workflow
  • Visualization: histograms, scatter, KDE, Q–Q, boxplots, multivariate charts
  • Modeling: regression, classification, clustering, evaluation, cross-validation
  • Data preparation: cleaning, integration, reduction, transformation, discretization
  • Hands-on: Python code to generate figures used in the chapters

Data Science Overview

graph TD
    A[Data Science] --> B[Prior Knowledge]
    A --> C[Data Preparation]
    A --> D[Modeling]
    A --> E[Evaluation]
    A --> F[Deployment]
    C --> C1[Cleaning]
    C --> C2[Integration]
    C --> C3[Reduction]
    D --> D1[Regression]
    D --> D2[Classification]
    D --> D3[Clustering]

🗂️ How to use this site

  • Use the left navigation to jump between chapters.
  • Each chapter includes formulas (MathJax), diagrams (Mermaid/PNG), and code snippets to reproduce figures.
  • Downloadable .md and .png are available where relevant for GitHub Pages hosting.

💡 Tip: Use the search box (⌘/Ctrl + K) to find formulas, terms, or figure names instantly.


📚 Chapters


Quantile Histogram Scatter
Quantile Histogram Scatter
Multi-Scatter Scatter Matrix Bubble
Multi-Scatter Scatter Matrix Bubble
KDE Q–Q Parallel Coordinates
KDE Q–Q Plot Parallel
Deviation Andrews Curves Box
Deviation Andrews Box