ThePythonBook

Data Science & Machine Learning

NumPy, Pandas, Matplotlib, SciPy, and scikit-learn

20tutorials
117exercises
535minutes
1995XP
20 tutorials in this category

NumPy Arrays: Create, Index, Slice, and Reshape

The ndarray is the heart of NumPy. Create arrays, index them, slice them, reshape them, and see why they're faster than lists.

intermediate30m7105

NumPy Operations: Element-wise Math, Aggregations, and ufuncs

Element-wise math, aggregations like sum/mean/std, universal functions, and vectorized operations that replace slow Python loops.

intermediate25m690

NumPy Broadcasting: Operate on Arrays of Different Shapes

How NumPy operates on arrays of different shapes without copying data. The broadcasting rules, visualized and explained.

advanced25m5100

NumPy Linear Algebra: Dot Products, Inverses, and Eigenvalues

Dot products, matrix multiplication, inverses, determinants, and eigenvalues — linear algebra through NumPy's linalg module.

advanced25m5100

Pandas: Create, Load, and Explore DataFrames

Load data into a DataFrame, explore it with head/info/describe, and start manipulating rows and columns right away.

intermediate30m7105

Pandas Indexing: loc, iloc, Boolean Indexing, and Selection

Select data precisely with loc, iloc, and boolean indexing. The three selection methods you'll use on every DataFrame.

intermediate25m690

Pandas Data Cleaning: Missing Values, Duplicates, and Outliers

Handle missing values, drop duplicates, detect outliers, and transform messy real-world data into something usable.

intermediate30m7105

Pandas merge(), join(), concat(): Combine DataFrames Like SQL

Combine DataFrames like SQL tables — merge(), join(), concat(). Inner, outer, left, and right joins explained with examples.

intermediate25m690

Pandas GroupBy: Split-Apply-Combine for Powerful Aggregations

Split-apply-combine: group rows, run aggregations, and reshape data. The pandas equivalent of SQL GROUP BY, but more flexible.

advanced30m7130

Pandas apply(), map(), transform(): Custom Data Transformations

Apply custom functions to DataFrames with apply(), map(), and transform(). When each one is the right choice.

advanced25m6110

Pandas String and DateTime Operations for Real-World Data

Work with text columns using .str accessor and dates using .dt accessor. Clean strings, parse timestamps, extract date parts.

intermediate25m685

Pandas Pivot Tables and Cross-Tabulation for Business Analysis

Reshape data with pivot tables and cross-tabulations. Turn raw rows into summary reports for business analysis.

advanced25m595

Matplotlib: Create Line, Bar, Scatter, and Pie Charts

Create line charts, bar charts, scatter plots, and pie charts. The plotting library behind most Python data visualization.

intermediate25m690

Advanced Matplotlib: Subplots, Dual Axes, Styles, Annotations

Subplots, dual axes, custom styles, annotations, and publication-quality figures. Take your matplotlib plots to the next level.

advanced25m5100

Data Visualization: Choose the Right Chart and Tell a Story

Choosing the right chart type, avoiding misleading visuals, and telling a clear story with your data.

advanced25m5100

SciPy Statistics: Distributions, Hypothesis Tests, and Correlations

Statistical distributions, hypothesis testing (t-tests, chi-squared), and correlation analysis using SciPy's stats module.

advanced30m6120

Build Your First ML Model: Linear Regression with scikit-learn

Build your first machine learning model — linear regression with scikit-learn. Fit, predict, evaluate, and visualize.

intermediate30m690

Python Classification: Build a Classifier with scikit-learn

Train a classifier to categorize data — logistic regression, decision trees, and measuring accuracy with scikit-learn.

intermediate30m690

Evaluating ML Models: Cross-Validation, Metrics, and Overfitting

Cross-validation, precision, recall, F1, ROC curves, and the overfitting trap. Know if your model actually works.

advanced25m5100

K-Means Clustering: Find Patterns in Data Without Labels

Group data without labels using K-Means clustering. The elbow method, silhouette scores, and unsupervised pattern discovery.

advanced25m5100