NumPy, Pandas, Matplotlib, SciPy, and scikit-learn
The ndarray is the heart of NumPy. Create arrays, index them, slice them, reshape them, and see why they're faster than lists.
Element-wise math, aggregations like sum/mean/std, universal functions, and vectorized operations that replace slow Python loops.
How NumPy operates on arrays of different shapes without copying data. The broadcasting rules, visualized and explained.
Dot products, matrix multiplication, inverses, determinants, and eigenvalues — linear algebra through NumPy's linalg module.
Load data into a DataFrame, explore it with head/info/describe, and start manipulating rows and columns right away.
Select data precisely with loc, iloc, and boolean indexing. The three selection methods you'll use on every DataFrame.
Handle missing values, drop duplicates, detect outliers, and transform messy real-world data into something usable.
Combine DataFrames like SQL tables — merge(), join(), concat(). Inner, outer, left, and right joins explained with examples.
Split-apply-combine: group rows, run aggregations, and reshape data. The pandas equivalent of SQL GROUP BY, but more flexible.
Apply custom functions to DataFrames with apply(), map(), and transform(). When each one is the right choice.
Work with text columns using .str accessor and dates using .dt accessor. Clean strings, parse timestamps, extract date parts.
Reshape data with pivot tables and cross-tabulations. Turn raw rows into summary reports for business analysis.
Create line charts, bar charts, scatter plots, and pie charts. The plotting library behind most Python data visualization.
Subplots, dual axes, custom styles, annotations, and publication-quality figures. Take your matplotlib plots to the next level.
Choosing the right chart type, avoiding misleading visuals, and telling a clear story with your data.
Statistical distributions, hypothesis testing (t-tests, chi-squared), and correlation analysis using SciPy's stats module.
Build your first machine learning model — linear regression with scikit-learn. Fit, predict, evaluate, and visualize.
Train a classifier to categorize data — logistic regression, decision trees, and measuring accuracy with scikit-learn.
Cross-validation, precision, recall, F1, ROC curves, and the overfitting trap. Know if your model actually works.
Group data without labels using K-Means clustering. The elbow method, silhouette scores, and unsupervised pattern discovery.