Skip to main content
ThePythonBook/Assessment

Data Science & Machine Learning Assessment

Test your data science fundamentals: NumPy array operations, statistical calculations, data manipulation, and algorithm implementations.

Progress
880 XP0/10
#1Descriptive Statistics
Write Code

Write a function describe(arr) that takes a 1-D NumPy array and returns a dictionary with keys "mean", "median", "std", and "range".

  • "mean" — arithmetic mean
  • "median" — median value
  • "std" — population standard deviation (ddof=0)
  • "range" — difference between max and min
  • All values should be Python floats (use .item() or float() on NumPy scalars).

    Loading editor...
    #2Fix the Normalization Bug
    Fix the Bug

    The function normalize(arr) should min-max normalize a 1-D NumPy array so that the smallest value maps to 0 and the largest maps to 1.

    The code below has two bugs. Find and fix them.

    Loading editor...
    #3Predict the Broadcasting Result
    Predict Output

    What does the following code print?

    import numpy as np
    
    a = np.array([[1, 2], [3, 4]])
    b = np.array([10, 20])
    result = a + b
    print(result.tolist())
    Loading editor...
    #4Moving Average
    Write Code

    Write a function moving_average(arr, window) that computes the simple moving average of a 1-D NumPy array.

  • Use a sliding window of size window.
  • Return a NumPy array of length len(arr) - window + 1.
  • Each element is the mean of the current window.
  • For example, moving_average(np.array([1,2,3,4,5]), 3) should return [2.0, 3.0, 4.0].

    Loading editor...
    #5Fix the Matrix Multiply
    Fix the Bug

    The function mat_mul(a, b) should perform matrix multiplication of two 2-D NumPy arrays and return the result as a nested Python list.

    There is one bug. Find and fix it.

    Loading editor...
    #6Refactor Z-Score Calculation
    Refactor

    The function below computes z-scores using a manual Python loop. Refactor it to use vectorized NumPy operations instead. The function must return a Python list of floats.

    Requirements:

  • Remove the explicit loop
  • Use NumPy vectorized arithmetic
  • The results must be identical
  • Loading editor...
    #7Predict Boolean Indexing Output
    Predict Output

    What does the following code print?

    import numpy as np
    
    data = np.array([15, 22, 8, 42, 3, 37, 19])
    mask = data > 20
    filtered = data[mask]
    print(filtered.tolist())
    print(mask.sum())
    Loading editor...
    #8Simple Linear Regression
    Write Code

    Implement linear_regression(x, y) that computes the slope and intercept of the least-squares regression line for 1-D NumPy arrays x and y.

    Use the closed-form formulas:

  • slope = sum((x - x_mean) * (y - y_mean)) / sum((x - x_mean)^2)
  • intercept = y_mean - slope * x_mean
  • Return a tuple (slope, intercept) with values rounded to 4 decimal places.

    Loading editor...
    #9Fix the Distance Calculation
    Fix the Bug

    The function euclidean_distance(a, b) should compute the Euclidean distance between two 1-D NumPy arrays.

    There is one bug. Find and fix it.

    Loading editor...
    #10Refactor One-Hot Encoding
    Refactor

    The function below builds a one-hot encoded matrix using nested Python loops. Refactor it to use NumPy operations instead.

    Requirements:

  • Remove explicit loops
  • Use NumPy array creation / indexing to build the matrix
  • Return the result as a nested Python list (same output as before)
  • Loading editor...