Gridscript

🧠 Supervised Learning

📘 What Is Supervised Learning?

Supervised Learning is a type of Machine Learning where the model learns from labeled data — that is, data with known input (features) and output (target) values.
The goal is to train a model that can predict the output for new, unseen inputs.

Examples:

📈 Linear Regression

Concept

Linear Regression is used to predict a continuous numeric value.
It assumes a linear relationship between input features (X) and the target variable (y).

Equation:

y = m * x + b

Where:

Example in Python

import numpy as np
from sklearn.linear_model import LinearRegression

# Training data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 6, 8, 10])

# Create and train model
model = LinearRegression()
model.fit(X, y)

# Predict
prediction = model.predict([[6]])
print("Predicted value for x=6:", prediction)

Use case: Predicting prices, sales, or other continuous outcomes.

📊 Logistic Regression

Concept

Despite its name, Logistic Regression is used for classification, not regression.
It predicts probabilities that map input features to categorical outcomes (like yes/no, 0/1).

Equation:

P(y=1) = 1 / (1 + e^-(b0 + b1*x))

If P(y=1) > 0.5, classify as 1; otherwise, classify as 0.

Example in Python

from sklearn.linear_model import LogisticRegression

# Example data
X = [[1], [2], [3], [4], [5]]
y = [0, 0, 0, 1, 1]

model = LogisticRegression()
model.fit(X, y)

# Prediction
print(model.predict([[3.5]]))  # Output: 0 or 1

Use case: Spam detection, credit approval, disease diagnosis.

🌳 Decision Trees

Concept

Decision Trees split data into branches based on feature conditions, forming a tree-like model of decisions.
Each branch represents a rule, and each leaf represents an outcome.

Advantages:

Example in Python

from sklearn.tree import DecisionTreeClassifier

X = [[0, 0], [1, 1]]
y = [0, 1]

model = DecisionTreeClassifier()
model.fit(X, y)

print(model.predict([[0.5, 0.5]]))

Use case: Customer segmentation, loan approval, fraud detection.

🌲 Random Forests

Concept

A Random Forest is an ensemble method that builds multiple decision trees and combines their results (via averaging or voting).
This reduces overfitting and improves accuracy.

Advantages:

Example in Python

from sklearn.ensemble import RandomForestClassifier

X = [[0, 0], [1, 1], [1, 0], [0, 1]]
y = [0, 1, 1, 0]

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)

print(model.predict([[0.2, 0.8]]))

Use case: Credit scoring, medical diagnosis, stock market predictions.

📍 K-Nearest Neighbors (KNN)

Concept

The K-Nearest Neighbors (KNN) algorithm classifies a data point based on how its neighbors are classified.
It looks at the ‘k’ closest data points and assigns the majority label among them.

Steps:

  1. Choose number of neighbors (k).
  2. Measure distance (usually Euclidean).
  3. Predict the most common class among neighbors.

Example in Python

from sklearn.neighbors import KNeighborsClassifier

X = [[0], [1], [2], [3]]
y = [0, 0, 1, 1]

model = KNeighborsClassifier(n_neighbors=3)
model.fit(X, y)

print(model.predict([[1.5]]))  # Predict class for new value

Use case: Image recognition, recommendation systems, pattern classification.

⚙️ Support Vector Machines (SVM)

Concept

Support Vector Machines are powerful models that find the best boundary (hyperplane) separating different classes.
SVMs try to maximize the margin between data points of different classes.

Advantages:

Example in Python

from sklearn import svm

X = [[0, 0], [1, 1]]
y = [0, 1]

model = svm.SVC(kernel='linear')
model.fit(X, y)

print(model.predict([[0.5, 0.5]]))

Use case: Text classification, handwriting recognition, face detection.

🧠 Summary

AlgorithmTypeMain UseKey Idea
Linear RegressionRegressionPredict continuous valuesFits a straight line to data
Logistic RegressionClassificationBinary outcomesUses sigmoid function for probabilities
Decision TreeClassification/RegressionRule-based predictionsSplits data by conditions
Random ForestClassification/RegressionEnsemble learningCombines multiple trees
KNNClassificationBased on neighbor votesUses distance metrics
SVMClassificationSeparates classesMaximizes margin between data points

Supervised learning algorithms form the foundation of predictive analytics.
Choosing the right algorithm depends on the type of data, problem goal, and performance requirements.