Gridscript

πŸ“Š Data Visualization in Python

πŸ“˜ Introduction

Data visualization is the process of representing data graphically to make insights easier to understand and communicate.
In Data Science, visualization helps reveal patterns, trends, and relationships that might not be obvious from raw data.

Two of the most widely used libraries for visualization in Python are:

πŸ“ˆ Plotting with Matplotlib

1. Line Plot

Line plots are used to visualize trends over time or continuous data.

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 40]

plt.plot(x, y, marker='o')
plt.title("Line Plot Example")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid(True)
plt.show()

Use case: Displaying trends such as sales over time, temperature changes, or stock prices.

2. Bar Plot

Bar plots are used to compare quantities of different categories.

categories = ["A", "B", "C", "D"]
values = [23, 45, 56, 78]

plt.bar(categories, values, color='skyblue')
plt.title("Bar Chart Example")
plt.xlabel("Category")
plt.ylabel("Value")
plt.show()

Use case: Comparing sales by product type, or counts across different groups.

3. Scatter Plot

Scatter plots show the relationship between two numeric variables.

x = [5, 7, 8, 10, 12]
y = [12, 14, 15, 18, 22]

plt.scatter(x, y, color='green')
plt.title("Scatter Plot Example")
plt.xlabel("Variable X")
plt.ylabel("Variable Y")
plt.show()

Use case: Finding correlations (e.g., age vs. income, study hours vs. exam scores).

4. Histogram

Histograms show the distribution of a single numeric variable.

import numpy as np

data = np.random.randn(1000)

plt.hist(data, bins=30, color='purple', alpha=0.7)
plt.title("Histogram Example")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

Use case: Understanding data distribution, such as exam scores, salaries, or customer ages.

🎨 Plotting with Seaborn

Seaborn makes it easier to create beautiful and informative plots with less code.

import seaborn as sns
import matplotlib.pyplot as plt

# Example dataset
tips = sns.load_dataset("tips")

# Basic scatter plot
sns.scatterplot(x="total_bill", y="tip", data=tips)
plt.title("Total Bill vs Tip (Seaborn)")
plt.show()

1. Bar Plot (Seaborn)

sns.barplot(x="day", y="total_bill", data=tips, palette="coolwarm")
plt.title("Average Total Bill by Day")
plt.show()

2. Histogram / Distribution Plot

sns.histplot(tips["total_bill"], bins=20, kde=True, color="orange")
plt.title("Distribution of Total Bill")
plt.show()

3. Box Plot

Box plots show data spread, median, and outliers.

sns.boxplot(x="day", y="total_bill", data=tips, palette="Set2")
plt.title("Box Plot of Total Bill by Day")
plt.show()

4. Pair Plot

Pair plots show pairwise relationships across multiple variables.

sns.pairplot(tips, hue="sex", palette="husl")
plt.show()

🧭 Understanding How to Tell a Story with Charts

Creating charts isn’t just about displaying data β€” it’s about communicating insights clearly.
A good data visualization tells a story that highlights key findings and patterns.

1. Define Your Purpose

Ask: What do I want to show?

2. Keep It Simple

Avoid clutter and unnecessary decorations. Focus on clarity.

3. Highlight Key Insights

Use color or annotations to emphasize important points.

x = [1, 2, 3, 4, 5]
y = [10, 20, 15, 25, 30]

plt.plot(x, y, marker='o', color='blue')
plt.title("Sales Over Time")
plt.xlabel("Month")
plt.ylabel("Sales ($)")

# Highlight the maximum point
max_index = y.index(max(y))
plt.annotate("Peak Sales", xy=(x[max_index], y[max_index]), xytext=(x[max_index]-0.5, y[max_index]+3),
             arrowprops=dict(facecolor='red', shrink=0.05))

plt.show()

4. Choose the Right Chart Type

GoalRecommended Chart
Compare categoriesBar Chart
Show trend over timeLine Plot
Show data distributionHistogram
Show correlationScatter Plot
Show part-to-wholePie Chart (use sparingly)

🧠 Summary

ConceptDescription
MatplotlibFoundation library for flexible plotting
SeabornHigh-level API for attractive statistical graphics
Line PlotShows trends over time
Bar PlotCompares categories
Scatter PlotReveals relationships between variables
HistogramShows data distribution
StorytellingFocus on clear insights, simplicity, and appropriate chart choice

Effective data visualization transforms complex data into actionable insights.
Use charts to inform, not overwhelm, and always design with your audience in mind.