Data Visualization Techniques

Duration: 5 min

Data visualization transforms complex data into visual representations that reveal patterns, trends, and insights. Effective visualizations communicate data stories and facilitate better decision-making.

Scatter Plots

Purpose

Show relationships between two continuous variables and identify correlations, trends, and outliers.

Example: Positive Correlation

Scatter Plot: Height vs Weight

Weight (kg)
    │
 90 │     ●
    │   ●   ●
 80 │  ●     ●
    │ ●       ●
 70 │●         ●
    │
 60 └─────────────
    160  170  180
    Height (cm)

Interpretation: Taller people tend to weigh more

Example: No Correlation

Scatter Plot: Age vs Shoe Size

Shoe Size
    │
  12│  ●     ●
    │    ●  ●
  10│  ●   ●
    │   ●    ●
   8│●        ●
    │
   6└─────────────
    20   40   60
    Age (years)

Interpretation: No clear relationship

Python Code

import matplotlib.pyplot as plt
import numpy as np

# Generate correlated data
x = np.random.rand(50) * 100
y = x + np.random.randn(50) * 10

# Create scatter plot
plt.scatter(x, y, alpha=0.6, s=100, color='blue')
plt.xlabel('Variable X')
plt.ylabel('Variable Y')
plt.title('Scatter Plot: Relationship Analysis')
plt.grid(True, alpha=0.3)
plt.show()

Histograms

Purpose

Display the distribution of a single variable across bins to identify patterns, skewness, and outliers.

Example: Normal Distribution

Histogram: Test Scores

Frequency
    │
 30 │        ╱╲
    │       ╱  ╲
 20 │      ╱    ╲
    │     ╱      ╲
 10 │    ╱        ╲
    │   ╱          ╲
  0 └──────────────────
    40  50  60  70  80  90
    Score

Interpretation: Bell-shaped, most scores around 65

Example: Skewed Distribution

Histogram: Income Distribution

Frequency
    │
 50 │ ╱
    │ │
 40 │ │
    │ │
 30 │ │
    │ │  ╱
 20 │ │  │
    │ │  │  ╱
 10 │ │  │  │
    │ │  │  │  ╱
  0 └─────────────
    0  20  40  60  80
    Income ($1000s)

Interpretation: Right-skewed, most earn less

Python Code

import matplotlib.pyplot as plt
import numpy as np

# Generate data
data = np.random.normal(loc=70, scale=15, size=1000)

# Create histogram
plt.hist(data, bins=30, alpha=0.7, color='green', edgecolor='black')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram: Distribution Analysis')
plt.grid(True, alpha=0.3, axis='y')
plt.show()

Line Charts

Purpose

Show trends over time or continuous data progression.

Example: Time Series

Line Chart: Stock Price Over Time

Price ($)
    │
 150│        ╱╲
    │       ╱  ╲
 140│      ╱    ╲╱╲
    │     ╱        ╲
 130│    ╱          ╲
    │   ╱            ╲
 120└──────────────────
    Jan Feb Mar Apr May
    Month

Interpretation: Price rises then falls

Python Code

import matplotlib.pyplot as plt
import numpy as np

# Generate time series data
months = np.arange(1, 13)
sales = np.array([100, 120, 115, 140, 160, 155, 180, 190, 175, 200, 210, 220])

# Create line chart
plt.plot(months, sales, marker='o', linewidth=2, markersize=8, color='red')
plt.xlabel('Month')
plt.ylabel('Sales ($1000s)')
plt.title('Line Chart: Sales Trend')
plt.grid(True, alpha=0.3)
plt.xticks(months)
plt.show()

Bar Charts

Purpose

Compare values across different categories.

Example: Category Comparison

Bar Chart: Sales by Region

Sales ($1000s)
    │
 300│ ┌─┐
    │ │ │
 200│ │ │ ┌─┐
    │ │ │ │ │ ┌─┐
 100│ │ │ │ │ │ │
    │ │ │ │ │ │ │
   0└─┴─┴─┴─┴─┴─┴─
    N  S  E  W  NE SE
    Region

Interpretation: North region has highest sales

Python Code

import matplotlib.pyplot as plt

# Data
regions = ['North', 'South', 'East', 'West']
sales = [300, 150, 200, 180]

# Create bar chart
plt.bar(regions, sales, color=['red', 'blue', 'green', 'orange'])
plt.xlabel('Region')
plt.ylabel('Sales ($1000s)')
plt.title('Bar Chart: Regional Sales Comparison')
plt.grid(True, alpha=0.3, axis='y')
plt.show()

Box Plots

Purpose

Show distribution, median, quartiles, and outliers.

Example: Box Plot Structure

Box Plot: Data Distribution

Value
    │
 100│ ●  (outlier)
    │
  80│ ┌─────┐
    │ │     │ (Q3)
  60│ ├─────┤ (median)
    │ │     │ (Q1)
  40│ └─────┘
    │ │
  20│ ●  (outlier)
    │
   0└─────
    Data

Components:
- Box: Q1 to Q3 (middle 50%)
- Line in box: Median
- Whiskers: Min/Max
- Dots: Outliers

Python Code

import matplotlib.pyplot as plt
import numpy as np

# Generate data
data = [np.random.normal(50, 15, 100) for _ in range(4)]

# Create box plot
plt.boxplot(data, labels=['Group A', 'Group B', 'Group C', 'Group D'])
plt.ylabel('Value')
plt.title('Box Plot: Distribution Comparison')
plt.grid(True, alpha=0.3, axis='y')
plt.show()

Heatmaps

Purpose

Show intensity/magnitude of values in a 2D matrix using colors.

Example: Correlation Heatmap

Heatmap: Correlation Matrix

        X    Y    Z
    ┌─────────────┐
  X │ 1.0  0.8  0.2│
    ├─────────────┤
  Y │ 0.8  1.0  0.5│
    ├─────────────┤
  Z │ 0.2  0.5  1.0│
    └─────────────┘

Color intensity: Red (high) → White (low)
Interpretation: X and Y are highly correlated

Python Code

import matplotlib.pyplot as plt
import numpy as np

# Create correlation matrix
data = np.random.randn(100, 3)
corr = np.corrcoef(data.T)

# Create heatmap
plt.imshow(corr, cmap='RdYlBu_r', vmin=-1, vmax=1)
plt.colorbar(label='Correlation')
plt.xticks([0, 1, 2], ['X', 'Y', 'Z'])
plt.yticks([0, 1, 2], ['X', 'Y', 'Z'])
plt.title('Heatmap: Correlation Matrix')
plt.show()

Best Practices

Choose the right chart type for your data
Label axes clearly with units
Use colors meaningfully (not just for aesthetics)
Avoid clutter - remove unnecessary elements
Provide context with titles and legends
Consider your audience - simplify for general viewers

❓ What is the primary purpose of a scatter plot?

To display the distribution of a single variable To show the relationship between two variables To compare different categories To visualize time-series data

❓ Which chart type is best for showing distribution?

Histogram Bar chart Line chart Scatter plot

❓ What does a box plot show?

Only the mean value Only the maximum value Median, quartiles, and outliers Only the minimum value

❓ Which chart type is best for showing trends over time?

Scatter plot Line chart Histogram Heatmap

Data Visualization Techniques

Scatter Plots

Purpose

Example: Positive Correlation

Example: No Correlation

Python Code

Histograms

Purpose

Example: Normal Distribution

Example: Skewed Distribution

Python Code

Line Charts

Purpose

Example: Time Series

Python Code

Bar Charts

Purpose

Example: Category Comparison

Python Code

Box Plots

Purpose

Example: Box Plot Structure

Python Code

Heatmaps

Purpose

Example: Correlation Heatmap

Python Code

Best Practices

Related Courses