Altcademy - a Forbes magazine logo Best Coding Bootcamp 2023

How to create a legend in matplotlib

Understanding Legends in Data Visualization

When you're starting out with programming and data visualization, one of the tools you'll likely encounter is Matplotlib, a plotting library in Python that enables you to create a wide range of static, animated, and interactive visualizations. A key component of these visualizations is the legend—a small area on the plot that explains the symbols, colors, or line types used on the chart.

Think of a legend as a map's key. Just as a map key explains what each symbol or color means (like how a dashed line might represent a boundary or a blue area might signify water), a legend in a plot tells you what each plot element represents. This is especially useful when you have multiple datasets or categories shown in one plot.

Starting with a Simple Plot

Before we add a legend, let's start by creating a simple plot. To do this, we need to use Matplotlib's pyplot module, which we'll import under the alias plt for convenience.

import matplotlib.pyplot as plt

# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]

# Creating a simple line plot
plt.plot(x, y)

# Display the plot
plt.show()

In this example, we create a basic line plot using x and y lists as data points. When you run this code, a window will pop up showing a line graph of the data.

Adding the First Legend

To add a legend, we'll need to label our plot elements when creating them. Then we'll call the legend() function.

# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]

# Creating a line plot with a label
plt.plot(x, y, label='Prime Numbers')

# Adding the legend
plt.legend()

# Display the plot
plt.show()

Here, we've added label='Prime Numbers' to our plot() function. This label is what will appear in the legend. The plt.legend() line tells Matplotlib to display the legend. You'll now see a box with the label 'Prime Numbers' next to a line that represents the data on the plot.

Customizing the Legend

Matplotlib allows you to customize the legend's location and appearance. You might want to move your legend inside the plot area because it's blocking some data, or you might want to change its style.

# Customizing the legend's location
plt.legend(loc='upper left')

# Customizing the legend's appearance
plt.legend(loc='best', frameon=False, fontsize='large')

The loc parameter can take values like 'upper left', 'lower right', etc., or 'best', which lets Matplotlib decide the optimal location. frameon=False removes the box around the legend, and fontsize adjusts the text size.

Plotting Multiple Lines with Legends

Let's plot multiple lines, each representing a different sequence of numbers.

# Sample data
x = [1, 2, 3, 4, 5]
primes = [2, 3, 5, 7, 11]
fibonacci = [1, 1, 2, 3, 5]

# Creating multiple line plots with labels
plt.plot(x, primes, label='Prime Numbers')
plt.plot(x, fibonacci, label='Fibonacci Sequence')

# Adding the legend
plt.legend()

# Display the plot
plt.show()

Now, our legend contains two items: 'Prime Numbers' and 'Fibonacci Sequence', each corresponding to a line on the plot.

Legends with Different Plot Types

Legends are not limited to line plots. They can be used with bar charts, scatter plots, and more. Let's create a scatter plot and add a legend to it.

# Sample data for scatter plot
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 40]

# Creating a scatter plot with a label
plt.scatter(x, y, label='Data Points')

# Adding the legend
plt.legend()

# Display the plot
plt.show()

The process is similar to adding a legend to a line plot. We use the scatter() function for creating scatter plots, and we still use label to define the legend's text.

Handling Overlapping Legends

Sometimes, a legend might overlap with your data, making it hard to read. You can adjust the legend's opacity to see the data underneath it by using the alpha parameter.

# Adding a legend with transparency
plt.legend(loc='best', framealpha=0.5)

framealpha controls the transparency of the legend's background. A value of 0.5 means 50% transparent.

Conclusion

Legends are an essential part of data visualization, as they help to distinguish between different data series or categories. With Matplotlib, creating and customizing legends is straightforward, allowing you to enhance the readability and professionalism of your plots.

By following the examples provided, you can add legends to your plots, customize their appearance, and make sure they complement your data rather than obscure it. As you continue to explore Matplotlib and data visualization, remember that the goal is to communicate information clearly and effectively. A well-placed and well-designed legend is a small but powerful tool in achieving that goal.

As you embark on your programming journey, think of each plot as a story you're telling with data. The legend is the guide that helps your audience follow along. With practice, you'll learn to create legends that not only inform but also contribute to the aesthetic appeal of your visualizations. Happy plotting!