# How to make a scatter plot in matplotlib

## Understanding Scatter Plots

Before diving into code, let's understand what a scatter plot is. Imagine you have a handful of pebbles and you throw them on the ground. Each pebble represents a point on the ground, and together, they form a pattern. In the world of data, a scatter plot is a type of graph that shows the relationship between two variables by displaying points at the intersection of their values. It's like a snapshot of where each pebble (or data point) falls on an imaginary grid based on two characteristics.

To create a scatter plot in Python, we'll use a library called Matplotlib. Think of Matplotlib as a box of crayons that lets you draw different types of graphs. Before we can start drawing, we need to make sure we have these crayons ready to use. If you haven't already, install Matplotlib by running this command in your terminal or command prompt:

``````pip install matplotlib
``````

Now, let's get our canvas ready by importing the necessary tools from Matplotlib:

``````import matplotlib.pyplot as plt
``````

Here, `plt` is a common shorthand for Matplotlib's plotting module. It's like giving a nickname to your favorite crayon so you can quickly grab it when you need it.

Let's start with something simple. We'll create a scatter plot of random points. Imagine you're plotting the favorite ice cream flavors and ages of a group of people. The flavor will be represented by numbers (just for simplicity) and the age by actual ages.

Here's how you can create this scatter plot:

``````# Sample data: ages and favorite ice cream flavors (as numbers)
ages = [25, 36, 47, 58, 22, 34, 49, 28]
flavors = [1, 3, 2, 5, 2, 4, 4, 1]

# Create a scatter plot
plt.scatter(ages, flavors)

# Add a title and labels to the axes
plt.title("Favorite Ice Cream Flavor by Age")
plt.xlabel("Age")
plt.ylabel("Ice Cream Flavor (Number)")

# Show the plot
plt.show()
``````

When you run this code, you'll see a graph with dots scattered around. Each dot represents a person's age and their corresponding ice cream flavor preference.

Now that you've made a basic scatter plot, let's make it prettier and more informative. We can change the color and size of the points, add a grid for easier reading, and more.

### Changing Point Colors and Sizes

Suppose you want to color the points based on another variable, like the number of scoops. A darker color could mean more scoops. And maybe the size of the points could represent how much they love ice cream.

Here's how you can do that:

``````# Additional data: number of scoops and love for ice cream
scoops = [2, 3, 1, 5, 4, 2, 3, 1]
love_for_ice_cream = [7, 9, 6, 10, 8, 5, 9, 6]

# Create a scatter plot with colors and sizes based on additional data
plt.scatter(ages, flavors, c=scoops, cmap='Greys', s=love_for_ice_cream)

# Add a color bar to show the color scale
plt.colorbar(label='Number of Scoops')

# Add a title and labels to the axes
plt.title("Favorite Ice Cream Flavor by Age with Scoops and Love")
plt.xlabel("Age")
plt.ylabel("Ice Cream Flavor (Number)")

# Show the plot
plt.show()
``````

In this plot, the colors and sizes of the points give us more information at a glance. The `cmap` parameter changes the color palette (here, 'Greys' is used for shades of gray).

### Adding a Grid and Customizing Axes

Sometimes, a grid can help us read the scatter plot more accurately. Let's add one and play with the axes a bit:

``````# Create a scatter plot with a grid
plt.scatter(ages, flavors, c=scoops, cmap='Greys', s=love_for_ice_cream)
plt.colorbar(label='Number of Scoops')

# Customize axes limits
plt.xlim(20, 60)  # Set x-axis limits
plt.ylim(0, 6)    # Set y-axis limits

plt.grid(True, which='both', linestyle='--', linewidth=0.5)

# Add a title and labels to the axes
plt.title("Favorite Ice Cream Flavor by Age with Scoops and Love")
plt.xlabel("Age")
plt.ylabel("Ice Cream Flavor (Number)")

# Show the plot
plt.show()
``````

The `xlim` and `ylim` functions control the range of the axes, and `grid` adds a grid with a specific style and width.

## Understanding Your Data Through Scatter Plots

Scatter plots are not just about throwing points on a graph; they're tools for storytelling with data. By looking at how the points are spread out, you can start to see patterns. For example, if most of the points are clustered in one area, it might mean that people of a certain age group prefer a specific ice cream flavor. If the points form a line going upwards, it could suggest that as people get older, they prefer more scoops.

Once you're happy with your scatter plot, you might want to save it to share with others or to include in a report. Here's how you can save your plot:

``````# Create a scatter plot
plt.scatter(ages, flavors, c=scoops, cmap='Greys', s=love_for_ice_cream)
plt.colorbar(label='Number of Scoops')

# Add a title and labels to the axes
plt.title("Favorite Ice Cream Flavor by Age with Scoops and Love")
plt.xlabel("Age")
plt.ylabel("Ice Cream Flavor (Number)")

# Save the plot as a PNG file
plt.savefig('scatter_plot.png', dpi=300)

# Show the plot
plt.show()
``````

The `savefig` function saves the plot as an image file. The `dpi` parameter controls the quality of the image.

## Conclusion

Creating a scatter plot with Matplotlib is like painting a picture of your data. Each point tells a small part of the story, and together, they reveal insights that might not be obvious at first glance. Whether you're exploring the relationship between age and ice cream preferences or something more complex, scatter plots are a powerful way to visualize and understand the nuances of your data.

As you continue your journey in programming and data visualization, remember that each graph you create is an opportunity to communicate something unique about the data you're working with. With each scatter plot, you're not just plotting points; you're laying out a constellation of information that can illuminate understanding for yourself and others. So keep experimenting with different styles and customizations, and enjoy the process of discovering and sharing the stories hidden within your data.

## Learn to code in our 100% online programs

Altcademy coding bootcamp offers beginner-friendly, online programs designed by industry experts to help you become a coder. 85%+ of Altcademy alumni are hired within 6 months after graduation. See how we teach, or click on one of the following programs to find out more.