Geospatial Data
An easy-to-use library for plotting maps is ggmap. This chapter introduces how to create map visualizations with this library.
Requirements
To use ggmap
, we first must install it in our project:
install.packages("ggmap")
Maps created with ggmap
will show a background image of a map and assign the x- and y-axis longitude and latitude of the map section. To obtain map images, we can choose from the two supported providers, Google Maps or Stamen Maps. Stamen Maps do not require authentication with an API token (as Google Maps does), so in this introduction, we'll go with that.
To get a background image for our map, we need to specify the bounding box in terms of longitude and latitude coordinates. To obtain the background image with minimal overhead and directly from our R-program, we install the osmdata
package. This package accesses the OpenStreetMap service to get the bounding boxes:
install-packages("osmdata")
Creating a map visualization
To create a map, we go through the following steps:
Retrieve the bounding box for our desired background map image using
osmdata::getbb
. We can simply pass a query such as "Osnabrück" to the function. Any query that works on their web interface should also work with the functiongetbb
.Get the background image for our map with the
ggmap::get_map
function, passing the bounding box from step 1.Create the empty map with
ggmap
.Add data layers, such as points or polygons, on top of the map.
1. Retrieve the bounding box
Let's assume we want to visualize our Campusbier orders on a map of Osnabrück. The data set contains the longitude and latitude information about the customer's billing address. Following the steps above, we first get the bounding box (longitude and latitude coordinates for the map's boundaries) for Osnabrück:
library(osmdata)
city <- "Osnabrück"
bb_os <- getbb(city)
2. Get the background image from Stamen Maps
We can now get the image by passing the bounding box to the get_map
function:
os_map <- get_map(location = bb_os,
source = "stamen",
maptype = "terrain",
crop = TRUE,
color = "bw")
Note that the get_map function has several arguments that we can use to specify the type of map we want. Stamen Maps support the different map types "terrain", "toner", and "watercolor", which all product a different look. Play around with it to see what works best for you. With color = "bw"
, we can produce a grayscale version of the map.
3. Create the empty map
We can display the map now using the ggmap
function:
ggmap(os_map)

3. Add the data layer
The final step is adding the data layer on top of the map:
ggmap(os_map)
labs(x= NULL, y = NULL) +
geom_point(data = orders_week,
aes(
x = billing_address_longitude,
y = billing_address_latitude,
size = total_price),
color = "#009ee3",
alpha = 0.7
)
In this case, we draw points on the map for each order. The size of a point corresponds to the turnover (total_price
) for each order. Order with more turnover appear larger:

In the example above, we used points to map coordinates in our data to the map. Additionally, we display some relevant information, in this case the turnover, using a visual property like the size of each point. This type of map visualization is very common and useful when we want to show exact coordinates.
Occasionally, we wish to show more aggregated data on the level of areas, such as countries, states or zip code areas. We can draw these areas as shapes onto the map, and fill the area with a color to visualize certain properties (such as population density). This is called a choropleth map.
Choropleth maps
As an alternative to location-based maps, in which we draw points in certain positions (coordinates) of the map, we can draw shapes that correspond to an area of the map. Zip codes are an example, as every zip code area can be described by a polygon, which is a set of connected coordinates. To work with an example, we can download the zip code polygons for Germany along with other information about the zip code areas from the following link:
We can choose the format to download. A common format for GIS data (GIS = Graphical Information System) is the Shapefile format. This is a standardized format that most GIS systems support. There is also a package in R that supports this format, called sf
.

Loading Shapefiles
The sf
-package
sf
-packageThe abbreviation "sf" is short for "Simple Feature", which is a "set of standards that specify a common storage and access model of geographic feature made of mostly two-dimensional geometries" (Wikipedia). The sf
-library makes working with these standards simple in R.
There are numerous resources to get started with the sf-package. The package offers features for reading spatial data from data sources, such as files or database. But it can also create or manipulate whole new spatial structures. Have a look at the following to get an overview and to learn more:
The st_read
function
st_read
functionIn this example, I have downloaded and unpacked the Shapefile with the German zip code data in my project folder under data/plz_germany
. Then, I can load the Shapefile as follows, using the sf
-library's function st_read
:
library(sf)
zip <- st_read(here("data/plz_germany/"), as_tibble = T)
# Reading layer `georef-germany-postleitzahl' from data source
# `C:\rcode\apal_ws_22_23\data\plz_germany' using driver `ESRI Shapefile'
# Simple feature collection with 8934 features and 8 fields
# Geometry type: POLYGON
# Dimension: XY
# Bounding box: xmin: 5.866315 ymin: 47.27011 xmax: 15.04193 ymax: 55.05868
# Geodetic CRS: WGS 84
The st_read
-function can be configured to return the data as a tibble, which is what we want since we are working with the Tidyverse. When we look at the type of the zip
object, we can see that it has the classes data.frame
, tbl
and tbl_df
, as we would expect for a tibble. In addition, it also has the sf
class:
zip %>% class()
# [1] "sf" "tbl_df" "tbl" "data.frame"
Explore the Shapefile
Since the zip object is a tibble, we can use the well-known functions to explore the data a bit. For example, we can get the list of column names:
zip |> colnames()
# [1] "name" "plz_name" "plz_name_lo" "plz_code" "krs_code"
# [6] "lan_name" "lan_code" "krs_name" "geometry"
Or filter the data based on one of the columns:
zip |> filter(name == "49090")
# Simple feature collection with 1 feature and 8 fields
# Geometry type: POLYGON
# Dimension: XY
# Bounding box: xmin: 7.965821 ymin: 52.28314 xmax: 8.099968 ymax: 52.33743
# Geodetic CRS: WGS 84
# # A tibble: 1 x 9
# name plz_name plz_name_lo plz_code krs_code lan_n~1 lan_c~2 krs_n~3
# * <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 49090 Osnabrück 49090 Osnabrück 49090 03404 Nieder~ 03 Kreisf~
# # ... with 1 more variable: geometry <POLYGON [°]>, and abbreviated
# # variable names 1: lan_name, 2: lan_code, 3: krs_name
# # i Use `colnames()` to see all variable names
Because sf
brings its version of the R plot
function, we can directly pass the zip
object to see the geometries. To plot the geometry information, we apply the st_geometry
-function:
plot(st_geometry(zip))

For our example of the Campusbier orders, we only want to plot the zip codes of Osnabrück. Tibble-style, we filter the data on the zip codes that start with "490":
zip_os <- zip |>
filter(str_starts(name, "490"))
As a result, we get 9 entries, which correspond to the zip codes in and around the city of Osnabrück.
Adding the zip code polygons to the map
We can now re-use the map of Osnabrück we created above and add the layer with the zip code polygons. The sf
-package extends ggplot2
and comes with its own geometry geom_sf
:
ggmap(os_map) +
geom_sf(data = zip_os, alpha = 0.8, inherit.aes = F)
Note that we pass the filtered data in zip_os
to the geom_sf
function and tell it not to use the aes
-mapping from ggmap
. This important because in our zip_os
object, there are no lon
and lat
fields. Instead, the geom_sf
-function creates its own aes
-mapping to draw the polygons.

The alpha = 0.8
makes sure we can still see the streets under the zip code shapes.
Filling the zip code polygons
We can now use the filling of the polygons to convey information in the visualization. We could choose a different color for each zip code:
ggmap(os_map) +
geom_sf(data = zip_os,
aes(fill = name),
alpha = 0.8, inherit.aes = F)

This does not convey any information about our customers and orders. Instead, we can leverage the data we have and fill the area for each zip code according to the turnover we made there. How can we do that? We need to join the geometry data with the sales data.
Calculate the turnover by zip code
In the first step, we need a summary of our orders that contains the turnover per zip code:
turnover_per_zip <- orders %>%
mutate(zip = as.character(billing_address_zip)) %>%
select(zip, total_price) %>%
group_by(zip) %>%
summarise(turnover = sum(total_price))
# # A tibble: 15 x 2
# zip turnover
# <chr> <dbl>
# 1 49009 36.9
# 2 49032 130.
# 3 49074 5117.
# 4 49075 138.
# 5 49076 23739.
# ...
Join the turnover with geometry data
To join the geometry data with the turnover summary from above, we make sure the column with the zip code is named zip
in both data frames. We can then use the left_join
with the by
attribute to merge the columns together:
zip_os_with_turnover <- zip_os %>%
rename(zip = name) %>%
left_join(turnover_per_zip, by = "zip")
We replace the object zip_os
with the new version with the turnover value included and use the turnover
column for the fill
aesthetic:
ggmap(os_map) +
geom_sf(data = zip_os_with_turnover,
alpha = 0.8,
mapping = aes(fill = turnover),
inherit.aes = F) +
scale_fill_distiller(type = "seq", palette = "Blues", direction = 1)
Note that we also changed the color scale with scale_fill_distiller
to a sequential palette from blue to white, where blue is more turnover and white is less.

Last updated
Was this helpful?