![](/style/images/good.png)
![](/style/images/bad.png)
How to make fancy road trip maps with R and OpenStreetMap
source link: https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
How to make fancy road trip maps with R and OpenStreetMap
How to make fancy road trip maps with R and OpenStreetMap
Thursday, June 1, 2023
In a couple days, I’m going to drive across the country to Utah, my home state. I haven’t been out west with my whole family in four years—not since 2019 when we moved from Spanish Fork, Utah to Atlanta, Georgia. According to Google Maps, it’s a mere 1,900 miles (3,000 kilometers) through the middle of the United States and should take 28 hours, assuming no stops.
![google-maps-ga-to-ut.png](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/img/google-maps-ga-to-ut.png)
For bonus fun though, my wife and I decided to make this trip even more adventurous and hit as many exciting stops as possible. So rather than drive through the middle of Kansas, we’re going along the southern border on the way there, visiting New Orleans, Louisiana; San Antonio, Texas; Carlsbad Caverns, New Mexico; the Grand Canyon in Arizona; and then camping for a couple days at Capitol Reef National Park in Southern Utah before finally arriving at my aunt’s house in Spanish Fork, Utah. On the way home, we’re going across the northern part of the United States, visiting my sister in Idaho before heading towards Yellowstone in Wyoming; Devil’s Tower in Wyoming; Mount Rushmore in South Dakota; and Nauvoo, Illinois (Mormons!). It’s going to be an awesome—but way longer—mega road trip.
To help keep six kids entertained beyond just plugging them into iPads, my wife made some road trip journals to make the experience more memorable and educational, and we’re bringing a bunch of books and extra materials about each of the stops we’ll be making so they can do research along the way.
I just finished revamping my online data visualization class this week since the summer semester is starting next week, so visualizing data through maps has been on my mind. In my session on visualizing maps, I added a section on making and visualizing your own geocoded data (i.e. plotting specific cities on a map), so I figured it would be neat to visualize this huge road trip somehow to make some maps to include in the kids’ journals.
Getting the latitude and longitude coordinates for specific cities or addresses is super easy—you can either right click on Google Maps and copy specific coordinates, or you can automate it with the {tidygeocoder} package in R.
Getting route coordinates is a lot trickier though, since it involves all sorts of complex algorithms (accounting for road locations, speed limits, traffic, etc.). This is why Google Maps is so invaluable—it does an excellent job of figuring this out. But Google’s data is proprietary.
There’s an R package named {mapsapi} that makes it really easy to get data from the Google Maps API, and if you get an API key from Google (following these instructions), you can extract geocoded {sf}-compatible route data from Google with the mp_directions()
function. It’s neat and quick and easy, but it’s expensive. When I first started playing with the maps API in R, I racked up $0.50 in charges just with testing a bunch of routing options. The Google Maps API gives you $200 in free credits every month and I was well below that amount, but it still was a little nervewracking to see that it was that expensive in just an hour of tinkering. Also, setting up the API key was fairly complicated (creating a Google Cloud project, setting up billing, enabling specific services, storing the API key securely on my computer, etc.), and I don’t want to go through that hassle in the future.
![google-maps-api-costs.png](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/img/google-maps-api-costs.png)
I’m a fan of the OpenStreetMap project, and it’s one of the backends (through Nominatim) for {tidygeocoder} (and I used it in a previous blog post showing some coordinates in interactive Leaflet maps). In searching around the internet for open source alternatives to the Google Maps API, I discovered that OpenStreetMap can also do directions and routing through the Open Source Routing Machine (OSRM) project. You can use OSRM’s public demo server for small-scale routing things, or you can install your own local instance through Docker. And super conveniently, there’s also an R package called {osrm} for accessing the OSRM API. This means that it’s possible to pull geocoded routing and directions data into R in an open source (and free!) way.
So let’s see how to use {osrm} and make some neat road trip maps with {sf} and {ggplot2}!
Getting started
Who this post is for
Here’s what I assume you know:
- You’re familiar with R and the tidyverse (particularly {dplyr} and {ggplot2}).
- You’re somewhat familiar with {sf} for working with geographic data. I have a whole tutorial here and a simplified one here and the {sf} documentation has a ton of helpful vignettes and blog posts, and there are also two free books about it: Spatial Data Science and Geocomputation with R. Also check this fantastic post out to learn more about the anatomy of a
geometry
column with {sf}.
Packages and functions
Before officially getting started, let’s load all the packages we need and create some helpful functions and variables:
library(tidyverse) # ggplot, dplyr, and friends
library(sf) # Handle spatial data in R in a tidy way
library(tigris) # Access geographic data from the US Census
library(tidygeocoder) # Automated geocoding
library(osrm) # Access OSRM through R
library(ggrepel) # Nicer non-overlapping labels
library(glue) # Easier string interpolation
library(scales) # Nicer labeling functions
library(patchwork) # Combine plots nicely
library(ggspatial) # Nicer map features like scale bars
# Custom ggplot theme to make pretty plots
# Get the font at https://fonts.google.com/specimen/Overpass
theme_roadtrip <- function() {
theme_void(base_family = "Overpass Light") +
theme(
plot.title = element_text(family = "Overpass", face = "bold", hjust = 0.5),
strip.text = element_text(
family = "Overpass ExtraBold", face = "plain",
size = rel(1.1), hjust = 0.5)
)
}
# Make labels use Overpass by default
update_geom_defaults("label_repel",
list(family = "Overpass",
fontface = "plain"))
update_geom_defaults("label",
list(family = "Overpass",
fontface = "plain"))
update_geom_defaults("text_repel",
list(family = "Overpass",
fontface = "plain"))
update_geom_defaults("text",
list(family = "Overpass",
fontface = "plain"))
# Yellowstone colors
clrs <- NatParksPalettes::natparks.pals("Yellowstone")
#' Format duration in minutes and hours
#'
#' This function takes a numeric input \code{x} representing a duration in minutes,
#' rounds it to the nearest 15 minutes, and formats the result as a string
#' indicating the number of hours and minutes in the duration.
#'
#' @param x A numeric input representing a duration in minutes.
#' @return A character vector of formatted duration strings.
#' @examples
#' fmt_duration(c(93, 1007, 3056))
fmt_duration <- function(x) {
# Round to the nearest 15 minutes
n_seconds <- round(seconds(x * 60) / (15 * 60)) * (15 * 60)
n_seconds <- seconds_to_period(n_seconds)
out <- map_chr(n_seconds, \(n) {
if (seconds(n) <= 59) {
# If this is less than an hour, don't format anything with hours
glue("{MM} minutes", MM = minute(n))
} else {
# I only want to format this as a number of hours. If the duration is
# longer than 24 hours, seconds_to_period() rolls over into days (i.e.
# seconds_to_period(60 * 60 * 24) returns "1d 0H 0M 0S"), and it shows
# zero hours. So we extract the day part of the period, multiply it by 24,
# and add it to the hour component that we want to display
extra_day_hours <- day(n) * 24
glue("{HH} hour{s} {MM} minutes",
HH = scales::label_comma()(hour(n) + extra_day_hours),
MM = minute(n),
s = ifelse(hour(n) == 1, "", "s")
)
}
})
return(out)
}
fmt_miles <- scales::label_number(accuracy = 10, suffix = " miles", big.mark = ",")
miles_to_meters <- function(x) {
x * 1609.344
}
meters_to_miles <- function(x) {
x / 1609.344
}
km_to_miles <- function(x) {
meters_to_miles(x * 1000)
}
State data
First we need state boundaries. We can get these from the US Census with states()
from {tigris}:
Here’s what they look like (using the Albers projection for the contiguous USA):
ggplot() +
geom_sf(data = lower_48) +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip()
![48 contiguous states](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-states-only-1.png)
Super quick and easy.
Stops data
Next we need to geocode and plot all the stops we’ll be making on the trip. I’ll stick them all in a data frame with columns indicating the direction (with a nod to Bilbo Baggins’s “There and Back Again” journey with the dwarves in The Hobbit) and the day of the leg of the trip:
stops_raw <- tribble(
~direction, ~day, ~city,
"There", 1, "Atlanta, Georgia",
"There", 2, "New Orleans, Louisiana",
"There", 3, "San Antonio, Texas",
"There", 4, "Carlsbad, New Mexico",
"There", 5, "Grand Canyon NP, Arizona",
"There", 6, "Grover, Utah",
"Back again", 1, "Spanish Fork, Utah",
"Back again", 1, "Shelley, Idaho",
"Back again", 2, "Yellowstone NP, Wyoming",
"Back again", 2, "Devil's Tower, Wyoming",
"Back again", 3, "Mount Rushmore, South Dakota",
"Back again", 4, "Albert Lea, Minnesota",
"Back again", 5, "Nauvoo, Illinois",
"Back again", 6, "Atlanta, Georgia"
) %>%
mutate(direction = fct_inorder(direction))
stops_raw
## # A tibble: 14 × 3
## direction day city
## <fct> <dbl> <chr>
## 1 There 1 Atlanta, Georgia
## 2 There 2 New Orleans, Louisiana
## 3 There 3 San Antonio, Texas
## 4 There 4 Carlsbad, New Mexico
## 5 There 5 Grand Canyon NP, Arizona
## 6 There 6 Grover, Utah
## 7 Back again 1 Spanish Fork, Utah
## 8 Back again 1 Shelley, Idaho
## 9 Back again 2 Yellowstone NP, Wyoming
## 10 Back again 2 Devil's Tower, Wyoming
## 11 Back again 3 Mount Rushmore, South Dakota
## 12 Back again 4 Albert Lea, Minnesota
## 13 Back again 5 Nauvoo, Illinois
## 14 Back again 6 Atlanta, Georgia
I could go to Google Maps and right click on each of these places and copy the latitude/longitude coordinates, but that’s not very reproducible and involves a lot of clicking around. We can do it programmatically with geocode()
from {tidygeocoder}, but first we need to help the API a little. With most of these places, OpenStreetMap will return the center of the city (like with Atlanta), and that’s fine. With larger places like the Grand Canyon and Yellowstone, though, OpenStreetMap will give coordinates for the middle of the parks, which are literally in the middle of nowhere and will mess up our directions (since it’s not really possible to drive to the middle of the whole Grand Canyon). To fix this we’ll make a little dataset of some more specific addresses and join it to the list of stops. We’ll geocode the more specific address column:
stops_addresses <- tribble(
~city, ~address,
"Grand Canyon NP, Arizona", "Grand Canyon Visitor Center, Arizona",
"Yellowstone NP, Wyoming", "Old Faithful, Teton County, Wyoming",
)
stops_to_geocode <- stops_raw %>%
left_join(stops_addresses, by = join_by(city)) %>%
# Combine the address and city columns, with a preference for address
mutate(address = coalesce(address, city))
stops_to_geocode
## # A tibble: 14 × 4
## direction day city address
## <fct> <dbl> <chr> <chr>
## 1 There 1 Atlanta, Georgia Atlanta, Georgia
## 2 There 2 New Orleans, Louisiana New Orleans, Louisiana
## 3 There 3 San Antonio, Texas San Antonio, Texas
## 4 There 4 Carlsbad, New Mexico Carlsbad, New Mexico
## 5 There 5 Grand Canyon NP, Arizona Grand Canyon Visitor Center, Arizona
## 6 There 6 Grover, Utah Grover, Utah
## 7 Back again 1 Spanish Fork, Utah Spanish Fork, Utah
## 8 Back again 1 Shelley, Idaho Shelley, Idaho
## 9 Back again 2 Yellowstone NP, Wyoming Old Faithful, Teton County, Wyoming
## 10 Back again 2 Devil's Tower, Wyoming Devil's Tower, Wyoming
## 11 Back again 3 Mount Rushmore, South Dakota Mount Rushmore, South Dakota
## 12 Back again 4 Albert Lea, Minnesota Albert Lea, Minnesota
## 13 Back again 5 Nauvoo, Illinois Nauvoo, Illinois
## 14 Back again 6 Atlanta, Georgia Atlanta, Georgia
To get the coordinates, we’ll feed the address column to geocode()
and then convert the resulting numeric long
and lat
into an official sf geometry column (with the WGS 84 (−180 to 180) scale):
stops_geocoded
## Simple feature collection with 14 features and 4 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS: WGS 84
## # A tibble: 14 × 5
## direction day city address geometry
## * <fct> <dbl> <chr> <chr> <POINT [°]>
## 1 There 1 Atlanta, Georgia Atlanta, Georgia (-84.4 33.7)
## 2 There 2 New Orleans, Louisiana New Orleans, Louisiana (-90.1 30)
## 3 There 3 San Antonio, Texas San Antonio, Texas (-98.5 29.4)
## 4 There 4 Carlsbad, New Mexico Carlsbad, New Mexico (-104 32.4)
## 5 There 5 Grand Canyon NP, Arizona Grand Canyon Visitor Center, Arizona (-112 36.1)
## 6 There 6 Grover, Utah Grover, Utah (-111 38.2)
## 7 Back again 1 Spanish Fork, Utah Spanish Fork, Utah (-112 40.1)
## 8 Back again 1 Shelley, Idaho Shelley, Idaho (-112 43.4)
## 9 Back again 2 Yellowstone NP, Wyoming Old Faithful, Teton County, Wyoming (-111 44.5)
## 10 Back again 2 Devil's Tower, Wyoming Devil's Tower, Wyoming (-105 44.6)
## 11 Back again 3 Mount Rushmore, South Dakota Mount Rushmore, South Dakota (-103 43.9)
## 12 Back again 4 Albert Lea, Minnesota Albert Lea, Minnesota (-93.4 43.6)
## 13 Back again 5 Nauvoo, Illinois Nauvoo, Illinois (-91.4 40.6)
## 14 Back again 6 Atlanta, Georgia Atlanta, Georgia (-84.4 33.7)
Atlanta is in there twice, so we’ll need to drop the last row so that we don’t plot the points and labels for it twice:
And let’s plot it all:
ggplot() +
geom_sf(data = lower_48) +
geom_sf(data = all_stops_unique) +
geom_label_repel(
data = all_stops_unique,
aes(label = city, geometry = geometry),
stat = "sf_coordinates", seed = 1234,
size = 3, segment.color = clrs[3], min.segment.length = 0
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip()
![All the main stops for our road trip](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-stops-only-1.png)
Routing
Next we need to figure out the routes between each of these points. The osrmRoute()
function in {osrm} takes two main arguments: (1) coordinates for the source location (src
) and (2) coordinates for the destination location (dst
). To make it easy to work with routing data in a dataframe, we need to manipulate the data a bit so that we get a row per route, with columns for the origin and destination cities.
We’ll create a copy of the geometry column in stops_geocoded
and shift it forward one row with lead()
and drop the last row because it doesn’t have a destination:
routes_raw <- stops_geocoded %>%
select(-address) %>%
rename(
origin_geometry = geometry,
origin_city = city
) %>%
mutate(
destination_geometry = lead(origin_geometry),
destination_city = lead(origin_city)
) %>%
filter(row_number() != n())
routes_raw
## Simple feature collection with 13 features and 4 fields
## Active geometry column: origin_geometry
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS: WGS 84
## # A tibble: 13 × 6
## direction day origin_city origin_geometry destination_geometry destination_city
## * <fct> <dbl> <chr> <POINT [°]> <POINT [°]> <chr>
## 1 There 1 Atlanta, Georgia (-84.4 33.7) (-90.1 30) New Orleans, Louisiana
## 2 There 2 New Orleans, Louisiana (-90.1 30) (-98.5 29.4) San Antonio, Texas
## 3 There 3 San Antonio, Texas (-98.5 29.4) (-104 32.4) Carlsbad, New Mexico
## 4 There 4 Carlsbad, New Mexico (-104 32.4) (-112 36.1) Grand Canyon NP, Arizona
## 5 There 5 Grand Canyon NP, Arizona (-112 36.1) (-111 38.2) Grover, Utah
## 6 There 6 Grover, Utah (-111 38.2) (-112 40.1) Spanish Fork, Utah
## 7 Back again 1 Spanish Fork, Utah (-112 40.1) (-112 43.4) Shelley, Idaho
## 8 Back again 1 Shelley, Idaho (-112 43.4) (-111 44.5) Yellowstone NP, Wyoming
## 9 Back again 2 Yellowstone NP, Wyoming (-111 44.5) (-105 44.6) Devil's Tower, Wyoming
## 10 Back again 2 Devil's Tower, Wyoming (-105 44.6) (-103 43.9) Mount Rushmore, South Dakota
## 11 Back again 3 Mount Rushmore, South Dakota (-103 43.9) (-93.4 43.6) Albert Lea, Minnesota
## 12 Back again 4 Albert Lea, Minnesota (-93.4 43.6) (-91.4 40.6) Nauvoo, Illinois
## 13 Back again 5 Nauvoo, Illinois (-91.4 40.6) (-84.4 33.7) Atlanta, Georgia
With the data like this, we can now feed each row individually to osrmRoute
and get geocoded paths between each origin and destination:
routes_geocoded_raw
## Simple feature collection with 13 features and 5 fields
## Active geometry column: origin_geometry
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS: WGS 84
## # A tibble: 13 × 7
## # Rowwise:
## direction day origin_city origin_geometry destination_geometry destination_city route$src $dst $duration $distance $geometry
## * <fct> <dbl> <chr> <POINT [°]> <POINT [°]> <chr> <chr> <chr> <dbl> <dbl> <LINESTRING [°]>
## 1 There 1 Atlanta, Georgia (-84.4 33.7) (-90.1 30) New Orleans, Louisiana src dst 513. 753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -...
## 2 There 2 New Orleans, Louisiana (-90.1 30) (-98.5 29.4) San Antonio, Texas src dst 596. 870. (-90.1 30, -90.3 30, -90.5 30.1, -90.9 30.2, -91.1 30.4, -91.3 30.5, -92.1 30.2, -93.2...
## 3 There 3 San Antonio, Texas (-98.5 29.4) (-104 32.4) Carlsbad, New Mexico src dst 458. 727. (-98.5 29.4, -98.7 29.7, -98.9 30, -99.5 30.3, -99.8 30.5, -99.9 30.5, -100 30.4, -101...
## 4 There 4 Carlsbad, New Mexico (-104 32.4) (-112 36.1) Grand Canyon NP, Arizona src dst 749. 1100. (-104 32.4, -104 32.5, -104 32.6, -104 32.9, -104 32.9, -104 33.3, -105 33.4, -105 33....
## 5 There 5 Grand Canyon NP, Arizona (-112 36.1) (-111 38.2) Grover, Utah src dst 510. 628. (-112 36.1, -112 36, -112 36, -112 36, -112 36, -112 35.9, -112 35.9, -112 35.9, -111 ...
## 6 There 6 Grover, Utah (-111 38.2) (-112 40.1) Spanish Fork, Utah src dst 197. 273. (-111 38.2, -111 38.2, -111 38.3, -112 38.3, -112 38.4, -112 38.4, -112 38.4, -112 38....
## 7 Back again 1 Spanish Fork, Utah (-112 40.1) (-112 43.4) Shelley, Idaho src dst 262. 411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,...
## 8 Back again 1 Shelley, Idaho (-112 43.4) (-111 44.5) Yellowstone NP, Wyoming src dst 199. 240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -...
## 9 Back again 2 Yellowstone NP, Wyoming (-111 44.5) (-105 44.6) Devil's Tower, Wyoming src dst 552. 694. (-111 44.5, -111 44.4, -111 44.5, -111 44.5, -110 44.5, -110 44.6, -110 44.5, -110 44....
## 10 Back again 2 Devil's Tower, Wyoming (-105 44.6) (-103 43.9) Mount Rushmore, South Dakota src dst 161. 215. (-105 44.6, -105 44.6, -105 44.6, -105 44.6, -105 44.5, -105 44.5, -105 44.5, -105 44....
## 11 Back again 3 Mount Rushmore, South Dakota (-103 43.9) (-93.4 43.6) Albert Lea, Minnesota src dst 548. 866. (-103 43.9, -103 43.9, -103 44, -103 44, -103 44.1, -103 44.1, -103 44.1, -102 44.1, -...
## 12 Back again 4 Albert Lea, Minnesota (-93.4 43.6) (-91.4 40.6) Nauvoo, Illinois src dst 359. 487. (-93.4 43.6, -93.3 43.1, -92.7 43.1, -92.7 43, -92.7 43, -92.6 43, -92.5 42.7, -92.5 4...
## 13 Back again 5 Nauvoo, Illinois (-91.4 40.6) (-84.4 33.7) Atlanta, Georgia src dst 847. 1199. (-91.4 40.6, -91.4 40.4, -91.6 40.4, -91.5 39.7, -91.4 39.7, -91.4 39.6, -91 39.2, -90...
The route details are included in a nested data frame list column named route
, so we’ll need to unnest it. The resulting data frame now has three different geometry columns (!) for the origin point, destination point, and the route, so we’ll set the route column as the data frame’s primary geometry column (which makes it so we can just use geom_sf(data = routes_geocoded)
instead of geom_sf(data = routes_geocoded, aes(geometry = whatever))
). The route data also includes some helpful details about the distance (in kilometers) and duration (in minutes), so we’ll create nicely formatted text-based versions of these too:
routes_geocoded <- routes_geocoded_raw %>%
unnest(route, names_sep = "_") %>%
st_set_geometry("route_geometry") %>%
mutate(
distance_miles = km_to_miles(route_distance),
distance_text = fmt_miles(distance_miles),
duration_text = fmt_duration(route_duration)
)
routes_geocoded
## Simple feature collection with 13 features and 11 fields
## Active geometry column: route_geometry
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: -113 ymin: 29.4 xmax: -84.4 ymax: 44.9
## Geodetic CRS: WGS 84
## # A tibble: 13 × 14
## direction day origin_city origin_geometry destination_geometry destination_city route_src route_dst route_duration route_distance route_geometry distance_miles distance_text duration_text
## * <fct> <dbl> <chr> <POINT [°]> <POINT [°]> <chr> <chr> <chr> <dbl> <dbl> <LINESTRING [°]> <dbl> <chr> <chr>
## 1 There 1 Atlanta, Georgia (-84.4 33.7) (-90.1 30) New Orleans, Louisiana src dst 513. 753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -... 468. 470 miles 8 hours 30 minu…
## 2 There 2 New Orleans, Louisiana (-90.1 30) (-98.5 29.4) San Antonio, Texas src dst 596. 870. (-90.1 30, -90.3 30, -90.5 30.1, -90.9 30.2, -91.1 30.4, -91.3 30.5, -92.1 30.2, -93.2... 541. 540 miles 10 hours 0 minu…
## 3 There 3 San Antonio, Texas (-98.5 29.4) (-104 32.4) Carlsbad, New Mexico src dst 458. 727. (-98.5 29.4, -98.7 29.7, -98.9 30, -99.5 30.3, -99.8 30.5, -99.9 30.5, -100 30.4, -101... 452. 450 miles 7 hours 45 minu…
## 4 There 4 Carlsbad, New Mexico (-104 32.4) (-112 36.1) Grand Canyon NP, Arizona src dst 749. 1100. (-104 32.4, -104 32.5, -104 32.6, -104 32.9, -104 32.9, -104 33.3, -105 33.4, -105 33.... 684. 680 miles 12 hours 30 min…
## 5 There 5 Grand Canyon NP, Arizona (-112 36.1) (-111 38.2) Grover, Utah src dst 510. 628. (-112 36.1, -112 36, -112 36, -112 36, -112 36, -112 35.9, -112 35.9, -112 35.9, -111 ... 390. 390 miles 8 hours 30 minu…
## 6 There 6 Grover, Utah (-111 38.2) (-112 40.1) Spanish Fork, Utah src dst 197. 273. (-111 38.2, -111 38.2, -111 38.3, -112 38.3, -112 38.4, -112 38.4, -112 38.4, -112 38.... 169. 170 miles 3 hours 15 minu…
## 7 Back again 1 Spanish Fork, Utah (-112 40.1) (-112 43.4) Shelley, Idaho src dst 262. 411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,... 256. 260 miles 4 hours 15 minu…
## 8 Back again 1 Shelley, Idaho (-112 43.4) (-111 44.5) Yellowstone NP, Wyoming src dst 199. 240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -... 149. 150 miles 3 hours 15 minu…
## 9 Back again 2 Yellowstone NP, Wyoming (-111 44.5) (-105 44.6) Devil's Tower, Wyoming src dst 552. 694. (-111 44.5, -111 44.4, -111 44.5, -111 44.5, -110 44.5, -110 44.6, -110 44.5, -110 44.... 431. 430 miles 9 hours 15 minu…
## 10 Back again 2 Devil's Tower, Wyoming (-105 44.6) (-103 43.9) Mount Rushmore, South Dakota src dst 161. 215. (-105 44.6, -105 44.6, -105 44.6, -105 44.6, -105 44.5, -105 44.5, -105 44.5, -105 44.... 133. 130 miles 2 hours 45 minu…
## 11 Back again 3 Mount Rushmore, South Dakota (-103 43.9) (-93.4 43.6) Albert Lea, Minnesota src dst 548. 866. (-103 43.9, -103 43.9, -103 44, -103 44, -103 44.1, -103 44.1, -103 44.1, -102 44.1, -... 538. 540 miles 9 hours 15 minu…
## 12 Back again 4 Albert Lea, Minnesota (-93.4 43.6) (-91.4 40.6) Nauvoo, Illinois src dst 359. 487. (-93.4 43.6, -93.3 43.1, -92.7 43.1, -92.7 43, -92.7 43, -92.6 43, -92.5 42.7, -92.5 4... 303. 300 miles 6 hours 0 minut…
## 13 Back again 5 Nauvoo, Illinois (-91.4 40.6) (-84.4 33.7) Atlanta, Georgia src dst 847. 1199. (-91.4 40.6, -91.4 40.4, -91.6 40.4, -91.5 39.7, -91.4 39.7, -91.4 39.6, -91 39.2, -90... 745. 750 miles 14 hours 0 minu…
Let’s see how this looks! Because we set the default geometry of routes_geocoded
to the route, geom_sf(data = routes_geocoded)
will automatically plot the LINESTRING
geometries for the routes as paths:
ggplot() +
geom_sf(data = lower_48) +
geom_sf(data = routes_geocoded, color = clrs[1]) +
geom_sf(data = all_stops_unique) +
geom_label_repel(
data = all_stops_unique,
aes(label = city, geometry = geometry),
stat = "sf_coordinates", seed = 1234,
size = 3, segment.color = clrs[3], min.segment.length = 0
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip()
![Automatically geocoded routes between all our road trip stops](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-routes-stops-1.png)
AHH that’s so cool! It works!
We’re not done yet though—let’s make this even fancier!
States crossed through
We’re going to be driving through a lot of states. Let’s highlight each of the states we’ll cross through to see how many there are. To do this we can use st_intersection()
to find which of the lower_48
states contain a part of the different routes. We’ll then make a copy of lower_48
with a new column indicating if we’ll visit the state:
states_crossed_through <- st_intersection(
st_transform(lower_48, st_crs(routes_geocoded)),
routes_geocoded
)
# There are 32 rows here, but 18 unique states (i.e. one day will end in a state
# and start the next day in the same state, so it gets counted twice)
states_crossed_through %>%
select(STATEFP, NAME, direction, day)
## Simple feature collection with 32 features and 4 fields
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: -113 ymin: 29.4 xmax: -84.4 ymax: 44.9
## Geodetic CRS: WGS 84
## First 10 features:
## STATEFP NAME direction day geometry
## 4 13 Georgia There 1 LINESTRING (-84.4 33.7, -84...
## 10 22 Louisiana There 1 LINESTRING (-89.6 30.3, -89...
## 32 28 Mississippi There 1 LINESTRING (-88.4 30.5, -88...
## 41 01 Alabama There 1 LINESTRING (-85.2 32.9, -85...
## 1 48 Texas There 2 LINESTRING (-93.7 30.1, -93...
## 10.1 22 Louisiana There 2 LINESTRING (-90.1 30, -90.3...
## 1.1 48 Texas There 3 LINESTRING (-98.5 29.4, -98...
## 36 35 New Mexico There 3 LINESTRING (-104 32, -104 3...
## 36.1 35 New Mexico There 4 LINESTRING (-104 32.4, -104...
## 47 04 Arizona There 4 LINESTRING (-109 35.4, -109...
unique(states_crossed_through$NAME)
## [1] "Georgia" "Louisiana" "Mississippi" "Alabama" "Texas" "New Mexico" "Arizona" "Utah" "Idaho" "Montana" "Wyoming" "South Dakota" "Minnesota" "Illinois" "Iowa" "Kentucky" "Missouri" "Tennessee"
# Create a column that flags if the state is cross through
lower_48_highlighted <- lower_48 %>%
mutate(visited = NAME %in% unique(states_crossed_through$NAME))
Now we can fill using the visited
column and make the visited states subtly darker:
ggplot() +
geom_sf(data = lower_48_highlighted, aes(fill = visited)) +
geom_sf(data = routes_geocoded, color = clrs[1]) +
geom_sf(data = all_stops_unique) +
geom_label_repel(
data = all_stops_unique,
aes(label = city, geometry = geometry),
stat = "sf_coordinates", seed = 1234,
size = 3, segment.color = clrs[3], min.segment.length = 0
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip()
![Map of road trip route with crossed-through states highlighted](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-routes-stops-highlighted-1.png)
Summary statistics
Now that we have a list of the states we’ll cross through, let’s play with some quick summary statistics.
n_states <- length(unique(states_crossed_through$NAME))
n_states
## [1] 18
total_distance <- sum(routes_geocoded$route_distance) %>% km_to_miles() %>% fmt_miles()
total_distance
## [1] "5,260 miles"
total_time <- sum(routes_geocoded$route_duration) %>% fmt_duration()
total_time
## [1] "99 hours 15 minutes"
We’re going to drive through 18 states, with a total distance of 5,260 miles over the span of 99 hours 15 minutes (!!)
We can use these distance or duration columns to add some extra detail to the map:
ggplot() +
geom_sf(data = lower_48_highlighted, aes(fill = visited)) +
geom_sf(data = routes_geocoded, color = clrs[1]) +
geom_sf(data = all_stops_unique) +
geom_label_repel(
data = routes_geocoded,
aes(label = distance_text, geometry = route_geometry),
stat = "sf_coordinates", seed = 12345,
size = 3, segment.color = clrs[3], min.segment.length = 0,
fill = colorspace::lighten(clrs[5], 0.8)
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
labs(title = glue("{n_states} states; {total_distance}; and {total_time} in the car. Bring it on.")) +
theme_roadtrip()
![Map of road trip showing the distances between each stop](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-routes-stops-distances-1.png)
Faceting
We have some nice tidy data here with columns for direction and day, which makes it easy to subset the data. We can make separate maps for the drive there and the drive back, or even separate maps for each day of the trip. Let’s add facet_wrap()
to the plot:
ggplot() +
geom_sf(data = lower_48_highlighted, aes(fill = visited)) +
geom_sf(data = routes_geocoded, color = clrs[1]) +
geom_sf(data = all_stops_unique) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip() +
facet_wrap(vars(direction), ncol = 1)
![Road trip map faceted by direction](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-facet-subset-issues-1.png)
There are two problems with this though:
- The highlighted states are the same in both facets
- The final destination point is missing in each facet (Spanish Fork, Utah in the “There” facet; Atlanta, Georgia in the “Back Again” facet)
Both of these issues are fixable though.
Fixing state highlighting
First we need to change how we identify the states we’ll cross through. Before, we kept the unique state names, but this stripped away the details about the direction of the trip, so we’ll find all the distinct combinations of direction and state and then join that little dataset to lower_48
to make the visited
column:
states_crossed_through_better <- st_intersection(
st_transform(lower_48, st_crs(routes_geocoded)),
routes_geocoded
) %>%
distinct(direction, NAME)
states_crossed_through_better
## direction NAME
## 4 There Georgia
## 10 There Louisiana
## 32 There Mississippi
## 41 There Alabama
## 1 There Texas
## 36 There New Mexico
## 47 There Arizona
## 35 There Utah
## 12 Back again Idaho
## 35.2 Back again Utah
## 15 Back again Montana
## 25 Back again Wyoming
## 23 Back again South Dakota
## 16 Back again Minnesota
## 14 Back again Illinois
## 18 Back again Iowa
## 3 Back again Kentucky
## 4.1 Back again Georgia
## 7 Back again Missouri
## 9 Back again Tennessee
lower_48_highlighted_better <- lower_48 %>%
left_join(states_crossed_through_better, by = join_by(NAME)) %>%
mutate(visited = !is.na(direction))
as_tibble(lower_48_highlighted_better)
## # A tibble: 51 × 12
## STATEFP STATENS AFFGEOID GEOID STUSPS NAME LSAD ALAND AWATER direction geometry visited
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <fct> <MULTIPOLYGON [°]> <lgl>
## 1 48 01779801 0400000US48 48 TX Texas 00 676685555821 18974391187 There (((-107 31.9, -107 32, -107 32, -107 32, -106 32, -106 32, -106 32, -106 32, -105 32... TRUE
## 2 06 01779778 0400000US06 06 CA California 00 403673617862 20291712025 <NA> (((-119 33.5, -118 33.5, -118 33.4, -118 33.4, -118 33.3, -118 33.3, -118 33.3, -118... FALSE
## 3 21 01779786 0400000US21 21 KY Kentucky 00 102266581101 2384240769 Back again (((-89.5 36.6, -89.5 36.6, -89.4 36.6, -89.4 36.6, -89.3 36.6, -89.3 36.6, -89.3 36.... TRUE
## 4 13 01705317 0400000US13 13 GA Georgia 00 149486268417 4418716153 There (((-85.6 35, -85.5 35, -85.4 35, -85.4 35, -85.3 35, -85.3 35, -85 35, -85 35, -85 3... TRUE
## 5 13 01705317 0400000US13 13 GA Georgia 00 149486268417 4418716153 Back again (((-85.6 35, -85.5 35, -85.4 35, -85.4 35, -85.3 35, -85.3 35, -85 35, -85 35, -85 3... TRUE
## 6 55 01779806 0400000US55 55 WI Wisconsin 00 140292518676 29343193162 <NA> (((-86.9 45.4, -86.8 45.5, -86.8 45.4, -86.9 45.4, -86.9 45.3, -87 45.4, -86.9 45.4)... FALSE
## 7 41 01155107 0400000US41 41 OR Oregon 00 248630319014 6169061220 <NA> (((-125 42.8, -124 43, -124 43, -124 43.1, -124 43.2, -124 43.2, -124 43.3, -124 43.... FALSE
## 8 29 01779791 0400000US29 29 MO Missouri 00 178052253239 2487526202 Back again (((-95.8 40.6, -95.5 40.6, -95.4 40.6, -95.3 40.6, -95.2 40.6, -95.1 40.6, -94.9 40.... TRUE
## 9 51 01779803 0400000US51 51 VA Virginia 00 102258178227 8528072639 <NA> (((-76 37.3, -76 37.4, -76 37.4, -75.9 37.5, -75.9 37.6, -75.9 37.6, -75.9 37.7, -75... FALSE
## 10 47 01325873 0400000US47 47 TN Tennessee 00 106792368794 2322190840 Back again (((-90.3 35, -90.3 35, -90.2 35.1, -90.2 35.1, -90.2 35.1, -90.1 35.1, -90.1 35.2, -... TRUE
## # ℹ 41 more rows
Let’s see how it works by adding a new geom_sf()
layer for the highlighted states. We need to double up here with both lower_48
and lower_48_highlighted_better
because if we only plot the highlighted states, each facet will only contain those states and not the underlying US map, which we don’t want.
ggplot() +
geom_sf(data = lower_48) +
geom_sf(data = na.omit(lower_48_highlighted_better), aes(fill = visited)) +
geom_sf(data = routes_geocoded, color = clrs[1]) +
geom_sf(data = all_stops_unique) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
scale_fill_manual(values = c("grey80"), guide = "none") +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip() +
facet_wrap(vars(direction), ncol = 1)
![Each facet now only shows the states corresponding to each direction](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-facet-subset-issues-fixed1-1.png)
There, we fixed issue #1. ✅
Issue #2—the missing endpoint in each facet—is a little trickier to deal with, but still doable. Before messing with the real data, let’s look at a simpler toy example first. Here’s a little dataset with three different “routes” between cities. The coordinates here aren’t actually coordinates—they’re just some arbitrary numbers. But that’s okay—this basically mirrors what we have in routes_geocoded
Fixing missing destination cities
example <- tribble(
~day, ~origin_city, ~destination_city, ~origin_coords, ~destination_coords,
1, "Atlanta", "Boston", 1234, 5678,
2, "Boston", "Chicago", 5678, 91011,
3, "Chicago", "Denver", 91011, 131415
)
example
## # A tibble: 3 × 5
## day origin_city destination_city origin_coords destination_coords
## <dbl> <chr> <chr> <dbl> <dbl>
## 1 1 Atlanta Boston 1234 5678
## 2 2 Boston Chicago 5678 91011
## 3 3 Chicago Denver 91011 131415
The reason we’re not getting the final point in each facet or subset is because there are only three rows here, but four cities. If we plotted the origin_coords
column, Atlanta would be missing; if we plotted the destination_coords
column, Denver would be missing. We need to manipulate this data so that it has 4 rows (Atlanta, Boston, Chicago, Denver), with the correct coordinates for each city.
There’s an easy way to do this with pivot_longer()
from {tidyr}. We can pivot with multiple columns that follow a pattern in their names. Here, all four of the columns we care about are are prefixed destination
or origin
, followed by a _
. If we specify these name settings in pivot_longer()
, it’ll automatically create a column named type
for destination and origin and it’ll put the rest of the data in corresponding columns:
example %>%
pivot_longer(
cols = c(origin_city, destination_city, origin_coords, destination_coords),
names_to = c("type", ".value"),
names_sep = "_"
)
## # A tibble: 6 × 4
## day type city coords
## <dbl> <chr> <chr> <dbl>
## 1 1 origin Atlanta 1234
## 2 1 destination Boston 5678
## 3 2 origin Boston 5678
## 4 2 destination Chicago 91011
## 5 3 origin Chicago 91011
## 6 3 destination Denver 131415
Now we have one row per city, which is close to what we need. Boston and Chicago are duplicated (since they’re both destination and origin cities), so we need to filter out duplicate cities. There are lots of ways to do this—my preferred way is to group by city and keep the first row in each group using slice()
:
example %>%
pivot_longer(
cols = c(origin_city, destination_city, origin_coords, destination_coords),
names_to = c("type", ".value"),
names_sep = "_"
) %>%
group_by(city) %>%
slice(1)
## # A tibble: 4 × 4
## # Groups: city [4]
## day type city coords
## <dbl> <chr> <chr> <dbl>
## 1 1 origin Atlanta 1234
## 2 1 destination Boston 5678
## 3 2 destination Chicago 91011
## 4 3 destination Denver 131415
Woohoo! A dataset with four rows and the correct coordinates for each city. If we could plot this, we’d get both endpoints (Atlanta and Denver).
This same double pivot approach (magically!!) works with the special geometry columns in our actual routes data:
all_stops_endpoints <- routes_geocoded %>%
pivot_longer(
cols = c(origin_city, destination_city, origin_geometry, destination_geometry),
names_to = c("type", ".value"),
names_sep = "_"
) %>%
group_by(direction, city) %>%
slice(1) %>%
arrange(direction, day, desc(type)) %>%
# Use "geometry" as the default geometry column instead of the routes
st_set_geometry("geometry")
all_stops_endpoints
## Simple feature collection with 15 features and 11 fields
## Active geometry column: geometry
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -112 ymin: 29.4 xmax: -84.4 ymax: 44.6
## Geodetic CRS: WGS 84
## # A tibble: 15 × 13
## # Groups: direction, city [15]
## direction day route_src route_dst route_duration route_distance route_geometry distance_miles distance_text duration_text type city geometry
## <fct> <dbl> <chr> <chr> <dbl> <dbl> <LINESTRING [°]> <dbl> <chr> <chr> <chr> <chr> <POINT [°]>
## 1 There 1 src dst 513. 753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -... 468. 470 miles 8 hours 30 minutes origin Atlanta, Georgia (-84.4 33.7)
## 2 There 1 src dst 513. 753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -... 468. 470 miles 8 hours 30 minutes destination New Orleans, Louisiana (-90.1 30)
## 3 There 2 src dst 596. 870. (-90.1 30, -90.3 30, -90.5 30.1, -90.9 30.2, -91.1 30.4, -91.3 30.5, -92.1 30.2, -93.2... 541. 540 miles 10 hours 0 minutes destination San Antonio, Texas (-98.5 29.4)
## 4 There 3 src dst 458. 727. (-98.5 29.4, -98.7 29.7, -98.9 30, -99.5 30.3, -99.8 30.5, -99.9 30.5, -100 30.4, -101... 452. 450 miles 7 hours 45 minutes destination Carlsbad, New Mexico (-104 32.4)
## 5 There 4 src dst 749. 1100. (-104 32.4, -104 32.5, -104 32.6, -104 32.9, -104 32.9, -104 33.3, -105 33.4, -105 33.... 684. 680 miles 12 hours 30 minutes destination Grand Canyon NP, Arizona (-112 36.1)
## 6 There 5 src dst 510. 628. (-112 36.1, -112 36, -112 36, -112 36, -112 36, -112 35.9, -112 35.9, -112 35.9, -111 ... 390. 390 miles 8 hours 30 minutes destination Grover, Utah (-111 38.2)
## 7 There 6 src dst 197. 273. (-111 38.2, -111 38.2, -111 38.3, -112 38.3, -112 38.4, -112 38.4, -112 38.4, -112 38.... 169. 170 miles 3 hours 15 minutes destination Spanish Fork, Utah (-112 40.1)
## 8 Back again 1 src dst 262. 411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,... 256. 260 miles 4 hours 15 minutes origin Spanish Fork, Utah (-112 40.1)
## 9 Back again 1 src dst 262. 411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,... 256. 260 miles 4 hours 15 minutes destination Shelley, Idaho (-112 43.4)
## 10 Back again 1 src dst 199. 240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -... 149. 150 miles 3 hours 15 minutes destination Yellowstone NP, Wyoming (-111 44.5)
## 11 Back again 2 src dst 552. 694. (-111 44.5, -111 44.4, -111 44.5, -111 44.5, -110 44.5, -110 44.6, -110 44.5, -110 44.... 431. 430 miles 9 hours 15 minutes destination Devil's Tower, Wyoming (-105 44.6)
## 12 Back again 2 src dst 161. 215. (-105 44.6, -105 44.6, -105 44.6, -105 44.6, -105 44.5, -105 44.5, -105 44.5, -105 44.... 133. 130 miles 2 hours 45 minutes destination Mount Rushmore, South Dakota (-103 43.9)
## 13 Back again 3 src dst 548. 866. (-103 43.9, -103 43.9, -103 44, -103 44, -103 44.1, -103 44.1, -103 44.1, -102 44.1, -... 538. 540 miles 9 hours 15 minutes destination Albert Lea, Minnesota (-93.4 43.6)
## 14 Back again 4 src dst 359. 487. (-93.4 43.6, -93.3 43.1, -92.7 43.1, -92.7 43, -92.7 43, -92.6 43, -92.5 42.7, -92.5 4... 303. 300 miles 6 hours 0 minutes destination Nauvoo, Illinois (-91.4 40.6)
## 15 Back again 5 src dst 847. 1199. (-91.4 40.6, -91.4 40.4, -91.6 40.4, -91.5 39.7, -91.4 39.7, -91.4 39.6, -91 39.2, -90... 745. 750 miles 14 hours 0 minutes destination Atlanta, Georgia (-84.4 33.7)
And plotted:
ggplot() +
geom_sf(data = lower_48) +
geom_sf(data = na.omit(lower_48_highlighted_better), aes(fill = visited)) +
geom_sf(data = routes_geocoded, color = clrs[1]) +
geom_sf(data = all_stops_endpoints) +
geom_label_repel(
data = all_stops_endpoints,
aes(label = city, geometry = geometry),
stat = "sf_coordinates", seed = 1234,
size = 3, segment.color = clrs[3], min.segment.length = 0
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
scale_fill_manual(values = c("grey80"), guide = "none") +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip() +
facet_wrap(vars(direction), ncol = 1)
![Each facet now shows the destination city](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-facet-subset-issues-fixed2-1.png)
Issue #2 fixed. ✅
Zooming in
We can also make separate maps that are zoomed in on each day of the trip in addition these bigger overarching ones. This will allow us to add more details like driving time and distances without getting too crowded on the map.
We’ll create a mini map of the first day by itself, and then scale up the process to make all the mini maps programmatically.
One day of the trip
First we’ll extract the route, cities, and states for the first day of the trip out there:
route_day1 <- routes_geocoded %>%
filter(direction == "There", day == 1)
stops_day1 <- all_stops_endpoints %>%
filter(direction == "There", day == 1)
states_crossed_day1 <- st_intersection(
st_transform(lower_48, st_crs(route_day1)),
route_day1
) %>%
distinct(NAME)
lower_48_highlighted_day1 <- lower_48 %>%
mutate(visited = NAME %in% states_crossed_day1$NAME)
Let’s plot these to make sure it worked:
ggplot() +
geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
geom_sf(data = route_day1, color = clrs[1]) +
geom_sf(data = stops_day1) +
geom_sf_label(
data = stops_day1,
aes(label = city),
nudge_y = miles_to_meters(80)
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(crs = st_crs("ESRI:102003")) + # Albers
theme_roadtrip()
![Drive for the first day on the full US map](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-day1-preliminary-1.png)
Yep, it worked. But we don’t need to show the whole map—we want to zoom in on just the area around the route for the day. To do this, we can use st_bbox()
to extract a rectangle of coordinates around the route, creating a bounding box for the map that we can then use with coord_sf()
to zoom in.
st_bbox()
returns a named vector of numbers with the x- and y-axis limits:
bbox <- st_bbox(route_day1)
bbox
## xmin ymin xmax ymax
## -90.1 30.0 -84.4 33.7
Since the elements are all named, we can use them to specify the different values in coord_sf()
. First we’ll temporarily stop using the Albers projection, since the numbers we’ve extracted with st_bbox()
are on the WGS 84 (−180 to 180) scale and Albers uses meters, which will make the map not work. (But we’ll fix that in a minute!)
ggplot() +
geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
geom_sf(data = route_day1, color = clrs[1]) +
geom_sf(data = stops_day1) +
geom_sf_label(
data = stops_day1,
aes(label = city),
# We're not using Albers, so this isn't measured in meters; it's decimal degrees
nudge_y = 0.15
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass"
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(
xlim = c(bbox["xmin"], bbox["xmax"]),
ylim = c(bbox["ymin"], bbox["ymax"])
) +
theme_roadtrip()
![Trip for the first day, but zoomed in a little too far](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-day1-bbox-basic-1.png)
We’re zoomed in (yay) but the edges of the bounding box window are too close to the points and the route, and the labels are cropped funny. Also, to shift the labels up we had to think in decimal degrees instead of meters, which I can’t do naturally. And also, we can’t change projections—because we’re specifying the bounding box coordinates in decimal degrees, we’re stuck with WGS 84 instead of Albers or any other projection. These are all fixable issues, fortunately.
To fix all this, we’ll expand the bounding box window by 150 miles on all sides using st_buffer()
and then convert the coordinates to Albers before extracting the coordinates of the window for plotting:
bbox_nice <- route_day1 %>%
st_bbox() %>% # Extract the bounding box of the coordinates
st_as_sfc() %>% # Convert the bounding box matrix back to an sf object
st_buffer(miles_to_meters(150)) %>% # Add 150 miles to all sides
st_transform("ESRI:102003") %>% # Switch to Albers
st_bbox() # Extract the bounding box of the expanded box
bbox_nice
## xmin ymin xmax ymax
## 301851 -1071395 1366321 -105313
These are no longer in decimal degrees, so we can use them with the Albers projection. We’ve also added a 150-mile buffer around the window, so we should have room for all the labels now. Let’s check it:
ggplot() +
geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
geom_sf(data = route_day1, color = clrs[1]) +
geom_sf(data = stops_day1) +
geom_sf_label(
data = stops_day1,
aes(label = city),
# We're in Albers again, so we can work with meters (or miles)
nudge_y = miles_to_meters(30)
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass",
width_hint = 0.4
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(
xlim = c(bbox_nice["xmin"], bbox_nice["xmax"]),
ylim = c(bbox_nice["ymin"], bbox_nice["ymax"]),
crs = st_crs("ESRI:102003")
) +
theme_roadtrip()
![Trip for the first day, correctly zoomed and using the Albers projection](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-day1-bbox-buffer-albers-1.png)
Heck yeah.
However, the shapes look a little curved and distorted because we’re zoomed in so much. Albers is great for big countries, but on a small scale like this, WGS 84 is probably fine. That’s what Google Maps does—it only starts getting weird and curvy along horizontal latitude lines when you zoom out really far. Let’s do the first day map one last time without the Albers conversion:
bbox_nice <- route_day1 %>%
st_bbox() %>% # Extract the bounding box of the coordinates
st_as_sfc() %>% # Convert the bounding box matrix back to an sf object
st_buffer(miles_to_meters(150)) %>% # Add 150 miles to all sides
st_bbox() # Extract the bounding box of the expanded box
ggplot() +
geom_sf(data = lower_48_highlighted_day1, aes(fill = visited)) +
geom_sf(data = route_day1, color = clrs[1]) +
geom_sf(data = stops_day1) +
geom_sf_label(
data = stops_day1,
aes(label = city),
# We're in WGS 84, so these are decimal degrees
nudge_y = 0.5
) +
geom_label_repel(
data = route_day1,
aes(label = distance_text, geometry = route_geometry),
stat = "sf_coordinates", seed = 12345,
family = "Overpass ExtraBold", fontface = "plain",
size = 3.5, fill = colorspace::lighten(clrs[5], 0.8), color = clrs[1],
segment.color = "grey50", min.segment.length = 0
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass",
width_hint = 0.3
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(
xlim = c(bbox_nice["xmin"], bbox_nice["xmax"]),
ylim = c(bbox_nice["ymin"], bbox_nice["ymax"]),
crs = st_crs("EPSG:4326")
) +
theme_roadtrip()
![Trip for the first day, correctly zoomed and using the WGS 84 projection](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-day1-bbox-buffer-wgs84-1.png)
Neat. The northern borders of Georgia, Alabama, and Mississippi are all flat here, which is great.
All the days of the trip
We successfully plotted one day of the trip. But we’ll be driving for 11 days (!) and need 11 plots. I don’t want to copy/paste this all code 11 times, so we’ll stick each major step into a function:
# Take a set of routes and do some pivoting to create a dataset that includes
# the start and end points
expand_endpoints <- function(routes) {
routes %>%
pivot_longer(
cols = c(origin_city, destination_city, origin_geometry, destination_geometry),
names_to = c("type", ".value"),
names_sep = "_"
) %>%
group_by(city) %>%
slice(1) %>%
arrange(desc(type)) %>%
st_set_geometry("geometry")
}
# Take a set of routes and return a dataset with a flag marking which states they cross
find_states <- function(routes) {
# st_intersection() complains about innocuous things so suppress the warnings
suppressWarnings({
states_crossed <- st_intersection(
st_transform(lower_48, st_crs(routes)),
routes
) %>%
distinct(NAME)
})
lower_48 %>%
mutate(visited = NAME %in% states_crossed$NAME)
}
# Use routes, stopes, and states to build a plot
build_day_map <- function(routes, stops, states, direction, day) {
bbox_nice <- routes %>%
st_bbox() %>% # Extract the bounding box of the coordinates
st_as_sfc() %>% # Convert the bounding box matrix back to an sf object
st_buffer(miles_to_meters(150)) %>% # Add 150 miles to all sides
st_bbox() # Extract the bounding box of the expanded box
ggplot() +
geom_sf(data = states, aes(fill = visited)) +
geom_sf(data = routes, color = clrs[1]) +
geom_sf(data = stops) +
geom_sf_label(
data = stops,
aes(label = city),
# We're in WGS 84, so these are decimal degrees
nudge_y = 0.4
) +
geom_label_repel(
data = routes,
aes(label = distance_text, geometry = route_geometry),
stat = "sf_coordinates", seed = 12345,
family = "Overpass ExtraBold", fontface = "plain",
size = 3.5, fill = colorspace::lighten(clrs[5], 0.8), color = clrs[1],
segment.color = "grey50", min.segment.length = 0
) +
annotation_scale(
location = "bl", bar_cols = c("grey30", "white"),
unit_category = "imperial", text_family = "Overpass",
width_hint = 0.4
) +
scale_fill_manual(values = c("grey90", "grey80"), guide = "none") +
coord_sf(
xlim = c(bbox_nice["xmin"], bbox_nice["xmax"]),
ylim = c(bbox_nice["ymin"], bbox_nice["ymax"]),
crs = st_crs("EPSG:4326")
) +
labs(title = glue("{direction}, day {day}")) +
theme_roadtrip()
}
With everything functionalized, we can do some super wild {purrr} magic. If we take routes_geocoded
and group it by direction and day (so there’s a group per driving day), we’ll get a list column that contains the geocded coordinates for each day:
daily_plots <- routes_geocoded %>%
group_by(direction, day) %>%
nest(.key = "routes")
daily_plots
## # A tibble: 11 × 3
## # Groups: direction, day [11]
## direction day routes
## <fct> <dbl> <list>
## 1 There 1 <sf [1 × 12]>
## 2 There 2 <sf [1 × 12]>
## 3 There 3 <sf [1 × 12]>
## 4 There 4 <sf [1 × 12]>
## 5 There 5 <sf [1 × 12]>
## 6 There 6 <sf [1 × 12]>
## 7 Back again 1 <sf [2 × 12]>
## 8 Back again 2 <sf [2 × 12]>
## 9 Back again 3 <sf [1 × 12]>
## 10 Back again 4 <sf [1 × 12]>
## 11 Back again 5 <sf [1 × 12]>
# Check the first day
daily_plots$routes[[1]]
## Simple feature collection with 1 feature and 9 fields
## Active geometry column: route_geometry
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: -90.1 ymin: 30 xmax: -84.4 ymax: 33.7
## Geodetic CRS: WGS 84
## # A tibble: 1 × 12
## origin_city origin_geometry destination_geometry destination_city route_src route_dst route_duration route_distance route_geometry distance_miles distance_text duration_text
## <chr> <POINT [°]> <POINT [°]> <chr> <chr> <chr> <dbl> <dbl> <LINESTRING [°]> <dbl> <chr> <chr>
## 1 Atlanta, Georgia (-84.4 33.7) (-90.1 30) New Orleans, Louisiana src dst 513. 753. (-84.4 33.7, -84.5 33.6, -84.8 33.4, -84.9 33.1, -85.1 32.9, -85.4 32.6, -85.7 32.5, -... 468. 470 miles 8 hours 30 minutes
# Check the first day of the return trip
daily_plots$routes[[7]]
## Simple feature collection with 2 features and 9 fields
## Active geometry column: route_geometry
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: -112 ymin: 40.1 xmax: -111 ymax: 44.7
## Geodetic CRS: WGS 84
## # A tibble: 2 × 12
## origin_city origin_geometry destination_geometry destination_city route_src route_dst route_duration route_distance route_geometry distance_miles distance_text duration_text
## <chr> <POINT [°]> <POINT [°]> <chr> <chr> <chr> <dbl> <dbl> <LINESTRING [°]> <dbl> <chr> <chr>
## 1 Spanish Fork, Utah (-112 40.1) (-112 43.4) Shelley, Idaho src dst 262. 411. (-112 40.1, -112 40.2, -112 40.3, -112 40.5, -112 41, -112 41.1, -112 41.2, -112 41.6,... 256. 260 miles 4 hours 15 minutes
## 2 Shelley, Idaho (-112 43.4) (-111 44.5) Yellowstone NP, Wyoming src dst 199. 240. (-112 43.4, -112 43.8, -112 43.9, -112 43.9, -112 44, -112 44, -111 44.1, -111 44.2, -... 149. 150 miles 3 hours 15 minutes
We can then use purrr::map()
and purrr::pmap()
to feed that nested list-column data frame through the different functions to expand endpoints, find crossed-through states, and build the daily plot.
Before looking at the plots, check out all these nested datasets! Each day has a dataset for the routes, stops, states, and final plot, and everything is contained here in a data frame. MAGICAL.
daily_plots
## # A tibble: 11 × 6
## # Groups: direction, day [11]
## direction day routes stops states plot
## <fct> <dbl> <list> <list> <list> <list>
## 1 There 1 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 2 There 2 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 3 There 3 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 4 There 4 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 5 There 5 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 6 There 6 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 7 Back again 1 <sf [2 × 12]> <sf [3 × 11]> <sf [49 × 11]> <gg>
## 8 Back again 2 <sf [2 × 12]> <sf [3 × 11]> <sf [49 × 11]> <gg>
## 9 Back again 3 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 10 Back again 4 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
## 11 Back again 5 <sf [1 × 12]> <sf [2 × 11]> <sf [49 × 11]> <gg>
We can extract the plot for any single day with indexing:
daily_plots$plot[[10]]
![Trip for the day 4 of the return trip](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-day10-indexed-1.png)
Or we can filter, pull, and pluck like good denizens of the tidyverse:
![Trip for the day 5 of the return trip](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/plot-day5-verbose-1.png)
Here’s all of them simultaneously inside a Quarto tabset, just for fun:
Or, instead of extracting each plot individually like this, we can use wrap_plots()
from {patchwork} to combine a whole list of plots automatically, though we have a lot less control over the plots this way—some of the maps use a landscape orientation and some use a portrait orientation, so they’re a big mess when combined like this:
![Ugly massive plot of all the days of the trip](https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/index_files/figure-html/patchwork-everything-ugly-1.png)
Time to add some maps to road trip journals and finish packing!
Creative Commons CC BY-SA 4.0
4AA2 FA83 A8B2 05A4 E30F
610D 1382 6216 9178 36AB
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK