Well folks, another new year’s resolution down the drain. I was initially shooting for a post each month for 2014. More projects came. Plates were full. Plates were emptied. More plates were filled again. I think I will just alter my resolution to 12 posts this year. That’s a fair compromise with myself, right? That’s what we Americans do. Needless to say, it will likely be a busy last week of December for me.
I’m taking a short break from the previous series to share a great data visualization platform I stumbled upon called plotly.There is even an R package that allows you to feed data directly to their site for further analysis and manipulation. Blew my mind and I had to share. Anyway, check out their site for some mesmerizing graphics and data visualization capabilities!
This post is based off of a guest blog post by Matt Sundquist of plotly on Corey Chivers’ blog bayesianbiologist. I tweaked the code only slightly to accommodate my data and I added a geocoding section. Other than that, they are the masterminds.
Alright so with the obvious boom of craft breweries here in Virginia (and well, across the country), I thought I’d be well-received doing a post on two of my favorite things: geographic data visualization and booze.
First off, in order to harness the great powers of plotly, you must register at https://plot.ly/ for your own account. Next, we install the package that will allow us to connect from R to our fresh, new plotly account.
install.packages("devtools") library("devtools") devtools::install_github("R-api","plotly")
After loading the packages, we can log in to our plotly account straight from R by typing in our respective username and API key (to obtain your API key, log in to plot.ly via your web browser, click Profile > Edit Profile and you will see your API key)
library(plotly) library(maps) p <- plotly(username="bobdole", key="abcbaseonme")
For my data set of craft brewery locations in Virginia, I queried a data set of current brewery licensees in the state from the Virginia Department of Alcoholic Beverage Control website. I then removed the 'big guys' (sorry, this bud is not for you) and aggregated the count of breweries by city/town and saved as a .csv file. Now we read in our data:
data = read.csv("C:/breww.csv", header=TRUE)
Matt's data already had location coordinates. Since mine only has the respective city/state, I need to geocode it so R will understand how to plot locations on the map. For this I am using the ever-faithful ggmap package.
We named the sheet "data" when we read it in and the column that has the city/state of each brewery is called "City". We can now batch geocode each city. The function geocode() returns and m x 2 matrix, where m is the number of rows of data (cities) and the 2 columns are the latitude (default column name is lat ) and longitude (default column name is lon) of each respective city. We create two new columns in our data set and set them equal to the two columns of the data frame loc we just created.
library(ggmap) loc <- geocode(as.character(data$City)) data$lon<-loc$lon data$lat<-loc$lat
We call the state outlines using the map() function, take its xy coordinates, and assign this as the first trace for plotting the map.
trace1 <- list(x=map("state")$x, y=map("state")$y)
We then create the second trace by extracting the longitude and latitude from our data (assigning as x and y plots, respectively). We specify that the size of the bubbles on the map is based on data$No (i.e. bigger bubble, more breweries), which is the column containing the number of breweries in each respective city.
trace2 <- list(x= data$lon, y=data$lat, text=data$City, type="scatter", mode="markers", marker=list( "size"=sqrt(data$No/max(data$No))*100, "opacity"=0.5) )
Finally, we combine the two traces and send our data to our plotly profile.
response <- p$plotly(trace1,trace2) url <- response$url filename <- response$filename browseURL(response$url)
Like magic, running the last code will open your browser and load your fancy new map in the plot.ly interface, ready for you to zoom, crop, and manipulate to your heart's content!
Static shots can also be exported at very high resolutions from the plotly site:
Maps like this often produce results that may mislead folks. They often just reflect populations, not a higher propensity to consume craft beer - more people in an area (i.e. Richmond, DC, Virginia Beach) results in the capacity and demand for more breweries overall. A 'craft breweries per capita map' would arguably tell a more interesting story. Thanks for reading!