Hadley Wickham’s ggplot2 package is a very powerful and (once you’ve got used to it) intuitive R graphics framework, based on the Grammar of Graphics, that most R users will come across at some point. One of its most useful features is facetting: splitting data up between multiple plots, in the same window (or device), based on some aspect of the data – usually a factor.
Facetting is great but, as with many aspects of ggplot2, it lacks some flexibility. The reasons for this lack of flexibility are often sensible, but it can be frustrating. We can use facets to split data, but there’s no option to vary the geoms used to plot data between facets: all facets must share the same geom, e.g. bars, or lines. This can be frustrating when some aspects of your data would be better represented using a different geom to the one you’ve already chosen.
Yesterday, however, I came across a work-around for this via this Stat Bandit blog post. The work-around uses the subset() function within each geom to control which facet each geom in plotted on. I’ve included an example below, which illustrates plotting monthly counts of blog views alongside a cumulative count. My code is based on that on the aforementioned post, so do check that out too.
# Load libraries library(ggplot2) library(reshape) # Load data BLOGVIEWS = read.table("blogviews.txt", header = T, sep = "\t") # We have times series data, with one observation per month # Convert into Date class, specifying "1" as the day of the month BLOGVIEWS$DATE = as.Date(paste("1", BLOGVIEWS$MON, BLOGVIEWS$YEAR), format = "%d %b %Y") # Check that our data look as we expect str(BLOGVIEWS) # We want to replace NAs (representing zero views) with 0 BLOGVIEWS$VIEWS[is.na(BLOGVIEWS$VIEWS)] = 0 # Next we calculate cumulative site views by month BLOGVIEWS$CVIEWS = cumsum(BLOGVIEWS$VIEWS) # Check the results BLOGVIEWS # To plot the data using facets, we need to reshape the # data into 'long' format using melt (BVIEWS.MELT = melt(BLOGVIEWS, id.vars = c("DATE", "MON", "YEAR"))) # Change the levels of the 'variable' factor so that our # facets have sensible names levels(BVIEWS.MELT$variable) = c("Monthly views", "Cumulative views") # The first plot sets up the axes and facets, but we # use geom_blank to draw a blank plot, which we'll add # geoms to next g1 = ggplot(BVIEWS.MELT, aes(DATE, value)) + facet_wrap(~ variable, nrow = 2, scales = "free_y") + labs(x = "Year", y = "Number of views") # Update the first plot, adding bars to display monthly counts # The subset operation ensures that we only add to the facet # corresponding to 'Monthly views' g2 = g1 + geom_bar(subset = .(variable == "Monthly views"), stat = "identity") # Do the same for the 'Cumulative views' facet. It makes # more sense to display these data using geom_line g3 = g2 + geom_line(subset = .(variable == "Cumulative views"), colour = "blue", size = 1) # Finally, print the plot and save it to a .png image file print(g3) ggsave(file = "g3.png")