12. Graphics

With R you can make beautiful plots. You have a lot of control over what you want your plots to look like. But all the control is via code, and this does make things pretty complicated at times.

Moreover, there are entirely different approaches to make plots. Here we only discuss scatter-plots with the “base” package. The next chapter shows other basic plot types. The chapter thereafter shows how you can make plots with additional packages levelplot and ggplot. It is very useful to learn about “base” plotting first before you get into the more complicated (and sometimes, but not always more fancy) approaches. There are many websites with cool examples.

Here we use the cars data set that comes with R. It has two variables: the speed of cars and the distances taken to stop (data recorded in the 1920s), see ?cars

Scatter plots

data(cars)
head(cars)
##   speed dist
## 1     4    2
## 2     4   10
## 3     7    4
## 4     7   22
## 5     8   16
## 6     9   10

As we only have two variables, we can simply do

plot(cars)

But to be more explicit:

plot(cars[,1], cars[,2])

And now to embellish, add axes labels.

plot(cars[,1], cars[,2], xlab='Speed of car (miles/hr)', ylab='Stopping distance (feet)')

Different symbols (pch is the symbol type, cex is the size).

plot(cars, xlab='Speed of car (miles/hr)', ylab='Stopping distances (feet)', pch=20, cex=2, col='red')

Let’s change some things about the axes. Use xlim and ylim to set the start and end of an axis. las=1 changes the orientation of the y-axis labels to horizontal.

plot(cars, xlab='Speed', ylab='Time', pch=20, cex=2, col='red', xlim = c(0,25), las=1)

Here we do not draw axes at first, and then add the lower (1) and left (2) axis, to avoid drawing the clutter from the unnecessary “upper” and “right” axis. Arguments xaxs="i" and yaxs="i" force the axis to touch at (0,0).

plot(cars, xlab='Speed', ylab='', pch=20, cex=2, col='red', xlim = c(0,27), ylim=c(0,125), axes=FALSE, xaxs="i", yaxs="i")
axis(1)
axis(2, las=1)
text(5, 100, 'Cars!', cex=2, col='blue')
par(xpd=NA)
text(-1, 133, 'Distance\n(feet)')

We can change the symbols using another variable. Let’s say we have three car brands and that we want to vary the symbol type, color, and size by brand (typically one of these changes should suffice to distinguish them!).

set.seed(0)
brands <- c('Buick', 'Chevrolet', 'Ford')
b <- sample(brands, nrow(cars), replace=TRUE)
i <- match(b, brands)
plot(cars, pch=i+1, cex=i, col=rainbow(3)[i])
j <- 1:length(brands)
legend(5, 120, brands, pch=(j+1), pt.cex=j, col=rainbow(3), cex=1.5)

The important step is the use of match, that creates for each character string a matching number that can be used to set the character type desired.

As you have seen above, plot takes many variables. Several other parameters influencing your plot, can be set with par. See ?par for details. Here I use it to create 4 subplots (mfrow=c(2,2) with non-default margins (mar=c(2,3,1.5,1.5)).

par(mfrow=c(2,2), mar=c(2,3,1.5,1.5))
for (i in 1:4) {
    plot(sample(cars[,1]), sample(cars[,2]), xlab='', ylab='', las=1)
}

Some other base plots

Consider the InsectSprays dataset

head(InsectSprays)
##   count spray
## 1    10     A
## 2     7     A
## 3    20     A
## 4    14     A
## 5    14     A
## 6    12     A
hist(InsectSprays[,1])
x <- aggregate(InsectSprays[,1,drop=F], InsectSprays[,2,drop=F], sum)
barplot(x[,2], names=x[,1], horiz=T, las=1)
boxplot(count ~ spray, data = InsectSprays, col = "lightgray")