3. Vector data

3.1 Introduction

Package sp is the central package supporting spatial data analysis in R. sp defines a set of classes to represent spatial data. A class defines a particular data type. The data.frame is an example of a class. Any particular data.frame you create is an object (instantiation) of that class.

The main reason for defining classes is to create a standard representation of a particular data type to make it easier to write functions (also known as ‘methods’) for them. In fact, the sp package does not provide many functions to modify or analyse spatial data; but the classes it defines are used in more than 100 other R packages that provide specific functionality. See Hadley Wickham’s Advanced R or John Chambers’ Software for data analysis for a detailed discussion of the use of classes in R).

Package sp introduces a number of classes with names that start with Spatial. For vector data, the basic types are the SpatialPoints, SpatialLines, and SpatialPolygons. These classes only represent geometries. To also store attributes, classes are available with these names plus DataFrame, for example, SpatialPolygonsDataFrame and SpatialPointsDataFrame. When referring to any object with a name that starts with Spatial, it is common to write Spatial*. When referring to a SpatialPolygons or SpatialPolygonsDataFrame object it is common to write SpatialPolygons*. The Spatial classes (and their use) are described in detail by Bivand, Pebesma and Gómez-Rubio.

It is possible to create Spatial* objects from scratch with R code. That can be very useful to create small self contained example to illustrate something, for example to ask a question about how to do a particular operation without needing to give access to the real data you are using (which is always cumbersom). But in real life you will read these from a file or database, for example from a shapefile see Chapter 5.

To get started, let’s make some Spatial objects from scratch anyway, using the same data as were used in the previous chapter.

3.2 SpatialPoints

longitude <- c(-116.7, -120.4, -116.7, -113.5, -115.5, -120.8, -119.5, -113.7, -113.7, -110.7)
latitude <- c(45.3, 42.6, 38.9, 42.1, 35.7, 38.9, 36.2, 39, 41.6, 36.9)
lonlat <- cbind(longitude, latitude)

Now create a SpatialPoints object

library(sp)
pts <- SpatialPoints(lonlat)

Let’s check what kind of object pts is.

class (pts)
## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"

And what is inside of it

showDefault(pts)
## An object of class "SpatialPoints"
## Slot "coords":
##       longitude latitude
##  [1,]    -116.7     45.3
##  [2,]    -120.4     42.6
##  [3,]    -116.7     38.9
##  [4,]    -113.5     42.1
##  [5,]    -115.5     35.7
##  [6,]    -120.8     38.9
##  [7,]    -119.5     36.2
##  [8,]    -113.7     39.0
##  [9,]    -113.7     41.6
## [10,]    -110.7     36.9
##
## Slot "bbox":
##              min    max
## longitude -120.8 -110.7
## latitude    35.7   45.3
##
## Slot "proj4string":
## CRS arguments: NA

So we see that the object has the coordinates we supplied, but also a bbox. This is a ‘bounding box’, or the ‘spatial extent’ that was computed from the coordinates. There is also a “proj4string”. This stores the coordinate reference system (“crs”, discussed in more detail later). We did not provide the crs so it is unknown (NA). That is not good, so let’s recreate the object, and now provide a crs.

crdref <- CRS('+proj=longlat +datum=WGS84')
pts <- SpatialPoints(lonlat, proj4string=crdref)

I load to raster package to improve how Spatial objects are printed.

library(raster)
pts
## class       : SpatialPoints
## features    : 10
## extent      : -120.8, -110.7, 35.7, 45.3  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0

We can use the SpatialPoints object to create a SpatialPointsDataFrame object. First we need a data.frame with the same number of rows as there are geometries.

df <- data.frame(ID=1:nrow(lonlat), precip=(latitude-30)^3)

Combine the SpatialPoints with the data.frame.

ptsdf <- SpatialPointsDataFrame(pts, data=df)
ptsdf
## class       : SpatialPointsDataFrame
## features    : 10
## extent      : -120.8, -110.7, 35.7, 45.3  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## variables   : 2
## names       : ID,   precip
## min values  :  1,  185.193
## max values  : 10, 3581.577

To see what is inside:

str(ptsdf)
## Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
##   ..@ data       :'data.frame':  10 obs. of  2 variables:
##   .. ..$ ID    : int [1:10] 1 2 3 4 5 6 7 8 9 10
##   .. ..$ precip: num [1:10] 3582 2000 705 1772 185 ...
##   ..@ coords.nrs : num(0)
##   ..@ coords     : num [1:10, 1:2] -117 -120 -117 -114 -116 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : NULL
##   .. .. ..$ : chr [1:2] "longitude" "latitude"
##   ..@ bbox       : num [1:2, 1:2] -120.8 35.7 -110.7 45.3
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:2] "longitude" "latitude"
##   .. .. ..$ : chr [1:2] "min" "max"
##   ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
##   .. .. ..@ projargs: chr "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"

Or

showDefault(ptsdf)
## An object of class "SpatialPointsDataFrame"
## Slot "data":
##    ID   precip
## 1   1 3581.577
## 2   2 2000.376
## 3   3  704.969
## 4   4 1771.561
## 5   5  185.193
## 6   6  704.969
## 7   7  238.328
## 8   8  729.000
## 9   9 1560.896
## 10 10  328.509
##
## Slot "coords.nrs":
## numeric(0)
##
## Slot "coords":
##       longitude latitude
##  [1,]    -116.7     45.3
##  [2,]    -120.4     42.6
##  [3,]    -116.7     38.9
##  [4,]    -113.5     42.1
##  [5,]    -115.5     35.7
##  [6,]    -120.8     38.9
##  [7,]    -119.5     36.2
##  [8,]    -113.7     39.0
##  [9,]    -113.7     41.6
## [10,]    -110.7     36.9
##
## Slot "bbox":
##              min    max
## longitude -120.8 -110.7
## latitude    35.7   45.3
##
## Slot "proj4string":
## CRS arguments:
##  +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0

3.3 SpatialLines and SpatialPolygons

Making a SpatialPoints object was easy. Making a SpatialLines and SpatialPolygons object is a bit harder, but stil relatively straightforward with the spLines and spPolygons functions (from the raster package).

lon <- c(-116.8, -114.2, -112.9, -111.9, -114.2, -115.4, -117.7)
lat <- c(41.3, 42.9, 42.4, 39.8, 37.6, 38.3, 37.6)
lonlat <- cbind(lon, lat)
lns <- spLines(lonlat, crs=crdref)
lns
## class       : SpatialLines
## features    : 1
## extent      : -117.7, -111.9, 37.6, 42.9  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
pols <- spPolygons(lonlat, crs=crdref)
pols
## class       : SpatialPolygons
## features    : 1
## extent      : -117.7, -111.9, 37.6, 42.9  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0

The structure of the SpatialPolygons class is somewhat complex as it needs to accommodate the possibility of multiple polygons, each consisting of multiple sub-polygons, some of which may be “holes”.

str(pols)
## Formal class 'SpatialPolygons' [package "sp"] with 4 slots
##   ..@ polygons   :List of 1
##   .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
##   .. .. .. ..@ Polygons :List of 1
##   .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
##   .. .. .. .. .. .. ..@ labpt  : num [1:2] -114.7 40.1
##   .. .. .. .. .. .. ..@ area   : num 19.7
##   .. .. .. .. .. .. ..@ hole   : logi FALSE
##   .. .. .. .. .. .. ..@ ringDir: int 1
##   .. .. .. .. .. .. ..@ coords : num [1:8, 1:2] -117 -114 -113 -112 -114 ...
##   .. .. .. ..@ plotOrder: int 1
##   .. .. .. ..@ labpt    : num [1:2] -114.7 40.1
##   .. .. .. ..@ ID       : chr "1"
##   .. .. .. ..@ area     : num 19.7
##   ..@ plotOrder  : int 1
##   ..@ bbox       : num [1:2, 1:2] -117.7 37.6 -111.9 42.9
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:2] "x" "y"
##   .. .. ..$ : chr [1:2] "min" "max"
##   ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
##   .. .. ..@ projargs: chr "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"

Fortunately, you do not need to understand how these structures are organized. The main take home message is that they store geometries (coordinates), the name of the coordinate reference system, and attributes.

We can make use generic R function plot to make a map.

plot(pols, axes=TRUE, las=1)
plot(pols, border='blue', col='yellow', lwd=3, add=TRUE)
points(pts, col='red', pch=20, cex=3)

We’ll make more fancy maps later.