Vector data¶
Introduction¶
Package sp
is the central package supporting spatial data analysis
in R. sp
defines a set of classes to represent spatial data. A
class defines a particular data type. The data.frame
is an example
of a class. Any particular data.frame
you create is an object
(instantiation) of that class.
The main reason for defining classes is to create a standard
representation of a particular data type to make it easier to write
functions (also known as ‘methods’) for them. In fact, the sp
package does not provide many functions to modify or analyze spatial
data; but the classes it defines are used in more than 100 other R
packages that provide specific functionality. See Hadley Wickham’s
Advanced R or John Chambers’ Software for
data analysis for a
detailed discussion of the use of classes in R).
We will be using the sp
package here. Note that this package will
eventually be replaced by the newer sf
package — but sp
is still
more commonly used.
Package sp
introduces a number of classes with names that start with
Spatial
. For vector data, the basic types are the SpatialPoints
,
SpatialLines
, and SpatialPolygons
. These classes only represent
geometries. To also store attributes, classes are available with these
names plus DataFrame
, for example, SpatialPolygonsDataFrame
and
SpatialPointsDataFrame
. When referring to any object with a name
that starts with Spatial
, it is common to write Spatial*
. When
referring to a SpatialPolygons
or SpatialPolygonsDataFrame
object it is common to write SpatialPolygons*
. The Spatial
classes (and their use) are described in detail by Bivand, Pebesma and
Gómez-Rubio.
It is possible to create Spatial*
objects from scratch with R
code. It can be very useful to create small self contained example to
illustrate something, for example to ask a question about how to do a
particular operation without needing to give access to the real data you
are using (which is always cumbersome). But in real life you will read
these from a file or database, for example from a “shapefile” see
Chapter 5.
To get started, let’s make some Spatial objects from scratch anyway, using the same data as were used in the previous chapter.
SpatialPoints¶
longitude <- c(-116.7, -120.4, -116.7, -113.5, -115.5, -120.8, -119.5, -113.7, -113.7, -110.7)
latitude <- c(45.3, 42.6, 38.9, 42.1, 35.7, 38.9, 36.2, 39, 41.6, 36.9)
lonlat <- cbind(longitude, latitude)
Now create a SpatialPoints
object. First load the sp
package
from the library. If this command fails with
Error in library(sp) : there is no package called ‘sp’
, then you
need to install the package first, with install.packages("sp")
library(sp)
pts <- SpatialPoints(lonlat)
Let’s check what kind of object pts
is.
class (pts)
## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"
And what is inside of it
showDefault(pts)
## An object of class "SpatialPoints"
## Slot "coords":
## longitude latitude
## [1,] -116.7 45.3
## [2,] -120.4 42.6
## [3,] -116.7 38.9
## [4,] -113.5 42.1
## [5,] -115.5 35.7
## [6,] -120.8 38.9
## [7,] -119.5 36.2
## [8,] -113.7 39.0
## [9,] -113.7 41.6
## [10,] -110.7 36.9
##
## Slot "bbox":
## min max
## longitude -120.8 -110.7
## latitude 35.7 45.3
##
## Slot "proj4string":
## Coordinate Reference System:
## Deprecated Proj.4 representation: NA
So we see that the object has the coordinates we supplied, but also a
bbox
. This is a ‘bounding box’, or the ‘spatial extent’ that was
computed from the coordinates. There is also a “proj4string”. This
stores the coordinate reference system (“crs”, discussed in more detail
later). We did not provide the crs so it is unknown (NA
). That is
not good, so let’s recreate the object, and now provide a crs.
crdref <- CRS('+proj=longlat +datum=WGS84')
pts <- SpatialPoints(lonlat, proj4string=crdref)
I load to raster
package to improve how Spatial objects are printed.
library(raster)
pts
## class : SpatialPoints
## features : 10
## extent : -120.8, -110.7, 35.7, 45.3 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
We can use the SpatialPoints
object to create a
SpatialPointsDataFrame
object. First we need a data.frame
with
the same number of rows as there are geometries.
# Generate random precipitation values, same quantity as points
precipvalue <- runif(nrow(lonlat), min=0, max=100)
df <- data.frame(ID=1:nrow(lonlat), precip=precipvalue)
Combine the SpatialPoints
with the data.frame
.
ptsdf <- SpatialPointsDataFrame(pts, data=df)
ptsdf
## class : SpatialPointsDataFrame
## features : 10
## extent : -120.8, -110.7, 35.7, 45.3 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## variables : 2
## names : ID, precip
## min values : 1, 6.17862704675645
## max values : 10, 99.1906094830483
To see what is inside:
str(ptsdf)
## Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
## ..@ data :'data.frame': 10 obs. of 2 variables:
## .. ..$ ID : int [1:10] 1 2 3 4 5 6 7 8 9 10
## .. ..$ precip: num [1:10] 6.18 20.6 17.66 68.7 38.41 ...
## ..@ coords.nrs : num(0)
## ..@ coords : num [1:10, 1:2] -117 -120 -117 -114 -116 ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : NULL
## .. .. ..$ : chr [1:2] "longitude" "latitude"
## ..@ bbox : num [1:2, 1:2] -120.8 35.7 -110.7 45.3
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:2] "longitude" "latitude"
## .. .. ..$ : chr [1:2] "min" "max"
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
## .. .. ..@ projargs: chr "+proj=longlat +datum=WGS84 +no_defs"
## .. .. ..$ comment: chr "GEOGCRS[\"unknown\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.25722"| __truncated__
Or
showDefault(ptsdf)
## An object of class "SpatialPointsDataFrame"
## Slot "data":
## ID precip
## 1 1 6.178627
## 2 2 20.597457
## 3 3 17.655675
## 4 4 68.702285
## 5 5 38.410372
## 6 6 76.984142
## 7 7 49.769924
## 8 8 71.761851
## 9 9 99.190609
## 10 10 38.003518
##
## Slot "coords.nrs":
## numeric(0)
##
## Slot "coords":
## longitude latitude
## [1,] -116.7 45.3
## [2,] -120.4 42.6
## [3,] -116.7 38.9
## [4,] -113.5 42.1
## [5,] -115.5 35.7
## [6,] -120.8 38.9
## [7,] -119.5 36.2
## [8,] -113.7 39.0
## [9,] -113.7 41.6
## [10,] -110.7 36.9
##
## Slot "bbox":
## min max
## longitude -120.8 -110.7
## latitude 35.7 45.3
##
## Slot "proj4string":
## Coordinate Reference System:
## Deprecated Proj.4 representation: +proj=longlat +datum=WGS84 +no_defs
## WKT2 2019 representation:
## GEOGCRS["unknown",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]],
## ID["EPSG",6326]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8901]],
## CS[ellipsoidal,2],
## AXIS["longitude",east,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]],
## AXIS["latitude",north,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]]]
SpatialLines and SpatialPolygons¶
Making a SpatialPoints
object was easy. Making a SpatialLines
and SpatialPolygons
object is a bit harder, but stil relatively
straightforward with the spLines
and spPolygons
functions (from
the raster
package).
lon <- c(-116.8, -114.2, -112.9, -111.9, -114.2, -115.4, -117.7)
lat <- c(41.3, 42.9, 42.4, 39.8, 37.6, 38.3, 37.6)
lonlat <- cbind(lon, lat)
lns <- spLines(lonlat, crs=crdref)
lns
## class : SpatialLines
## features : 1
## extent : -117.7, -111.9, 37.6, 42.9 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
pols <- spPolygons(lonlat, crs=crdref)
pols
## class : SpatialPolygons
## features : 1
## extent : -117.7, -111.9, 37.6, 42.9 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
The structure of the SpatialPolygons class is somewhat complex as it needs to accommodate the possibility of multiple polygons, each consisting of multiple sub-polygons, some of which may be “holes”.
str(pols)
## Formal class 'SpatialPolygons' [package "sp"] with 4 slots
## ..@ polygons :List of 1
## .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
## .. .. .. ..@ Polygons :List of 1
## .. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
## .. .. .. .. .. .. ..@ labpt : num [1:2] -114.7 40.1
## .. .. .. .. .. .. ..@ area : num 19.7
## .. .. .. .. .. .. ..@ hole : logi FALSE
## .. .. .. .. .. .. ..@ ringDir: int 1
## .. .. .. .. .. .. ..@ coords : num [1:8, 1:2] -117 -114 -113 -112 -114 ...
## .. .. .. ..@ plotOrder: int 1
## .. .. .. ..@ labpt : num [1:2] -114.7 40.1
## .. .. .. ..@ ID : chr "1"
## .. .. .. ..@ area : num 19.7
## ..@ plotOrder : int 1
## ..@ bbox : num [1:2, 1:2] -117.7 37.6 -111.9 42.9
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:2] "x" "y"
## .. .. ..$ : chr [1:2] "min" "max"
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
## .. .. ..@ projargs: chr "+proj=longlat +datum=WGS84 +no_defs"
## .. .. ..$ comment: chr "GEOGCRS[\"unknown\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.25722"| __truncated__
## ..$ comment: chr "FALSE"
Fortunately, you do not need to understand how these structures are organized. The main take home message is that they store geometries (coordinates), the name of the coordinate reference system, and attributes.
We can make use plot
to make a map.
plot(pols, axes=TRUE, las=1)
plot(pols, border='blue', col='yellow', lwd=3, add=TRUE)
points(pts, col='red', pch=20, cex=3)
We’ll make more fancy maps later.