Exploration¶

In this chapter we describe how to access and explore satellite remote sensing data with R. We also show how to use them to make maps.

We will primarily use a spatial subset of a Landsat 8 scene collected on June 14, 2017. The subset covers the area between Concord and Stockton, in California, USA.

All Landsat image scenes have a unique product ID and metadata. You can find the information on Landsat sensor, satellite, location on Earth (WRS path, WRS row) and acquisition date from the product ID. For example, the product identifier of the data we will use is ‘LC08_044034_20170614’. Based on this guide, you can see that the Sensor-Satellite is OLI/TIRS combined Landsat 8, WRS Path 44, WRS Row 34 and collected on June 14, 2017. Landsat scenes are most commonly delivered as zipped file, which contains separate files for each band.

We will start by exploring and visualizing the data (See the instructions in Chapter 1 for data downloading instructions if you have not already done so).

Image properties¶

Create RasterLayer objects for single Landsat layers (bands)

library(raster)
# Blue
b2 <- raster('data/rs/LC08_044034_20170614_B2.tif')
# Green
b3 <- raster('data/rs/LC08_044034_20170614_B3.tif')
# Red
b4 <- raster('data/rs/LC08_044034_20170614_B4.tif')
# Near Infrared (NIR)
b5 <- raster('data/rs/LC08_044034_20170614_B5.tif')

Print the variables to check. E.g.

b2
## class      : RasterLayer
## dimensions : 1245, 1497, 1863765  (nrow, ncol, ncell)
## resolution : 30, 30  (x, y)
## extent     : 594090, 639000, 4190190, 4227540  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=10 +datum=WGS84 +units=m +no_defs
## source     : LC08_044034_20170614_B2.tif
## names      : LC08_044034_20170614_B2
## values     : 0.0748399, 0.7177562  (min, max)

You can see the spatial resolution, extent, number of layers, coordinate reference system and more.

Image information and statistics¶

The below shows how you can access various properties from a Raster* object (this is the same for any raster data set).

# coordinate reference system (CRS)
crs(b2)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
##  +proj=utm +zone=10 +datum=WGS84 +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["WGS 84 / UTM zone 10N",
##     BASEGEOGCRS["WGS 84",
##         DATUM["World Geodetic System 1984",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["UTM zone 10N",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",-123,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9996,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["Between 126Â°W and 120Â°W, northern hemisphere between equator and 84Â°N, onshore and offshore. Canada - British Columbia (BC); Northwest Territories (NWT); Nunavut; Yukon. United States (USA) - Alaska (AK)."],
##         BBOX[0,-126,84,-120]],
##     ID["EPSG",32610]]
# Number of cells, rows, columns
ncell(b2)
## [1] 1863765
dim(b2)
## [1] 1245 1497    1
# spatial resolution
res(b2)
## [1] 30 30
# Number of bands
nlayers(b2)
## [1] 1
# Do the bands have the same extent, number of rows and columns, projection, resolution, and origin
compareRaster(b2,b3)
## [1] TRUE

You can create a RasterStack (an object with multiple layers) from the existing RasterLayer (single band) objects.

s <- stack(b5, b4, b3)
# Check the properties of the RasterStack
s
## class      : RasterStack
## dimensions : 1245, 1497, 1863765, 3  (nrow, ncol, ncell, nlayers)
## resolution : 30, 30  (x, y)
## extent     : 594090, 639000, 4190190, 4227540  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=10 +datum=WGS84 +units=m +no_defs
## names      : LC08_044034_20170614_B5, LC08_044034_20170614_B4, LC08_044034_20170614_B3
## min values :            0.0008457669,            0.0208406653,            0.0425921641
## max values :               1.0124315,               0.7861769,               0.6924697

You can also create the RasterStack using the filenames.

# first create a list of raster layers to use
filenames <- paste0('data/rs/LC08_044034_20170614_B', 1:11, ".tif")
filenames
##  [1] "data/rs/LC08_044034_20170614_B1.tif"
##  [2] "data/rs/LC08_044034_20170614_B2.tif"
##  [3] "data/rs/LC08_044034_20170614_B3.tif"
##  [4] "data/rs/LC08_044034_20170614_B4.tif"
##  [5] "data/rs/LC08_044034_20170614_B5.tif"
##  [6] "data/rs/LC08_044034_20170614_B6.tif"
##  [7] "data/rs/LC08_044034_20170614_B7.tif"
##  [8] "data/rs/LC08_044034_20170614_B8.tif"
##  [9] "data/rs/LC08_044034_20170614_B9.tif"
## [10] "data/rs/LC08_044034_20170614_B10.tif"
## [11] "data/rs/LC08_044034_20170614_B11.tif"
landsat <- stack(filenames)
landsat
## class      : RasterStack
## dimensions : 1245, 1497, 1863765, 11  (nrow, ncol, ncell, nlayers)
## resolution : 30, 30  (x, y)
## extent     : 594090, 639000, 4190190, 4227540  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=10 +datum=WGS84 +units=m +no_defs
## names      : LC08_044034_20170614_B1, LC08_044034_20170614_B2, LC08_044034_20170614_B3, LC08_044034_20170614_B4, LC08_044034_20170614_B5, LC08_044034_20170614_B6, LC08_044034_20170614_B7, LC08_044034_20170614_B8, LC08_044034_20170614_B9, LC08_044034_20170614_B10, LC08_044034_20170614_B11
## min values :            9.641791e-02,            7.483990e-02,            4.259216e-02,            2.084067e-02,            8.457669e-04,           -7.872183e-03,           -5.052945e-03,            3.931751e-02,           -4.337332e-04,             2.897978e+02,             2.885000e+02
## max values :              0.73462820,              0.71775615,              0.69246972,              0.78617686,              1.01243150,              1.04320455,              1.11793602,              0.82673049,              0.03547901,             322.43139648,             317.99530029

Above we created a RasterStack with 11 layers. The layers represent reflection intensity in the following wavelengths: Ultra Blue, Blue, Green, Red, Near Infrared (NIR), Shortwave Infrared (SWIR) 1, Shortwave Infrared (SWIR) 2, Panchromatic, Cirrus, Thermal Infrared (TIRS) 1, Thermal Infrared (TIRS) 2. We won’t use the last four layers and you will see how to remove those in following sections.

Single band and composite maps¶

You can plot individual layers of a RasterStack of a multi-spectral image.

par(mfrow = c(2,2))
plot(b2, main = "Blue", col = gray(0:100 / 100))
plot(b3, main = "Green", col = gray(0:100 / 100))
plot(b4, main = "Red", col = gray(0:100 / 100))
plot(b5, main = "NIR", col = gray(0:100 / 100))

Have a look at the legends of the maps created above. They can range between 0 and 1. Notice the difference in shading and range of legends between the different bands. This is because different surface features reflect the incident solar radiation differently. Each layer represent how much incident solar radiation is reflected for a particular wavelength range. For example, vegetation reflects more energy in NIR than other wavelengths and thus appears brighter. In contrast, water absorbs most of the energy in the NIR wavelength and it appears dark.

We do not gain that much information from these grey-scale plots; they are often combined into a “composite” to create more interesting plots. You can learn more about color composites in remote sensing here and also in the section below.

To make a “true (or natural) color” image, that is, something that looks like a normal photograph (vegetation in green, water blue etc), we need bands in the red, green and blue regions. For this Landsat image, band 4 (red), 3 (green), and 2 (blue) can be used. The plotRGB method can be used to combine them into a single composite. You can also supply additional arguments to plotRGB to improve the visualization (e.g. a linear stretch of the values, using strecth = "lin").

landsatRGB <- stack(b4, b3, b2)
plotRGB(landsatRGB, axes = TRUE, stretch = "lin", main = "Landsat True Color Composite")

The true-color composite reveals much more about the landscape than the earlier gray images. Another popular image visualization method in remote sensing is known “false color” image in which NIR, red, and green bands are combined. This representation is popular as it makes it easy to see the vegetation (in red).

par(mfrow = c(1,2))
plotRGB(landsatRGB, axes=TRUE, stretch="lin", main="Landsat True Color Composite")
landsatFCC <- stack(b5, b4, b3)
plotRGB(landsatFCC, axes=TRUE, stretch="lin", main="Landsat False Color Composite")

Note: Always check for package documentation (help(plotRGB)) for other arguments that can be added (like scale) to improve or modify the image.

Question 1: Use the plotRGB function with RasterStack ``landsat`` to create a true and false color composite (hint remember the position of the bands in the stack).

Subset and rename bands¶

You can select specific layers (bands) using subset function, or via indexing.

# select first 3 bands only
landsatsub1 <- subset(landsat, 1:3)
# same
landsatsub2 <- landsat[[1:3]]
# Number of bands in the original and new data
nlayers(landsat)
## [1] 11
nlayers(landsatsub1)
## [1] 3
nlayers(landsatsub2)
## [1] 3

We won’t use the last four bands in landsat. You can remove those using

landsat <- subset(landsat, 1:7)

For clarity, it is useful to set the names of the bands.

names(landsat)
## [1] "LC08_044034_20170614_B1" "LC08_044034_20170614_B2"
## [3] "LC08_044034_20170614_B3" "LC08_044034_20170614_B4"
## [5] "LC08_044034_20170614_B5" "LC08_044034_20170614_B6"
## [7] "LC08_044034_20170614_B7"
names(landsat) <- c('ultra-blue', 'blue', 'green', 'red', 'NIR', 'SWIR1', 'SWIR2')
names(landsat)
## [1] "ultra.blue" "blue"       "green"      "red"        "NIR"
## [6] "SWIR1"      "SWIR2"

Spatial subset or crop¶

Spatial subsetting can be used to limit analysis to a geographic subset of the image. Spatial subsets can be created with the crop function, using an extent object, or another spatial object from which an Extent can be extracted.

# Using extent
extent(landsat)
## class      : Extent
## xmin       : 594090
## xmax       : 639000
## ymin       : 4190190
## ymax       : 4227540
e <- extent(624387, 635752, 4200047, 4210939)
# crop landsat by the extent
landsatcrop <- crop(landsat, e)

Question 2: Interactive selection from the image is also possible. Use ``drawExtent`` and ``drawPoly`` to select an area of interest

Question 3: Use the RasterStack ``landsatcrop`` to create a true and false color composite

Saving results to disk¶

At this stage we may want to save the raster to disk using the function writeRaster. Multiple file types are supported. We will use the commonly used GeoTiff format. While the layer order is preserved, layer names are unfortunately lost in the GeoTiff format.

x <- writeRaster(landsatcrop, filename="cropped-landsat.tif", overwrite=TRUE)

Alternatively you can used the ‘raster-grd’ format.

writeRaster(landsatcrop, filename="cropped-landsat.grd", overwrite=TRUE)

An advantage of this format is that it saves the layer names. The disadvantage of this format is that not many other programs can read the data, in contrast to the GeoTiff format.

Note: Check for package documentation (help(writeRaster)) for additional helpful arguments that can be added.

Relation between bands¶

A scatterplot matrix can be helpful in exploring relationships between raster layers. This can be done with the pairs() function of the raster package.

Plot of reflection in the ultra-blue wavelength against reflection in the blue wavelength.

pairs(landsatcrop[[1:2]], main = "Ultra-blue versus Blue")

Plot of reflection in the red wavelength against reflection in the NIR wavelength.

pairs(landsatcrop[[4:5]], main = "Red versus NIR")

The first plot reveals high correlations between the blue wavelength regions. Because of the high correlation, we can just use one of the blue bands without losing much information.

This distribution of points in second plot (between NIR and red) is unique due to its triangular shape. Vegetation reflects very highly in the NIR range than red and creates the upper corner close to NIR (y) axis. Water absorbs energy from all the bands and occupies the location close to origin. The furthest corner is created due to highly reflecting surface features like bright soil or concrete.

Extract pixel values¶

Often we want to get the values of raster cells for specific geographic locations or area. The extract function is used to get raster values at the locations of other spatial data. You can use points, lines, polygons or an Extent (rectangle) object. You can also use cell numbers to extract values. When using points, extract returns the values of a Raster* object for the cells in which a set of points fall.

# load the polygons with land use land cover information
samp <- readRDS('data/rs/samples.rds')
# generate 300 point samples from the polygons
ptsamp <- spsample(samp, 300, type='regular')
# add the land cover class to the points
ptsamp$class <- over(ptsamp, samp)$class
# extract values with points
df <- extract(landsat, ptsamp)
# To see some of the reflectance values
head(df)
##      ultra.blue      blue     green       red       NIR     SWIR1     SWIR2
## [1,]  0.1410052 0.1248705 0.1113816 0.1148080 0.1793469 0.2322618 0.1928575
## [2,]  0.1395523 0.1239814 0.1177140 0.1251308 0.1995803 0.2612348 0.2238691
## [3,]  0.1372535 0.1197959 0.1036178 0.1024034 0.1818842 0.2088187 0.1686338
## [4,]  0.1277982 0.1195357 0.1204682 0.1599375 0.3001404 0.3075571 0.1824047
## [5,]  0.1289693 0.1201212 0.1211838 0.1598074 0.2862176 0.3166437 0.1924021
## [6,]  0.1480100 0.1449739 0.1573568 0.2008598 0.3047379 0.3821368 0.2652685

Spectral profiles¶

A plot of the spectrum (all bands) for pixels representing a certain earth surface features (e.g. water) is known as a spectral profile. Such profiles demonstrate the differences in spectral properties of various earth surface features and constitute the basis for image analysis. Spectral values can be extracted from any multispectral data set using extract function. In the above example, we extracted values of Landsat data for the samples. These samples include: cropland, water, fallow, built and open. First we compute the mean reflectance values for each class and each band.

ms <- aggregate(df, list(ptsamp$class), mean)
# instead of the first column, we use row names
rownames(ms) <- ms[,1]
ms <- ms[,-1]
ms
##          ultra.blue       blue      green        red        NIR      SWIR1
## built     0.1791182 0.17209275 0.17608798 0.19163815 0.24237949 0.24805444
## cropland  0.1117013 0.08931219 0.08450462 0.05233991 0.48128346 0.15164212
## fallow    0.1346741 0.11946550 0.10820704 0.11711890 0.18396671 0.23850747
## open      0.1390056 0.13789844 0.15295334 0.20732350 0.34415621 0.35790739
## water     0.1333515 0.11632029 0.09899055 0.07830379 0.04851571 0.03296793
##               SWIR2
## built    0.20686696
## cropland 0.06555377
## fallow   0.19976847
## open     0.21372804
## water    0.02668551

Now we plot the mean spectra of these features.

# Create a vector of color for the land cover classes for use in plotting
mycolor <- c('darkred', 'yellow', 'burlywood', 'cyan', 'blue')
#transform ms from a data.frame to a matrix
ms <- as.matrix(ms)
# First create an empty plot
plot(0, ylim=c(0,0.6), xlim = c(1,7), type='n', xlab="Bands", ylab = "Reflectance")
# add the different classes
for (i in 1:nrow(ms)){
  lines(ms[i,], type = "l", lwd = 3, lty = 1, col = mycolor[i])
}
# Title
title(main="Spectral Profile from Landsat", font.main = 2)
# Legend
legend("topleft", rownames(ms),
       cex=0.8, col=mycolor, lty = 1, lwd =3, bty = "n")

The spectral profile shows (dis)similarity in the reflectance of different features on the earth’s surface (or above it). ‘Water’ shows relatively low reflection in all wavelengths, and ‘built’, ‘fallow’ and ‘open’ have relatively high reflectance in the longer wavelengts.