Recap last session

  • Hierarchical clustering
    • Dissimilation / distance measures (dist())
    • Agglomeration (hclust())
    • Dendrogram (as.dendrogram())
  • Partitional clustering
    • K-means example (kmeans())

Today

  • GIS vector data in R
  • Overlaying vector and raster data
  • Descriptive point pattern analysis

Required packages

Today we need several packages:

library(sp)         # Managing and manipulating vector data
library(rgdal)      # Reading and writing spatial data
library(raster)     # Reading and manipulating raster data
library(mapview)    # Map visualization
library(spatialEco) # Spatial analysis and modeling utilities
library(ggplot2)    # Visualization

Vector data in R

The simple features package

‘Simple features’ or ‘simple feature access’ refers to a formal standard. R’s sp class has so far lacked a complete implementation of simple features, making conversions at times convoluted, inefficient or incomplete. The package sf tries to fill this gap, and aims at succeeding sp in the long term.

library(sf)        # Managing and manipulating vector data

Vector data: Spatial point class

The sp package provides a number of hierarchically-nested spatial classes to store and represent spatial points, polygons, and grids.

Bivand et al. (2013): Applied Spatial Data Analysis with R.

Bivand et al. (2013): Applied Spatial Data Analysis with R.

Create spatial points data.frame

We can create a spatial points data.frame from a data.frame if it contains coordinates. Lets try this with the following tree species dataset that contains the occurrence of Abies alba (silver fir) in Europe:

abies <- read.csv("data/abiesalba.csv")
head(abies)
##          long      lat
## 1 -0.01377021 48.78623
## 2 -0.01423346 42.83379
## 3 -0.02027734 42.79684
## 4 -0.02039473 43.01494
## 5 -0.02171301 48.48641
## 6 -0.02336757 42.92377
abies_sp <- SpatialPoints(abies, CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"))

Visualize spatial points data.frame

mapview(abies_sp)

Reproject spatial points

Using spTransform, we can reproject the spatial points to another projection. Here we use the projection of a raster image.

soil <- stack("data/soil_europe_laea_10x10km.tif")
soil.layernames <- read.csv("data/soil_europe_laea_10x10km_layer.csv", stringsAsFactors = F)
names(soil) <- soil.layernames$code

soil_projection <- projection(soil)
soil_projection
## [1] "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=m +no_defs"
abies_sp <- spTransform(abies_sp, CRS(soil_projection))

Extract raster values from point locations

soil.df <- raster::extract(soil, abies_sp)
head(soil.df)
##             awc       bd      cec       cn       ph
## [1,] 0.09869616 1.320587 21.19460 10.91162 6.695996
## [2,] 0.12140623 1.019247 17.94526 13.70917 6.254736
## [3,] 0.12228899 1.015588 16.78173 13.67870 6.229464
## [4,] 0.11559993 1.112967 21.84254 13.19514 6.153059
## [5,] 0.10142094 1.287644 12.03737 13.18615 5.613811
## [6,] 0.12025109 1.015924 18.30894 13.88025 6.241406

Descriptive point pattern

The centroid can easily be calculated by averaging the coordinates:

library(ggplot2)
coords <- abies_sp@coords
coords <- as.data.frame(coords)

xmean <- mean(coords[, 1])
ymean <- mean(coords[, 2])

ggplot(coords, aes(x = long, y = lat)) + 
  geom_point(alpha = 0.3) +
  geom_point(aes(x = xmean, y = ymean), col = "red", shape = 2)

We can also calculate the median of the coordinates:

xmedian <- median(coords[, 1])
ymedian <- median(coords[, 2])

ggplot(coords, aes(x = long, y = lat)) + 
  geom_point(alpha = 0.3) +
  geom_point(aes(x = xmean, y = ymean), col = "red", shape = 2) +
  geom_point(aes(x = xmedian, y = ymedian), col = "blue", shape = 2)

We can calculate the average/minimum/maximum distance between points:

dist <- raster::pointDistance(abies_sp, longlat = FALSE)
dist[1:3, 1:3]
##          [,1]       [,2]       [,3]
## [1,]      0.0 661483.862 665579.331
## [2,] 661483.9      0.000   4123.066
## [3,] 665579.3   4123.066      0.000
dist <- dist[lower.tri(dist)]

c(min(dist), mean(dist), max(dist)) / 10000
## [1]   0.04996316  55.87651901 268.64010711

Point counts

We can generate a grid from the points extent.

ext <- extent(abies_sp)
ext
## class      : Extent 
## xmin       : 3185500 
## xmax       : 5621500 
## ymin       : 1690500 
## ymax       : 3669500
ras <- raster(ext)
ras
## class      : RasterLayer 
## dimensions : 10, 10, 100  (nrow, ncol, ncell)
## resolution : 243600, 197900  (x, y)
## extent     : 3185500, 5621500, 1690500, 3669500  (xmin, xmax, ymin, ymax)
## crs        : NA

And set the resolution to 5000*5000 meters while keeping the projection:

res(ras) <- c(5000, 5000)

projection(ras) <- projection(abies_sp)
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj = prefer_proj):
## Discarded datum Unknown based on GRS80 ellipsoid in Proj4 definition
ras
## class      : RasterLayer 
## dimensions : 396, 487, 192852  (nrow, ncol, ncell)
## resolution : 5000, 5000  (x, y)
## extent     : 3185500, 5620500, 1689500, 3669500  (xmin, xmax, ymin, ymax)
## crs        : +proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=m +no_defs

Following we can use this raster to rasterize the points and count the number of points in each cell:

abies_ras <- rasterize(abies_sp, ras, fun = "count", background = 0)

mapview(abies_ras)

Kernel estimates

Kernel density estimates do not simply count the occurrences, but ‘smooth’ by fitting a function (e.g., multivariate-normal) to the data. We can calculate Kernel densities as raster:

kernel_ras <- spatialEco::sp.kde(abies_sp, bw = 10000, newdata = abies_ras, standardize = T)
mapview(kernel_ras)

Distances as raster

We can easily calculate the distances to cells with Abies. Following I set all cells containing no Abies to NA and calculate the distance:

abies_ras[abies_ras == 0] <- NA

distance_abies <- raster::distance(abies_ras)

mapview(distance_abies)

Kernel maps with ggplot2

ggplot2 can also estimate kernels and nicely plot them atop of other GIS data.

map <- map_data("world")

ggplot() + xlim (-20, 59) + ylim(35, 71) +
  geom_density2d(data = abies, aes(x = long, y = lat), size = 0.3) + 
  geom_polygon(data = map, aes(x=long, y = lat, group = group)) + 
  stat_density2d(data = abies, 
                 aes(x = long, y = lat, fill = ..level.., alpha = ..level..), size = 0.01, 
                 bins = 50, geom = "polygon") + 
  scale_fill_gradient(low = "green", high = "red") + 
  scale_alpha(range = c(0.1, 0.5), guide = FALSE)

Reading

  • Bivand, R. S., Pebesma, E., Gómez-Rubio, V. (2013): Applied Spatial Data Analysis with R. Second Edition. Springer

Copyright © 2020 Humboldt-Universität zu Berlin. Department of Geography.