--- title: "Spatial Weight Matrices" author: "Jeremy Gelb" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: fig_width: 5 fig_height: 5 toc: true toc_depth: 2 df_print: "tibble" vignette: > %\VignetteIndexEntry{Spatial Weight Matrices} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) backup_option <- options() base_wd <- getwd() ``` ```{r include=FALSE} load(system.file("extdata", "results_vignette_wmat.rda", package = "spNetwork", mustWork = TRUE)) library(spdep) ``` This vignette is a short introduction to the spatial weight matrices feature of the **spNetwork** package. # Quick introduction to spatial weight matrices A vast number of spatial analysis methods are based on a spatial matrix Wij of size *n* x *n* with *i* an observation and *j* the neighbours of that observation. Wij represents the degree of spatial relationship between *i* and *j*. Classically one can define : * neighbouring matrix (Wij = 1 if *j* is a neighbour of *i*, 0 otherwise) * distance matrix (Wij = the distance between *i* and *j*, modified by a function like $\frac{1}{distance}$ or $\frac{1}{distance^2}$) * interaction matrix (Wij = the degree of interaction between *i* and *j*, the measure of the interaction depends on the subject of the analysis) In R, the classical package to deal with such matrices is the package 'spdep' which defines objects like neighbour lists and spatial weight lists, and offers the possibility to convert these objects into regular matrices. When one works with data constrained on a network, using Euclidean distance to estimate proximity between observations leads to an underestimation of the real distances between them **spNetwork** makes it possible to create `listw` objects based on network distance. Let us give an example here: calculating the Moran autocorrelation index for the number of bike accidents recorded on the Montreal road network in 2016. First, we want to load data: ```{r message=FALSE, warning=FALSE, eval = FALSE, message=FALSE} options("rgdal_show_exportToProj4_warnings"="none") library(spNetwork) library(sf) library(dplyr) library(spdep) data(mtl_network) data(bike_accidents) ``` then we want to split lines into lixels and calculate for each lixel the number of events on that lixel. ```{r message=FALSE, warning=FALSE, eval = FALSE} lixels <- lixelize_lines(mtl_network,200,mindist = 50) # defining and oid for the lixels lixels$oid <- 1:nrow(lixels) # snapping the points on the lines and counting snapped_acc <- snapPointsToLines2(bike_accidents,lixels, idField ="oid") counts <- table(snapped_acc$nearest_line_id) counts_df <- data.frame("oid" = as.numeric(as.character(names(counts))), "count" = as.numeric(counts)) # merging the results lixels$nbAccident <- left_join(lixels,counts_df, by="oid")$count nbAccident <- ifelse(is.na(lixels$nbAccident),0,lixels$nbAccident) ``` We use here the function *network_listw* the create a `listw` object representing the spatial weight matrix. The distances can be calculated from the centroids of the lixels, from the extremities of the lixels or from evenly spaced points on the lixels. We use here the second approach and specify it with the parameter `method = "ends"`. Let us consider that above 300 metres two segments are not neighbours anymore, and convert the distances between the observations into spatial weights by using the inverse of the squared distance. ```{r message=FALSE, warning=FALSE, eval = FALSE} netlistw <- network_listw(lixels,mtl_network, method = "ends", mindist = 10, maxdistance = 300, dist_func = "squared inverse", line_weight = 'length', matrice_type = 'W', grid_shape = c(1,1), verbose=FALSE) ``` With that matrix, we can calculate the Moran I for the number of accidents on a lixel. ```{r message=FALSE, warning=FALSE} Test <- moran.test(nbAccident, netlistw, zero.policy = T) print(round(Test$estimate,4)) ``` The autocorrelation is weak, this is due to the large distance used and the fact that the analyzed variable is a counting variable (number of accidents). Indeed, the Moran I is supposed to be used on continuous variables. One could go further and define its own function to convert distances into spatial weights ```{r message=FALSE, warning=FALSE, eval = FALSE} my_conv_func <- function(x){ if (x>=300){ return(0) }else{ return(1/x**3) } } netlistw2 <- network_listw(lixels,mtl_network, method = "ends", mindist = 10, maxdistance = 300, dist_func = my_conv_func, line_weight = 'length', matrice_type = 'W', grid_shape = c(1,1), verbose=FALSE) ``` To speed up calculation, one could use a multiprocessing plan defined with the package 'future'. To do so, the study area is divided into rectangles (according to the parameter *grid_shape*), and each rectangle is treated separately. A buffer is applied around the rectangles to avoid edge effects. ```{r message=FALSE, warning=FALSE, eval=FALSE} # setting the multiprocess plan future::plan(future::multisession(workers=2)) netlistw3 <- network_listw.mc(lixels,lixels, method = "ends", mindist = 10, maxdistance = 300, dist_func = my_conv_func, line_weight = 'length', matrice_type = 'W', grid_shape = c(2,2), verbose=FALSE) if (!inherits(future::plan(), "sequential")) future::plan(future::sequential) ``` # Other features * A spatial matrix could be calculated for every type of geometries not only lines. Points can directly be used. For polygons, it is possible to use the centres of the geometries as starting points. One can also choose a distance and equally spaced starting points will be defined on the border of the polygons according to that distance. * The cost of travelling on an edge could be set to something else than "length". * It is possible to specify directions on the network. ```{r include = FALSE} # reset all the user parameters options(backup_option) setwd(base_wd) oldpar <- par(mfrow = c(1,2)) par(oldpar) ```