--- title: 'Getting started' date: "`r Sys.Date()`" output: rmarkdown::html_vignette abstract: "`gtfs2emis` is an R package to estimate public transport emissions using data in the General Transit Feed Specification (GTFS) format. The package allows users to estimate the emission levels of several types of pollutants for each public transport vehicle/trip/route at high spatial and temporal resolutions simply using a GTFS feed and some information on fleet characteristics." urlcolor: blue vignette: > %\VignetteIndexEntry{Getting started} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>") ``` # 1. Introduction `gtfs2emis` is an R package to estimate hot exhaust emissions of public transport vehicles using data in the General Transit Feed Specification (GTFS) format. The package allows users to estimate the emission levels of several types of pollutants for each public transport vehicle/trip/route at high spatial and temporal resolutions simply using a GTFS feed and some information on fleet characteristics. This vignette introduces the main functions of the `gtfs2emis` package and shows a step-by-step reproducible example of how to use the package. # 2. Installation One can install the development version `gtfs2emis` from Github: ```{r, eval = FALSE, message = FALSE} # From CRAN install.packages("gtfs2emis") # Dev. version with latest features # install.packages("remotes") remotes::install_github("ipeaGIT/gtfs2emis") ``` A few samples of GTFS and fleet files are included in the package: ```{r} data_path <- system.file("extdata", package = "gtfs2emis") list.files(data_path) ``` - Subset of Curitiba (Brazil): `bra_cur_gtfs.zip` and `bra_cur_fleet.txt` - Subset of Dublin (Ireland): `irl_dub_gtfs.zip` and `irl_dub_fleet.txt` - Subset of Detroit (US): `usa_det_gtfs.zip` and `usa_det_fleet.txt` # 3. Data requirements To estimate the emission levels from a given public transport system, users need to: 1. Input a `GTFS.zip` file 2. Input `data.frame` with a few characteristics of the public transport fleet (such as age, vehicle type, and fuel) 3. Select an emission factor model from the models provided by the package, which currently includes models from the US, Europe, and Brazil. 4. Select which pollutants should be estimated from a list of over 15 pollutants provided by the package. # 4. Package Overview Before we start, let's load a few packages we'll be using in this vignette: ```{r, message = FALSE} library(gtfs2emis) library(gtfstools) library(progressr) library(data.table) library(ggplot2) library(units) library(sf) ``` The `gtfs2emis` package has two core functions: 1. `transport_model()`. This function converts GTFS data into a GPS-like `data.table` with the space-time positions and speeds of public transport vehicles. 2. `emission_model()`. This function estimates hot-exhaust emissions based on four inputs: a) the result from the `transport_model()`; b) a `string` indicating which emission factor model should be considered; c) a `string` indicating which pollutants should be estimated; and d) a `data.frame` with info on fleet characteristics passed by the user. The function returns a `list` with the estimated amount of each pollutant emitted by public transport vehicles. To help users analyze the output from `emission_model()`, the `gtfs2emis` package has few functions: - `emis_summary()` to aggregate emission estimates by the time of the day, vehicle type, or road segment. - `emis_grid()` to spatially aggregate emission estimates using any custom spatial grid or polygons. - `emis_to_dt()` to convert the output of `emission_model()` from `list` to `data.table`. # 5. Demonstration of sample data In this introductory vignette, we show a very simple case study using default parameters of the `gtfs2emis` package and where we assume that fleet characteristics are homogeneously distributed across the public transport routes. [For advanced users, we have written another vignette](https://ipeagit.github.io/gtfs2emis/articles/gtfs2emis_emission_factor.html) that specifies the usage of different emissions factors. We have also written a separate vignette to help users build the `data.frame` with information on [fleet characteristics](https://ipeagit.github.io/gtfs2emis/articles/gtfs2emis_fleet_data.html). To demonstrate how the `gtfs2emis` package works, we will be using a small sample of data for the city of Dublin, Ireland. In this example, we'll be estimating `NOx` and `PM10` emissions of bus services on business days. ## 5.1 Transport model The first step is to generate the transport model using `transport_model{gtfs2emis}`. This function converts GTFS data into a GPS-like `data.table`, transforming the output into a `sf-linestring`, which is the required input for emissions estimates. The user can input either a string with the file path where the a `gtfs.zip` file is stored or an object of class `"gtfs" "list"`, generated with `gtfstools::read_gtfs()`. First, let's read the GTFS data and filter only the transport services that run on ```{r, message = FALSE} # path to GTFS.zip file gtfs_file <- system.file("extdata/irl_dub_gtfs.zip", package = "gtfs2emis") # read GTFS gtfs <- gtfstools::read_gtfs(gtfs_file) # Keep Monday services GTFS gtfs <- gtfstools::filter_by_weekday(gtfs, weekday = c('saturday', 'sunday'), keep = FALSE) ``` Now let's generate the transport model. This is the most time-consuming part. If you want to, you can set a progress bar by calling the `transport_model()` function within `progressr::with_progress()`, as shown below. ```{r, message = FALSE} # generate transport model progressr::with_progress( tp_model <- transport_model(gtfs_data = gtfs, min_speed = 2, max_speed = 80, spatial_resolution = 100, parallel = TRUE) ) head(tp_model) ``` Here is what the output of the transport model looks like. In essence, it's a trajectory `data.table sf linestring` with the space-time position and speed of trip segments for every single vehicle of the public transport system. ```{r, message = FALSE, fig.width=5, fig.height=5} ggplot(data = tp_model) + geom_sf(aes(color= as.numeric(speed))) + scale_color_continuous(type = "viridis")+ labs(color = "Speed (km/h)")+ theme_void() ``` ## 5.2 Fleet data The next step is to prepare a `data.frame` with some characteristics of the public transport fleet. This can be either: - A simple table with the overall composition of the fleet. In this case, the `gtfs2emis` will assume that fleet is homogeneously distributed across all routes; OR - A detailed table that (1) brings info on the characteristics of each vehicle and, (2) tells the probability with which each vehicle type is allocated to each transport route. In this introductory vignette, we'll be working with a simple table that tells us the proportion of buses in Dublin according to their vehicle type, Euro standard, technology, and fuel. The table looks like this: ```{r, message = FALSE} fleet_file <- system.file("extdata/irl_dub_fleet.txt", package = "gtfs2emis") fleet_df <- read.csv(fleet_file) head(fleet_df) ``` Please note different emission factor models may require different information and that the fleet `data.frame` needs to be organized accordingly. In our current example for the city of Dublin, the fleet data must include certain columns with the fleet characteristics that are used in the EMEP-EEA emission factor model: vehicle type, Euro standard, technology, and fuel. To check which columns and sets of vehicle characteristics are required by a given emission factor model, check the [fleet data vignette](https://ipeagit.github.io/gtfs2emis/articles/gtfs2emis_fleet_data.html). ## 5.3 Emission model In the final step, we use the `emission_model{gtfs2emis}` function to estimate the hot exhaust emissions of our public transport system. Here, the user needs to pass the results from `transport_model()`, some fleet data as described above, and select which emission factor model and pollutants should be considered. The `emission_model()` outputs a `list` with several `vectors` and `data.frames` with emission estimates and related information such as vehicle variables (`fuel`, `age`, `tech`, `euro`, `fleet_composition`), travel variables (`slope`, `load`, `gps`) or pollution (`EF`, `emi`). ```{r, message = FALSE} emi_list <- emission_model(tp_model = tp_model, ef_model = "ef_europe_emep", fleet_data = fleet_df, pollutant = c("CO2","PM10"), reference_year = 2020 ) names(emi_list) ``` # 6 Analyzing emission estimates Sometimes, working with data stored in a long list can be tricky. For convenience, `gtfs2emis` package has a few functions to help users analyze the results of emission estimates. For example, you can use the `emis_to_dt()` function to convert the output of `emission_model()` from a `list` to a `data.table` format: ```{r,eval = TRUE} emi_dt <- emis_to_dt(emi_list = emi_list, veh_vars = c("veh_type", "euro", "fuel"), pol_vars = "pollutant" ) head(emi_dt) ``` Alternatively, the `emis_summary()` function helps users summarize emission estimates, aggregating emissions by pollutant, time of the day, vehicle or road segment (spatial). ## 6.1 Total emissions by pollutant ```{r, message = FALSE} emi_by_pol <- emis_summary(emi_list = emi_list, by = "pollutant") emi_by_pol ``` ## 6.2 Total emissions by vehicle type The `emis_summary()` function help users summarize emission estimates by either `veh_type`, `pollutant`, or `time`. Using this function, users can easily: ```{r, fig.height=5, fig.width=8} emi_by_veh <- emis_summary(emi_list = emi_list, by = "vehicle") head(emi_by_veh) ``` It is also possible to summarize emission estimates by other vehicle characteristics AND pollutants. ```{r, message = FALSE, fig.width=5, fig.height=6} emi_by_veh <- emis_summary(emi_list = emi_list, by = "vehicle", veh_vars = c("veh_type","euro")) head(emi_by_veh) # plot ggplot(data = emi_by_veh) + geom_col(aes(x = euro, y = as.numeric(emi/1000), fill = pollutant), color=NA, show.legend = FALSE) + labs(y="Total emissions (Kg)", x="Euro standard") + facet_wrap(~pollutant, scales = "free", nrow = 2) + theme_minimal() ``` ## 6.3 Total emissions by time of the day ```{r, fig.width=5, fig.height=6} emi_by_time <- emis_summary(emi_list = emi_list, by = "time") head(emi_by_time) # plot ggplot(data = emi_by_time) + geom_col(aes(x = factor(timestamp_hour), y = as.numeric(emi/1000), fill = pollutant), color=NA, show.legend = FALSE) + labs(y="Total emissions (Kg)", x="Hour of the day") + facet_wrap(~pollutant, scales = "free", nrow = 2) + theme_minimal() ``` ## 6.4 Spatial distribution of emissions Finally, users can analyze how public transport emissions are spatially distributed. To do this, the `emis_grid()` function helps aggregate emission estimates over any custom spatial vector data (`sf POLYGON`). Let's create a regular hexagonal grid for this example. ```{r, message = FALSE, fig.width=5, fig.height=5} # create spatial grid mygrid <- sf::st_make_grid( x = sf::st_make_valid(emi_list$tp_model) , cellsize = 0.25 / 200 , crs= 4326 , what = "polygons" , square = FALSE) ggplot() + geom_sf(data=mygrid) + theme_void() ``` ```{r, message=FALSE, fig.width=5, fig.height=5} mygrid_emi <- emis_grid(emi_list, mygrid,time_resolution = "day" ,quiet = FALSE) ggplot() + geom_sf(data = mygrid_emi, aes(fill= as.numeric(CO2_Euro_III)), color=NA) + geom_sf(data = emi_list$tp_model$geometry,color = "black")+ scale_fill_continuous(type = "viridis")+ labs(fill = "CO2 (g)")+ theme_void() ``` ## Learn more Check out our extra guides: - [Defining Fleet data](https://ipeagit.github.io/gtfs2emis/articles/gtfs2emis_fleet_data.html) - [Exploring Emission Factors](https://ipeagit.github.io/gtfs2emis/articles/gtfs2emis_emission_factor.html) ## Report a bug If you have any suggestions or want to report an error, please visit [the package GitHub page](https://github.com/ipeaGIT/gtfs2emis/issues).