Package 'PVplr' reference manual

Title:	Performance Loss Rate Analysis Pipeline
Description:	The pipeline contained in this package provides tools used in the Solar Durability and Lifetime Extension Center (SDLE) for the analysis of Performance Loss Rates (PLR) in real world photovoltaic systems. Functions included allow for data cleaning, feature correction, power predictive modeling, PLR determination, and uncertainty bootstrapping through various methods <doi:10.1109/PVSC40753.2019.8980928>. The vignette "Pipeline Walkthrough" gives an explicit run through of typical package usage. This material is based upon work supported by the U.S Department of Energy's Office of Energy Efficiency and Renewable Energy (EERE) under Solar Energy Technologies Office (SETO) Agreement Number DE-EE-0008172. This work made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University.
Authors:	Alan Curran [aut] , Tyler Burleyson [aut] , William Oltjen [aut] , Sascha Lindig [aut] , David Moser [aut] , Roger French [aut, cre] , Solar Durability and Lifetime Extension research center [cph, fnd]
Maintainer:	Roger French <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.2
Built:	2025-01-12 06:28:52 UTC
Source:	https://github.com/cranhaven/cranhaven.r-universe.dev

function to test if an entire column is NA

Description

This function tests for completely NA columns

Usage

all_na(x)
all_na(x)

Arguments

`x`	any column in a dataframe

Value

Returns boolean TRUE if column is all NA, FALSE if not

Examples

test <- all_na(c(NA, "a", NA))

test <- all_na(c(NA, "a", NA))

Fixes the anomlies

Description

This function gets the data and finds the anomlies in weekends and weekdays and gives a dataframe with anomalies and anomaly columns

Usage

anomalies(df)
anomalies(df)

Arguments

`df`	structured dataframe

Value

df with two columns of cleaned_energy and anom_flag

Author(s)

Arash Khalilnejad

detects rhw anomalies and returns a dataframw with cleaned and anom_flag column

Description

detects rhw anomalies and returns a dataframw with cleaned and anom_flag column

Usage

anomaly_detector(df, batch_days = 90)
anomaly_detector(df, batch_days = 90)

Arguments

`df`	the strucutred data
`batch_days`	the batch of data that the anomaly detection is applied. Since time series decomposition is used, one seasonality will be applied for whole data which is inefficeint, if NA, will pass whole

Value

data with anomalies

Author(s)

Arash Khalilnejad

checks the quality of the data after and before cleaning

Description

calculates the percentage of anomalies, missings + zeros, gaps, and length of the data and reports the quality of data before and after cleaning.

Usage

data_quality_check(
  energy_data,
  col = "elec_cons",
  id = "pv_df",
  batch_days = 90
)
data_quality_check(
  energy_data,
  col = "elec_cons",
  id = "pv_df",
  batch_days = 90
)

Arguments

`energy_data`	structured energy dataframe
`col`	Input column
`id`	PV system ID
`batch_days`	the batch of data that the anomaly detection is applied. Since time series decomposition is used, one seasonality will be applied for whole data which is inefficient, if NA, will pass whole

Details

The quality grading criteria is as following: anomalies A: less than 10 missing percentage: A: less than 10 largest gap: A: less than 120 hours, B: 120 to 164 hours, C: 164 to 240 hours D: more than 240 hours length P: more than 2 years, F: less than 2 years

Value

a table with grading of the quality after and before cleaning

Author(s)

Arash Khalilnejad

Reads jci files gotten in budget period 2

Description

Reads the jci file and modifies the timestamp intevals and based on location modifies the timezone using googleapi and then generates the useful columns

Usage

data_structure(df, col = "elec_cons", timestamp_col = "timestamp")
data_structure(df, col = "elec_cons", timestamp_col = "timestamp")

Arguments

`df`	dataframe containing at least the timestamp column and the variable to be plotted with the heatmap
`col`	the character name of the column to be ploted
`timestamp_col`	the character name of the timestamp column which i is the number of file in the list

Value

a dataframe with fixed timestamps and useful cooumns

Author(s)

Arash

finds median start and end time of PV operation

Description

finds median start and end time of PV operation

Usage

day_time_start_end(df)
day_time_start_end(df)

Arguments

`df`	with num_time Column

Value

dataframe with start and end time

Author(s)

Arash Khalilnejad

data with PV on time flag.

Description

returns dataframe of PV with approximate operating period, baed on median of start and end time.

Usage

df_With_on_time(df)
df_With_on_time(df)

Arguments

`df`	df with num_time

Value

input data with one more column of on_time

Author(s)

Arash Khalilnejad

returns quality information of time series data of PV

Description

returns quality information of time series data of PV

Usage

grade_pv(
  df,
  col = "poay",
  id = "pv_id",
  timestamp_col = "tmst",
  timestamp_format = "%Y-%m-%d %H:%M:%S",
  batch_days = 90
)
grade_pv(
  df,
  col = "poay",
  id = "pv_id",
  timestamp_col = "tmst",
  timestamp_format = "%Y-%m-%d %H:%M:%S",
  batch_days = 90
)

Arguments

`df`	the PV time series data. It can be the direct output of read.csv(file_name, stringsAsFactors = F)
`col`	column of the grading, default 'poay'
`id`	The name of the pv data
`timestamp_col`	the character name of the timestamp column
`timestamp_format`	the POSIXct format of the timestamp if conversion is needed
`batch_days`	the batch of data that the anomaly detection is applied. Since time series decomposition is used, one seasonality will be applied for whole data which is inefficeint, if NA, will pass whole

Author(s)

Arash Khalilnejad

Largest Intervals

Description

Largest Intervals

Usage

Int(df)
Int(df)

Arguments

df

Dataframe

Value

Intervals

Author(s)

Arash Khalilnejad

Numerical time interim predictor.

Description

Convert the hour and minute component of each timestamp to a numerical representation.

Usage

ip_num_time(data, ts_col = "timestamp")
ip_num_time(data, ts_col = "timestamp")

Arguments

`data`	A dataframe with a timestamp column.
`ts_col`	The timestamp column name in `data`. Default value is 'timestamp'.

Value

data with a num_time column added.

Author(s)

Arash Khalilnejad

Linearly interpolate hourly data to 15 min data.

Description

Many weather data sets are hourly and we need values for every 15 minutes.

Usage

lin_inter_hrly_to_fifteen(data, data_ts)
lin_inter_hrly_to_fifteen(data, data_ts)

Arguments

`data`	A data frame with hourly data.
`data_ts`	The column name for the `data` timestamp.

Details

Any value that can not be linearly interpolated such as a string will remain the same.

Value

The resulting fifteen minute data frame.

Author(s)

Arash Khalilnejad

Linearly interpolate missing energy values.

Description

If there exist lest than four missing values, represented by NA values, fill with linearly interpolated values.

Usage

lin_inter_missing_energy(data, threshold = 4)
lin_inter_missing_energy(data, threshold = 4)

Arguments

`data`	A data frame with an 'elec_cons' column.
`threshold`	The maximum number of consective values that may be filled with interpolated values. By default four.

Value

The data frame with 'missing values' filled in.

Examples

## Not run: 
lin_inter_missing_energy(data)

## End(Not run)
## Not run: 
lin_inter_missing_energy(data)

## End(Not run)

Dataframe resample function

Description

This function resamples data from a given dataframe. Dataframe must have columns created through plr_cleaning to denote time segments

Usage

mbm_resample(df, fraction, by)
mbm_resample(df, fraction, by)

Arguments

`df`	dataframe
`fraction`	fraction of data to resample from dataframe
`by`	timescale over which to resample, day, week, or month

Value

Returns randomly resampled dataframe

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
dfc_resampled <- mbm_resample(test_dfc, fraction = 0.65, by = "week")

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
dfc_resampled <- mbm_resample(test_dfc, fraction = 0.65, by = "week")

function to convert to character then numeric

Description

The function is a shorthand for converting factors to numeric

Usage

nc(x)
nc(x)

Arguments

`x`	any factor to convert to numeric

Value

Returns supplied parameter as numeric

Examples

num <- nc(test_df$power)

num <- nc(test_df$power)

function to test is the values in a column should be numeric

Description

This function tests a column to see if it should be numeric

Usage

num_test(col)
num_test(col)

Arguments

col

any column in a dataframe

Value

Returns boolean TRUE if column should be numeric, FALSE if not

Examples

test <- num_test(test_df$power)

test <- num_test(test_df$power)

Export variables to a cluster.

Description

Ghost cluster export call to make sure testCoverage's trace function and environment are available.

Usage

parallel_cluster_export(cluster, varlist, envir = .GlobalEnv)
parallel_cluster_export(cluster, varlist, envir = .GlobalEnv)

Arguments

`cluster`	Cluster
`varlist`	Character vector of names of objects to export.
`envir`	Environment from which t export variables

6k Method for PLR Determination

Description

This function groups data by the specified time interval and performs a linear regression using the formula: power_var ~ irrad_var/istc * (nameplate_power + a*log(irrad_var/istc) + b*log(irrad_var/istc)^2 + c*(temp_var - tref) + d*(temp_var - tref)*log(irrad_var/istc) + e*(temp_var - tref)*log(irrad_var/istc)^2 + f*(temp_var - tref)^2). Predicted values of irradiance, temperature, and wind speed (if applicable) are added for reference. These values are the lowest daily high irradiance reading (over 300W/m^2), the average temperature over all data, and the average wind speed over all data.

Usage

plr_6k_model(
  df,
  var_list,
  nameplate_power,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL
)
plr_6k_model(
  df,
  var_list,
  nameplate_power,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL
)

Arguments

`df`	A dataframe containing pv data.
`var_list`	A list of the dataframe's standard variable names, obtained from the output of `plr_variable_check`.
`nameplate_power`	The rated power capability of the system, in watts.
`by`	String, either "day", "week", or "month". The time periods over which to group data for regression.
`data_cutoff`	The number of data points needed to keep a value in the final table. Regressions over less than this number and their data will be discarded.
`predict_data`	optional; Dataframe; If you have preferred estimations of irradiance, temperature, and wind speed, include them here to skip automatic generation. Format: Irradiance, Temperature, Wind (optional).

Value

Returns dataframe of results per passed time scale from 6K modeling

Bootstrap: Resampling from individual Models

Description

This function determines uncertainty of a PLR measurement by sampling results from invididual models. Specify the model you would like to find the uncertainty of, and the function will put the dataframe through the selected model and return the uncertainties of the model's results.

Usage

plr_bootstrap_output(
  df,
  var_list,
  model,
  by = "month",
  fraction = 0.65,
  n = 1000,
  predict_data = NULL,
  np = NA,
  power_var = "power_var",
  time_var = "time_var",
  ref_irrad = 900,
  irrad_range = 10
)
plr_bootstrap_output(
  df,
  var_list,
  model,
  by = "month",
  fraction = 0.65,
  n = 1000,
  predict_data = NULL,
  np = NA,
  power_var = "power_var",
  time_var = "time_var",
  ref_irrad = 900,
  irrad_range = 10
)

Arguments

`df`	A dataframe containing pv data.
`var_list`	A list of the dataframe's standard variable names, obtained from the plr_variable_check output.
`model`	The model you would like to calculate the uncertainty of. Use "xbx", "xbx+utc", "pvusa", or "6k".
`by`	String indicating time step count per year for the regression. Use "day", "month", or "year". See `plr_weighted_regression`.
`fraction`	The size of each sample relative to the total dataset.
`n`	Number of samples to take.
`predict_data`	passed to predict_data in model call. See `plr_xbx_model` for example.
`np`	The system's reported name plate power. See `plr_6k_model`.
`power_var`	The name of the power variable after being put through a Performance Loss Rate (PLR) determining test. Typically "power_var".
`time_var`	The name of the time variable after being put through a PLR determining test. Typically "time_var".
`ref_irrad`	The irradiance value at which to calculate the universal temperature coefficient. Since irradiance is a much stronger influencer on power generation than temperature, it is important to specify a small range of irradiance data from which to estimate the effect of temperature.
`irrad_range`	The range of the subset used to calculate the universal temperature coefficient. See above.

Value

Returns PLR value and uncertainty calculated with bootstrap of data from power correction models

Examples

# build var_list

var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)

xbx_mbm_plr_output_uncertainty <- plr_bootstrap_output(test_dfc, var_list,
                                                       model = "xbx", fraction = 0.65,
                                                       n = 10, power_var = 'power_var',
                                                       time_var = 'time_var', ref_irrad = 900,
                                                       irrad_range = 10, by = "month",
                                                       np = NA, pred = NULL)


# build var_list

var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)

xbx_mbm_plr_output_uncertainty <- plr_bootstrap_output(test_dfc, var_list,
                                                       model = "xbx", fraction = 0.65,
                                                       n = 10, power_var = 'power_var',
                                                       time_var = 'time_var', ref_irrad = 900,
                                                       irrad_range = 10, by = "month",
                                                       np = NA, pred = NULL)

Bootstrap: Resample from individual Models

Description

The function samples and bootstraps data that has already been put through a power predictive model. The PLR and Uncertainty are returned in a dataframe.

Usage

plr_bootstrap_output_from_results(
  data,
  power_var,
  time_var,
  weight_var,
  by = "month",
  model,
  fraction = 0.65,
  n = 1000
)
plr_bootstrap_output_from_results(
  data,
  power_var,
  time_var,
  weight_var,
  by = "month",
  model,
  fraction = 0.65,
  n = 1000
)

Arguments

`data`	Result of modeling data with a PLR determining model, i.e. plr_xbx_model, plr_6k_model, etc.
`power_var`	Variable name of power in the dataframe. Typically power_var
`time_var`	Variable name of time in the dataframe. Typically time_var
`weight_var`	Variable name of weightings in the dataframe. Typically sigma
`by`	String, either "day", "month", or "year". Time over which to perform `plr_yoy_regression` and `plr_weighted_regression`.
`model`	The name of the model the data has been put through. This option is only included for the user's benefit in keeping bootstrap outputs consistent.
`fraction`	The fractional size of the data to be sampled each time.
`n`	The number of resamples to take.

Value

Returns PLR value and uncertainty calculated with bootstrap of data going into power correction models

Examples

# build var_list


var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
xbx_mbm_plr_result_uncertainty <- plr_bootstrap_output_from_results(test_xbx_wbw_res, 
                                                                    power_var = 'power_var',
                                                                    time_var = 'time_var',
                                                                    weight_var = 'sigma',
                                                                    by = "month", model = 'xbx',
                                                                    fraction = 0.65, n = 10)


# build var_list


var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
xbx_mbm_plr_result_uncertainty <- plr_bootstrap_output_from_results(test_xbx_wbw_res, 
                                                                    power_var = 'power_var',
                                                                    time_var = 'time_var',
                                                                    weight_var = 'sigma',
                                                                    by = "month", model = 'xbx',
                                                                    fraction = 0.65, n = 10)

Bootstrap: Resampling data going into each Model

Description

This function determines the uncertainty of a PLR measurement through resampling data for each model, prior to putting the data through the model.

Usage

plr_bootstrap_uncertainty(
  df,
  n,
  fraction = 0.65,
  var_list,
  model,
  by = "month",
  power_var = "power_var",
  time_var = "time_var",
  data_cutoff = 100,
  np = NA,
  pred = NULL
)
plr_bootstrap_uncertainty(
  df,
  n,
  fraction = 0.65,
  var_list,
  model,
  by = "month",
  power_var = "power_var",
  time_var = "time_var",
  data_cutoff = 100,
  np = NA,
  pred = NULL
)

Arguments

`df`	A dataframe containing pv data.
`n`	(numeric) Number of samples to take. The higher the n value, the longer it takes to complete, but the results become more accurate as well.
`fraction`	The fraction of data that constitutes a resample for the bootstrap.
`var_list`	A list of variables obtained through `plr_variable_check`.
`model`	the String name of the model to bootstrap. Select from: "xbx" (`plr_xbx_model`), "correction" (`plr_xbx_utc_model`), "pvusa" (`plr_pvusa_model`), or "6k" (`plr_6k_model`).
`by`	String, either "day", "week", or "month". Time over which to perform `plr_yoy_regression`.
`power_var`	Variable name of power in the dataframe. This must be the variable's name after being put through your selected model. Typically power_var
`time_var`	Variable name of time in the dataframe. This must be the variable's name after being put through your selected model. Typically time_var
`data_cutoff`	The number of data points needed to keep a value in the final table. Regressions over less than this number and their data will be discarded.
`np`	The system's reported name plate power. See `plr_6k_model`.
`pred`	passed to predict_data in model call. See `plr_xbx_model` for an example.

Value

Returns PLR value and uncertainty calculated with bootstrap of data going into power correction models

Examples

# build var_list


var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
xbx_mbm_plr_uncertainty <- plr_bootstrap_uncertainty(test_dfc, n = 2, 
                                                     fraction = 0.65, by = 'month',
                                                     power_var = 'power_var', time_var = 'time_var',
                                                     var_list = var_list, model = "xbx",
                                                     data_cutoff = 10, np = NA,
                                                     pred = NULL)


# build var_list


var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
xbx_mbm_plr_uncertainty <- plr_bootstrap_uncertainty(test_dfc, n = 2, 
                                                     fraction = 0.65, by = 'month',
                                                     power_var = 'power_var', time_var = 'time_var',
                                                     var_list = var_list, model = "xbx",
                                                     data_cutoff = 10, np = NA,
                                                     pred = NULL)

Build a Custom Variable List

Description

The default var_list generator, plr_variable_check, assumes data comes from SDLE's sources. If you are using this package with your own data, the format may not line up appropriately. Use this function to create a variable list to be passed to other functions so they can keep track of what column names mean.

Usage

plr_build_var_list(time_var, power_var, irrad_var, temp_var, wind_var)
plr_build_var_list(time_var, power_var, irrad_var, temp_var, wind_var)

Arguments

`time_var`	The variable representing time. Typically, a timestamp.
`power_var`	The variable representing time. Typically, in watts.
`irrad_var`	The variable representing irradiance. Typically, either poa or ghi irradiance.
`temp_var`	The variable representing temperature. Package functions assume Celcius.
`wind_var`	optional; The variable representing wind speed.

Value

Returns dataframe of variable names for the given photovoltaic data for use with later functions

Examples

var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)

var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)

Basic Data Cleaning

Description

Removes entries with irradiance and power readings outside cutoffs, fixes timestamps to your specified format, and converts columns to numeric when appropriate - see plr_convert_columns. Also, adds columns for days/weeks/years of operation that are used by other functions.

Usage

plr_cleaning(
  df,
  var_list,
  irrad_thresh = 100,
  low_power_thresh = 0.05,
  high_power_cutoff = NA,
  tmst_format = "%Y-%m-%d %H:%M:%S"
)
plr_cleaning(
  df,
  var_list,
  irrad_thresh = 100,
  low_power_thresh = 0.05,
  high_power_cutoff = NA,
  tmst_format = "%Y-%m-%d %H:%M:%S"
)

Arguments

`df`	A dataframe containing pv data.
`var_list`	A list of the dataframe's standard variable names, obtained from the output of `plr_variable_check`.
`irrad_thresh`	The lowest meaningful irradiance value. Values below are filtered.
`low_power_thresh`	The lowest meaningful power output. Values below are filtered.
`high_power_cutoff`	The highest meaningful power output. Values above are filtered.
`tmst_format`	The desired timestamp format.

Value

Returns dataframe with rows filtered out based on passed cleaning parameters

Examples

var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
                               
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)

var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
                               
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)

Fix Column Typings

Description

Converts appropriate columns to numeric without specifying the name of the column. All columns from hbase are read as factors. Columns are tested to see if they should be numeric by forcing conversion to numeric. Columns that subsequently contain NA's are not numeric; if not, they are set to numeric.

Usage

plr_convert_columns(df)
plr_convert_columns(df)

Arguments

`df`	A dataframe containing pv data.

Value

Returns original dataframe with columns corrected to proper classes

Examples

df <- PVplr::plr_convert_columns(test_df)

df <- PVplr::plr_convert_columns(test_df)

Decompose Seasonality from Data

Description

Decomposes seasonality from a dataframe that has already passed through a PLR Determination test, e.g. plr_xbx_model. This method has the option of creating plot and data files.

Usage

plr_decomposition(
  data,
  freq,
  power_var,
  time_var,
  plot = FALSE,
  plot_file = NULL,
  title = NULL,
  data_file = NULL
)
plr_decomposition(
  data,
  freq,
  power_var,
  time_var,
  plot = FALSE,
  plot_file = NULL,
  title = NULL,
  data_file = NULL
)

Arguments

`data`	a dataframe containing PV data that has undergone a power predictive model, e.g. `plr_xbx_model`.
`freq`	the frequency of seasonality. This is typically 4 but depends on the location of the system.
`power_var`	name of the power variable, e.g. iacp
`time_var`	name of the time variable, e.g. tvar
`plot`	boolean indicating if you wish to save a plot.
`plot_file`	location to save the plot, if the plot param is given TRUE.
`title`	the title of the plot created if the plot param is given TRUE.
`data_file`	location to save data. Currently non-functional.

Value

Dataframe containing decomposed time series features

Examples

#' # build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
# Perform power modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
test_xbx_wbw_decomp <- plr_decomposition(test_xbx_wbw_res, freq = 4,
                                         power_var = 'power_var', time_var = 'time_var',
                                         plot = FALSE, plot_file = NULL, title = NULL, 
                                         data_file = NULL)

#' # build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
# Perform power modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
test_xbx_wbw_decomp <- plr_decomposition(test_xbx_wbw_res, freq = 4,
                                         power_var = 'power_var', time_var = 'time_var',
                                         plot = FALSE, plot_file = NULL, title = NULL, 
                                         data_file = NULL)

Statistical k-means Test

Description

The method builds linear models by day, identifies outliers, and performs 2-means clustering by slopes. If the lower identified cluster is significantly less than the higher mean, and constitutes less than 25% of the data, it is identified as soiled and returned. Otherwise, the outlier points are identified as soiled and returned.

Usage

plr_kmeans_test(
  df,
  var_list,
  mean_ratio = 0.7,
  plot = FALSE,
  file_path,
  file_name,
  set_cutoff = FALSE
)
plr_kmeans_test(
  df,
  var_list,
  mean_ratio = 0.7,
  plot = FALSE,
  file_path,
  file_name,
  set_cutoff = FALSE
)

Arguments

`df`	A df containing pv data. Should be 'cleaned' by `plr_cleaning`.
`var_list`	A list of the dataframe's standard variable names, obtained from the output of `plr_variable_check`.
`mean_ratio`	This scales the higher identified cluster's mean for comparison. Higher values will be more likely to identify the second mean as soiled, and vice versa. Values should range from 0 to 1.
`plot`	optional; Boolean; whether to return the box plot generated by the method to identify outliers.
`file_path`	optional; location to store the boxplot if plot is set TRUE. Note this is not necessary if you select to plot - only if you wish to save it.
`file_name`	optional; name of file to save boxplot if plot is set to TRUE.
`set_cutoff`	Defaults to FALSE; pass a numeric value to cut off all slopes less than the cutoff value. This bypasses entirely the outlier and clustering calculuations to remove slope values you believe to be soiled.

Value

The method returns a dataframe containing the values that should be removed. If you want to discard them, try using dplyr::filter().

Title Heatmap generation for PV data

Description

Title Heatmap generation for PV data

Usage

plr_pvheatmap(
  df,
  col,
  timestamp_col,
  timestamp_format = "%Y-%m-%d %H:%M:%S",
  upper_threshold = 1,
  lower_threshold = 0,
  font_size = 12
)
plr_pvheatmap(
  df,
  col,
  timestamp_col,
  timestamp_format = "%Y-%m-%d %H:%M:%S",
  upper_threshold = 1,
  lower_threshold = 0,
  font_size = 12
)

Arguments

`df`	dataframe containing at least the timestamp column and the variable to be plotted with the heatmap
`col`	the character name of the column to be ploted
`timestamp_col`	the character name of the timestamp column
`timestamp_format`	the POSIXct format of the timestamp if conversion is needed
`upper_threshold`	the fraction of upper data to include, 1 removes no data, 0.9 remove the top 1 percent etc.
`lower_threshold`	the fraction of lower data to remove, 0 removes no data, 0.01 remove the bottom 1 percent etc.
`font_size`	font size of the output plot

Value

returns a ggplot object heatmap of the specified column

Examples

# build heatmap
heat <- plr_pvheatmap(test_df, col = "g_poa", timestamp_col = "timestamp", 
                      upper_threshold = 0.99, lower_threshold = 0)
# display heatmap
plot(heat)

# build heatmap
heat <- plr_pvheatmap(test_df, col = "g_poa", timestamp_col = "timestamp", 
                      upper_threshold = 0.99, lower_threshold = 0)
# display heatmap
plot(heat)

PVUSA Method for PLR Determination

Description

This function groups data by the specified time interval and performs a linear regression using the formula: $P = G_{POA} * (\beta_{0} + \beta_{1} G + \beta_{2} T_{amb} + \beta_{3} W)$ . Predicted values of irradiance, temperature, and wind speed (if applicable) are added for reference. These values are the lowest daily high irradiance reading (over 300), the average temperature over all data, and the average wind speed over all data.

Usage

plr_pvusa_model(
  df,
  var_list,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL
)
plr_pvusa_model(
  df,
  var_list,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL
)

Arguments

`df`	A dataframe containing pv data.
`var_list`	A list of the dataframe's standard variable names, obtained from the output of `plr_variable_check`.
`by`	String, either "day", "week", or "month". The time periods over which to group data for regression.
`data_cutoff`	The number of data points needed to keep a value in the final table. Regressions over less than this number and their data will be discarded.
`predict_data`	optional; Dataframe; If you have preferred estimations of irradiance, temperature, and wind speed, include them here to skip automatic generation. Format: Irradiance, Temperature, Wind (optional).

Value

Returns dataframe of results per passed time scale from PVUSA modeling

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_pvusa_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_pvusa_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

Filter outliers from Power Predicted Data

Description

This function is used to remove outliers (if desired) after putting data through a power predictive model, e.g. plr_xbx_model.

Usage

plr_remove_outliers(data)
plr_remove_outliers(data)

Arguments

data

A resulting dataframe from a power predictive model.

Value

Returns dataframe with outliers flagged by other functions removed

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

# Remove outliers from the modeled data
test_xbx_wbw_res_no_outliers <- plr_remove_outliers(test_xbx_wbw_res)

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

# Remove outliers from the modeled data
test_xbx_wbw_res_no_outliers <- plr_remove_outliers(test_xbx_wbw_res)

Removing Saturated Data

Description

Tests for readings which may indicate saturation of the system. Removes values above the power saturation limit (calculated by multiplying sat_limit and power_thresh).

Usage

plr_saturation_removal(df, var_list, sat_limit, power_thresh = 0.99)
plr_saturation_removal(df, var_list, sat_limit, power_thresh = 0.99)

Arguments

`df`	A dataframe containing pv data.
`var_list`	A list of the dataframe's standard variable names, obtained from the output of `plr_variable_check`.
`sat_limit`	An upper limit on power saturation. This is multiplied by the power threshold, and power values above this point are filtered from the dataframe. The value depends on the system's inverter.
`power_thresh`	An upper limit on power.

Value

Returns passed data frame with rows removed which contain power values above the specified threshold

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
test_dfc_removed_saturation <- plr_saturation_removal(test_dfc, var_list,
                                                      sat_limit = 3000, power_thresh = 0.99)

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
test_dfc_removed_saturation <- plr_saturation_removal(test_dfc, var_list,
                                                      sat_limit = 3000, power_thresh = 0.99)

Segmented linear PLR extraction function

Description

Segmented linear PLR extraction function

Usage

plr_seg_extract(
  df,
  per_year,
  psi = NA,
  n_breakpoints,
  power_var,
  time_var,
  return_model = FALSE
)
plr_seg_extract(
  df,
  per_year,
  psi = NA,
  n_breakpoints,
  power_var,
  time_var,
  return_model = FALSE
)

Arguments

`df`	data frame of corrected power measurements, typically the output of a weather correction model
`per_year`	number of data point defining one seasonal year (365 for days, 52 for weeks etc.)
`psi`	vector of 1 or more breakpoint estimates for the model. If not given will evenly space breakpoints across time series
`n_breakpoints`	number of desired breakpoints. Determines number of linear models
`power_var`	character name of the power variable
`time_var`	character name of the time variable
`return_model`	logical to return model object. If FALSE returns PLR results from model

Value

if return_model is FALSE it returns PLR results from model, otherwise returns segmented linear model object

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
#' # Perform power modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
decomp <- plr_decomposition(test_xbx_wbw_res, freq = 4,
                                         power_var = 'power_var', time_var = 'time_var',
                                         plot = FALSE, plot_file = NULL, title = NULL, 
                                         data_file = NULL)

# evaluate segmented PLR results
seg_plr_result <- PVplr::plr_seg_extract(df = decomp, per_year = 365,
                                         n_breakpoints = 1, power_var = "trend",
                                         time_var = "age")

# return segmented model instead of PLR result
model <- PVplr::plr_seg_extract(df = decomp, per_year = 365, n_breakpoints = 1,
                                power_var = "trend", time_var = "age", return_model = TRUE)

# predict data along time-series with piecewise model for plotting
pred <- data.frame(age = seq(1, max(decomp$age, na.rm = TRUE), length.out = 10000))
pred$seg <- predict(model, newdata = pred)

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
#' # Perform power modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
decomp <- plr_decomposition(test_xbx_wbw_res, freq = 4,
                                         power_var = 'power_var', time_var = 'time_var',
                                         plot = FALSE, plot_file = NULL, title = NULL, 
                                         data_file = NULL)

# evaluate segmented PLR results
seg_plr_result <- PVplr::plr_seg_extract(df = decomp, per_year = 365,
                                         n_breakpoints = 1, power_var = "trend",
                                         time_var = "age")

# return segmented model instead of PLR result
model <- PVplr::plr_seg_extract(df = decomp, per_year = 365, n_breakpoints = 1,
                                power_var = "trend", time_var = "age", return_model = TRUE)

# predict data along time-series with piecewise model for plotting
pred <- data.frame(age = seq(1, max(decomp$age, na.rm = TRUE), length.out = 10000))
pred$seg <- predict(model, newdata = pred)

PLR linear model uncertainty

Description

This function returns the standard deviation of a PLR calculated from a linear model

Usage

plr_var(mod, per_year)
plr_var(mod, per_year)

Arguments

`mod`	linear model
`per_year`	number of data points in a given year baesd on which time scale was selected

Value

Returns standard deviation of PLR value

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

# obain standard deviation from model
mod <- lm(power_var ~ time_var, data = test_xbx_wbw_res)
plr_sd <- plr_var(mod, per_year = 52)

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

# obain standard deviation from model
mod <- lm(power_var ~ time_var, data = test_xbx_wbw_res)
plr_sd <- plr_var(mod, per_year = 52)

Define Standard Variable Names

Description

The method determines the variable names used by the input dataframe. It looks for the following labels:

power_var <- iacp; if not, sets to idcp
time_var <- tmst; if not ,sets to tutc
irrad_var <- poay; if not, sets to ghir
temp_var <- temp; if not, sets to modt
wind_var <- wspa; if applicable, else NULL

This function assumes data is in a standard HBase format. If you are using other data (as you most likely are) you should use the companion function, plr_build_var_list.

Usage

plr_variable_check(df)
plr_variable_check(df)

Arguments

`df`	A dataframe containing pv data.

Value

Returns a dataframe containing standard variable names (no data). It will not include windspeed if the variable was not already included. This is frequently an input of other functions.

Examples

var_list <- plr_variable_check(test_df)
 
var_list <- plr_variable_check(test_df)

Weighted Regression

Description

Automatically calculates Performance Loss Rate (PLR) using weighted linear regression. Note that it needs data from a power predictive model.

Usage

plr_weighted_regression(
  data,
  power_var,
  time_var,
  model,
  per_year = 12,
  weight_var = NA
)
plr_weighted_regression(
  data,
  power_var,
  time_var,
  model,
  per_year = 12,
  weight_var = NA
)

Arguments

`data`	The result of a power predictive model
`power_var`	String name of the variable used as power
`time_var`	String name of the variable used as time
`model`	String name of the model that the data was passed through
`per_year`	the time step count per year based on the model - 12 for month-by-month, 52 for week-by-week, and 365 for day-by-day
`weight_var`	Used to weight regression, typically sigma.

Value

Returns PLR value and error evaluated with linear regression

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
# Calculate Performance Loss Rate
xbx_wbw_plr <- plr_weighted_regression(test_xbx_wbw_res, 
                                       power_var = 'power_var', 
                                       time_var = 'time_var',
                                       model = "xbx", 
                                       per_year = 52, 
                                       weight_var = 'sigma')

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
# Calculate Performance Loss Rate
xbx_wbw_plr <- plr_weighted_regression(test_xbx_wbw_res, 
                                       power_var = 'power_var', 
                                       time_var = 'time_var',
                                       model = "xbx", 
                                       per_year = 52, 
                                       weight_var = 'sigma')

XbX Method for PLR Determination

Description

This function groups data by the specified time interval and performs a linear regression using the formula: $P_{pred.} = \beta_0 + \beta_1 G + \beta_2 T + \epsilon$ . This is the simplest of the PLR determining methods. Predicted values of irradiance, temperature, and wind speed (if applicable) are added to the output for reference. These values are the lowest daily high irradiance reading (over 300), the average temperature over all data, and the average wind speed over all data. Outliers are detected and labeled in a column as TRUE or FALSE.

Usage

plr_xbx_model(
  df,
  var_list,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL
)
plr_xbx_model(
  df,
  var_list,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL
)

Arguments

`df`	A dataframe containing pv data.
`var_list`	A list of the dataframe's standard variable names, obtained from the plr_variable_check output.
`by`	String, either "day", "week", or "month". The time periods over which to group data for regression.
`data_cutoff`	The number of data points needed to keep a value in the final table. Regressions over less than this number and their data will be discarded.
`predict_data`	optional; Dataframe; If you have preferred estimations of irradiance, temperature, and wind speed, include them here to skip automatic generation. Format: Irradiance, Temperature, Wind (optional).

Value

Returns dataframe of results per passed time scale from XbX modeling

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)

UTC Method for PLR Determination

Description

This function groups data by the specified time interval and performs a linear regression using the formula: power_corr ~ irrad_var - 1. Predicted values of irradiance, temperature, and wind speed (if applicable) are added for reference. The function uses a universal temperature correction, rather than the monthly regression correction done in other PLR determining methods.

Usage

plr_xbx_utc_model(
  df,
  var_list,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL,
  ref_irrad = 900,
  irrad_range = 10
)
plr_xbx_utc_model(
  df,
  var_list,
  by = "month",
  data_cutoff = 30,
  predict_data = NULL,
  ref_irrad = 900,
  irrad_range = 10
)

Arguments

`df`	A dataframe containing pv data.
`var_list`	A list of the dataframe's standard variable names, obtained from the output of `plr_variable_check`.
`by`	String, either "day", "week", or "month". The time periods over which to group data for regression.
`data_cutoff`	The number of data points needed to keep a value in the final table. Regressions over less than this number and their data will be discarded.
`predict_data`	optional; Dataframe; If you have preferred estimations of irradiance, temperature, and wind speed, include them here to skip automatic generation. Format: Irradiance, Temperature, Wind (optional).
`ref_irrad`	The irradiance value at which to calculate the universal temperature coefficient. Since irradiance is a much stronger influencer on power generation than temperature, it is important to specify a small range of irradiance data from which to estimate the effect of temperature.
`irrad_range`	The range of the subset used to calculate the universal temperature coefficient. See above.

Value

Returns dataframe of results per passed time scale from XbX with universal temperature correction modeling

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_utc_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL,
                                  ref_irrad = 900, irrad_range = 10)

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_utc_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL,
                                  ref_irrad = 900, irrad_range = 10)

Year-on-Year Regression

Description

Automatically calculates Performance Loss Rate (PLR) using year on year regression. Note that it needs data from a power predictive model.

Usage

plr_yoy_regression(
  data,
  power_var,
  time_var,
  model,
  per_year = 12,
  return_PLR = TRUE
)
plr_yoy_regression(
  data,
  power_var,
  time_var,
  model,
  per_year = 12,
  return_PLR = TRUE
)

Arguments

`data`	Result of a power predictive model
`power_var`	String name of the variable used as power
`time_var`	String name of the variable used as time
`model`	String name of the model the data was passed through
`per_year`	Time step count per year based on model. Typically 12 for MbM, 365 for DbD.
`return_PLR`	boolean; option to return PLR value, rather than the raw regression data.

Value

Returns PLR value and error evaluated with YoY regression, if return_PLR is false it will return the individual YoY calculations

Examples

# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
# Calculate Performance Loss Rate
xbx_wbw_plr <- plr_yoy_regression(test_xbx_wbw_res, 
                                       power_var = 'power_var', 
                                       time_var = 'time_var',
                                       model = "xbx", 
                                       per_year = 52, 
                                       return_PLR = TRUE)


# build var_list
var_list <- plr_build_var_list(time_var = "timestamp",
                               power_var = "power",
                               irrad_var = "g_poa",
                               temp_var = "mod_temp",
                               wind_var = NA)
# Clean Data
test_dfc <- plr_cleaning(test_df, var_list, irrad_thresh = 100,
                         low_power_thresh = 0.01, high_power_cutoff = NA)
                         
# Perform the power predictive modeling step
test_xbx_wbw_res <- plr_xbx_model(test_dfc, var_list, by = "week",
                                  data_cutoff = 30, predict_data = NULL)
                                  
# Calculate Performance Loss Rate
xbx_wbw_plr <- plr_yoy_regression(test_xbx_wbw_res, 
                                       power_var = 'power_var', 
                                       time_var = 'time_var',
                                       model = "xbx", 
                                       per_year = 52, 
                                       return_PLR = TRUE)

Spline columns to match timestamps.

Description

Often timestamps of two data frames will be mismatched. To produced matching timestamps, columns that may be splined will be and then corresponding values at the 'correct' timestamp are used.

Usage

spline_timestamp_sync(
  data,
  data_ts = "timestamp",
  merge_data,
  merge_ts = "timestamp"
)
spline_timestamp_sync(
  data,
  data_ts = "timestamp",
  merge_data,
  merge_ts = "timestamp"
)

Arguments

`data`	A data frame with a correct timestamp column.
`data_ts`	The column name for the `data` timestamp. Defaults to 'timestamp'
`merge_data`	A data frame that will be linearly interpolated and merged with `data`.
`merge_ts`	The column name for the `merge_data` timestamp. Defaults to 'timestamp'.

Details

Any value that can not be linearly interpolated such as a string will remain the same.

Value

The resulting merged data frame.

Author(s)

Arash Khalilnejad

DOE RTC Sample PV System Data

Description

A dataset containing a small, randomly taken sample of PV data from SDLE's data collection. It is included for the purposes of unit tests and vignettes, serving as an example of how the package's functions work.

Usage

test_df
test_df

Format

A .csv file that can be read as a dataframe. 16265 rows and 22 variables.

Determines the minutes between data points in a time-series

Description

Determines the minutes between data points in a time-series

Usage

time_frequency(data)
time_frequency(data)

Arguments

data

A time-series dataframe containing a column named 'timestamp'.

Value

a numeric value of the minutes between data points

Author(s)

Arash Khalilnejad

Inflate a time series data set.

Description

Shifts known values to the nearest equidistant timestamp and fills in any missing timestamps with NA values. An additional binary column named <column to impute>_imp is added where 1 represents an unknown value and zero represents a known value.

Usage

ts_inflate(data, ts_col, col_to_imp, dt)
ts_inflate(data, ts_col, col_to_imp, dt)

Arguments

`data`	A data frame containing columns `ts_col` and `col_to_imp`.
`ts_col`	The name of the timestamp column.
`col_to_imp`	The name of the column to impute.
`dt`	The expected time between consecutive timestamps, in minutes.

Package 'PVplr'

Help Index

function to test if an entire column is NA

Description

Usage

Arguments

Value

Examples

Fixes the anomlies

Description

Usage

Arguments

Value

Author(s)

detects rhw anomalies and returns a dataframw with cleaned and anom_flag column

Description

Usage

Arguments

Value

Author(s)

checks the quality of the data after and before cleaning

Description

Usage

Arguments

Details

Value

Author(s)

Reads jci files gotten in budget period 2

Description

Usage

Arguments

Value

Author(s)

finds median start and end time of PV operation

Description

Usage

Arguments

Value

Author(s)

data with PV on time flag.

Description

Usage

Arguments

Value

Author(s)

returns quality information of time series data of PV

Description

Usage

Arguments

Author(s)

Largest Intervals

Description

Usage

Arguments

Value

Author(s)

Numerical time interim predictor.

Description

Usage

Arguments

Value

Author(s)

Linearly interpolate hourly data to 15 min data.

Description

Usage

Arguments

Details

Value

Author(s)

Linearly interpolate missing energy values.

Description

Usage

Arguments

Value

Examples

Dataframe resample function

Description

Usage

Arguments

Value