Skip to contents

The function estimates and predict models using time series dataset and provide subset forecasts within the length of trend. The recognized models are lm, smooth spline, polynomial splines with or without knots, quadratic polynomial, and ARIMA. The robust output include the models' estimates, time-varying forecasts and plots based on themes from ggplot. The main attraction of this function is the use of the newly introduced equal number of trend (days, months, years) to estimate forecast from the model. The function takes daily, monthly and yearly data sets for now.

Usage

DynamicForecast(date, series, Trend, Type, MaximumDate, x = 0, BREAKS = 0,
 ORIGIN = origin, Length = 0, ...)

Arguments

date

A vector containing the dates for which the data is collected. Must be the same length with series. The date must be in 'YYYY-MM-DD'. If the data is monthly series, the recognized date format is the last day of the month of the dataset e.g. 2021-02-28. If the data is a yearly series, the recognized date format is the last day of the year of the data set e.g. 2020-12-31. There is no format for Quarterly data for now.

series

A vector containing data for estimation and forecasting. Must be the same length with date.

x

vector of optional dataset that is to be added to the model for forecasting. The modeling and forecasting is still done if not provided. Must be the same length with series.

BREAKS

A vector of numbers indicating points of breaks for estimation of the spline models.

MaximumDate

The date indicating the maximum date (last date) in the data frame, meaning that forecasting starts the next date following it. The date must be a recognized date format. Note that for forecasting, the date origin is set to 1970-01-01.

Trend

The type of trend. There are three options Day, Month and Year.

Type

The type of response variable. There are two options Continuous and Integer. For integer variable, the forecasts are constrained between the minimum and maximum value of the response variable.

Length

The length for which the forecast would be made. If not given, would default to the length of the dataset i.e. sample size.

ORIGIN

if different from 1970-01-01 must be in the format "YYYY-MM-DD". This is used to position the date of the data in order to properly date the forecasts.

...

Additional arguments that may be passed to the function. If the maximum date is NULL which is is the default, it is set to the last date of the series.

Value

A list with the following components:

Spline without knots

The estimated spline model without the breaks (knots).

Spline with knots

The estimated spline model with the breaks (knots).

Smooth Spline

The smooth spline estimates.

ARIMA

Estimated Auto Regressive Integrated Moving Average model.

Quadratic

The estimated quadratic polynomial model.

Ensembled with equal weight

Estimated Ensemble model with equal weight given to each of the models. To get this, the fitted values of each of the models is divided by the number of models and summed together.

Ensembled based on weight

Estimated Ensemble model based on weight of each model. To do this, the fitted values of each model served as independent variable and regressed against the trend with interaction among the variables.

Ensembled based on summed weight

Estimated Ensemble model based on summed weight of each model. To do this, the fitted values of each model served as independent variable and is regressed against the trend.

Ensembled based on weight of fit

Estimated Ensemble model. The fit of each model is measured by the rmse.

Unconstrained Forecast

The forecast if the response variable is continuous. The number of forecasts is equivalent to the length of the dataset (equal days forecast).

Constrained Forecast

The forecast if the response variable is integer. The number of forecasts is equivalent to the length of the dataset (equal days forecast).

RMSE

Root Mean Square Error (rmse) for each forecast.

Unconstrained forecast Plot

The combined plots of the unconstrained forecasts using ggplot.

Constrained forecast Plot

The combined plots of the constrained forecasts using ggplot.

Date

This is the date range for the forecast.

Fitted plot

This is the plot of the fitted models.

Estimated coefficients

This is the estimated coefficients of the various models in the forecast.

Examples

# COVID19$Date <- zoo::as.Date(COVID19$Date, format = '%m/%d/%Y')
#  #The date is formatted to R format
# LEN <- length(COVID19$Case)
# Dss <- seq(COVID19$Date[1], by = "day", length.out = LEN)
#  #data length for forecast
# ORIGIN = "2020-02-29"
# lastdayfo21 <- Dss[length(Dss)] # The maximum length # uncomment to run
# Data <- COVID19[COVID19$Date <= lastdayfo21 - 28, ]
# # desired length of forecast
# BREAKS <- c(70, 131, 173, 228, 274) # The default breaks for the data
# DynamicForecast(date = Data$Date, series = Data$Case,
# BREAKS = BREAKS, MaximumDate = "2021-02-10",
#  Trend = "Day", Length = 0, Type = "Integer")
#
# lastdayfo21 <- Dss[length(Dss)]
# Data <- COVID19[COVID19$Date <= lastdayfo21 - 14, ]
# BREAKS = c(70, 131, 173, 228, 274)
# DynamicForecast(date = Data$Date, series = Data$Case,
# BREAKS = BREAKS , MaximumDate = "2021-02-10",
#  Trend = "Day", Length = 0, Type = "Integer")