The function estimates and predict models using time series dataset and provide subset forecasts within the length of trend. The recognized models are lm, smooth spline, polynomial splines with or without knots, quadratic polynomial, and ARIMA. The robust output include the models' estimates, time-varying forecasts and plots based on themes from ggplot. The main attraction of this function is the use of the newly introduced equal number of trend (days, months, years) to estimate forecast from the model. The function takes daily, monthly and yearly data sets for now
.
Usage
DynamicForecast(date, series, Trend, Type, MaximumDate, x = 0, BREAKS = 0,
ORIGIN = origin, Length = 0, ...)
Arguments
- date
A vector containing the dates for which the data is collected. Must be the same length with
series
. The date must be in 'YYYY-MM-DD'. If the data is monthly series, the recognized date format is the last day of the month of the dataset e.g. 2021-02-28. If the data is a yearly series, the recognized date format is the last day of the year of the data set e.g. 2020-12-31. There is no format for Quarterly data for now.- series
A vector containing data for estimation and forecasting. Must be the same length with
date
.- x
vector of optional dataset that is to be added to the model for forecasting. The modeling and forecasting is still done if not provided. Must be the same length with
series
.- BREAKS
A vector of numbers indicating points of breaks for estimation of the spline models.
- MaximumDate
The date indicating the maximum date (last date) in the data frame, meaning that forecasting starts the next date following it. The date must be a recognized date format. Note that for forecasting, the date origin is set to 1970-01-01.
- Trend
The type of trend. There are three options Day, Month and Year.
- Type
The type of response variable. There are two options Continuous and Integer. For integer variable, the forecasts are constrained between the minimum and maximum value of the response variable.
- Length
The length for which the forecast would be made. If not given, would default to the length of the dataset i.e. sample size.
- ORIGIN
if different from 1970-01-01 must be in the format
"YYYY-MM-DD"
. This is used to position the date of the data in order to properlydate
the forecasts.- ...
Additional arguments that may be passed to the function. If the maximum date is NULL which is is the default, it is set to the last date of the
series
.
Value
A list with the following components:
Spline without knots
The estimated spline model without the breaks (knots).
Spline with knots
The estimated spline model with the breaks (knots).
Smooth Spline
The smooth spline estimates.
ARIMA
Estimated Auto Regressive Integrated Moving Average model.
Quadratic
The estimated quadratic polynomial model.
Ensembled with equal weight
Estimated Ensemble model with equal weight given to each of the models. To get this, the fitted values of each of the models is divided by the number of models and summed together.
Ensembled based on weight
Estimated Ensemble model based on weight of each model. To do this, the fitted values of each model served as independent variable and regressed against the trend with interaction among the variables.
Ensembled based on summed weight
Estimated Ensemble model based on summed weight of each model. To do this, the fitted values of each model served as independent variable and is regressed against the trend.
Ensembled based on weight of fit
Estimated Ensemble model. The fit of each model is measured by the rmse.
Unconstrained Forecast
The forecast if the response variable is continuous. The number of forecasts is equivalent to the length of the dataset (equal days forecast).
Constrained Forecast
The forecast if the response variable is integer. The number of forecasts is equivalent to the length of the dataset (equal days forecast).
RMSE
Root Mean Square Error (rmse) for each forecast.
Unconstrained forecast Plot
The combined plots of the unconstrained forecasts using ggplot.
Constrained forecast Plot
The combined plots of the constrained forecasts using ggplot.
Date
This is the date range for the forecast.
Fitted plot
This is the plot of the fitted models.
Estimated coefficients
This is the estimated coefficients of the various models in the forecast.
Examples
# COVID19$Date <- zoo::as.Date(COVID19$Date, format = '%m/%d/%Y')
# #The date is formatted to R format
# LEN <- length(COVID19$Case)
# Dss <- seq(COVID19$Date[1], by = "day", length.out = LEN)
# #data length for forecast
# ORIGIN = "2020-02-29"
# lastdayfo21 <- Dss[length(Dss)] # The maximum length # uncomment to run
# Data <- COVID19[COVID19$Date <= lastdayfo21 - 28, ]
# # desired length of forecast
# BREAKS <- c(70, 131, 173, 228, 274) # The default breaks for the data
# DynamicForecast(date = Data$Date, series = Data$Case,
# BREAKS = BREAKS, MaximumDate = "2021-02-10",
# Trend = "Day", Length = 0, Type = "Integer")
#
# lastdayfo21 <- Dss[length(Dss)]
# Data <- COVID19[COVID19$Date <= lastdayfo21 - 14, ]
# BREAKS = c(70, 131, 173, 228, 274)
# DynamicForecast(date = Data$Date, series = Data$Case,
# BREAKS = BREAKS , MaximumDate = "2021-02-10",
# Trend = "Day", Length = 0, Type = "Integer")