AutoARIMA#

class cuml.tsa.auto_arima.AutoARIMA(endog, *, simple_differencing=True, verbose=False, output_type=None, convert_dtype=True)#

Implements a batched auto-ARIMA model for in- and out-of-sample times-series prediction.

This interface offers a highly customizable search, with functionality similar to the forecast and fable packages in R. It provides an abstraction around the underlying ARIMA models to predict and forecast as if using a single model.

Parameters:
endogdataframe or array-like (device or host)

The time series data, assumed to have each time series in columns. Acceptable formats: cuDF DataFrame, cuDF Series, NumPy ndarray, Numba device ndarray, cuda array interface compliant array like CuPy.

simple_differencing: bool or int, default=True

If True, the data is differenced before being passed to the Kalman filter. If False, differencing is part of the state-space model. See additional notes in the ARIMA docs

verboseint or boolean, default=False

Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.

output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None

Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.

convert_dtypeboolean

When set to True, the model will automatically convert the inputs to np.float64.

Attributes:
d_y

Methods

fit(self, double h, int maxiter[, method])

Fits the selected models for their respective series

forecast(self, int nsteps[, level])

Forecast nsteps into the future.

predict(self[, start, end, level])

Compute in-sample and/or out-of-sample prediction for each series

search(self[, s, d, D, p, q, P, Q, ...])

Searches through the specified model space and associates each series to the most appropriate model.

summary(self)

Display a quick summary of the models selected by search

Notes

The interface was influenced by the R fable package: See https://fable.tidyverts.org/reference/ARIMA.html

References

A useful (though outdated) reference is the paper:

[1]

Rob J. Hyndman, Yeasmin Khandakar, 2008. “Automatic Time Series Forecasting: The ‘forecast’ Package for R”, Journal of Statistical Software 27

Examples

from cuml.tsa.auto_arima import AutoARIMA

model = AutoARIMA(y)
model.search(s=12, d=(0, 1), D=(0, 1), p=(0, 2, 4), q=(0, 2, 4),
             P=range(2), Q=range(2), method="css", truncate=100)
model.fit(method="css-ml")
fc = model.forecast(20)
fit(self, double h: float = 1e-8, int maxiter: int = 1000, method='ml', int truncate: int = 0)[source]#

Fits the selected models for their respective series

Parameters:
hfloat

Finite-differencing step size used to compute gradients in ARIMA

maxiterint

Maximum number of iterations of L-BFGS-B

methodstr

Estimation method - “css”, “css-ml” or “ml”. CSS uses a fast sum-of-squares approximation. ML estimates the log-likelihood with statespace methods. CSS-ML starts with CSS and refines with ML.

truncateint

When using CSS, start the sum of squares after a given number of observations for better performance (but often a worse fit)

forecast(self, int nsteps: int, level=None) Union[CumlArray, Tuple[CumlArray, CumlArray, CumlArray]][source]#

Forecast nsteps into the future.

Parameters:
nstepsint

The number of steps to forecast beyond end of the given series

level: float or None (default = None)

Confidence level for prediction intervals, or None to return only the point forecasts. 0 < level < 1

Returns:
y_fcarray-like

Forecasts. Shape = (nsteps, batch_size)

lower: array-like (device) (optional)

Lower limit of the prediction interval if level != None Shape = (end - start, batch_size)

upper: array-like (device) (optional)

Upper limit of the prediction interval if level != None Shape = (end - start, batch_size)

predict(self, start=0, end=None, level=None) Union[CumlArray, Tuple[CumlArray, CumlArray, CumlArray]][source]#

Compute in-sample and/or out-of-sample prediction for each series

Parameters:
start: int

Index where to start the predictions (0 <= start <= num_samples)

end:

Index where to end the predictions, excluded (end > start)

level: float or None (default = None)

Confidence level for prediction intervals, or None to return only the point forecasts. 0 < level < 1

Returns:
y_parray-like (device)

Predictions. Shape = (end - start, batch_size)

lower: array-like (device) (optional)

Lower limit of the prediction interval if level != None Shape = (end - start, batch_size)

upper: array-like (device) (optional)

Upper limit of the prediction interval if level != None Shape = (end - start, batch_size)

search(self, s=None, d=range(3), D=range(2), p=range(1, 4), q=range(1, 4), P=range(3), Q=range(3), fit_intercept='auto', ic='aicc', test='kpss', seasonal_test='seas', double h: float = 1e-8, int maxiter: int = 1000, method='auto', int truncate: int = 0)[source]#

Searches through the specified model space and associates each series to the most appropriate model.

Parameters:
sint

Seasonal period. None or 0 for non-seasonal time series

dint, sequence or generator

Possible values for d (simple difference)

Dint, sequence or generator

Possible values for D (seasonal difference)

pint, sequence or generator

Possible values for p (AR order)

qint, sequence or generator

Possible values for q (MA order)

Pint, sequence or generator

Possible values for P (seasonal AR order)

Qint, sequence or generator

Possible values for Q (seasonal MA order)

fit_interceptint, sequence, generator or “auto”

Whether to fit an intercept. “auto” chooses based on the model parameters: it uses an incercept iff d + D <= 1

icstr

Which information criterion to use for the model selection. Currently supported: AIC, AICc, BIC

teststr

Which stationarity test to use to choose d. Currently supported: KPSS

seasonal_teststr

Which seasonality test to use to choose D. Currently supported: seas

hfloat

Finite-differencing step size used to compute gradients in ARIMA

maxiterint

Maximum number of iterations of L-BFGS-B

methodstr

Estimation method - “auto”, “css”, “css-ml” or “ml”. CSS uses a fast sum-of-squares approximation. ML estimates the log-likelihood with statespace methods. CSS-ML starts with CSS and refines with ML. “auto” will use CSS for long seasonal time series, ML otherwise.

truncateint

When using CSS, start the sum of squares after a given number of observations for better performance. Recommended for long time series when truncating doesn’t lose too much information.

summary(self)[source]#

Display a quick summary of the models selected by search