AutoARIMA#
- class cuml.tsa.auto_arima.AutoARIMA(endog, *, simple_differencing=True, verbose=False, output_type=None, convert_dtype=True)#
Implements a batched auto-ARIMA model for in- and out-of-sample times-series prediction.
This interface offers a highly customizable search, with functionality similar to the
forecastandfablepackages in R. It provides an abstraction around the underlying ARIMA models to predict and forecast as if using a single model.- Parameters:
- endogdataframe or array-like (device or host)
The time series data, assumed to have each time series in columns. Acceptable formats: cuDF DataFrame, cuDF Series, NumPy ndarray, Numba device ndarray, cuda array interface compliant array like CuPy.
- simple_differencing: bool or int, default=True
If True, the data is differenced before being passed to the Kalman filter. If False, differencing is part of the state-space model. See additional notes in the ARIMA docs
- verboseint or boolean, default=False
Sets logging level. It must be one of
cuml.common.logger.level_*. See Verbosity Levels for more info.- output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None
Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (
cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.- convert_dtypeboolean
When set to True, the model will automatically convert the inputs to np.float64.
- Attributes:
- d_y
Methods
fit(self, double h, int maxiter[, method])Fits the selected models for their respective series
forecast(self, int nsteps[, level])Forecast
nstepsinto the future.predict(self[, start, end, level])Compute in-sample and/or out-of-sample prediction for each series
search(self[, s, d, D, p, q, P, Q, ...])Searches through the specified model space and associates each series to the most appropriate model.
summary(self)Display a quick summary of the models selected by
searchNotes
The interface was influenced by the R
fablepackage: See https://fable.tidyverts.org/reference/ARIMA.htmlReferences
A useful (though outdated) reference is the paper:
[1]Rob J. Hyndman, Yeasmin Khandakar, 2008. “Automatic Time Series Forecasting: The ‘forecast’ Package for R”, Journal of Statistical Software 27
Examples
from cuml.tsa.auto_arima import AutoARIMA model = AutoARIMA(y) model.search(s=12, d=(0, 1), D=(0, 1), p=(0, 2, 4), q=(0, 2, 4), P=range(2), Q=range(2), method="css", truncate=100) model.fit(method="css-ml") fc = model.forecast(20)
- fit(self, double h: float = 1e-8, int maxiter: int = 1000, method='ml', int truncate: int = 0)[source]#
Fits the selected models for their respective series
- Parameters:
- hfloat
Finite-differencing step size used to compute gradients in ARIMA
- maxiterint
Maximum number of iterations of L-BFGS-B
- methodstr
Estimation method - “css”, “css-ml” or “ml”. CSS uses a fast sum-of-squares approximation. ML estimates the log-likelihood with statespace methods. CSS-ML starts with CSS and refines with ML.
- truncateint
When using CSS, start the sum of squares after a given number of observations for better performance (but often a worse fit)
- forecast(self, int nsteps: int, level=None) Union[CumlArray, Tuple[CumlArray, CumlArray, CumlArray]][source]#
Forecast
nstepsinto the future.- Parameters:
- nstepsint
The number of steps to forecast beyond end of the given series
- level: float or None (default = None)
Confidence level for prediction intervals, or None to return only the point forecasts. 0 < level < 1
- Returns:
- y_fcarray-like
Forecasts. Shape = (nsteps, batch_size)
- lower: array-like (device) (optional)
Lower limit of the prediction interval if level != None Shape = (end - start, batch_size)
- upper: array-like (device) (optional)
Upper limit of the prediction interval if level != None Shape = (end - start, batch_size)
- predict(self, start=0, end=None, level=None) Union[CumlArray, Tuple[CumlArray, CumlArray, CumlArray]][source]#
Compute in-sample and/or out-of-sample prediction for each series
- Parameters:
- start: int
Index where to start the predictions (0 <= start <= num_samples)
- end:
Index where to end the predictions, excluded (end > start)
- level: float or None (default = None)
Confidence level for prediction intervals, or None to return only the point forecasts. 0 < level < 1
- Returns:
- y_parray-like (device)
Predictions. Shape = (end - start, batch_size)
- lower: array-like (device) (optional)
Lower limit of the prediction interval if level != None Shape = (end - start, batch_size)
- upper: array-like (device) (optional)
Upper limit of the prediction interval if level != None Shape = (end - start, batch_size)
- search(self, s=None, d=range(3), D=range(2), p=range(1, 4), q=range(1, 4), P=range(3), Q=range(3), fit_intercept='auto', ic='aicc', test='kpss', seasonal_test='seas', double h: float = 1e-8, int maxiter: int = 1000, method='auto', int truncate: int = 0)[source]#
Searches through the specified model space and associates each series to the most appropriate model.
- Parameters:
- sint
Seasonal period. None or 0 for non-seasonal time series
- dint, sequence or generator
Possible values for d (simple difference)
- Dint, sequence or generator
Possible values for D (seasonal difference)
- pint, sequence or generator
Possible values for p (AR order)
- qint, sequence or generator
Possible values for q (MA order)
- Pint, sequence or generator
Possible values for P (seasonal AR order)
- Qint, sequence or generator
Possible values for Q (seasonal MA order)
- fit_interceptint, sequence, generator or “auto”
Whether to fit an intercept. “auto” chooses based on the model parameters: it uses an incercept iff d + D <= 1
- icstr
Which information criterion to use for the model selection. Currently supported: AIC, AICc, BIC
- teststr
Which stationarity test to use to choose d. Currently supported: KPSS
- seasonal_teststr
Which seasonality test to use to choose D. Currently supported: seas
- hfloat
Finite-differencing step size used to compute gradients in ARIMA
- maxiterint
Maximum number of iterations of L-BFGS-B
- methodstr
Estimation method - “auto”, “css”, “css-ml” or “ml”. CSS uses a fast sum-of-squares approximation. ML estimates the log-likelihood with statespace methods. CSS-ML starts with CSS and refines with ML. “auto” will use CSS for long seasonal time series, ML otherwise.
- truncateint
When using CSS, start the sum of squares after a given number of observations for better performance. Recommended for long time series when truncating doesn’t lose too much information.