Skip to content

A comprehensive implementation of the Theta Model for time series forecasting. Includes model estimation, sensitivity analysis, diagnostics, and extensions. Uses statsmodels, pandas, and matplotlib.

License

Notifications You must be signed in to change notification settings

esosetrov/the_theta_model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

the_theta_model

A comprehensive implementation of the Theta Model for time series forecasting. Includes model estimation, sensitivity analysis, diagnostics, and extensions. Uses statsmodels, pandas, and matplotlib.

Overview

A comprehensive implementation and analysis of the Theta forecasting model developed by Assimakopoulos & Nikolopoulos (2000). This notebook provides a complete workflow for time series forecasting using Python's statsmodels library, with detailed explanations, parameter analysis, and diagnostic evaluation.

Key Features

1. Complete Theta Model Implementation

  • Default Estimation (SES + OLS): Traditional parameter estimation approach
  • Maximum Likelihood Estimation (MLE): Statistical estimation using IMA(1,1) model
  • Multi-theta Forecasting: Sensitivity analysis across different theta values
  • Seasonal Decomposition: Automatic detection and handling of seasonality

2. Comprehensive Case Studies

  • Housing Starts Analysis: US housing market data (FRED: HOUST)
  • Personal Consumption Expenditure: Macroeconomic forecasting (FRED: PCEC)
  • Comparative Analysis: Different estimation methods and their implications

3. Advanced Diagnostic Tools

  • Residual Analysis: Distribution, autocorrelation, and normality tests
  • Forecast Evaluation: MAE, RMSE, MAPE metrics and interpretation
  • Model Diagnostics: Jarque-Bera, Ljung-Box tests for model validation
  • Component Analysis: Trend, SES, and seasonal component decomposition

4. Visualization Suite

  • Hedgehog Plots: Rolling origin forecasts for model stability assessment
  • Parameter Sensitivity: Theta value impact on forecast trajectories
  • Prediction Intervals: Uncertainty quantification in forecasts
  • Residual Diagnostics: Comprehensive error analysis visualizations

Installation & Requirements

pip install statsmodels pandas matplotlib seaborn numpy pandas-datareader scipy

Core Dependencies

  • statsmodels (≥0.13.0): Theta model implementation
  • pandas (≥1.3.0): Data manipulation and time series handling
  • numpy (≥1.21.0): Numerical computations
  • matplotlib (≥3.5.0): Data visualization
  • pandas-datareader (≥0.10.0): FRED data access
  • scipy (≥1.7.0): Statistical functions

Quick Start

# Basic Theta Model Implementation
from statsmodels.tsa.forecasting.theta import ThetaModel
import pandas_datareader as pdr

# Load data
reader = pdr.fred.FredReader(['HOUST'], start="1980-01-01", end="2020-04-01")
data = reader.read()
series = data['HOUST']

# Fit Theta Model
tm = ThetaModel(series)
results = tm.fit()

# Generate forecasts
forecast = results.forecast(12)
print(forecast)

Model Parameters & Interpretation

Core Parameters

  • b0 (Drift/Trend): Linear trend component

    • Positive: Upward trend
    • Negative: Downward trend
    • Near-zero: Stationary behavior
  • alpha (Smoothing): Exponential smoothing parameter

    • Range: (0, 1)
    • High values: More weight to recent observations
    • Low values: More weight to historical data
  • theta: Curvature adjustment parameter

    • θ = 0: Straight line (linear trend)
    • θ = 2: Standard Theta model
    • θ → ∞: IMA(1,1) with drift

Detailed Workflow

1. Data Preparation

# Automatic frequency detection and seasonal adjustment
series.index.freq = series.index.inferred_freq
tm = ThetaModel(series, method='additive')

2. Model Estimation Methods

  • Default: SES smoothing + OLS trend estimation
  • MLE: Maximum likelihood via IMA(1,1) model
  • Comparative analysis of estimation approaches

3. Forecasting Components

# Access individual forecast components
components = results.forecast_components(12)
# Returns: trend, SES, and seasonal components

4. Sensitivity Analysis

# Compare different theta values
forecasts = {
    'theta=1.2': results.forecast(12, theta=1.2),
    'theta=2': results.forecast(12),
    'theta=3': results.forecast(12, theta=3),
    'theta=inf': results.forecast(12, theta=np.inf)
}

Case Study Results

Housing Starts (HOUST)

  • Data: 484 monthly observations (1980-2020)
  • Trend: Negative drift (b0 = -0.9186)
  • Smoothing: Moderate persistence (alpha = 0.6165)
  • Accuracy: MAPE = 8.28% (Good performance)

Personal Consumption Expenditure (PCEC)

  • Data: 162 quarterly observations (1980-2020)
  • Trend: Positive growth (b0 = 0.0130)
  • Smoothing: High persistence (alpha = 0.9999)
  • Pattern: Strong upward trajectory with COVID-19 impact visible

Model Diagnostics

Residual Analysis

  • Normality: Jarque-Bera test for distribution assessment
  • Autocorrelation: Ljung-Box test for residual independence
  • Heteroscedasticity: Variance stability checks
  • Error Metrics: MAE, RMSE, MAPE calculation

Forecast Evaluation

  • Hedgehog plots: Model stability over time
  • Prediction intervals: Uncertainty quantification
  • Out-of-sample testing: Model generalization assessment

Advanced Topics

1. Multiple Theta-Lines

Extension to support multiple theta values with weighted combinations:

class ExtendedThetaModel:
    """Support for multiple theta values with custom weighting"""
    def __init__(self, series, thetas=[0, 2, 3]):
        self.series = series
        self.thetas = thetas

2. State Space Representation

Theta model expressed as state space system for Kalman filter implementation. This refers to reformulating the Theta model as a state space system, enabling Kalman filter implementation. The state space representation decomposes the time series into unobserved components (level, trend, seasonal) with explicit transition equations. This allows for optimal recursive estimation of these hidden states using the Kalman filter algorithm. The approach provides several advantages: it yields minimum mean-squared error estimates, handles missing data naturally, and produces exact prediction intervals. Essentially, the Theta model's components are estimated optimally rather than heuristically, while maintaining the model's interpretable structure. This representation bridges the intuitive Theta decomposition with rigorous statistical estimation methods.

3. Parameter Optimization

Grid search for optimal theta values based on validation performance.

Performance Benchmarks

Metric Housing Starts PCE
MAPE 8.28% 5-7% (estimated)
MAE 103.90 units Scale-dependent
RMSE 141.03 units Scale-dependent
Autocorrelation None detected None detected

Key Insights

Model Strengths

  1. Simplicity: Easy to implement and interpret
  2. Flexibility: Theta parameter allows trend/smoothness tuning
  3. Performance: Particularly effective for monthly economic data
  4. Decomposition: Clear separation of trend, seasonal, and irregular components

Limitations & Considerations

  1. Linear Trend Assumption: Long-term forecasts assume linearity
  2. Parameter Sensitivity: Theta choice significantly impacts results
  3. Error Distribution: Residuals often non-normal in practice
  4. Seasonal Adjustment: Method choice (additive/multiplicative) affects results

Best Practices

  1. Always validate with out-of-sample testing
  2. Compare multiple theta values for optimal performance
  3. Check residuals for model adequacy
  4. Consider transformations (log, etc.) for non-stationary data
  5. Use hedgehog plots to assess forecast stability

References

  1. Assimakopoulos, V., & Nikolopoulos, K. (2000). The theta model: a decomposition approach to forecasting. International Journal of Forecasting, 16(4), 521-530.

  2. Hyndman, R. J., & Billah, B. (2003). Unmasking the Theta method. International Journal of Forecasting, 19(2), 287-290.

  3. Fioruci, J. A., Pellegrini, T. R., Louzada, F., & Petropoulos, F. (2015). The optimized theta method. arXiv preprint arXiv:1503.03529.

  4. Statsmodels Documentation: Theta Model Implementation

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions, issues, and feature requests are welcome. Feel free to check [issues page] if you want to contribute.

Support

For questions and support:

  1. Check the Statsmodels documentation
  2. Review existing issues
  3. Submit detailed questions with reproducible examples

This implementation is based on statsmodels v0.13.0+. Some features may not be available in earlier versions. Always check compatibility with your environment.

Note

This notebook is designed for educational and research purposes. Real-world applications may require additional considerations and validation.

About

A comprehensive implementation of the Theta Model for time series forecasting. Includes model estimation, sensitivity analysis, diagnostics, and extensions. Uses statsmodels, pandas, and matplotlib.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published