the_theta_model

A comprehensive implementation of the Theta Model for time series forecasting. Includes model estimation, sensitivity analysis, diagnostics, and extensions. Uses statsmodels, pandas, and matplotlib.

Overview

A comprehensive implementation and analysis of the Theta forecasting model developed by Assimakopoulos & Nikolopoulos (2000). This notebook provides a complete workflow for time series forecasting using Python's statsmodels library, with detailed explanations, parameter analysis, and diagnostic evaluation.

Key Features

1. Complete Theta Model Implementation

Default Estimation (SES + OLS): Traditional parameter estimation approach
Maximum Likelihood Estimation (MLE): Statistical estimation using IMA(1,1) model
Multi-theta Forecasting: Sensitivity analysis across different theta values
Seasonal Decomposition: Automatic detection and handling of seasonality

2. Comprehensive Case Studies

Housing Starts Analysis: US housing market data (FRED: HOUST)
Personal Consumption Expenditure: Macroeconomic forecasting (FRED: PCEC)
Comparative Analysis: Different estimation methods and their implications

3. Advanced Diagnostic Tools

Residual Analysis: Distribution, autocorrelation, and normality tests
Forecast Evaluation: MAE, RMSE, MAPE metrics and interpretation
Model Diagnostics: Jarque-Bera, Ljung-Box tests for model validation
Component Analysis: Trend, SES, and seasonal component decomposition

4. Visualization Suite

Hedgehog Plots: Rolling origin forecasts for model stability assessment
Parameter Sensitivity: Theta value impact on forecast trajectories
Prediction Intervals: Uncertainty quantification in forecasts
Residual Diagnostics: Comprehensive error analysis visualizations

Installation & Requirements

pip install statsmodels pandas matplotlib seaborn numpy pandas-datareader scipy

Core Dependencies

statsmodels (≥0.13.0): Theta model implementation
pandas (≥1.3.0): Data manipulation and time series handling
numpy (≥1.21.0): Numerical computations
matplotlib (≥3.5.0): Data visualization
pandas-datareader (≥0.10.0): FRED data access
scipy (≥1.7.0): Statistical functions

Quick Start

# Basic Theta Model Implementation
from statsmodels.tsa.forecasting.theta import ThetaModel
import pandas_datareader as pdr

# Load data
reader = pdr.fred.FredReader(['HOUST'], start="1980-01-01", end="2020-04-01")
data = reader.read()
series = data['HOUST']

# Fit Theta Model
tm = ThetaModel(series)
results = tm.fit()

# Generate forecasts
forecast = results.forecast(12)
print(forecast)

Model Parameters & Interpretation

Core Parameters

b0 (Drift/Trend): Linear trend component
- Positive: Upward trend
- Negative: Downward trend
- Near-zero: Stationary behavior
alpha (Smoothing): Exponential smoothing parameter
- Range: (0, 1)
- High values: More weight to recent observations
- Low values: More weight to historical data
theta: Curvature adjustment parameter
- θ = 0: Straight line (linear trend)
- θ = 2: Standard Theta model
- θ → ∞: IMA(1,1) with drift

Detailed Workflow

1. Data Preparation

# Automatic frequency detection and seasonal adjustment
series.index.freq = series.index.inferred_freq
tm = ThetaModel(series, method='additive')

2. Model Estimation Methods

Default: SES smoothing + OLS trend estimation
MLE: Maximum likelihood via IMA(1,1) model
Comparative analysis of estimation approaches

3. Forecasting Components

# Access individual forecast components
components = results.forecast_components(12)
# Returns: trend, SES, and seasonal components

4. Sensitivity Analysis

# Compare different theta values
forecasts = {
    'theta=1.2': results.forecast(12, theta=1.2),
    'theta=2': results.forecast(12),
    'theta=3': results.forecast(12, theta=3),
    'theta=inf': results.forecast(12, theta=np.inf)
}

Case Study Results

Housing Starts (HOUST)

Data: 484 monthly observations (1980-2020)
Trend: Negative drift (b0 = -0.9186)
Smoothing: Moderate persistence (alpha = 0.6165)
Accuracy: MAPE = 8.28% (Good performance)

Personal Consumption Expenditure (PCEC)

Data: 162 quarterly observations (1980-2020)
Trend: Positive growth (b0 = 0.0130)
Smoothing: High persistence (alpha = 0.9999)
Pattern: Strong upward trajectory with COVID-19 impact visible

Model Diagnostics

Residual Analysis

Normality: Jarque-Bera test for distribution assessment
Autocorrelation: Ljung-Box test for residual independence
Heteroscedasticity: Variance stability checks
Error Metrics: MAE, RMSE, MAPE calculation

Forecast Evaluation

Hedgehog plots: Model stability over time
Prediction intervals: Uncertainty quantification
Out-of-sample testing: Model generalization assessment

Advanced Topics

1. Multiple Theta-Lines

Extension to support multiple theta values with weighted combinations:

class ExtendedThetaModel:
    """Support for multiple theta values with custom weighting"""
    def __init__(self, series, thetas=[0, 2, 3]):
        self.series = series
        self.thetas = thetas

2. State Space Representation

Theta model expressed as state space system for Kalman filter implementation. This refers to reformulating the Theta model as a state space system, enabling Kalman filter implementation. The state space representation decomposes the time series into unobserved components (level, trend, seasonal) with explicit transition equations. This allows for optimal recursive estimation of these hidden states using the Kalman filter algorithm. The approach provides several advantages: it yields minimum mean-squared error estimates, handles missing data naturally, and produces exact prediction intervals. Essentially, the Theta model's components are estimated optimally rather than heuristically, while maintaining the model's interpretable structure. This representation bridges the intuitive Theta decomposition with rigorous statistical estimation methods.

3. Parameter Optimization

Grid search for optimal theta values based on validation performance.

Performance Benchmarks

Metric	Housing Starts	PCE
MAPE	8.28%	5-7% (estimated)
MAE	103.90 units	Scale-dependent
RMSE	141.03 units	Scale-dependent
Autocorrelation	None detected	None detected

Key Insights

Model Strengths

Simplicity: Easy to implement and interpret
Flexibility: Theta parameter allows trend/smoothness tuning
Performance: Particularly effective for monthly economic data
Decomposition: Clear separation of trend, seasonal, and irregular components

Limitations & Considerations

Linear Trend Assumption: Long-term forecasts assume linearity
Parameter Sensitivity: Theta choice significantly impacts results
Error Distribution: Residuals often non-normal in practice
Seasonal Adjustment: Method choice (additive/multiplicative) affects results

Best Practices

Always validate with out-of-sample testing
Compare multiple theta values for optimal performance
Check residuals for model adequacy
Consider transformations (log, etc.) for non-stationary data
Use hedgehog plots to assess forecast stability

References

Assimakopoulos, V., & Nikolopoulos, K. (2000). The theta model: a decomposition approach to forecasting. International Journal of Forecasting, 16(4), 521-530.
Hyndman, R. J., & Billah, B. (2003). Unmasking the Theta method. International Journal of Forecasting, 19(2), 287-290.
Fioruci, J. A., Pellegrini, T. R., Louzada, F., & Petropoulos, F. (2015). The optimized theta method. arXiv preprint arXiv:1503.03529.
Statsmodels Documentation: Theta Model Implementation

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions, issues, and feature requests are welcome. Feel free to check [issues page] if you want to contribute.

Support

For questions and support:

Check the Statsmodels documentation
Review existing issues
Submit detailed questions with reproducible examples

This implementation is based on statsmodels v0.13.0+. Some features may not be available in earlier versions. Always check compatibility with your environment.

Note

This notebook is designed for educational and research purposes. Real-world applications may require additional considerations and validation.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
the_theta_model.ipynb		the_theta_model.ipynb

License

esosetrov/the_theta_model

Folders and files

Latest commit

History

Repository files navigation

the_theta_model

Overview

Key Features

1. Complete Theta Model Implementation

2. Comprehensive Case Studies

3. Advanced Diagnostic Tools

4. Visualization Suite

Installation & Requirements

Core Dependencies

Quick Start

Model Parameters & Interpretation

Core Parameters

Detailed Workflow

1. Data Preparation

2. Model Estimation Methods

3. Forecasting Components

4. Sensitivity Analysis

Case Study Results

Housing Starts (HOUST)

Personal Consumption Expenditure (PCEC)

Model Diagnostics

Residual Analysis

Forecast Evaluation

Advanced Topics

1. Multiple Theta-Lines

2. State Space Representation

3. Parameter Optimization

Performance Benchmarks

Key Insights

Model Strengths

Limitations & Considerations

Best Practices

References

License

Contributing

Support

Note

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages