(Scattered) mindsets for time series analysis

Eve Law
2 min readAug 25, 2021

Learned something from an experienced modeler today; let’s call him C. Gonna write down the lessons before I forget.

That fancy estimator may be an overkill…for a boring problem

When he was much younger, he tried to solve a forecasting problem with a neural network. It worked quite well — — until a colleague pointed out that the neural network captured seasonality and not much else. In other words, it was an overkill for nothing.

Upon reflection, he said he should have removed seasonality from the data, and then forecast the residues using a non-linear model e.g., a neural network. In his seasoned opinion, once trend and seasonality are accounted for, the residues present a tough nut to crack — — which is where non-linear models and judicious feature engineering come in.

It takes some work to separate out different components of the data and build separate models for them, sure. But it might be worthwhile after all.

Handling skewness in the forecast target (or endogeneous/response/dependent variable)

In C’s sharing, he also mentioned Box-Cox transformations and how they help enforce heteroskadacity and normality of the forecast target. (Heteroskadacity is just a fancy word for constant variance — — across time and in this case.) Certain models, such as ordinary least squares (OLS), are the least biased (linear) estimator when those assumptions are met. (https://statisticsbyjim.com/regression/gauss-markov-theorem-ols-blue/)

Is Box-Cox a good idea? After doing some research myself, here’s what I think.

Box-Cox is OK, if you 1) only care about predictive performance and not interpretability; 2) are stuck with OLS.

Otherwise, there are so many ways to handle skewness! Generalized linear models (GLM) accounts for some common parametric forms, non-linear models tries to do away with the parametric stuff…and econometrics offer even more tools e.g., generalized least squares and quantile regression.

Adding categorical features may also help — — because the skewness may be the result of several densities mixing together. The categorical features may allow the model to distinguish between the mixing components.

All in all, there are more reasons to avoid Box-Cox than to consider using it. Credits to the awesome Quora response below.



