Seemingly Virtuous Complexity in Return Prediction
Return prediction with Random Fourier Features (RFF)—a very large number, P , of nonlinear trans-formations of a small number, K, of predictor variables—has become popular recently. Surprisingly, this approach appears to yield a successful out-of-sample stock market index timing strategy even when trained in rolling windows as small as T = 12 months with P in the thousands. However, when P ≫ T , the RFF-based forecast becomes a weighted average of the T training sample returns, with weights determined by the similarity between the predictor vectors in the training data and the current predictor vector. In short training windows, similarity primarily reflects temporal proximity, so the forecast reduces to a recency-weighted average of the T return observations in the training data—essentially a momentum strategy. Moreover, because similarity declines with predictor volatility, the result is a volatility-timed momentum strategy. The strong performance of the RFF-based strategy thus stems not from its ability to extract predictive signals from the training data, but from the fact that a volatility-timed momentum strategy happened to perform well in historical data. This point becomes clear when applying the same method to artificial data in which returns exhibit reversals rather than momentum: the RFF approach still constructs the same volatility-timed momentum strategy, which then performs poorly.