T-fold sequential-validation technique for out-of-distribution generalization with financial time series data

Cargando...
Miniatura

Fecha

2021-06

Autores

Muñoz-Elguezábal, Juan F.
Sánchez-Torres, Juan D.

Título de la revista

ISSN de la revista

Título del volumen

Editor

International Conference on Econometrics and Statistics

Resumen

Descripción

The temporal structure in financial time series (FTS) data demands non-trivial considerations in the use of cross-validation (CV). Such frequently used technique is based on statistical learning theory, which is founded on the assumption that training samples are i.i.d. Although there is progress in studying fundamental phenomenons in certain learning methods such as feature selection imbalance during the learning stage, it is currently widely accepted that there will be no reason to expect good out of sample results from a learning process without such strong assumption. In FTS, there are conditions under which sub-sampling data leads to overshadow the effect of non-deterministic relationships between features and the target variable among different samples. Such effect remains unnoticed given the use of the additivity property in the decomposition of objective functions for the Learning Process. Moreover, it reduces to a particular operation the relationship among samples without information attribution. We present a technique that controls information leakage and decomposes the global probability distribution into local probability distributions, providing identification of each sample contribution to the learning process, maintaining information sparsity, therefore, relaxing the effects of the i.i.d. assumption. Parametric stability, as a result, is presented for exchange rate prediction using different predictive models.

Palabras clave

Financial Machine Learning, Cross-Validation, Time Series Forecasting, Learning Theory

Citación

Muñoz-Elguezábal, J. F. & Sánchez-Torres, J. D. (2021). T-fold sequential-validation technique for out-of-distribution generalization with financial time series data. 4th International Conference on Econometrics and Statistics.