T-fold sequential-validation technique for out-of-distribution generalization with financial time series data
Cargando...
Fecha
2021-06
Autores
Muñoz-Elguezábal, Juan F.
Sánchez-Torres, Juan D.
Título de la revista
ISSN de la revista
Título del volumen
Editor
International Conference on Econometrics and Statistics
Resumen
Descripción
The temporal structure in financial time series (FTS) data demands non-trivial considerations in the use of cross-validation (CV). Such frequently used technique is based on statistical learning theory, which is founded on the assumption that training samples are i.i.d. Although there is progress in studying fundamental phenomenons in certain learning methods such as feature selection imbalance during the learning stage, it is currently widely accepted that there will be no reason to expect good out of sample results from a learning process without such strong assumption. In FTS, there are conditions under which sub-sampling data leads to overshadow the effect of non-deterministic relationships between features and the target variable among different samples. Such effect remains unnoticed given the use of the additivity property in the decomposition of objective functions for the Learning Process. Moreover, it reduces to a particular operation the relationship among samples without information attribution. We present a technique that controls information leakage and decomposes the global probability distribution into local probability distributions, providing identification of each sample contribution to the learning process, maintaining information sparsity, therefore, relaxing the effects of the i.i.d. assumption. Parametric stability, as a result, is presented for exchange rate prediction using different predictive models.
Palabras clave
Financial Machine Learning, Cross-Validation, Time Series Forecasting, Learning Theory
Citación
Muñoz-Elguezábal, J. F. & Sánchez-Torres, J. D. (2021). T-fold sequential-validation technique for out-of-distribution generalization with financial time series data. 4th International Conference on Econometrics and Statistics.