Palestrantes
Alan De Genaro (FGV – EAESP)
ST2
Título: Product Complexity, Investor Experience, and Returns
Resumo
This paper examines how financial sophistication influences household investment outcomes in complex financial markets. Using regulatory microdata from Brazil’s structured products market, we analyze how returns vary with investor sophistication and product complexity. We introduce a novel measure of financial sophistication: prior trading experience in other securities markets. Investors with established trading histories systematically outperform inexperienced counterparts when trading complex products, exhibiting both persistent skill and greater learning capabilities. This experience-based measure is more predictive than conventional proxies such as wealth, age, and education. Our findings suggest that complexity can effectively obscure risk from unsophisticated investors, enabling strategic rent extraction through product design. These results question the efficacy of current “qualified investor” standards based on wealth thresholds, suggesting that direct measures of market participation provide more effective criteria for accessing complex financial instruments in increasingly sophisticated retail markets. (coautoria com Jason Sturgess – Queen Mary University of London, Jose Liberti – Northwestern University e Pedro Saffi – University of Cambridge)
Audrone Virbickaite (CUNEF Universidad)
CP2
Título: Intraday Crude Oil Volatitlity: Assessing the Impact of Economic Announcements and Mixed-frequency Data
Resumo
This paper investigates the real-time impact of economic announcements and policy decisions on intraday crude oil volatility. Our model employs a high-frequency mixed data sampling (MIDAS) approach to capture the interaction between market volatility and economic uncertainty, paired with spike-and-slab priors for the announcement coefficients to perform efficient variable selection in a high-dimensional setting. Our findings demonstrate that these announcements significantly influence short-term oil price volatility; meanwhile the evolution of the level of the volatility depends on some economic indicators, sampled
at lower frequencies. The proposed model improves the accuracy of short-term volatility forecasts, offering valuable insights for market participants and policymakers. The empirical results highlight the importance of timely economic information in forecasting oil market dynamics
Boaz Nadler (Weizmann Institute of Science)
CP1
Título: Finding structure in high dimensional data: Statistical and Computational Challenges
Resumo
A fundamental task in the statistical analysis of data is to detect and estimate interesting ”structures”hidden in it. In this talk I’ll focus on aspects of this problem under a high dimensional regime, where each observed sample has many coordinates, and the number of samples is limited. We will show how in such cases: (i) standard methods to detect structure in high dimensions, such as principal component analysis, may not work well; (ii) sparsity can come to the rescue, albeit it brings with it significant statistical and computational challenges; and (iii) some interesting phenomena may occur in semi-supervised learning settings where for few of the samples we are also given their underlying labels. Specifically, merging labeled and unlabeled data may have significant computational benefits in high dimensions.
Boaz Nadler (Weizmann Institute of Science)
ST1
Título: Completing large low-rank matrices from only few observed entries: A oneline algorithm with provable guarantees
Resumo
Suppose you observe very few entries from a large matrix. Can we predict the missing entries, say assuming the matrix is (approximately) low rank? We describe a very simple method to solve this matrix completion problem. We show our method is able to recover matrices from very few entries and/or with ill conditioned matrices, where many other popular methods fail. Furthermore, due to its simplicity, it is easy to extend our method to incorporate additional knowledge on the underlying matrix, for example to solve the inductive matrix completion problem. On the theoretical front, we prove that our method enjoys some of the strongest available theoretical recovery guarantees. Finally, for inductive matrix completion, we prove that under suitable conditions the problem has a benign optimization landscape with no bad local minima.
Chengchun Shi (London School of Economics)
CP7
Título: ARMA-Design: Optimal Treatment Allocation Strategies for A/B Testing in Partially Observable Time Series Experiments
Resumo
Chengchun Shi (London School of Economics)
ST1
Título: Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Resumo
Reinforcement learning from human feedback (RLHF) has emerged as a key technique for aligning the output of large language models (LLMs) with human preferences. To learn the reward function, most existing RLHF algorithms use the Bradley-Terry model, which relies on assumptions about human preferences that may not reflect the complexity and variability of real-world judgments. In this paper, we propose a robust algorithm to enhance the performance of existing approaches under such reward model misspecifications. Theoretically, our algorithm reduces the variance of reward and policy estimators, leading to improved regret bounds. Empirical evaluations on LLM benchmark datasets demonstrate that the proposed algorithm consistently outperforms existing methods, with 77-81% of responses being favored over baselines on the Anthropic Helpful and Harmless dataset.
Eduardo Fonseca Mendes (Fundação Getúlio Vargas)
CP3
Título: Estimation risk in conditional expectiles
Resumo
We establish the consistency and asymptotic normality of a two-step estimator of conditional expectiles in the context of location-scale models. We first estimate the parameters of the conditional mean and variance by quasi-maximum likelihood and then compute the unconditional expectile of the innovations using the empirical quantiles of the standardized residuals. We show how replacing true innovations with standardized residuals affects the asymptotic variance of the expectile estimator. In addition, we also obtain asymptoticvalid bootstrap-based confidence intervals. Finally, our empirical analysis reveals that conditional expectiles are very interesting alternatives to assess tail risk in cryptomarkets, relative to traditional quantile-based risk measures, such as value at risk and expected shortfall. Joint work with: Marcelo Fernandes (FGV/EESP) e Víctor Henriques (Argus Media)
Guilherme Pumi (UFRGS)
ST3
Título: Parameter Estimation in Observation-Driven Models With Missing Data
Resumo
Handling missing data in time series is a complex problem due to the presence of temporal dependence. General-purpose imputation methods, while widely used, often distort key statistical properties of the data, such as variance and dependence structure, leading to biased estimation and misleading inference. These issues become more pronounced in models that explicitly rely on capturing serial dependence, as standard imputation techniques fail to preserve the underlying dynamics. In this presentation we will discuss estimation of observation-driven models (ODM) in the presence of missing data. We propose a novel multiple imputation method specifically designed for the context of ODM, by taking advantage of the iterative nature of the systematic component in ODM to propagate the dependence structure through missing data, minimizing its impact on estimation. Unlike traditional imputation techniques, the proposed method accommodates continuous, discrete, and mixed-type data while preserving key distributional and dependence properties. The proposed method will be illustrated in the context of GARMA models.
João Caldeira (UFSC)
ST2
Título: Decomposição da Curva de Juros Nominal e Real, Dinâmica do Prêmio a Termo e Previsão da Inflação
Resumo
Neste artigo, usamos os modelos dinâmicos e livres de arbitragem para a estrutura a termo das taxas de juros (AFTSMs) para modelar conjuntamente as taxas de juros nominais e reais. A abordagem permite decompor as taxas de juros em expectativas para taxas de juros futuras e o prêmio pelo risco que os investidores exigem como compensação pela compra de títulos de longo prazo. Além disso, analisamos sua capacidade de capturar expectativas de inflação ajustadas ao risco, usando-a para previsão de inflação. Os resultados sugerem que os prêmios a termo real e nominal variam com o tempo e aumentam com as maturidades, e também geram estimativas de alta frequência da inflação esperada.
Lucas Lúcio Godeiro (UFERSA)
JD4
Título: Forecasting Brazilian Stock Market Using Sentiment Indices from Textual
Resumo
Data, Chat-GPT-Based and Technical Indicators The rapid advancement of artificial intelligence, exemplified by tools such as Chat-GPT, has significantly transformed the landscape of stock market analysis. This paper aims to leverage these technological developments to predict the daily returns of the Ibovespa by utilizing predictors derived from technical indicators and sentiment indices extracted from textual data and Chat-GPT-generated sentiment indices. Our findings reveal that the Chat-GPT-based sentiment index does not enhance the out-of-sample prediction of Ibovespa returns. Conversely, the sentiment index derived from financial news data, utilizing a time-varying dictionary, demonstrates improved out-of-sample predictive accuracy for the Ibovespa. Notably, the predictor based on the technical indicator Accumulation– Distribution (AD) outperforms the historical average benchmark, establishing itself as the superior forecasting model. This study contributes to the ongoing discourse on the integration of artificial intelligence and traditional financial analysis, offering insights into the efficacy of sentiment indices and technical indicators for forecasting stock market returns in the Brazilian context.
Luís Antonio Fantozzi Alvarez (USP)
JD3
Título: Quantile Mixture Models: Estimation and Inference
Resumo
Nonparametric density mixture models are popular in Statistics and Econometrics but suffer from computational and inferential hurdles. This paper introduces nonparametric quantile mixture models as a convenient counterpart, discusses several applications, and proposes a computationally efficient sieve estimator based on a generalized method of Lmoments approach. We develop a full inferential theory for our proposed estimator. In doing so, we make several contributions to statistical theory that allow us to extend a numerical bootstrap method to high-dimensional settings. We further show that, as a direct byproduct of our theory, we can provide an inference method for the distributional synthetic controls of Gunsilius (2023), a novel approach to counterfactual analysis for which formal inference methods were not yet available. As an empirical application of the latter, we apply our proposed approach to inference in assessing the effects of a large-scale environmental disaster, the Brumadinho barrage rupture, on the local wage distribution. Our results uncover a range of effects across percentiles, which we argue are consistent with displacement effects, whereby median-earning jobs are replaced by low-paying contracts.
Marcelo Moreira (FGV – EPGE)
CP8
Título: Inference based on the Continuously Updating Estimator
Resumo
This paper highlights the importance of finding all roots for the Continuously Updating Generalized Method of Moments (CU-GMM) estimator and Likelihood Ratio (LR)-based tests. Traditional numerical optimization methods often fail to locate global minima due to the non-convexity of objective functions, leading to inaccurate estimates and test results. The leading example is the Instrumental Variables model. Numerical comparisons show that our method outperforms traditional techniques, especially with many weak instruments.
Marcelo Moreira (FGV – EPGE)
ST4
Título: Efficiency Loss of Asymptotically Efficient Tests in An Instrumental Variables Regression
Resumo
In an instrumental variable model, the score statistic can be bounded for any alternative in parts of the parameter space. These regions involve a constraint on the first-stage regression coefficients and the reduced-form covariance matrix. Consequently, the Lagrange Multiplier test can have power close to size, despite being efficient under standard asymptotics. This information loss limits the power of conditional tests which use only the Anderson-Rubin and the score statistic. The conditional quasi-likelihood ratio test also suffers severe losses because it can be bounded for any alternative. A necessary condition for drastic power loss to occur is that the Hermitian of the reduced-form covariance matrix has eigenvalues of opposite signs. These cases are denoted impossibility designs (ID). We show this happens in practice, by applying our theory to the problem of inference on the intertemporal elasticity of substitution (IES). Of eleven countries studied by Yogo (2004) and Andrews (2016), nine are consistent with ID at the 95% level.
Mateus Gonzalez de Freitas Pinto (Banco Pan/ USP)
JD2
Título: Analyzing and modeling long-memory time series using fractional spline wavelets
Resumo
Fractional splines extend Schoenberg’s B-splines to fractional orders, which have been shown to fulfill all the requirements to form wavelet bases. Nevertheless, some of these fractional spline wavelets act as fractional difference operators for signals with essentially low-pass behavior and with a pole around the origin, making them useful in the analysis of series with fractal behavior. Using the fact that this family of wavelets acts approximately as a fractional difference operator in the Fourier domain, we propose two novel estimators for the long-memory parameter of a time series based on the fractional spline discrete wavelet transform, one heuristic and the other based on maximum likelihood. We demonstrate the fractional differentiation properties of fractional spline wavelets, as well as a theorem that allows for the construction of a procedure for whitening fractional noises. Simulations and examples are provided to illustrate the proposed methods, verifying their competitiveness with other proposals in the literature. Finally, we present the behavior of the proposed estimator on real data, verifying its dominance over other widely employed methods in the time series literature.
Rodney Fonseca (UFBA)
JD1
Título: Weighted U-statistics for dependent data with applications on environmental time series
Resumo
U-statistic is a classical tool in nonparametric inference and is commonly used to establish large sample properties of various estimators. This presentation will focus on weighted U-statistics for time series and discuss different aspects of this type of estimator, such as data dependency level, type of weights, and kernel degeneracy. We will discuss a new central limit theorem proved for weighted U-statistics with non-degenerate kernels, which extends a classic result proposed by Ken-ichi Yoshihara in 1976. Applications of these results will be presented for time series of temperature and river flow data, and we will discuss interesting problems related to these topics. Joint work with Aluísio Pinheiro.
Thaís Fonseca (UFRJ)
CP5
Título: Graphical Models for High-Dimensional Time Series: A Dynamic Approach to Forecasting and Decision-Making
Resumo
In complex dynamic systems, it is increasingly difficult for decision-makers to effectively account for all the variables within the system that may influence the evaluation of candidate policies. Each of these variables tends to be a dynamic subsystem with areas of knowledge supported by sophisticated probabilistic models. This work proposes modeling high-dimensional multivariate time series (large p) using graphical models that decompose a multivariate system into lower-dimensional subsystems and then reconstruct the original system for forecasting and decision-making. This approach captures correlations between series, temporal dependencies, nonlinearities, and regime shifts. I present the Gaussian model known as Dynamical Multiregression. As follows, I extend the static Bayesian Network model introduced by Heckerman et al. (1995) to accommodate temporal dependence in discrete time series. In particular, expert systems are considered for estimating the network structure, and the Dirichlet evolution process (Fonseca & Ferreira, 2017) is employed for efficient Bayesian learning over time. Analysis of food security in England and Brazil are presented to illustrate the flexibility of our proposed model.
Guilherme Pumi e Taiane Prass (UFRGS)
Minicurso
Título: Fundamentos de Machine Learning e Séries Temporais
Resumo
Este mini-curso apresenta uma introdução aos conceitos centrais de Machine Learning (ML), abordando aprendizagem supervisionada e não supervisionada, estratégias para evitar superajuste e métodos de seleção e validação de modelos, destacando suas relações com técnicas tradicionais para análise de séries temporais. Discutimos as particularidades, limitações e cuidados necessários na aplicação desses métodos. Adicionalmente, abordamos os desafios específicos deste contexto, incluindo dependência serial, não estacionariedade e sazonalidade, e como esses fatores potencialmente impactam os modelos clássicos de ML. O curso promove uma discussão crítica sobre a utilização de ML no contexto de séries temporais, fornecendo uma visão fundamentada sobre o balanço entre flexibilidade e interpretabilidade nos modelos. Computacionalmente, serão discutidas algumas implementações simples em R explorando aplicações encontradas na literatura.
Alex Rodrigo dos Santos Sousa (Unicamp)
Tutorial
Título: Suavização por Ondaletas
Resumo
Este tutorial explora, de uma perspectiva prática, a aplicação de técnicas de limiarização e encolhimento para a estimação de coeficientes de ondaletas em modelos de regressão não paramétrica sob ruído Gaussiano. Ele é estruturado em duas partes: uma breve introdução às técnicas, incluindo abordagens clássicas, como a limiarização universal e SureShrink, além de métodos de encolhimento Bayesiano, seguida de aplicações práticas em conjuntos de dados reais utilizando a linguagem de programação R.