Palestrantes
Alan De Genaro (FGV–EAESP)
ST2
Título: Product Complexity, Investor Experience, and Returns
Resumo
This paper examines how financial sophistication influences household investment outcomes in complex financial markets. Using regulatory microdata from Brazil’s structured products market, we analyze how returns vary with investor sophistication and product complexity. We introduce a novel measure of financial sophistication: prior trading experience in other securities markets. Investors with established trading histories systematically outperform inexperienced counterparts when trading complex products, exhibiting both persistent skill and greater learning capabilities. This experience-based measure is more predictive than conventional proxies such as wealth, age, and education. Our findings suggest that complexity can effectively obscure risk from unsophisticated investors, enabling strategic rent extraction through product design. These results question the efficacy of current “qualified investor” standards based on wealth thresholds, suggesting that direct measures of market participation provide more effective criteria for accessing complex financial instruments in increasingly sophisticated retail markets. (coautoria com Jason Sturgess – Queen Mary University of London, Jose Liberti – Northwestern University e Pedro Saffi – University of Cambridge)
Alex Rodrigo dos Santos Sousa (Unicamp)
Tutorial
Título: Suavização por Ondaletas
Resumo
Este tutorial explora, de uma perspectiva prática, a aplicação de técnicas de limiarização e encolhimento para a estimação de coeficientes de ondaletas em modelos de regressão não paramétrica sob ruído Gaussiano. Ele é estruturado em duas partes: uma breve introdução às técnicas, incluindo abordagens clássicas, como a limiarização universal e SureShrink, além de métodos de encolhimento Bayesiano, seguida de aplicações práticas em conjuntos de dados reais utilizando a linguagem de programação R.
Audrone Virbickaite (CUNEF Universidad)
CP2
Título: Intraday Crude Oil Volatility: Assessing the Impact of Economic Announcements and Mixed-frequency Data
Resumo
This paper investigates the real-time impact of economic announcements and policy decisions on intraday crude oil volatility. Our model employs a high-frequency mixed data sampling (MIDAS) approach to capture the interaction between market volatility and economic uncertainty, paired with spike-and-slab priors for the announcement coefficients to perform efficient variable selection in a high-dimensional setting. Our findings demonstrate that these announcements significantly influence short-term oil price volatility; meanwhile the evolution of the level of the volatility depends on some economic indicators, sampled
at lower frequencies. The proposed model improves the accuracy of short-term volatility forecasts, offering valuable insights for market participants and policymakers. The empirical results highlight the importance of timely economic information in forecasting oil market dynamics
Boaz Nadler (Weizmann Institute of Science)
CP1
Título: Finding structure in high dimensional data: Statistical and Computational Challenges
Resumo
A fundamental task in the statistical analysis of data is to detect and estimate interesting ”structures”hidden in it. In this talk I’ll focus on aspects of this problem under a high dimensional regime, where each observed sample has many coordinates, and the number of samples is limited. We will show how in such cases: (i) standard methods to detect structure in high dimensions, such as principal component analysis, may not work well; (ii) sparsity can come to the rescue, albeit it brings with it significant statistical and computational challenges; and (iii) some interesting phenomena may occur in semi-supervised learning settings where for few of the samples we are also given their underlying labels. Specifically, merging labeled and unlabeled data may have significant computational benefits in high dimensions.
Boaz Nadler (Weizmann Institute of Science)
ST1
Título: Completing large low-rank matrices from only few observed entries: A oneline algorithm with provable guarantees
Resumo
Suppose you observe very few entries from a large matrix. Can we predict the missing entries, say assuming the matrix is (approximately) low rank? We describe a very simple method to solve this matrix completion problem. We show our method is able to recover matrices from very few entries and/or with ill conditioned matrices, where many other popular methods fail. Furthermore, due to its simplicity, it is easy to extend our method to incorporate additional knowledge on the underlying matrix, for example to solve the inductive matrix completion problem. On the theoretical front, we prove that our method enjoys some of the strongest available theoretical recovery guarantees. Finally, for inductive matrix completion, we prove that under suitable conditions the problem has a benign optimization landscape with no bad local minima.
Chengchun Shi (London School of Economics)
CP7
Título: ARMA-Design: Optimal Treatment Allocation Strategies for A/B Testing in Partially Observable Time Series Experiments
Resumo
Chengchun Shi (London School of Economics)
ST1
Título: Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Resumo
Reinforcement learning from human feedback (RLHF) has emerged as a key technique for aligning the output of large language models (LLMs) with human preferences. To learn the reward function, most existing RLHF algorithms use the Bradley-Terry model, which relies on assumptions about human preferences that may not reflect the complexity and variability of real-world judgments. In this paper, we propose a robust algorithm to enhance the performance of existing approaches under such reward model misspecifications. Theoretically, our algorithm reduces the variance of reward and policy estimators, leading to improved regret bounds. Empirical evaluations on LLM benchmark datasets demonstrate that the proposed algorithm consistently outperforms existing methods, with 77-81% of responses being favored over baselines on the Anthropic Helpful and Harmless dataset.
Cristine Campos (Insper)
CP6
Título: Difference-in-Discontinuities: Theory and Applications
Resumo
This project sets out to explore the econometric theory behind the newly developed difference-in-discontinuities design. Despite its increasing use in causal inference research, there are currently limited studies about its underlying principles and properties. The method combines elements of regression discontinuity and difference-in-differences, allowing researchers to eliminate the effects of potential threshold discontinuities and account for changes in the larger environment.
Eduardo Fonseca Mendes (FGV-EESP)
CP3
Título: Estimation risk in conditional expectiles
Resumo
We establish the consistency and asymptotic normality of a two-step estimator of conditional expectiles in the context of location-scale models. We first estimate the parameters of the conditional mean and variance by quasi-maximum likelihood and then compute the unconditional expectile of the innovations using the empirical quantiles of the standardized residuals. We show how replacing true innovations with standardized residuals affects the asymptotic variance of the expectile estimator. In addition, we also obtain asymptoticvalid bootstrap-based confidence intervals. Finally, our empirical analysis reveals that conditional expectiles are very interesting alternatives to assess tail risk in cryptomarkets, relative to traditional quantile-based risk measures, such as value at risk and expected shortfall. Joint work with: Marcelo Fernandes (FGV/EESP) e Víctor Henriques (Argus Media)
Flávio Ziegelmann (UFRGS)
ST1
Título: Improving Copula-GARCH Risk Forecasting Learning from Factor Functional Time Series
Resumo
In modern days, the accurate prediction and forecasting of risk measures, such as Value at Risk and Expected Shortfall, is an essential task for asset market managers. When calculating risk measures, an essential step, for most approaches, is to estimate the probability density function of asset returns. A daily sequence of intraday return densities of p assets, denoted by Y_t, t=1,…,n, can be seen as a p-dimensional functional time series. If p is large (Y_t is high dimensional), then one has to perform a two-way dimension reduction: in the high dimensional vector and in the infinite dimensional curves. Here we propose combining a Functional Factor Model with a univariate Dynamic Functional Principal Components Analysis as a two way dimension reduction approach, which allied to a copula model feeds the error term of a high-frequency ARMA-GARCH model aiming to forecast future daily risk measures.
Guilherme Pumi (UFRGS)
ST3
Título: Parameter Estimation in Observation-Driven Models With Missing Data
Resumo
Handling missing data in time series is a complex problem due to the presence of temporal dependence. General-purpose imputation methods, while widely used, often distort key statistical properties of the data, such as variance and dependence structure, leading to biased estimation and misleading inference. These issues become more pronounced in models that explicitly rely on capturing serial dependence, as standard imputation techniques fail to preserve the underlying dynamics. In this presentation we will discuss estimation of observation-driven models (ODM) in the presence of missing data. We propose a novel multiple imputation method specifically designed for the context of ODM, by taking advantage of the iterative nature of the systematic component in ODM to propagate the dependence structure through missing data, minimizing its impact on estimation. Unlike traditional imputation techniques, the proposed method accommodates continuous, discrete, and mixed-type data while preserving key distributional and dependence properties. The proposed method will be illustrated in the context of GARMA models.
João Caldeira (UFSC)
ST2
Título: Decomposição da Curva de Juros Nominal e Real, Dinâmica do Prêmio a Termo e Previsão da Inflação
Resumo
Neste artigo, usamos os modelos dinâmicos e livres de arbitragem para a estrutura a termo das taxas de juros (AFTSMs) para modelar conjuntamente as taxas de juros nominais e reais. A abordagem permite decompor as taxas de juros em expectativas para taxas de juros futuras e o prêmio pelo risco que os investidores exigem como compensação pela compra de títulos de longo prazo. Além disso, analisamos sua capacidade de capturar expectativas de inflação ajustadas ao risco, usando-a para previsão de inflação. Os resultados sugerem que os prêmios a termo real e nominal variam com o tempo e aumentam com as maturidades, e também geram estimativas de alta frequência da inflação esperada.
Lucas Finamor (FGV-EESP)
ST4
Título: There must be an error here! Experimental evidence on coding errors’ biases
Resumo
Economics research relies heavily on computational activities. Nonetheless, coding errors are widely present, even in papers that have gone through peer review processes. In this paper, we investigate whether researchers have differential probabilities for debugging their codes, depending on the results they face. We test this hypothesis in a randomized experiment in which common coding errors would lead to either expected or unexpected results. If researchers are less likely to look for coding errors when encountering non-favorable results, this implies a bias in the scientific inquiry.
Lucas Lúcio Godeiro (UFERSA)
JD4
Título: Forecasting Brazilian Stock Market Using Sentiment Indices from Textual
Resumo
Data, Chat-GPT-Based and Technical Indicators The rapid advancement of artificial intelligence, exemplified by tools such as Chat-GPT, has significantly transformed the landscape of stock market analysis. This paper aims to leverage these technological developments to predict the daily returns of the Ibovespa by utilizing predictors derived from technical indicators and sentiment indices extracted from textual data and Chat-GPT-generated sentiment indices. Our findings reveal that the Chat-GPT-based sentiment index does not enhance the out-of-sample prediction of Ibovespa returns. Conversely, the sentiment index derived from financial news data, utilizing a time-varying dictionary, demonstrates improved out-of-sample predictive accuracy for the Ibovespa. Notably, the predictor based on the technical indicator Accumulation– Distribution (AD) outperforms the historical average benchmark, establishing itself as the superior forecasting model. This study contributes to the ongoing discourse on the integration of artificial intelligence and traditional financial analysis, offering insights into the efficacy of sentiment indices and technical indicators for forecasting stock market returns in the Brazilian context.
Luís Antonio Fantozzi Alvarez (USP)
JD3
Título: Quantile Mixture Models: Estimation and Inference
Resumo
Nonparametric density mixture models are popular in Statistics and Econometrics but suffer from computational and inferential hurdles. This paper introduces nonparametric quantile mixture models as a convenient counterpart, discusses several applications, and proposes a computationally efficient sieve estimator based on a generalized method of Lmoments approach. We develop a full inferential theory for our proposed estimator. In doing so, we make several contributions to statistical theory that allow us to extend a numerical bootstrap method to high-dimensional settings. We further show that, as a direct byproduct of our theory, we can provide an inference method for the distributional synthetic controls of Gunsilius (2023), a novel approach to counterfactual analysis for which formal inference methods were not yet available. As an empirical application of the latter, we apply our proposed approach to inference in assessing the effects of a large-scale environmental disaster, the Brumadinho barrage rupture, on the local wage distribution. Our results uncover a range of effects across percentiles, which we argue are consistent with displacement effects, whereby median-earning jobs are replaced by low-paying contracts.
Marcelo Moreira (FGV–EPGE)
CP8
Título: Inference based on the Continuously Updating Estimator
Resumo
This paper highlights the importance of finding all roots for the Continuously Updating Generalized Method of Moments (CU-GMM) estimator and Likelihood Ratio (LR)-based tests. Traditional numerical optimization methods often fail to locate global minima due to the non-convexity of objective functions, leading to inaccurate estimates and test results. The leading example is the Instrumental Variables model. Numerical comparisons show that our method outperforms traditional techniques, especially with many weak instruments.
Marcelo Moreira (FGV–EPGE)
ST4
Título: Efficiency Loss of Asymptotically Efficient Tests in An Instrumental Variables Regression
Resumo
In an instrumental variable model, the score statistic can be bounded for any alternative in parts of the parameter space. These regions involve a constraint on the first-stage regression coefficients and the reduced-form covariance matrix. Consequently, the Lagrange Multiplier test can have power close to size, despite being efficient under standard asymptotics. This information loss limits the power of conditional tests which use only the Anderson-Rubin and the score statistic. The conditional quasi-likelihood ratio test also suffers severe losses because it can be bounded for any alternative. A necessary condition for drastic power loss to occur is that the Hermitian of the reduced-form covariance matrix has eigenvalues of opposite signs. These cases are denoted impossibility designs (ID). We show this happens in practice, by applying our theory to the problem of inference on the intertemporal elasticity of substitution (IES). Of eleven countries studied by Yogo (2004) and Andrews (2016), nine are consistent with ID at the 95% level.
Mateus Gonzalez de Freitas Pinto (Banco Pan/ USP)
JD2
Título: Analyzing and modeling long-memory time series using fractional spline wavelets
Resumo
Fractional splines extend Schoenberg’s B-splines to fractional orders, which have been shown to fulfill all the requirements to form wavelet bases. Nevertheless, some of these fractional spline wavelets act as fractional difference operators for signals with essentially low-pass behavior and with a pole around the origin, making them useful in the analysis of series with fractal behavior. Using the fact that this family of wavelets acts approximately as a fractional difference operator in the Fourier domain, we propose two novel estimators for the long-memory parameter of a time series based on the fractional spline discrete wavelet transform, one heuristic and the other based on maximum likelihood. We demonstrate the fractional differentiation properties of fractional spline wavelets, as well as a theorem that allows for the construction of a procedure for whitening fractional noises. Simulations and examples are provided to illustrate the proposed methods, verifying their competitiveness with other proposals in the literature. Finally, we present the behavior of the proposed estimator on real data, verifying its dominance over other widely employed methods in the time series literature.
Pedro Morettin (USP)
ST3
Título: Robust Semiparametric Nonlinear Time Series Models Using Reproducing Kernels Hilbert Spaces
Resumo
Reproducing Kernels Hilbert Spaces (RKHS) are considered in order to analyze nonlinear heteroscedastic regressions of stochastic processes under very general settings and subject to heavy tailed errors. An estimation procedure using the EM algorithm is proposed and, in order to get the estimators more resitant to ouliers than those based only on the Normal distribution, the model random errors are assumed to follow a scale mixture of Normal distributions. This class of distributions includes, among others, the Student’s t and symmetric stable distributions. The methodology is extended to multivariate models. We present the underlying theory and the develop estimation algorithm. Its applicability and performance are assessed via a simulation study.
Rafael Araújo (FGV-EESP)
ST4
Título: Potato Potahto in the FAO-GAEZ Productivity Measures? Nonclassical Measurement Error with Multiple Proxies
Resumo
The FAO-GAEZ crop productivity data are widely used in Economics. However, the existence of measurement error is rarely recognized in the empirical literature. We propose a novel method to partially identify the effect of agricultural productivity, deriving bounds that allow for nonclassical measurement error by leveraging two proxies. These bounds exhaust all the information contained in the first two moments of the data. We reevaluate three influential studies, documenting that measurement error matters and that the impact of agricultural productivity on economic outcomes may be smaller than previously reported. Our methodology has broad applications in empirical research involving mismeasured variables.
Rodney Fonseca (UFBA)
JD1
Título: Wavelet Feature Screening
Resumo
An initial screening of which covariates are relevant is a common practice in high-dimensional regression models. The classic feature screening selects only a subset of covariates correlated with the response variable. However, many important features might have a relevant albeit highly nonlinear relation with the response. One screening approach that handles nonlinearity is to compute the correlation between the response and nonparametric functions of each covariate. Wavelets are powerful tools for nonparametric and functional data analysis but are still seldom used in the feature screening literature. In this talk, we introduce a wavelet feature screening method that can be easily implemented. Theoretical and simulation results show that the proposed method can capture true covariates with high probability, even in highly nonlinear models. We also present an example with real data in a high-dimensional setting.
Taiane Prass e Guilherme Pumi (UFRGS)
Minicurso
Título: Fundamentos de Machine Learning e Séries Temporais
Resumo
Este mini-curso apresenta uma introdução aos conceitos centrais de Machine Learning (ML), abordando aprendizagem supervisionada e não supervisionada, estratégias para evitar superajuste e métodos de seleção e validação de modelos, destacando suas relações com técnicas tradicionais para análise de séries temporais. Discutimos as particularidades, limitações e cuidados necessários na aplicação desses métodos. Adicionalmente, abordamos os desafios específicos deste contexto, incluindo dependência serial, não estacionariedade e sazonalidade, e como esses fatores potencialmente impactam os modelos clássicos de ML. O curso promove uma discussão crítica sobre a utilização de ML no contexto de séries temporais, fornecendo uma visão fundamentada sobre o balanço entre flexibilidade e interpretabilidade nos modelos. Computacionalmente, serão discutidas algumas implementações simples em R explorando aplicações encontradas na literatura.
Thaís Fonseca (UFRJ)
CP5
Título: Graphical Models for High-Dimensional Time Series: A Dynamic Approach to Forecasting and Decision-Making
Resumo
In complex dynamic systems, it is increasingly difficult for decision-makers to effectively account for all the variables within the system that may influence the evaluation of candidate policies. Each of these variables tends to be a dynamic subsystem with areas of knowledge supported by sophisticated probabilistic models. This work proposes modeling high-dimensional multivariate time series (large p) using graphical models that decompose a multivariate system into lower-dimensional subsystems and then reconstruct the original system for forecasting and decision-making. This approach captures correlations between series, temporal dependencies, nonlinearities, and regime shifts. I present the Gaussian model known as Dynamical Multiregression. As follows, I extend the static Bayesian Network model introduced by Heckerman et al. (1995) to accommodate temporal dependence in discrete time series. In particular, expert systems are considered for estimating the network structure, and the Dirichlet evolution process (Fonseca & Ferreira, 2017) is employed for efficient Bayesian learning over time. Analysis of food security in England and Brazil are presented to illustrate the flexibility of our proposed model.
Silvia Lopes (UFRGS)
CP4
Título: Hellinger Integral Properties in Testing Parameters in CIR and Vasicek Models
Resumo
Proper representation formulas for the Hellinger integral of the distribution of the Cox- Ingersoll-Ross (here denoted by CIR) and Vasicek models are presented with their likelihood and least squares functions. Under the assumption that only the mean level parameter is unknown both a maximum likelihood type and a least squares type estimator are derived for this parameter, for both models. Large Deviation properties for Vasicek and CIR models are illustrated as possible applications for the famous stochastic differential equations in Finance. We also present some simulations for both, the maximum likelihood and the least squares estimators, for both models. This is joint work with A.V. Medino (Mathematics Department, University of Brasília) and F.S. Quintino (Statistics Department, University of Brasília).