Expert Systems with Applications, Elsevier, 2024, p. 123-134.

The potential to study and improve different aspects of our lives is ever growing thanks to the abundance of data available in today’s modern society. Scientists and researchers often need to analyze data from different sources; the observations, which only share a subset of the variables, cannot always be paired to detect common individuals. This is the case, for example, when the information required to study a certain phenomenon is coming from different sample surveys. Statistical matching is a common practice to combine these data sets. In this paper, we investigate and extend to statistical matching two methods based on Kernel Canonical Correlation Analysis - KCCA - and Super-Organizing Map - Super-OM. These methods are designed to deal with various variable types, sample weights and incompatibilities among categorical variables. In the first case, we use KCCA, a non-linear extension of CCA, to create canonical variables that we can compare in the two data sets. In the second case, Super-OM uses organizing maps to create subgroups of individuals who share the same characteristics. We use the 2017 Belgian Statistics on Income and Living Conditions - SILC - and we compare the performance of the proposed statistical matching methods by means of a cross-validation technique, as if the data were available from two separate sources. The results indicate that our proposed methods are superior to existing methods because they preserve the distribution of generated variables while also providing good predictions. Existing methods typically only achieve one or the other. These new techniques open the door to improving statistical matching in other applications such as medicine, economics, …

Journal of Applied Statistics, (2022).

Several commercial banks in the United States disappeared during the last decades due to failure or acquisition by another entity. From a survival analysis perspective, however, the high censoring rate suggests that some institutions are likely to be immune to failure and/or acquisition. In this study, we use a competing risks proportional-hazards cure model in order to measure the impact of bank-specific and macroeconomic variables on the probabilities of being susceptible to these events (i.e. incidence) and on the survival time of susceptible banks (i.e. latency). Moreover, we propose to model the incidence distribution using Generalized Extreme Value regression and compare the results with the ones obtained by the usual logistic regression model. The proposed methodology is evaluated by means of a simulation study and then applied to a dataset of more than 4000 United States commercial banks spanning the period 1993–2018.

International Journal of Microsimulation, Vol. 14, no.1, p. 43-72 (2021).

Belgium has implemented, following the example of other countries, in-work benefit policies since the early 2000’s, with the objective of increasing employment rates and fighting poverty. Belgian in-work benefits differ from most other in-work benefits as eligibility requires low hourly earnings. We study the effects extensions of those benefits would have both on labour supply and welfare, using a random-utility - random-opportunity model estimated on cross-sectional SILC datasets. Results show that further increasing the benefits would slightly increase labour supply and welfare of low-to-middle income deciles, but at very high net cost per job created. We compare our results with existing research and explain some mechanisms that possibly led to an underestimation of negative intensive margin labour supply responses in previous simulations.

The R Journal - Vol. 31, no.1, p. 116-129 (2021).

We describe the penPHcure R package, which implements the semiparametric proportionalhazards (PH) cure model of Sy and Taylor (2000) extended to time-varying covariates and the variable selection technique based on its SCAD-penalized likelihood proposed by Beretta and Heuchenne (2019a). In survival analysis, cure models are a useful tool when a fraction of the population is likely to be immune from the event of interest. They can separate the effects of certain factors on the probability of being susceptible and on the time until the occurrence of the event. Moreover, the penPHcure package allows the user to simulate data from a PH cure model, where the event-times are generated on a continuous scale from a piecewise exponential distribution conditional on time-varying covariates, with a method similar to Hendry (2014). We present the results of a simulation study to assess the finite sample performance of the methodology and illustrate the functionalities of the penPHcure package using criminal recidivism data.

Revue économique - Vol. 72, no. 3, p. 443-458 (2021).

Nous étudions un modèle à valorisations privées indépendantes, où le classement de toutes les offres est révélé à un observateur extérieur, information qui lui permet d’estimer au mieux le type des participants. Alors l’équivalence du revenu est conservée pour les enchères sous plis scellés, l’enchère statique optimale nécessite un prix de réserve spécifique, et un droit d’entrée continu pour extraire tout le surplus des enchérisseurs avec les plus faibles valorisations. Une réponse illustrative, sur la taille optimale du classement des offres à révéler, est apportée. Enfin, la non-équivalence entre les enchères ascendante et au second prix est établie. Les applications sont aussi diverses que la vente d’œuvres d’art ou le financement d’œuvres caritatives.

Empirical Economics, (2021).

This paper provides causal evidence on the effects of parental involvement on stu- dent outcomes in a financial education course based on two randomised controlled trials with a total of 2,779 students from grade 8 and 9 in Flanders. Using an ex- perimental design with three treatment groups, the impact of parental involvement in homework is distinguished from the standalone impact of the classroom inter- vention and homework itself. Intention-to-treat analysis reveals that access to the intervention effectively improves students’ financial literacy in the two dimensions of knowledge and behaviour. The classroom intervention combined with a homework assigned to be completed with the parents increases financial literacy by 0.38 stan- dard deviations. On average, the added value of prompting parental involvement in homework is not statistically significant. Yet, stimulating parental involvement has significant positive effects on behaviour for disadvantaged students.

