My research focuses on the intersection of statistics, machine learning, and data science. I am particularly interested in developing and applying statistical models to address complex problems in social sciences, finance, and health data. My areas of interest include:
Below are some of my research works and academic publications. Each entry includes the authors, publication date, a short abstract, and a link to the full text or related resource.
Restricted Boltzmann Machines (RBM) are undirected probabilistic graphical model that can be interpreted as stochastic neural networks. In their basic formulation, Boltzmann Machines (BM) involve binary variables. They have found applications in a wide range of learning problems. However, RBMs are generally used as a preliminary phase for other algorithms, providing either data preprocessing or an initialization method for feedforward neural network classifiers. They are not typically considered a standalone solution for classification problems. In this paper, we proposed several variants of the RBM in which the observed variables are assumed to be continuous and bounded. We demonstrated that the underlying models could solve high-dimensional data classification problems without relying on another classification algorithm. The experiments showed that, in addition to having good characteristics as generative data models, the proposed models possess excellent predictive power, outperforming certain standard models commonly used for classification tasks.
[Read PDF]The Boltzmann machine and its variants have demonstrated excellent performance in several object classification and recognition tasks. It involves a very large number of parameters, which complicates training and limits its use in devices with low memory capacity. An important research problem is how to compress the parameters to reduce the storage requirements and, eventually, the training time of the machine. This work focuses on storage compression for efficient inference of the restricted Gaussian Boltzmann machine. Different tensor formats for modeling the interaction terms or weights of a restricted Gaussian Boltzmann machine are proposed and investigated. The objective is to reduce the number of model parameters to save time and memory during training and predictions. It is shown that the weight matrix of the Boltzmann machine, as in the case of several neural network modules, is highly redundant and that by limiting its matrix rank, it is possible to significantly reduce the number of parameters without significantly decreasing the model’s prediction rate. Particularly high performance is achieved with the so-called “matrix product operator” tensor format.
[Read PDF]Boltzmann Machines (BMs) and their variants have traditionally been used as initialization tools for supervised learning models, notably feedforward neural networks. In such settings, they mainly serve as feature extractors or pre-training mechanisms, without being directly employed for classification or clustering tasks. In previous work, we proposed a supervised model based on the Compact Interval Restricted Boltzmann Machine (CIRBM) to address this limitation partially. Despite its potential, a major obstacle remains: the intractability of the normalization constant (partition function) in the Boltzmann distribution, which hinders both inference and parameter estimation. This paper tackles the intractability of the partition function in the CIRBM framework by introducing and evaluating several estimation techniques. We assess the accuracy and computational efficiency of each method through extensive empirical studies. Among the proposed approaches, the Normalization Function Ratio (NFR) estimator demonstrates the best trade-off between precision and computational cost, offering narrow confidence intervals and consistent performance across various settings. Building upon these results, we develop a novel mixture model of Compact Interval Restricted Boltzmann Machines (MixCIRBM) for clustering applications. The model is trained via the Expectation-Maximization (EM) algorithm, combined with gradient-based optimization of the model parameters. Experimental results indicate that the proposed MixCIRBM model outperforms the classical Gaussian Mixture Model (GMM) and K-means algorithm, especially in high-dimensional scenarios, demonstrating its effectiveness as a robust clustering tool.
[Read PDF]In insurance risk modeling, selecting relevant categorical variables is a crucial step in developing accurate, parsimonious, and interpretable predictive models. Indeed, categorical features present different challenges, including high cardinality, multicollinearity, and complex interactions with continuous variables. This study examines statistical and machine learning-based approaches for selecting categorical predictors in insurance risk models. Traditional methods like chi-square testing, information value, and weight of evidence are evaluated alongside modern regularization techniques and tree-based model importance metrics. Special attention is given to encoding strategies, feature engineering, and handling of rare categories. Using real-world insurance datasets, we demonstrate how informed categorical variable selection improves model performance and interpretability of the experimented models. Our findings offer practical guidelines for actuaries, data scientists, and risk professionals seeking to enhance predictive modeling practices in the finance field.
[View draft]