19–22 May 2025
Rimske Terme, Slovenia
Europe/Ljubljana timezone

When data are big in the ”wrong” direction: identifying compact and informative distance measures in high-dimensional feature spaces

22 May 2025, 09:00
45m
Rimske Terme, Slovenia

Rimske Terme, Slovenia

Speaker

Alessandro Laio (Theoretical and Scientific Data Science Group, International School for Advanced Studies (SISSA), Trieste, Italy)

Description

Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Finding a small set of features that still retains sufficient information about the dataset is important for the successful application of many machine learning approaches. We introduce an approach that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This test can be used to identify the most informative distance measure out of a pool of candidate. We will discuss applications of this approach to feature selection in molecular modeling, to the analysis of the representations of deep neural networks, and to infer causality in high-dimensional dynamic processes and time series.

Alessandro Laio is a Full Professor in the Department of Physics at the International School for Advanced Studies (SISSA), Trieste, and consultant of the Condensed Matter and Statistical Physics group at ICTP. His recent research has focused on algorithmic developments in unsupervised learning, data clustering, metric learning and dimensionality reduction, while he has also made pioneering contributions to improving the ability of computer simulations to make predictions for complex systems. His name is usually associated with the groundbreaking algorithmic solutions for extracting essential features from complex data.

Primary author

Alessandro Laio (Theoretical and Scientific Data Science Group, International School for Advanced Studies (SISSA), Trieste, Italy)

Presentation materials

There are no materials yet.