ASHPC25

Name: ASHPC25
Start: 2025-05-19T14:00:00+02:00
End: 2025-05-22T15:00:00+02:00
Location: Rimske Terme, Slovenia

19–22 May 2025

Rimske Terme, Slovenia

Europe/Ljubljana timezone

ASHPC25

conference@ashpc.eu

Compressing AI Models at GPT Scale

21 May 2025, 09:00

45m

Rimske Terme, Slovenia

Keynote

Dan Alistarh (Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria)

A key barrier to the wide deployment of highly-accurate machine learning models, whether for language or vision, is their high computational and memory overhead. Although we possess the mathematical tools for highly-accurate compression of such models, these elegant techniques require second-order information about the model’s loss function, which is hard to even approximate efficiently at the scale of billion-parameter models.
In this talk, I will describe our work on bridging this computational divide, which enables the accurate second-order pruning and quantization of models at truly massive scale. Compressed using our techniques, models with billions and even trillions of parameters can be executed efficiently on GPUs or even CPUs, with significant speedups, and negligible accuracy loss.

Dan Alistarh is a Professor at the Institute of Science and Technology Austria, in Vienna. Previously, he was a Visiting Professor at MIT, a Researcher at Microsoft, and received his PhD from the EPFL. His research is on algorithms for efficient machine learning and high-performance computing, with a focus on scalable DNN inference and training, for which he was awarded an ERC Starting Grant in 2018. In his spare time, he works with the ML research team at Neural Magic, a startup based in Boston, on making compression faster, more accurate and accessible to practitioners.

Dan Alistarh (Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria)

There are no materials yet.

ASHPC25

ASHPC25

Compressing AI Models at GPT Scale

Rimske Terme, Slovenia

Speaker

Description

Primary author

Presentation materials