"False"
Skip to content
printicon
Main menu hidden.

Mathematical Foundations of AI Seminar

The Seminars in Mathematical Foundations of Artificial Intelligence are open to employees and students of Umeå University.

Subribe to the seminar e-mail list

Subscribe to mathfoundations-ai-seminar@lists.umu.se  for notification of future seminars.

Send an email to sympa@lists.umu.se and in the subject/heading of the email write: subscribe mathfoundations-ai-seminar

Leave the email blank.

Workshop

14 May 2025, 13:00 - 16.45

13:00 - 13:30: Aron Persson, Uppsala University

Title: A Glimpse of Intrinsic Statistics on Manifolds with an Application to Signal Processing on the Sphere

Abstract: This talk provides an introduction and overview of intrinsic statistics on manifolds, concluding with an application to signal processing on the sphere.

13:45 - 14:15: Fredrik Ohlsson, Umeå University

Title: Neural ODEs Beyond Manifolds

Abstract: An important class of models in geometric deep learning is based on the flows generated by vector fields. The ODEs corresponding to these flows describe the dynamics of information propagating through the model, and training amounts to learning the vector field that minimizes the loss in some machine learning application. Neural ODEs have been used to construct powerful generative models that can account for both the geometry and the symmetries of a problem, e.g., to generate 3D protein structures. In this talk I will discuss the current NODE framework on manifolds and in particular some open problems that prompt us to revisit the role of geometry in these deep learning models; adding more geometrical structure to the manifolds or even moving beyond the manifold concept itself.

14:30 - 15:00: Andreas Granath, Umeå University

Title: Low Rank Methods for Wave Dominated Problems

Abstract: Many problems of interest in engineering, ranging from electromagnetic to elastic and acoustic phenomena, can be modelled by wave equations. It is well known that solving wave equations accurately requires a high resolution in the spatial discretization. To decrease the overall computational cost of the discretization, we propose a low rank framework utilizing a singular value decomposition (SVD) to represent the numerical solution in 2D and tensor trains (TT) in 3D. Specifically, we focus on underwater acoustics, modelled by the Helmholtz equation in the frequency domain. We demonstrate how the SVD and TT representations can be combined with an iterative finite difference solver, decreasing the computational cost compared to a full rank solver, while maintaining the desired convergence and accuracy of the underlying numerical method.

15:00 - 15:30: FIKA

15:30 - 16:00: Oskar Nylén Nordenfors, Umeå University

Title: Group Equivariant Models and Training with Data Augmentation

Abstract: When training a neural network to perform a task with known symmetries, one would like the neural network to respect those symmetries, that is, to be equivariant. We compare two strategies for achieving equivariance: 1) restricting the network architecture to be equivariant by design; 2) training the neural network using data augmentation. We find the same set of equivariant stationary points for both strategies, and we see that the space of equivariant models is an invariant set of the augmented gradient flow. Analogously, for ensembles of neural networks we see that if the networks are invariantly initialized, then the ensemble mean is equivariant. In this talk I will highlight some of the techniques we used to obtain these results. The talk is based on joint work with Fredrik Ohlsson and Axel Flinth.

16:00 - 16:45: Emma Andersdotter Svensson, Umeå University

Title: A Bundle Formalism for Equivariant Neural ODEs

Abstract: Previous work by, e.g., M. Weiler et al has shown that using a bundle formalism to describe equivariant CNNs is a useful way to capture the underlying mathematical structures. There, feature maps are described as sections of an associated bundle of a homogeneous vector bundle. In my talk, I will introduce a way to use the bundle formalism for equivariant manifold neural ODEs – a neural network model where the network is defined by a vector field describing how the data evolves continuously with time. By considering neural ODEs on a homogeneous space, we can define a lift of the solution curve to an associated bundle, making it possible to extend previous formulations in terms of a parallel transport. I will include a description of how this formulation might be a generalization of our previous formulation of vector fields being transformed by neural ODEs through the pushforward.

28 March 2025, 15:15-16:15

On the Geometry and Optimization of Polynomial Convolutional Networks

Speaker: Vahid Shahverdi, KTH

Abstract: In this talk, I will explore the rich interplay between algebraic geometry and convolutional neural networks (CNNs) with polynomial activation functions. At the heart of this study is the parameterization map, which translates network parameters into functions. We show that this map is a regular morphism and an isomorphism almost everywhere, up to scaling symmetries. The image of this map, which we call the “neuromanifold,” exhibits intricate geometric properties. I will discuss its dimension, degree, and singularities, and their implications for the learning process. Beyond structural insights, I will highlight a connection between the geometry of the neuromanifold and optimization: for large generic datasets, we compute the number of critical points that arise during training with a quadratic loss function, using tools from metric algebraic geometry. This is joint work with Giovanni Luca Marchetti and Kathlén Kohn.

13 March 2025, 16:00-16:45

Equivariant Neural Tangent Kernels

Speaker: Philipp Misof, Chalmers University of Technology

Abstract: In recent years, the neural tangent kernel (NTK) has proven to be a valuable tool to study training dynamics of neural networks (NN) analytically. In this talk, I will present how this NTK framework can be extended to equivariant NNs based on group convolutional NNs (GCNNs). Not only does this enable the analytic study of influences of hyperparameters, training biases etc. in equivariant NNs, but it also allows us to draw an interesting connection between data augmentation and manifestly equivariant architectures. In particular, we show that the mean predictions of an ensemble of data augmented non-equivariant networks coincide with the mean predictions of an ensemble of specific GCNNs at all training times in the infinite-width limit. We further provide explicit implementations of the equivariant NTK for roto-translations in the plane and 3d rotations. To evaluate the performance of the equivariant infinite width solution, we benchmark the models on quantum mechanical property prediction and medical image classification. This talk is based on joined work with Jan Gerken and Pan Kessel.

Place: MIT.A.356

6 March 2025, 14:15-14:55

Learning-Based Surrogate Models for the Fluid Dynamics of a Pharmaceutical Bioreactor

Speaker: Umut Kaya, Daiichi Sankyo Europe GmbH  / University of Ghent

Abstract: We developed learning-based surrogate models to predict fluid dynamics in pharmaceutical bioreactors. Traditional CFD simulations take too long to run, making real-time process optimization impractical. By using machine learning, including graph neural networks and reduced-order modeling, we built models that provide fast and accurate predictions of hydrodynamic stress and mixing behavior. These surrogate models significantly cut down computational costs while maintaining reliability, making them valuable for biopharmaceutical process development. Our work bridges the gap between physics-based modeling and data-driven approaches, helping improve bioprocess design, monitoring, and control.

Place: MIT.A.356

13 February 2025, 15:15-16:15

Communication-efficient distributed optimization algorithms

Speaker: Laurent Condat, King Abdullah University of Science and Technology (KAUST)

Abstract: In distributed optimization and machine learning, a large number of machines perform computations in parallel and communicate back and forth with a distant server. Communication can be costly and slow, in particular in federated learning. To address the communication bottleneck, two strategies are popular: 1) communicate less frequently; 2) compress the communicated vectors. Also, a robust algorithm should allow for partial participation. I will present several randomized algorithms we developed recently in this area, with proved convergence guarantees and state-of-the-art communication complexity.

Place: MIT.A.346

Latest update: 2025-05-09