Machine Learning Day 2023 on 17 May
Date and time: 17 May 2023, 10:00 – 16:00 CEST (UTC +2)
Title: Machine Learning Day 2023
Where: F2, Lindstedtsvägen 26 & 28, floor 2, Sing-Sing, KTH Campus
Directions: https://goo.gl/maps/hUxXm65trj7hzacY7
Link to registration: https://www.kth.se/form/643951c3fb06ea3cb92b5c50
The main objective behind this event is to map the rich landscape of machine learning research at KTH, Stockholm University and RISE and inform each other about collaborative opportunities. We invite Digital Futures faculty, their colleagues, industrial and societal partners, postdocs, and PhD students who want to share their machine learning research interests and network.
Invitation to share a poster presentation! Digital Futures faculty, their colleagues, postdocs and PhD students who would like to share their research via a poster presentation are invited to submit an abstract via the following form: https://www.kth.se/form/643951c3fb06ea3cb92b5c50. The submission deadline is 5 May. Notifications of acceptance will be communicated by 10 May.
Please register for the event using the link above to secure your spot.
The event programme features four keynote lectures and a poster session. The poster session will host posters representing the landscape of Machine Learning research within Digital Futures.
PROGRAM
10:15 – 10:20 Welcome and introduction by Aristides Gionis, Alexandre Proutiere and Martina Scolamiero
10:20 – 11:10 KEYNOTE 1: Understanding Linear Convolutional Neural Networks via Sparse Factorizations of Real Polynomials by Kathlén Kohn, KTH
11:10 – 12:00 KEYNOTE 2: Is it easier to count communities than find them? by Fiona Skerman, Uppsala University
12:00 – 13:00 LUNCH at Syster & Bror (lunch coupons will be distributed)
13:00 – 13:50 KEYNOTE 3: Repurpose, Reuse, Recycle the building blocks of Machine Learning by Gianmarco De Francisci Morales, CENTAI, Turin
13:50 – 14:40 KEYNOTE 4: Similarity-based Link Prediction from Modular Compression of Network Flows by Martin Rosvall, Umeå University
14:40 – 16:00 POSTER SESSION
End of Day
KEYNOTE TALKS
Kathlén Kohn – Understanding Linear Convolutional Neural Networks via Sparse Factorizations of Real Polynomials
Abstract: This talk will explain that Convolutional Neural Networks without activation parametrize semialgebraic sets of real homogeneous polynomials that admit a certain sparse factorization. We will investigate how the geometry of these semialgebraic sets (e.g., their singularities and relative boundary) changes with the network architecture. Moreover, we will explore how these geometric properties affect the optimization of a loss function for given training data. This talk is based on joint work with Guido Montúfar, Vahid Shahverdi, and Matthew Trager.
Bio: Kathlén Kohn is a WASP-AI/Math assistant professor at KTH since September 2019 and a docent in Mathematics since 2021. Her research investigates the intrinsic geometric structures in data science and AI with algebraic methods. Kathlén obtained her PhD from TU Berlin in 2018. Afterwards, she was a postdoctoral researcher at the Institute for Computational and Experimental Research in Mathematics (ICERM) at Brown University and at the University of Oslo.
She received various national and international awards and scholarships, such as the Best Student Paper Award at the IEEE/CVF International Conference on Computer Vision 2019, a Marie Skłodowska-Curie Fellowship in 2019, the Small Göran Gustafsson Prize for Young Researchers at UU/KTH in 2021, and The L’Oréal-Unesco For Women in Science Sweden Prize in 2023. She is co-PI of the WASP NEST “3D Scene Perception, Embeddings, and Neural Rendering”.
Fiona Skerman – Is it easier to count communities than find them?
Abstract: Random graph models with community structure have been extensively studied. For both the problems of detecting and recovering community structure, an interesting landscape of statistical and computational phase transitions has emerged. A natural unanswered question is: might it be possible to infer properties of the community structure (for instance, the number and sizes of communities) even in situations where actually finding those communities are believed to be computationally hard? We show the answer is no. In particular, we consider certain hypothesis-testing problems between models with different community structures. We show in the low-degree polynomial framework that testing between two options is as hard as finding the communities. Our methods give the first computational lower bounds for testing between two different ”planted” distributions, whereas previous results have considered testing between a planted distribution and an i.i.d. ”null” distribution. Joint work with Cynthia Rush, Alex Wein and Dana Yang.
Bio: Fiona Skerman is an Assistant Professor at Uppsala University. She is interested in probability and combinatorics, in particular random graphs and questions concerning community structure in networks. Fiona completed doctoral studies at Oxford, supervised by Colin McDiarmid, where her thesis `Modularity of Networks’ was awarded the Corcoran prize for best thesis in the Statistics Department. Since then, she has been at Bristol University, Masaryk University, the Simons Institute, UC Berkeley and Uppsala.
Gianmarco De Francisci Morales – Repurpose, Reuse, Recycle the building blocks of Machine Learning
Abstract: The field of Machine Learning has made remarkable progress and has achieved capabilities that were once unimaginable. This progress has resulted in the development of a vast knowledge base and a multitude of valuable tools. In this presentation, we will explore how the basic building blocks of machine learning can be repurposed and reused for entirely different objectives. Specifically, we will examine two examples: the VC dimension, a theoretical metric for model complexity, and automatic differentiation, a practical cornerstone of deep learning. I will demonstrate how these artifacts can be applied to achieve objectives that are unrelated to their original intended use. For instance, the VC dimension can be employed to create approximation algorithms, while automatic differentiation can be used for learning agent-based models. By presenting these concrete examples, I hope to inspire individuals to experiment with the fundamental components of machine learning and devise innovative ways to repurpose them.
Bio: Gianmarco De Francisci Morales is a Principal Researcher at CENTAI. This private research institute focuses on Artificial Intelligence and Complex Systems sciences, where he leads the Social Algorithmics Team (SALT). Previously, he worked as a Senior Researcher at ISI Foundation in Turin, as a Scientist at Qatar Computing Research Institute in Doha, as a Visiting Scientist at Aalto University in Helsinki, as a Research Scientist at Yahoo Labs in Barcelona, and as a Research Associate at ISTI-CNR in Pisa. He received his PhD in Computer Science and Engineering from the IMT Institute for Advanced Studies of Lucca in 2012. His research focuses on computational social science and scalable data mining, with an emphasis on polarization on social media and Web mining. He is a member of the open source community of the Apache Software Foundation, has worked on the Hadoop ecosystem, and has been a committer for the Apache Pig project. He was one of the lead developers of Apache SAMOA, an open-source platform for mining big data streams. He commonly serves on the PC of several major conferences in the area of data mining, including WSDM, WWW, KDD, and ICWSM. He co-organized the workshop series on Social News on the Web (SNOW), co-located with the WWW conference. He has published more than 90 scientific articles and won best paper awards at WSDM, CHI, WebSci, and SocInfo.
Martin Rosvall – Similarity-based Link Prediction from Modular Compression of Network Flows
Abstract: Node similarity scores are a foundation for machine learning in graphs for clustering, node classification, anomaly detection, and link prediction with applications in biological systems, information networks, and recommender systems. Recent works on link prediction use vector space embeddings to calculate node similarities in undirected networks with good performance. Still, they have several disadvantages: limited interpretability, need for hyperparameter tuning, manual model fitting through dimensionality reduction, and poor performance from symmetric similarities in directed link prediction. We propose MapSim, an information-theoretic measure to assess node similarities based on modular compression of network flows. Unlike vector space embeddings, MapSim represents nodes in a discrete, non-metric space of communities and yields asymmetric similarities in an unsupervised fashion. We compare MapSim on a link prediction task to popular embedding-based algorithms across 47 networks and find that MapSim’s average performance across all networks is more than 7 percent higher than its closest competitor, outperforming all embedding methods in 11 of the 47 networks. Our method demonstrates the potential of compression-based approaches in graph representation learning, with promising applications in other graph-learning tasks.
Bio: Martin Rosvall was born in 1978 in Uppsala, Sweden, and grew up in a small village north of Umeå, Sweden. He studied Engineering Physics at Umeå University and earned his PhD with a thesis about modeling information flows in complex systems from the Niels Bohr Institute in Copenhagen in 2006. Then he conducted a postdoc in the biology department at the University of Washington in Seattle. In 2009, he returned to Umeå and established his research group. In 2011, Martin became an associate professor, and in 2019, a professor of physics with a focus on computational science. He heads Integrated Science Lab, IceLab, an interdisciplinary hub with researchers from fields as diverse as mathematics, physics, ecology, biology, and computer science at Umeå University.
Contact persons:
- Aristides Gionis: argioni@kth.se
- Alexandre Proutiere: alepro@kth.se
- Martina Scolamiero: scola@kth.se