David Weissteiner

I am currently in the second year of my Master’s degree in Computer Science at Graz University of Technology, with Data Science as my major and Statistics as my minor. My interests mainly tend towards machine learning systems and federated learning, with a particular attraction for exploring the field of bathymetry at the moment.

latest posts

Nov 13, 2024	Gradient Evolution
Apr 04, 2024	GitHub SSH Key
Apr 03, 2024	Zsh and Oh-My-Zsh Installation Guide

selected publications

Federated Data Preparation, Learning, and Debugging in Apache SystemDS

Sebastian Baunsgaard , Matthias Boehm , Kevin Innerebner , and 7 more authors

In Proceedings of the 31st ACM International Conference on Information & Knowledge Management , 2022

Abs HTML PDF

Federated learning allows training machine learning (ML) models without central consolidation of the raw data. Variants of such federated learning systems enable privacy-preserving ML, and address data ownership and/or sharing constraints. However, existing work mostly adopt data-parallel parameter-server architectures for mini-batch training, require manual construction of federated runtime plans, and largely ignore the broad variety of data preparation, ML algorithms, and model debugging. Over the last years, we extended Apache SystemDS by an additional federated runtime backend for federated linear-algebra programs, federated parameter servers, and federated data preparation. In this paper, we share the system-level compiler and runtime integration, new features such as multi-tenant federated learning, selected federated primitives, multi-key homomorphic encryption, and our monitoring infrastructure. Our demonstrator showcases how composite ML pipelines can be compiled into federated runtime plans with low overhead.
Multi-tenant Federated Learning

David Weissteiner

Aug 2022

Supervised by Matthias Boehm and Arnab Phani

Abs Bib PDF

Federated Learning (FL) is being increasingly studied and became a common application in many domains. Existing work about FL mostly focuses on a system architecture with a single centralized actor (coordinator) deploying continuous computation requests over multiple decentralized federated workers. However, due to the limitation of a single coordinator within the federated infrastructure, only one actor can address the data of the federated worker at the same time. The emerging field of FL created many application scenarios allowing to learn centralized models from private data, and some systems even making it feasible to perform certain ML pipelines. However, these applications miss the benefits of multiple coordinators within the same federated infrastructure. In this work, we introduce Multi-tenant Federated Learning (MuTeFL), which provides the possibility to share data and computing resources of multiple federated workers among several distinct coordinators. To preserve the independence of the coordinators, we create an autonomous, server-like federated worker that can serve multiple coordinators simultaneously. Moreover, we isolate the coordinator-instances at the federated worker to avoid revealing data to other coordinators, we introduce parallelization strategies to utilize the full potential of the worker, and we eliminate redundancies at the worker by using reuse techniques. We thereby create the possibility to deploy the federated worker as a long-running server process that can be addressed by different parties at any time. This enables, for example, simultaneous model training by different data scientists on the same federated workers, as well as providing learning capabilities on sensitive data to the public domain.
@thesis{Weissteiner2022, author = {Weissteiner, David}, title = {Multi-tenant Federated Learning}, school = {Graz University of Technology}, year = {2022}, type = {Bachelor's Thesis}, address = {Rechbauerstraße 12, 8010 GRAZ, AUSTRIA}, month = aug, note = {Supervised by Matthias Boehm and Arnab Phani}, }