Keynote Speakers

Manzoor Mohammed

Talk title: How the cloud made performance on board agenda!

Cloud spending is skyrocketing! Gartner predicts a 20% surge to $678.8 billion in 2024, making it a top expense after personnel for many organisations. In fact, 78% of US businesses and 54% in EMEA already leverage the cloud for diverse needs, from infrastructure and storage to development. However, a crucial question lingers: are we maximising our cloud investments? This year, boards are demanding answers. Thankfully, by delving deeper into cloud performance characteristics, we can unlock valuable insights. This keynote will explore how understanding performance empowers organisations to extract the full potential of their cloud, transforming cost into strategic advantage.

Short Bio

Manzoor Mohammed Manzoor, a cloud veteran with over 25 years of experience, co-founded Capacitas and serves as its Chief Innovation Officer. He has a proven track record of delivering millions in annual cloud spend reductions and optimising the performance of complex systems and has helped companies achieve their financial goals through strategic cloud management.

Manzoor recognised the transformative power of cloud computing early on, understanding how it strengthens the link between performance and cost. This realisation led him to create Capacitas' unique methodology, which has empowered numerous clients, including easyJet, Skype, JAGGAER, Ancestry, Cegid, and BMC Software, to achieve significant cost reductions and performance improvements.

Manzoor is a trusted advisor and thought leader in the cloud computing space, passionate about helping businesses leverage the cloud to achieve their full potential.

Jane Hillston

Talk title: What does performance mean for large language models?

In the last decade there has been a significant leap in the capability of foundation AI models, largely driven by the introduction and refinement of transformer-based machine learning architectures. The most visible consequence of this has been the explosion of interest and application of large language models such as ChatGPT. This is one exemplar of how a foundation model trained on a huge amount of data can be specialised for particular task, often by a phase of reinforcement learning with human feedback.

Within the AI community “performance” of such systems is generally taken to mean how well they respond to their users on characteristics such as accuracy, verifiability, and bias. Performance analysis usually considers both the responsiveness of a system to its user and the efficiency and equity of resource use. These foundation models rely on massive amounts of resource but there appears to have been little work considering how to understand the resource use or the trade-offs that exist between how the system responds to users and the amount of resource used.

In this talk I will present initial ideas of what it could mean to develop a framework of performance evaluation for foundation models such as large language models. Such a framework would need to take into consideration the distinct phases of operation for these models, which broadly speaking can be categorised as training, generating and fine-tuning. Evaluating the trade-off between user interests and resource management will require the identification of suitable metrics. Resources in such systems are more than simply compute and storage use, and bandwidth; data and even human resources also play crucial roles in training and fine-tuning. I will discuss all these topics.

Short Bio

Jane Hillston Jane Hillston is Professor of Quantitative Modelling in the School of Informatics and Dean of Research Culture in the College of Science and Engineering at the University of Edinburgh. Her research is concerned with formal approaches to modelling dynamic behaviour of discrete event systems. This includes everything from cloud computing, to biological processes, to transport systems in smart cities. Her research has been recognised by a number of awards including the RSE Lord Kelvin Medal, the BCS Lovelace Award and Fellowship of the Royal Society. She was Head of School in School of Informatics from 2018—2023 and Deputy Vice Principal Research 2020—2022. Professor Hillston was awarded an MBE in the Kings Birthday Honours List in 2023 in recognition of her contribution to computer science and women in science.

Giuliano Casale

Talk title: Optimizing Edge AI: Performance Engineering in Resource-Constrained Environments

Recent years have witnessed the growth of Edge AI, a transformative paradigm that integrates neural networks with edge computing, bringing computational intelligence closer to end users. However, this innovation is not without its challenges, especially in environments with limited computing, network, and memory constraints, where resource-hungry AI models often need to be partitioned for distributed execution. This issue becomes even more acute in scenarios where post-deployment updates are infeasible or costly, posing a need to accurately reason about the interplay between resource constraints and Quality-of-Service (QoS) in Edge AI systems, so as to optimally design and operate them.

In this keynote talk, I will focus on these challenges, discussing QoS management and deployment problems arising in Edge AI systems. I will review mechanisms such as early exits and DNN partitioning that are distinctive of this problem space, explaining how they could be accounted for and leveraged in system performance and reliability tuning. I will then illustrate how design decisions and the definition of novel runtime control algorithms can be guided by approaches based on both traditional analytical models and emerging data-driven methods based on machine learning models.

Short Bio

Giuliano Casale Giuliano Casale is a Reader in the Department of Computing at Imperial College London. He does research in Quality-of-Service engineering and cloud computing, topics on which he has published more than 150 refereed papers. He has served as program co-chair for several conferences in the area of performance and reliability engineering, such as ACM SIGMETRICS/Performance and IEEE/IFIP DSN. His research work has received best paper awards at ACM SIGMETRICS, IEEE/IFIP DSN and IEEE INFOCOM. During 2019-2023, he has served as ACM SIGMETRICS chair. He serves on the editorial board of ACM TOMPECS and as the Editor-in-Chief of Elsevier Performance Evaluation.