The Summer Research Institute (SuRI) is a premier
venue to discuss recent research results. Each year we invite the top faculty in
a given topic to present their research in front of an interested crowd of
students, faculty, researchers, fellows from EPFL, the Vaud area, Switzerland,
and abroad.
This year, our focus was security, systems, and formal methods. This distinct
system stint allowed us to explore upcoming problems in consensus, reliability,
distributed systems, verification, and cyber defense. Over two days we heard
from ten amazing speakers complemented by mentoring and poster sessions and a
neat walking tour through Lausanne.

Natacha Crooks: Scaling Decentralized Trust with Databases
Natacha started with a background on byzantine fault tolerance and how it
applies to distributed and shared ledgers.
Basil, the fault tolerance system from her group consists of three core
components: concurrency control to detect concurrent conflicting operations on
replicas, the atomic commit component that ensures consistency across replicas and shards
and the fallback component that ensures byzantine participants cannot stall
honest clients.
Following the description of Basil, Natacha introduced Pesto which runs a query
engine on top of Basil that runs complex SQL benchmarks without code
modifications at 3-5x better performance than previous approaches.
Interestingly, they demonstrated that the system is only 10-35% slower than
classic systems proving great performance.
Fault tolerance matters and append-only ledgers are just databases.
Aurojit Panda: Runtime Verification of Distributed Systems
We have protocol specifications but how do we prove the correctness of a given
implementation? Key is that one needs to specify the protocol in a way that aids
the search.
A key finding is that static analysis can enable lightweight runtime
verification that does not affect failure properties and has low runtime
overhead.
The system cannot detect bugs outside of the specification or performance bugs,
detectable bugs require visibility of the bug at runtime and instantiatability.
The biggest challenge remains the lack of specification and corresponding
unawareness of what it means for a program to be correct.
Peter Pietzuch: Improving Cloud Security with Hardware Memory Capabilities
Clouds have performance/isolation trade-offs and require strong isolation
between tenants. Workloads require efficient sharing between isolation domains
for performance and to limit overhead. There's redundant memory contents at all
levels. The goal of Peter's research therefore is to rethink the cloud stack
with new hardware or custom-tailored hardware extensions.
One goal is to break down isolation barriers by enabling memory capabilities.
That way, systems can avoid the MMU tax and reduce the cost of address
translation. Another angle is the focus on de-privileging system software and
small TCBs by breaking systems into smaller parts. The underlying goal is to
minimize disruption to existing applications by accepting changes to the
privileged systems.
The two papers that Peter presented were CAP-VMs from OSDI'22 where they go into
details to enable strong isolation with fine-grained data sharing and ORC from
OSDI'23 where they look at concrete memory deduplication with semantic object
reuse.
Lesly-Ann Daniel: Hardware-software co-designs for microarchitectural security
Mitigations are facing a hard time under micro-architectural side channels. The
complexity of modern processors opens up a large area of potential attack
vectors. Lesly-Ann focused on recent works that address these side channels.
She presented ProSpeCT from Usenix SEC'23, a defense against SPECTRE that
hardens constant code against SPECTRE attacks. The challenge of SPECTRE
mitigations is that one must enumerate all possible branches that could be
vulnerable to SPECTRE and harden them individually. For constant code, the
software should not be responsible for tracking potential speculation. The goal
is to prove that software is constant time and then use hardware-software
contracts to ensure underlying guarantees.
After lessons learned about protecting against SPECTRE, Lesly-Ann focused on
securing constant time execution through different compiler extensions. At a
high level, Libra from CCS'24 balances executions for complex processors. The
key idea is to balance execution that will leak the same set of secrets by going
through the code in a carefully balanced way that orchestrates
micro-architectural state carefully.
Vyas Sekar: Enabling AI-based Autonomous Security Operations
Due to the complexity, it is extremely challenging to evaluate the security of
autonomous systems. As a case study, Vyas and his team implemented a mock
environment for the Equifax system. Key challenge was to model the different
defenses and detection systems which resulted in a complex configuration of
mangled systems.
After discussing the design and MHBench system to evaluate the cyber range, Vyas
talked about deception insights, novel LLM-based attackers, robust and stealthy
attackers, and opportunities for LLM-based defenders.
In systems, there's only three novel ideas: use an abstraction layer, use a
level of indirection, or create a cache.
Vyas discussed MHBench as a general platform to implement different strategies
for attackers and deception based on static and dynamic properties and a variety
of capabilities such as decoy hosts or files along with various attack
strategies that mimic APTs and replicate real-world settings.
To evaluate the MHBench system, they used LLMs and existing pen testing queries
to try to exfiltrate data from the network. Interestingly, LLMs were unable to
do end-to-end multistage attacks. This was mostly due to the LLMs having to
generate lots of coherent low level code and its inability to keep track of
state at that level. But using high-level abstractions, they were able to
guide the LLM to produce an end-to-end attack. Their tool Incalmo is open source and available to try.
Going forward, a key question will be how to leverage these results to scale up
the evaluation and how to leverage this to test your systems.
Bozidar Radunovic: The future of mobile cellular connectivity is programmable
Cellular networks have become complex and run multiple layers of software on
diverse hardware. One key difference is that for wireless, you can communicate
with several clients at the same time due to separation in the frequency domain.
Bozidar went into details on how to build and deploy such a network, including
running large parts on an open-source stack. Even though the systems are
open-source, one of the challenges is that the code is also complex which makes
it difficult to change individual aspects without breaking other tangential
features.
To tackle this complexity, Bozidar introduced eBPF-based hooks that allow him to
measure diverse properties of the network. Similar to the Linux kernel, these
hooks allow introspection without requiring heavy changes of the underlying
system. Using customizable hooks (codelets), they managed to analyze diverse
interference attacks using these lightweight probes as a sample application. But
many other use-cases are doable.
Looking at data, interestingly many base stations have very low utilization,
including during peak times. There's an immense opportunity at saving energy
when switching off idle cells and ensuring coverage through nearby cells.
Challenge here is that RAN software has real-time deadlines (with 10s of ms),
CPU load can experience spikes at ms granularity, and RAN software is a black
box. This setup makes it challenging to optimize in this environment.
Key message: cellular networks are exciting, it's easy to build a network
yourself. Small changes can have a huge impact and network programmability can
enable next-gen apps!
Ivan De Oliveira Nunes: Provable Execution in Real-Time Embedded Systems
Embedded systems are used in diverse contexts, interact with the environment
through sensors and signal status to other systems. The security properties on
deeply embedded systems is challenging as code often runs bare-metal without any
security mitigations. Ivan's key question is if it's possible to guarantee
some level of service for embedded systems even if part of the embedded system
is compromised.
Protecting embedded systems is hard as there is generally no virtual memory
abstraction and all applications run, along with the OS, in the same address
space. Newer embedded systems may allow some secure enclaves that share the same
CPU but allows doing security critical operations.
In a nutshell, they allow potentially dangerous interferences to go through but
log everything using a monitor in the trusted enclave. It only works because of
the passive checks guaranteeing integrity (but not availability). A trusted
verifier can then assess if the execution was correct.
Tianyin Xu: Software Reliability in Emerging Cloud Computing Paradigms
Tianyin gave a great overview and deep dive into software reliability. I sadly
missed this talk due to the organization of the walking tour.
Ana Klimovic: Rethinking Cloud System Software and Abstractions for True Elasticity in the Cloud-Native Era
Last but not least, Ana talked about performance optimizations for cloud
applications. She especially highlighted performance optimizations in her
Dandelion system and how to further improve cloud workloads and systems.
Similarly as for Tianyin, I sadly missed parts of her talk due to admin duties.
Concluding thoughts
Overall SuRI was a great success. We had over 100 attendees, 10 speakers, and 5
organizers attend amazing talks and ask lots of questions. We do hope that the
discussions at SuRI will trigger lots of new collaborations and look forward to
the next years!
