Texas Systems Seminar

Welcome to the Texas Systems Seminar series. This is a series of technical talks related to software systems and infrastructure by either companies or researchers based in Austin. The talks will be centered around the technical challenges the speaker is facing or solving (not a marketing talk). Speakers are usually senior technical folks, such as staff engineers. Coffee will be provided at the event.

Aim

The aim of this talk series is to bring together and connect the technical community in Austin, and to strengthen the connections between companies in Austin and the UT Computer Science department.

Intended Audience

The talks will be open to the public, and will be attended by a mix of undergraduate, masters, and PhD students. A significant portion of the audience will be from the computer science department.

Mailing List

If you would like to know about new talks, please subscribe to the mailing list texas-systems-seminar@utlists.utexas.edu. You can subscribe here.

Indicating interest

Are you interested in presenting at this talk series? Please fill out this form. We will get in touch with you.

Where

The talks will talk place at the Gates Dell Complex, the home of the UT Computer Science department. Please check the talk details to find the room for the talk.

When

The talks will be held on a monthly basis, usually on a Friday. Check this space to know when the talks are finalized!

Parking

Guests can park at the San Jacinto parking garage. It is about a 10 minute walk to the department.

Schedule

TBD Orchestrating Agent Communication: Scalable State Management for Long-Running Processes via the Actor Model and Saga Pattern

Speaker: David B. Shabat, Vice President - Research and Development, Quali

Time: 2 pm

Room: GDC 6.302

Abstract.

Modern intelligent agent systems rely on efficient, robust coordination to handle complex, asynchronous interactions and long-running processes. Traditional synchronous communication methods and centralized databases often create bottlenecks and points of failure, hindering system scalability and agent communication quality.
This overview introduces a system architecture that leverages the Actor Model and the Saga design pattern to build highly resilient, high-throughput agent-to-agent communication platforms. We address the challenges of managing state and ensuring data consistency across multiple, independent services during extended operations (e.g., multi-step negotiations, task handoffs). The Actor Model serves as the foundational computational paradigm, providing isolated, concurrent units of behavior and state that communicate exclusively via asynchronous message passing. This inherently distributed approach eliminates shared memory conflicts and enables massive parallelization of agent interactions.
To manage the complexity of distributed transactions within this model, we implement the Saga pattern. A saga is modeled as a sequence of local transactions, coordinated by a dedicated “orchestrator” actor. This orchestrator manages the global state of the long-running process and utilizes compensating transactions to ensure eventual consistency and robust failure recovery without requiring cumbersome two-phase commits.
This session will provide an industry-oriented abstract view of:
The Actor Model in practice: How asynchronous messaging and actor isolation improve communication efficiency and system responsiveness.
Saga implementation: Using orchestrator actors to define and manage complex, multi-agent workflows.
Resilience and state management: Techniques for persisting actor state to survive failures and guarantee process completion.

Speaker Bio.

David B. Shabat is the Vice President - Research and Development at Quali. He has several decades of experience in industry across companies such as Intel, Sisense, BitDam, and Quali. He got his Bachelors and Masters degree from Reichman University.

Jan 16, 2026 MegaSort – Tuning A Billion+ Value Sorting Algorithm on Modern Hardware

Speaker: Conor Cunningham, Microsoft

Time: 2 pm

Room: GDC 6.302

Abstract.

While most students learn sorting algorithms in introductory computer science classes, many of these algorithms were created decades ago before the evolution of CPUs into multi-core, multi-level cache systems with super-scalar execution pipelines including complex branch prediction and memory prefetching implementations to hide latency. While traditional Big-O analysis can identify algorithms that will perform sorting faster or slower at a logical level, there is a different set of considerations and challenges when trying to get algorithms to perform optimally in practice on a current server CPU. We created MegaSort, an AVX-512-based quicksort/bitonic sort based on the Brahmas quicksort, as an intellectual exercise to validate next-generation chips from processor vendors with a focus on measuring and understanding the microarchitectural details of each vendor’s SIMD implementation so we can plan future work in Microsoft’s database engines. It can sort 1 billion signed 64-bit integers in under 30 seconds and can often go far faster than that (depending on data distribution). This talk covers how to build a sort to run on modern hardware, the challenges that must be overcome in dealing with current-generation microarchitecture behaviors, and the tools (ex: Vtune, uProf) that modern processor vendors make available so that anyone can do evaluations of any implementation where performance is a goal.

Speaker Bio.

Conor completed his B.S. in Computer Science from UT Austin in 1996. Later, he completed a Masters degree, also in Computer Science, at the University of Washington in Seattle. He has worked at Microsoft for 27 years building database engines with specializations in query optimization/search algorithms, distributed systems, and more recently low-level query performance on modern hardware. He has led large scale projects such as the releases of SQL Server 2016, 2017, and 2019, and now he has been working on the analytics engine inside of SQL’s engine which is used in SQL Server, Azure SQL Database, and in Fabric Data Warehouse (a scale-out analytics engine).

Feb 20, 2026 Disaggregated Shared Memory is Coming

Speaker: John Groves, Technical Director, Micron

Time: 2 pm

Room: GDC 6.516

Abstract.

The Compute Express Link (CXL) standard enables disaggregated memory, both for composable capacity-on-demand and memory shared by multiple servers. Now DRAM capacity can be provisioned within a rack or cluster via a fabric manager, either private or shared. This has led to interesting work in memory tiering, as well as memory sharing. Adding memory for non-shared use is straightforward – it is brought online as if hot-plugged – but disaggregated shared memory requires some new abstractions. One such abstraction is the Fabric-Attached Memory File System (famfs). Famfs is open source software that organizes shared memory as a file system; reading or writing a famfs file are just ‘memcpy()’, and a memory-mapped famfs file is byte addressable and accessed at cache line granularity – just like conventional DRAM.
Brief outline
* Brief introduction to CXL
* Overview of disaggregated memory topologies
* Comments on cache coherency
* Introduction to the Fabric-Attached Memory File System (famfs) and shared memory
use cases

Speaker Bio.

John Groves has been a kernel and system software developer for decades, working on
memory management, file systems and data storage in several Unix variants prior to
Linux. John serves the CXL Consortium as co-chair of the Software and Systems
Working Group (SSWG), and is a contributor to the CXL specification - particularly in the
areas of fabric management and sharable memory devices. John is also the creator and
primary author of famfs, which is progressing toward inclusion the Linux kernel. John
has spoken on famfs at the last three Linux Plumbers Conferences (2023-2025) as well
as the Linux Storage, Filesystem and Memory Management (LSFMM) summits in 2024
and 2025 , Usenix FAST in 2025 (famfs poster), the SNIA Developers Conference
(2025) and the Massive Storage Systems Technology (MSST) conference in 2025. John
received a BS degree in Physics in 1985 from Northern Illinois University.
John Groves has been a kernel and system software developer for decades, working on
memory management, file systems and data storage in several Unix variants prior to
Linux. John serves the CXL Consortium as co-chair of the Software and Systems
Working Group (SSWG), and is a contributor to the CXL specification - particularly in the
areas of fabric management and sharable memory devices. John is also the creator and
primary author of famfs, which is progressing toward inclusion the Linux kernel. John
has spoken on famfs at the last three Linux Plumbers Conferences (2023-2025) as well
as the Linux Storage, Filesystem and Memory Management (LSFMM) summits in 2024
and 2025 , Usenix FAST in 2025 (famfs poster), the SNIA Developers Conference
(2025) and the Massive Storage Systems Technology (MSST) conference in 2025. John
received a BS degree in Physics in 1985 from Northern Illinois University.

March 13, 2026 Architecting Resilience at Scale: From Research to Practice

Speaker: Sudhanva Gurumurthi, AMD

Time: 2 pm

Room: GDC 6.516

Abstract.

Computing must be reliable. From a computer architecture perspective, achieving this goal begins with understanding the root causes of faults and applying systematic, quantitative methods to improve the resilience of hardware components. This talk will illustrate this approach through two case studies. The first describes research that led to a new resilience architecture for die-stacked DRAM that was adopted into the third generation of the JEDEC High-Bandwidth Memory standard (HBM3) and incorporated in GPUs and AI accelerators deployed at scale today in data centers. The second focuses on techniques for designing and testing high-performance CPUs to improve resilience to faults arising from silicon defects. Together, these examples highlight how principled reliability research can translate into practical impact.

Speaker Bio.

Sudhanva Gurumurthi is a Fellow at AMD, where he is responsible for research and advanced development in Reliability, Availability, and Serviceability (RAS). His work has impacted numerous AMD products, multiple industry standards, and external research in the field. Before joining industry, he was an Associate Professor with tenure in the Computer Science Department at the University of Virginia. Sudhanva is the recipient of an NSF CAREER Award, a Google Focused Research Award, and is named to the ISCA Hall of Fame. He currently serves as the Editor-in-Chief of IEEE Computer Architecture Letters. Sudhanva received his PhD in Computer Science and Engineering from Penn State in 2005.