The Sched app lets you build your schedule, but it is not a substitute for event registration. You must be registered for Open Source in Finance Forum London 2026 to participate in the sessions. If you have not registered but would like to join us, please visit the event registration page to purchase a ticket.
Sign up or log in to add sessions to your schedule and sync them to your phone or calendar.
The FINOS AI Governance Framework identifies non-deterministic behavior as a core operational risk—but stops short of prescribing how to fix it. Two research papers offer contradictory answers: one says temperature zero is enough, the other says you need specialized GPU kernels. Both can't be right—and for workflows governed by SR 11-7, the answer determines whether LLMs can be deployed at all.
This talk follows our 16,000-call reproduction study across consumer and enterprise GPUs to resolve these claims. Starting with a 2025 study claiming small models achieve perfect determinism, we found their results didn't replicate. Moving to H100s with vLLM gave different answers—until we toggled a single infrastructure setting.
We present our findings as a reference architecture for deterministic LLM inference, mapped to the AIGF's risk taxonomy. Using Basel III Pillar 3 document extraction as our test case, we show which infrastructure choices—batch-invariant kernels, inference engine configuration, hardware selection—mitigate the non-determinism risk the AIGF catalogs but doesn't yet solve.