Full Implementation Paths, Physics, and Procedures
sequenceDiagram
participant GT as Game Thread
participant PL as Plugin Layer
participant K as Kernel (DLL)
GT->>PL: Trigger LoadModel (Multiple Paths)
Note right of PL: Entering Background AsyncTask
PL->>K: Physical mmap mapping
K->>K: Execute GPU VRAM Pre-allocation
K-->>PL: Return Engine Handle
PL-->>GT: Broadcast OnModelLoaded Signal
For performance purists, manual thread and lifecycle management.
// Manual Async Setup
AsyncTask(ENamedThreads::AnyBackgroundThreadNormalTask, [Config]() {
FLiteRtLmUnrealApi::LoadModel(Config);
});
Leveraging ULiteRtLmSubsystem for global lifecycle management.
// Load via global singleton, maintains model state automatically
ULiteRtLmSubsystem* Subsystem = GEngine->GetEngineSubsystem<ULiteRtLmSubsystem>();
Subsystem->LoadModel(MyConfig);
Attach ULiteRtLmComponent and check bAutoInit.
Why is loading a 2GB model instant? We just tell the OS "the file is here," and physical transfer only happens when computation touches the weights.
Once started, the plugin physically reserves MaxNumTokens * stride in VRAM, ensuring stability during long dialogues.