Process: Persistent Inference

01 / Background Compute Handshake

            sequenceDiagram
                participant GT as Game Thread
                participant K as Plugin Kernel (Thread)
                participant GPU as GPU hardware
                
                K->>K: Entering Blocking Loop (WaitUntilDone)
                loop Token Generation
                    K->>GPU: Dispatch Compute Instructions
                    GPU-->>K: Return Logits
                    K->>GT: Trigger OnChunk (Async Marshalling)
                end
                GT->>GT: Concurrent Game Logic & Rendering

02 / Monitoring & Actions

Observation: Hardware Telemetry

Business logic should constantly sense hardware status for defensive programming.

// Standard Polling Pattern
float AvailableVRAM = ULiteRtLmSubsystem::QueryAvailableVramMB();
if (AvailableVRAM < 512) { InterruptInference(); } // Safety Break

Drive: Internal Async Pump

Background tasks managed by the plugin. Business logic mainly manages handles.

// Internal: AsyncTask(ENamedThreads::AnyBackgroundThreadNormalTask, ...) wraps the WaitUntilDone loop.

03 / Related Concepts

ASYNCAsync Pumping

Why no frame drops? The plugin drives the kernel via a background pump. GameThread only receives pushed results, decoupling compute from rendering.

TELEMETRYDXGI Passthrough

Bypass RHI to get true GPU Budget. This is the engineering guarantee for stable local AI games.