Message Normalization, Incremental Sync, and Multi-path Input
sequenceDiagram
participant GT as Game Thread
participant PL as Plugin Layer
participant K as Kernel (VRAM)
GT->>PL: Send Message Array (Messages)
PL->>PL: Role Normalization (Gemma Support)
PL->>PL: Delta Check (Incremental)
PL->>K: Push Physical Token Delta
Note right of K: Background Inference Starts
PL-->>GT: Immediate Return (Non-blocking)
For scenarios requiring custom history management and sampling.
// Physical Pointer Binding + Delta Push
FLiteRtLmUnrealApi::SendChatRequest(AgentPtr, Messages, ...);
Encapsulated via ULiteRtLmComponent. Handles AgentPtr hashing automatically.
// Simple BP or C++ call
MyBrainComponent->SendChatMessage(UserMessageText);
We only push new tokens. This significantly reduces Prefill latency in long conversations by leveraging persistent KV Cache.
The plugin automatically merges system prompts into the first user message to maintain compatibility with lightweight models.