EN 中文

LiteRtLm_SamplingParams

Sampling and constraint parameter structure. Controls the creativity of text generation (Temperature/Top-P) and hard constraints on output format (JSON/Regex).

01. Role Context in Generation Loop

SamplingParams not only determines the "personality" of the AI but also, through the llguidance integration layer, forces the output to conform to a specific Schema (like JSON).

02. Member Variable Details

Member	Type	Description & Constraints
`temperature`	`float`	Sampling Temperature. Range [0.0, 2.0]. 0.0 indicates Greedy search for deterministic results; 1.0 is standard creativity.
`top_p`	`float`	Nucleus Sampling. Range [0.0, 1.0]. Samples only from the set of candidates whose cumulative probability reaches P. Usually set to 0.9.
`max_tokens`	`int`	Generation Limit. Forcibly stops after this many tokens, regardless of whether an EOF token was generated.
`constraint_type`	`int`	Hard Constraint Type. 0: No constraint \| 1: Regex \| 2: JSON Schema \| 3: Lark Grammar.
`constraint_string`	`const char*`	Constraint Description String. Passes the corresponding Regex or JSON Schema based on `constraint_type`.

Note: When constraint_type > 0, the first token latency might increase slightly due to the initialization of the llguidance grammar state machine.

03. C Language Usage Example (JSON Constraint)

LiteRtLm_SamplingParams params = {0};
params.temperature = 0.0f; // 0 temperature is recommended for structured output
params.max_tokens = 512;

// Enable hard JSON Schema constraint
params.constraint_type = 2; 
params.constraint_string = "{\"type\": \"object\", \"properties\": {\"name\": {\"type\": \"string\"}}}";

// Apply to inference
LiteRtLm_RunInference(conversation, params, MyCallback, NULL);