Optional
abortOptional
nOptional
samplingOptional
stopList of custom token IDs for stopping the generation. Note: To convert from text to token ID, use lookupToken()
Optional
streamIf true, return an AsyncIterable instead of a string
Optional
useEquivalent to cache_prompt
option in llama.cpp server.
Useful for chat, because it skip evaluating the history part of the conversation.
Optional abort signal to stop the generation. This can also be used to stop during prompt processing (with a bit of delay.)