OptionalabortOptionalnOptionalsamplingOptionalstopList of custom token IDs for stopping the generation. Note: To convert from text to token ID, use lookupToken()
OptionalstreamIf true, return an AsyncIterable instead of a string
OptionaluseEquivalent to cache_prompt option in llama.cpp server.
Useful for chat, because it skip evaluating the history part of the conversation.
Optional abort signal to stop the generation. This can also be used to stop during prompt processing (with a bit of delay.)