Chat with a model hosted by vLLM — chat

vLLM is an open source library that provides an efficient and convenient LLMs model server. You can use chat_vllm() to connect to endpoints powered by vLLM.

Usage

chat_vllm(
  base_url,
  system_prompt = NULL,
  model,
  seed = NULL,
  params = NULL,
  api_args = list(),
  api_key = vllm_key(),
  echo = NULL,
  api_headers = character()
)

models_vllm(base_url, api_key = vllm_key())

Arguments

base_url

The base URL to the endpoint; the default uses OpenAI.

system_prompt

A system prompt to set the behavior of the assistant.

model

The model to use for the chat. Use models_vllm() to see all options.

seed

Optional integer seed that ChatGPT uses to try and make output more reproducible.

params

Common model parameters, usually created by params().

api_args

Named list of arbitrary extra arguments appended to the body of every chat API call. Combined with the body object generated by ellmer with modifyList().

api_key

API key to use for authentication.

You generally should not supply this directly, but instead set the VLLM_API_KEY environment variable. The best place to set this is in .Renviron, which you can easily edit by calling usethis::edit_r_environ().

echo

One of the following options:

none: don't emit any output (default when running in a function).
output: echo text and tool-calling output as it streams in (default when running at the console).
all: echo all input and output.

Note this only affects the chat() method.

api_headers

Named character vector of arbitrary extra headers appended to every chat API call.

Value

A Chat object.

Examples

if (FALSE) { # \dontrun{
chat <- chat_vllm("http://my-vllm.com")
chat$chat("Tell me three jokes about statisticians")
} # }