Skip to content

vLLM is an open source library that provides an efficient and convenient LLMs model server. You can use chat_vllm() to connect to endpoints powered by vLLM.

Uses OpenAI compatible API via chat_openai_compatible().

Usage

chat_vllm(
  base_url,
  system_prompt = NULL,
  model,
  params = NULL,
  api_args = list(),
  api_key = NULL,
  credentials = NULL,
  echo = NULL,
  api_headers = character()
)

models_vllm(base_url, api_key = NULL, credentials = NULL)

Arguments

base_url

The base URL to the endpoint; the default is OpenAI's public API.

system_prompt

A system prompt to set the behavior of the assistant.

model

The model to use for the chat. Use models_vllm() to see all options.

params

Common model parameters, usually created by params().

api_args

Named list of arbitrary extra arguments appended to the body of every chat API call. Combined with the body object generated by ellmer with modifyList().

api_key

[Deprecated] Use credentials instead.

credentials

Override the default credentials. You generally should not need this argument; instead set the VLLM_API_KEY environment variable. The best place to set this is in .Renviron, which you can easily edit by calling usethis::edit_r_environ().

If you do need additional control, this argument takes a zero-argument function that returns either a string (the API key), or a named list (added as additional headers to every request).

echo

One of the following options:

  • none: don't emit any output (default when running in a function).

  • output: echo text and tool-calling output as it streams in (default when running at the console).

  • all: echo all input and output.

Note this only affects the chat() method.

api_headers

Named character vector of arbitrary extra headers appended to every chat API call.

Value

A Chat object.

Examples

if (FALSE) { # \dontrun{
chat <- chat_vllm("http://my-vllm.com")
chat$chat("Tell me three jokes about statisticians")
} # }