vLLM Support¶
ToolUniverse supports vLLM for self-hosted LLM inference. Use vLLM to run models on your own infrastructure for better privacy, cost control, and performance.
Quick Start¶
Start a vLLM server:
pip install vllm
vllm serve meta-llama/Llama-3.1-8B-Instruct
Set environment variables:
export VLLM_SERVER_URL="http://localhost:8000"
export TOOLUNIVERSE_LLM_DEFAULT_PROVIDER="VLLM"
export TOOLUNIVERSE_LLM_MODEL_DEFAULT="meta-llama/Llama-3.1-8B-Instruct"
Use with AgenticTool:
from tooluniverse import ToolUniverse
tool_config = {
"name": "Summarizer",
"type": "AgenticTool",
"prompt": "Summarize: {text}",
"input_arguments": ["text"],
"parameter": {
"type": "object",
"properties": {"text": {"type": "string"}},
"required": ["text"]
}
}
tu = ToolUniverse()
tu.register_tool_from_config(tool_config)
result = tu.execute_tool("Summarizer", {"text": "Your text here"})
Configuration¶
Environment Variables¶
Required:
- VLLM_SERVER_URL: Your vLLM server URL (e.g., http://localhost:8000)
Optional:
- TOOLUNIVERSE_LLM_DEFAULT_PROVIDER="VLLM": Set vLLM as default
- TOOLUNIVERSE_LLM_MODEL_DEFAULT="model-name": Default model (must match vLLM server)
- TOOLUNIVERSE_LLM_CONFIG_MODE="env_override": Make env vars override tool configs
Tool Configuration¶
You can also configure vLLM directly in tool configs:
tool_config = {
"name": "MyTool",
"type": "AgenticTool",
# ... prompt and parameters ...
"configs": {
"api_type": "VLLM",
"model_id": "meta-llama/Llama-3.1-8B-Instruct",
"temperature": 0.7
}
}
Note: Still requires VLLM_SERVER_URL environment variable.
Using with Space Configurations¶
In Space YAML files:
llm_config:
mode: "env_override"
default_provider: "VLLM"
models:
default: "meta-llama/Llama-3.1-8B-Instruct"
Then set: export VLLM_SERVER_URL="http://localhost:8000"
Configuration Priority¶
With env_override mode:
1. Environment variables (highest)
2. Tool configuration
3. Space configuration
4. Built-in defaults
Troubleshooting¶
- “VLLM_SERVER_URL environment variable not set”
Set the environment variable:
export VLLM_SERVER_URL="http://localhost:8000"- “Model not found”
Ensure
model_idmatches the model name loaded on your vLLM server- Connection failed
Verify vLLM server is running:
curl http://localhost:8000/health(if available)
URL format: Use base URL (e.g., http://localhost:8000). ToolUniverse automatically appends /v1.
Test Your Setup¶
from tooluniverse.agentic_tool import AgenticTool
import os
os.environ["VLLM_SERVER_URL"] = "http://localhost:8000"
tool = AgenticTool({
"name": "test",
"prompt": "Say hello",
"input_arguments": [],
"parameter": {"type": "object", "properties": {}, "required": []},
"configs": {
"api_type": "VLLM",
"model_id": "meta-llama/Llama-3.1-8B-Instruct"
}
})
if tool.is_available():
print("✅ vLLM connection successful!")
else:
print(f"❌ Failed: {tool.get_availability_status()}")
See Also¶
OpenRouter Support - Using OpenRouter as an LLM provider
Space Configuration System - Space configurations with LLM settings
Agentic Tools Tutorial - Complete guide to creating agentic tools