Python Guide

Complete guide for using ToolUniverse with Python

Welcome to the Python developer path! This guide covers everything you need to build scientific workflows with ToolUniverse’s Python API.

Installation

Choose your preferred installation method:

Standard installation with pip:

pip install tooluniverse

Fast, modern package manager:

uv pip install tooluniverse

For contributors and custom modifications:

git clone https://github.com/mims-harvard/ToolUniverse.git
cd ToolUniverse
uv sync  # or: pip install -e .[dev]

Tip

🎯 Pro Tip

Use uv for faster installations and better dependency management. Install it with: curl -LsSf https://astral.sh/uv/install.sh | sh

Verify Installation

Check that ToolUniverse is installed correctly:

import tooluniverse
print(f"ToolUniverse version: {tooluniverse.__version__}")
print("✅ Installation successful!")

Quick Start

Get your first scientific query running in 5 minutes:

Step 1: Initialize ToolUniverse

Create a ToolUniverse instance:

from tooluniverse import ToolUniverse

# Initialize ToolUniverse
tu = ToolUniverse()
Step 2: Load Tools

Load the scientific tools ecosystem:

# Load all 1000+ tools
tu.load_tools()

print(f"✅ Loaded {len(tu.all_tools)} scientific tools!")
💡 Advanced: Load Specific Tools

For faster loading, specify tool categories:

# Load only specific tool categories
tu.load_tools(tool_type=['uniprot', 'ChEMBL', 'opentarget'])
Step 3: Execute Your First Tool

Query scientific databases:

# Get protein function from UniProt
result = tu.run({
    "name": "UniProt_get_function_by_accession",
    "arguments": {"accession": "P05067"}
})

print(result)

Important

Success!

You now have access to 1000+ scientific tools for drug discovery, protein analysis, literature search, and more!

Tool Execution

All tools follow a consistent structure:

# Standardized query format
query = {
    "name": "tool_name",           # Tool identifier
    "arguments": {                 # Tool parameters
        "parameter1": "value1",
        "parameter2": "value2"
    }
}

result = tu.run(query)

Two execution methods:

Explicit and clear:

# Method 1: Dictionary API
result = tu.run({
    "name": "OpenTargets_get_associated_targets_by_disease_efoId",
    "arguments": {"efoId": "EFO_0000537"}
})

Convenient shorthand:

# Method 2: Direct Import
from tooluniverse.opentarget_tool import OpenTargets_get_associated_targets_by_disease_efoId

# Call directly
result = OpenTargets_get_associated_targets_by_disease_efoId(
    efoId="EFO_0000537"
)

Tool Finders

ToolUniverse has three ways to find tools. Don’t browse 1000+ tools manually—use Tool Finder!

🔍 Keyword Search

Fast text matching

Best for: Exact terms you know

tools = tu.run({
    "name": "Tool_Finder_Keyword",
    "arguments": {
        "description": "protein structure",
        "limit": 5
    }
})
🤖 LLM Search

Natural language (LLM API required)

Best for: Descriptive queries

tools = tu.run({
    "name": "Tool_Finder_LLM",
    "arguments": {
        "description": "find tools for analyzing gene expression",
        "limit": 5
    }
})
🧠 Semantic Search

Embedding-based (GPU required)

Best for: Conceptual matches

tools = tu.run({
    "name": "Tool_Finder",
    "arguments": {
        "description": "drug safety analysis",
        "limit": 5
    }
})
📋 Browse by Category

Organized view

Best for: Exploring tool types

# List by configuration file
stats = tu.list_built_in_tools(mode='config')

# List by tool type
stats = tu.list_built_in_tools(mode='type')

See also

For detailed guide on finding tools, see Tool Finder Tutorial

Common Examples

Protein & Gene Information

# Get protein function
result = tu.run({
    "name": "UniProt_get_function_by_accession",
    "arguments": {"accession": "P05067"}
})

Drug Safety Analysis

# Check adverse events
result = tu.run({
    "name": "FAERS_count_reactions_by_drug_event",
    "arguments": {"medicinalproduct": "aspirin"}
})

Disease-Target Relationships

# Find therapeutic targets
result = tu.run({
    "name": "OpenTargets_get_associated_targets_by_disease_efoId",
    "arguments": {"efoId": "EFO_0000685"}  # Rheumatoid arthritis
})

Literature Search

# Search scientific papers
result = tu.run({
    "name": "PubTator_search_publications",
    "arguments": {
        "query": "CRISPR cancer therapy",
        "limit": 10
    }
})

Tool Specifications

Inspect tool details before execution:

# Get single tool specification
spec = tu.tool_specification("UniProt_get_function_by_accession")

print(f"Name: {spec['name']}")
print(f"Description: {spec['description']}")
print("Parameters:")
for param_name, param_info in spec['parameters']['properties'].items():
    print(f"  - {param_name}: {param_info['type']} - {param_info['description']}")

# Get multiple specifications
specs = tu.get_tool_specification_by_names([
    "FAERS_count_reactions_by_drug_event",
    "OpenTargets_get_associated_targets_by_disease_efoId"
])

See also

For AI-Tool Interaction Protocol details, see AI-Tool Interaction Protocol

Building Workflows

Chain tools for complex research tasks:

Multi-Step Pipeline

from tooluniverse import ToolUniverse

tu = ToolUniverse()
tu.load_tools()

# Step 1: Find tools for drug discovery
tools = tu.run({
    "name": "Tool_Finder_Keyword",
    "arguments": {"description": "drug target", "limit": 3}
})

# Step 2: Get disease targets
targets = tu.run({
    "name": "OpenTargets_get_associated_targets_by_disease_efoId",
    "arguments": {"efoId": "EFO_0000685"}
})

# Step 3: For each target, get protein info
for target in targets[:3]:  # First 3 targets
    protein_info = tu.run({
        "name": "UniProt_get_entry_by_accession",
        "arguments": {"accession": target.get("target_id")}
    })
    print(f"Target: {target.get('target_name')}")
    print(f"Protein: {protein_info}")

Batch Execution

Execute multiple tools in parallel:

# Prepare multiple queries
queries = [
    {"name": "UniProt_get_function_by_accession", "arguments": {"accession": "P05067"}},
    {"name": "UniProt_get_function_by_accession", "arguments": {"accession": "P04637"}},
    {"name": "UniProt_get_function_by_accession", "arguments": {"accession": "P01112"}},
]

# Execute in batch
results = [tu.run(query) for query in queries]

See also

Configuration

API Keys

Some tools require API keys for enhanced performance:

🔑 Setting Up API Keys

Environment Variables (Recommended)

# Essential for specific features
export NVIDIA_API_KEY=your_nvidia_key_here        # Structure prediction
export HF_TOKEN=your_huggingface_token_here       # Model hosting

# Recommended for better performance
export NCBI_API_KEY=your_ncbi_key_here            # 3x faster queries
export SEMANTIC_SCHOLAR_API_KEY=your_key_here     # 100x faster literature
export FDA_API_KEY=your_fda_key_here              # 6x faster safety data

Using .env File

# Copy template
cp docs/.env.template .env

# Edit with your keys
nano .env

See detailed guide: API Keys and Authentication

Tool Loading Options

# Load all tools (default)
tu.load_tools()

# Load specific categories
tu.load_tools(tool_type=['uniprot', 'ChEMBL', 'opentarget'])

# Load with custom cache
tu.load_tools(use_cache=True, cache_dir="./custom_cache")

Logging

Configure logging for debugging:

import logging

# Enable detailed logging
logging.basicConfig(level=logging.INFO)

# ToolUniverse operations will now log details
tu = ToolUniverse()
tu.load_tools()

See also

For comprehensive logging configuration, see Logging Tutorial

Advanced Features

🔗 Tool Composition

Chain multiple tools into scientific workflows

Tool Composition Tutorial
🎣 Hooks System

Intelligent output processing and summarization

Post-processing Tool Outputs
💾 Cache System

Optimize performance with smart caching

Result Caching
🌐 HTTP API

Deploy ToolUniverse as a remote service

HTTP API - Remote Access

📚 Complete Case Study: Drug discovery workflow with Gemini 2.5 Pro

🔌 API Reference: Detailed Python API documentation

Need Help?