Glossary

Technical terms and concepts used throughout ToolUniverse documentation.

AI Scientist

An AI system (LLM, agent, or reasoning model) enhanced with ToolUniverse capabilities to autonomously perform scientific research tasks including literature search, data analysis, and experimental design.

AI-Tool Interaction Protocol

Standardized interface governing how AI scientists issue tool requests and receive structured results. Ensures compatibility across different LLM providers and tool implementations.

Agentic Tool

A tool that uses LLM capabilities internally to perform complex tasks like summarization, extraction, or intelligent search. Examples: Tool_Finder_LLM, summarization hooks.

ChEMBL

European Molecular Biology Laboratory’s database of bioactive drug-like small molecules. ChEMBL IDs are identifiers like “CHEMBL25” for aspirin.

ChEMBL ID

Unique identifier for chemical compounds in the ChEMBL database (e.g., CHEMBL25 for aspirin).

Compact Mode

Context window optimization mode that exposes only 4-5 core discovery tools instead of 1000+ tools. Reduces context usage by 99% while maintaining full functionality through the execute_tool tool.

EFO
EFO ID

Experimental Factor Ontology - A standardized disease classification system used in biomedical research. EFO IDs are identifiers like “EFO_0000537” for hypertension.

Embedding

Vector-based semantic search where text is converted to numerical vectors. Similar concepts have similar vectors, enabling “meaning-based” search beyond keyword matching. Used in Tool_Finder for intelligent tool discovery.

Hook

Output processing function that transforms or enhances tool results. Examples: SummarizationHook (AI-powered summaries), FileSaveHook (automatic file saving).

MCP
Model Context Protocol

Open standard protocol enabling AI assistants to securely connect to external tools and data sources. Think of it as “USB for AI tools” - a standardized connection interface.

SMCP
Scientific Model Context Protocol

ToolUniverse’s implementation of MCP, optimized for scientific workflows. Extends standard MCP with scientific domain features, intelligent tool discovery, and hooks system.

Profile
Toolspace

YAML configuration file (profile.yaml) declaring which tools to load, cache settings, LLM config, and hooks. Profiles can be loaded from GitHub, local files, or URLs using --load parameter.

STDIO Transport

Communication method where data flows through standard input/output streams. Used by desktop applications like Claude Desktop to communicate with MCP servers.

Tool Composition

Chaining multiple tools together in sequential or parallel workflows. Example: Search PubMed → Extract abstracts → Summarize findings.

Tool Finder

Family of specialized search tools for discovering relevant ToolUniverse tools:

  • Tool_Finder_Keyword: Fast keyword search

  • Tool_Finder_LLM: Natural language search using LLM

  • Tool_Finder: Semantic embedding search (most powerful)

Tool Specification

JSON definition describing a tool’s purpose, input parameters, output format, and execution details. Think of it as the “instruction manual” for each tool.

ToolSpace

See Profile.

PubChem CID

PubChem Compound Identifier - unique numerical ID for chemical compounds in PubChem database (e.g., CID 2244 for aspirin).

UniProt Accession

Unique identifier for protein sequences in UniProt database. Format: 1-6 characters (e.g., “P05067” for amyloid precursor protein).

FAERS

FDA Adverse Event Reporting System - Database of adverse drug reactions reported to FDA. Used for pharmacovigilance and drug safety monitoring.

SSE

Server-Sent Events - HTTP-based protocol for servers to push updates to clients. One of the transport methods supported by ToolUniverse MCP servers.

TTL

Time To Live - Duration (in seconds) that cached results remain valid before expiration. Set via TOOLUNIVERSE_CACHE_DEFAULT_TTL.

LRU Cache

Least Recently Used cache - Memory cache that evicts oldest unused entries when full. ToolUniverse uses LRU for in-memory caching.

Rate Limit

Maximum number of API requests allowed per time period. Example: NCBI allows 3 requests/second without API key, 10 requests/second with key.

Tool Category
Tool Type

Classification of tools by data source or functionality. Examples: uniprot, chembl, pubmed, fda, nvidia_nim. Used for selective tool loading.

Fingerprint
Cache Fingerprint

Unique hash computed from tool’s source code and parameter schema. When tool implementation changes, fingerprint changes, invalidating old cache entries.

Workspace Configuration

YAML or JSON file defining tool selection, hooks, and settings for a specific research workflow or domain. See Profile.

JSON-RPC

Remote Procedure Call protocol using JSON for data encoding. Some MCP servers use JSON-RPC for communication.

OpenTargets

Open Targets Platform - Database connecting genes to diseases with evidence scores. Provides target-disease associations for drug discovery.

AlphaFold
AlphaFold2

AI system for predicting 3D protein structures from amino acid sequences. Accessible through ToolUniverse via NVIDIA NIM tools.

NCBI

National Center for Biotechnology Information - US government agency providing biomedical databases including PubMed, GenBank, and protein databases.

Ontology

Structured vocabulary defining terms and relationships in a domain. Examples: EFO (diseases), Gene Ontology (biological processes), ChEBI (chemicals).

API Endpoint

URL path for making API requests to web services. Example: /api/search or https://rest.uniprot.org/uniprotkb/search.

See Also