Source code for tooluniverse.smcp
"""
Scientific Model Context Protocol (SMCP) - Enhanced MCP Server with ToolUniverse Integration
SMCP is a sophisticated MCP (Model Context Protocol) server that bridges the gap between
AI agents and scientific tools. It seamlessly integrates ToolUniverse's extensive
collection of 350+ scientific tools with the MCP protocol, enabling AI systems to
access scientific databases, perform complex analyses, and execute scientific workflows.
The SMCP module provides a complete solution for exposing scientific computational
resources through the standardized MCP protocol, making it easy for AI agents to
discover, understand, and execute scientific tools in a unified manner.
Usage Patterns:
===============
Quick Start:
```python
# High-performance server with custom configuration
server = SMCP(
name="Production Scientific API",
tool_categories=["uniprot", "ChEMBL", "opentarget", "hpa"],
max_workers=20,
search_enabled=True
)
server.run_simple(
transport="http",
host="0.0.0.0",
port=7000
)
```
Client Integration:
```python
# Using MCP client to discover and use tools
import json
# Discover protein analysis tools
response = await client.call_tool("find_tools", {
"query": "protein structure analysis",
"limit": 5
})
# Use discovered tool
result = await client.call_tool("UniProt_get_entry_by_accession", {
"arguments": json.dumps({"accession": "P05067"})
})
```
Architecture:
=============
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ MCP Client │◄──►│ SMCP │◄──►│ ToolUniverse │
│ (AI Agent) │ │ Server │ │ (350+ Tools) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────────┐
│ Scientific │
│ Databases & │
│ Services │
└──────────────────┘
The SMCP server acts as an intelligent middleware layer that:
1. Receives MCP requests from AI agents/clients
2. Translates requests to ToolUniverse tool calls
3. Executes tools against scientific databases/services
4. Returns formatted results via MCP protocol
5. Provides intelligent tool discovery and recommendation
Integration Points:
==================
MCP Protocol Layer:
- Standard MCP methods (tools/list, tools/call, etc.)
- Custom scientific methods (tools/find, tools/search)
- Transport-agnostic communication (stdio, HTTP, SSE)
- Proper error codes and JSON-RPC 2.0 compliance
ToolUniverse Integration:
- Dynamic tool loading and configuration
- Schema transformation and validation
- Execution wrapper with error handling
- Category-based tool organization
AI Agent Interface:
- Natural language tool discovery
- Contextual tool recommendations
- Structured parameter schemas
- Comprehensive tool documentation
"""
import asyncio
import json
from concurrent.futures import ThreadPoolExecutor
from typing import Any, Dict, List, Optional, Union, Callable, Literal
try:
from fastmcp import FastMCP
FASTMCP_AVAILABLE = True
except ImportError:
# Use a simple print here since logging isn't available yet
print(
"FastMCP is not available. SMCP is built on top of FastMCP, which is a required dependency."
)
from .execute_function import ToolUniverse
from .logging_config import (
get_logger,
)
[docs]
class SMCP(FastMCP):
"""
Scientific Model Context Protocol (SMCP) Server
SMCP is an enhanced MCP (Model Context Protocol) server that seamlessly integrates
ToolUniverse's extensive collection of scientific and scientific tools with the
FastMCP framework. It provides a unified, AI-accessible interface for scientific
computing, data analysis, and research workflows.
The SMCP server extends standard MCP capabilities with scientific domain expertise,
intelligent tool discovery, and optimized configurations for research applications.
It automatically handles the complex task of exposing hundreds of specialized tools
through a consistent, well-documented interface.
Key Features:
============
🔬 **Scientific Tool Integration**: Native access to 350+ specialized tools covering
scientific databases, literature search, clinical data, genomics, proteomics,
chemical informatics, and AI-powered analysis capabilities.
🧠 **AI-Powered Tool Discovery**: Multi-tiered intelligent search system using:
- ToolFinderLLM: Cost-optimized LLM-based semantic understanding with pre-filtering
- Tool_RAG: Embedding-based similarity search
- Keyword Search: Simple text matching as reliable fallback
📡 **Full MCP Protocol Support**: Complete implementation of MCP specification with:
- Standard methods (tools/list, tools/call, resources/*, prompts/*)
- Custom scientific methods (tools/find, tools/search)
- Multi-transport support (stdio, HTTP, SSE)
- JSON-RPC 2.0 compliance with proper error handling
⚡ **High-Performance Architecture**: Production-ready features including:
- Configurable thread pools for concurrent tool execution
- Intelligent tool loading and caching
- Resource management and graceful degradation
- Comprehensive error handling and recovery
🔧 **Developer-Friendly**: Simplified configuration and deployment with:
- Sensible defaults for scientific computing
- Flexible customization options
- Comprehensive documentation and examples
- Built-in diagnostic and monitoring tools
Custom MCP Methods:
==================
tools/find:
AI-powered tool discovery using natural language queries. Supports semantic
search, category filtering, and flexible response formats.
tools/search:
Alternative endpoint for tool discovery with identical functionality to
tools/find, provided for compatibility and convenience.
Parameters:
===========
name : str, optional
Human-readable server name used in logs and identification.
Default: "SMCP Server"
Examples: "Scientific Research API", "Drug Discovery Server"
tooluniverse_config : ToolUniverse or dict, optional
Either a pre-configured ToolUniverse instance or configuration dict.
If None, creates a new ToolUniverse with default settings.
Allows reuse of existing tool configurations and customizations.
tool_categories : list of str, optional
Specific ToolUniverse categories to load. If None and auto_expose_tools=True,
loads all available tools. Common combinations:
- Scientific: ["ChEMBL", "uniprot", "opentarget", "pubchem", "hpa"]
- Literature: ["EuropePMC", "semantic_scholar", "pubtator", "agents"]
- Clinical: ["fda_drug_label", "clinical_trials", "adverse_events"]
exclude_tools : list of str, optional
Specific tool names to exclude from loading. These tools will not be
exposed via the MCP interface even if they are in the loaded categories.
Useful for removing specific problematic or unwanted tools.
exclude_categories : list of str, optional
Tool categories to exclude from loading. These entire categories will
be skipped during tool loading. Can be combined with tool_categories
to first select categories and then exclude specific ones.
include_tools : list of str, optional
Specific tool names to include. If provided, only these tools will be
loaded regardless of categories. Overrides category-based selection.
tools_file : str, optional
Path to a text file containing tool names to include (one per line).
Alternative to include_tools parameter. Comments (lines starting with #)
and empty lines are ignored.
tool_config_files : dict of str, optional
Additional tool configuration files to load. Format:
{"category_name": "/path/to/config.json"}. These files will be loaded
in addition to the default tool files.
include_tool_types : list of str, optional
Specific tool types to include. If provided, only tools of these types
will be loaded. Available types include: 'OpenTarget', 'ToolFinderEmbedding',
'ToolFinderKeyword', 'ToolFinderLLM', etc.
exclude_tool_types : list of str, optional
Tool types to exclude from loading. These tool types will be skipped
during tool loading. Useful for excluding entire categories of tools
(e.g., all ToolFinder types or all OpenTarget tools).
auto_expose_tools : bool, default True
Whether to automatically expose ToolUniverse tools as MCP tools.
When True, all loaded tools become available via the MCP interface
with automatic schema conversion and execution wrapping.
search_enabled : bool, default True
Enable AI-powered tool search functionality via tools/find method.
Includes ToolFinderLLM (cost-optimized LLM-based), Tool_RAG (embedding-based),
and simple keyword search capabilities with intelligent fallback.
max_workers : int, default 5
Maximum number of concurrent worker threads for tool execution.
Higher values allow more parallel tool calls but use more resources.
Recommended: 5-20 depending on server capacity and expected load.
hooks_enabled : bool, default False
Whether to enable output processing hooks for intelligent post-processing
of tool outputs. When True, hooks can automatically summarize long outputs,
save results to files, or apply other transformations.
hook_config : dict, optional
Custom hook configuration dictionary. If provided, overrides default
hook settings. Should contain 'hooks' list with hook definitions.
Example: {"hooks": [{"name": "summarization_hook", "type": "SummarizationHook", ...}]}
hook_type : str, optional
Simple hook type selection. Can be 'SummarizationHook', 'FileSaveHook',
or a list of both. Provides an easy way to enable hooks without full configuration.
Takes precedence over hooks_enabled when specified.
**kwargs
Additional arguments passed to the underlying FastMCP server instance.
Supports all FastMCP configuration options for advanced customization.
Raises:
=======
ImportError
If FastMCP is not installed. FastMCP is a required dependency for SMCP.
Install with: pip install fastmcp
Notes:
======
- SMCP automatically handles ToolUniverse tool loading and MCP conversion
- Tool search uses ToolFinderLLM (optimized for cost) when available, gracefully falls back to simpler methods
- All tools support JSON argument passing for maximum flexibility
- Server supports graceful shutdown and comprehensive resource cleanup
- Thread pool execution ensures non-blocking operation for concurrent requests
- Built-in error handling provides informative debugging information
"""
[docs]
def __init__(
self,
name: Optional[str] = None,
tooluniverse_config: Optional[Union[ToolUniverse, Dict[str, Any]]] = None,
tool_categories: Optional[List[str]] = None,
exclude_tools: Optional[List[str]] = None,
exclude_categories: Optional[List[str]] = None,
include_tools: Optional[List[str]] = None,
tools_file: Optional[str] = None,
tool_config_files: Optional[Dict[str, str]] = None,
include_tool_types: Optional[List[str]] = None,
exclude_tool_types: Optional[List[str]] = None,
auto_expose_tools: bool = True,
search_enabled: bool = True,
max_workers: int = 5,
hooks_enabled: bool = False,
hook_config: Optional[Dict[str, Any]] = None,
hook_type: Optional[str] = None,
**kwargs,
):
if not FASTMCP_AVAILABLE:
raise ImportError(
"FastMCP is required for SMCP. Install it with: pip install fastmcp"
)
# Filter out SMCP-specific kwargs before passing to FastMCP
fastmcp_kwargs = kwargs.copy()
fastmcp_kwargs.pop("tooluniverse", None) # Remove if accidentally passed
# Initialize FastMCP with default settings optimized for scientific use
super().__init__(name=name or "SMCP Server", **fastmcp_kwargs)
# Get logger for this class
self.logger = get_logger("SMCP")
# Initialize ToolUniverse with hook support
if isinstance(tooluniverse_config, ToolUniverse):
self.tooluniverse = tooluniverse_config
else:
self.tooluniverse = ToolUniverse(
tool_files=tooluniverse_config,
keep_default_tools=True,
hooks_enabled=hooks_enabled,
hook_config=hook_config,
hook_type=hook_type,
)
# Configuration
self.tool_categories = tool_categories
self.exclude_tools = exclude_tools or []
self.exclude_categories = exclude_categories or []
self.include_tools = include_tools or []
self.tools_file = tools_file
self.tool_config_files = tool_config_files or {}
self.include_tool_types = include_tool_types or []
self.exclude_tool_types = exclude_tool_types or []
self.auto_expose_tools = auto_expose_tools
self.search_enabled = search_enabled
self.max_workers = max_workers
self.hooks_enabled = hooks_enabled
self.hook_config = hook_config
self.hook_type = hook_type
# Thread pool for concurrent tool execution
self.executor = ThreadPoolExecutor(max_workers=max_workers)
# Track exposed tools to avoid duplicates
self._exposed_tools = set()
# Initialize SMCP-specific features
self._setup_smcp_tools()
# Register custom MCP methods
self._register_custom_mcp_methods()
[docs]
def _register_custom_mcp_methods(self):
"""
Register custom MCP protocol methods for enhanced functionality.
This method extends the standard MCP protocol by registering custom handlers
for scientific tool discovery and search operations. It safely patches the
FastMCP request handler to support additional methods while maintaining
compatibility with standard MCP operations.
Custom Methods Registered:
=========================
- tools/find: AI-powered tool discovery using natural language queries
- tools/search: Alternative endpoint for tool search (alias for tools/find)
Implementation Details:
======================
- Preserves original FastMCP request handler for standard methods
- Uses method interception pattern to handle custom methods first
- Falls back to original handler for unrecognized methods
- Implements proper error handling and JSON-RPC 2.0 compliance
Error Handling:
==============
- Gracefully handles missing request handlers
- Logs warnings for debugging when handler patching fails
- Ensures server continues to function even if custom methods fail to register
Notes:
======
This method is called automatically during SMCP initialization and should
not be called manually. It uses a guard to prevent double-patching.
"""
try:
# Override the default request handler to support custom methods
if hasattr(self, "_original_handle_request"):
return # Already patched
# Store original handler
self._original_handle_request = getattr(self, "_handle_request", None)
# Replace with custom handler
if hasattr(self, "_handle_request"):
self._handle_request = self._custom_handle_request
elif hasattr(self, "handle_request"):
self._original_handle_request = self.handle_request
self.handle_request = self._custom_handle_request
else:
self.logger.warning("Could not find request handler to override")
except Exception as e:
self.logger.error(f"Error registering custom MCP methods: {e}")
[docs]
def _get_valid_categories(self):
"""
Get valid tool categories from ToolUniverse.
Returns:
Set[str]: Set of valid tool category names
"""
try:
# Use the existing ToolUniverse instance if available
if hasattr(self.tooluniverse, "get_tool_types"):
return set(self.tooluniverse.get_tool_types())
else:
# Create a temporary instance to get categories
temp_tu = ToolUniverse()
return set(temp_tu.get_tool_types())
except Exception as e:
self.logger.error(f"❌ Error getting valid categories: {e}")
return set()
[docs]
async def _custom_handle_request(self, request: Dict[str, Any]) -> Dict[str, Any]:
"""
Custom MCP request handler that supports enhanced scientific tool operations.
This handler intercepts MCP requests and provides specialized handling for
scientific tool discovery methods while maintaining full compatibility with
standard MCP protocol operations.
Parameters:
===========
request : dict
JSON-RPC 2.0 request object containing:
- method: The MCP method being called
- id: Request identifier for response correlation
- params: Method-specific parameters
Returns:
========
dict
JSON-RPC 2.0 response object with either:
- result: Successful operation result
- error: Error information with code and message
Supported Custom Methods:
========================
tools/find:
Search for tools using natural language queries with AI-powered recommendations.
Parameters:
- query (required): Natural language description of desired functionality
- categories (optional): List of tool categories to filter by
- limit (optional): Maximum number of results (default: 10)
- use_advanced_search (optional): Use AI vs keyword search (default: True)
- search_method (optional): Specific search method - 'auto', 'llm', 'embedding', 'keyword' (default: 'auto')
- format (optional): Response format - 'detailed' or 'mcp_standard'
tools/search:
Alias for tools/find method with identical parameters and behavior.
Standard MCP Methods:
All other methods are forwarded to the original FastMCP handler,
ensuring full compatibility with MCP specification.
Error Codes:
============
- -32601: Method not found (unknown method)
- -32602: Invalid params (missing required parameters)
- -32603: Internal error (server-side failures)
Examples:
=========
Request for tool discovery:
```json
{
"jsonrpc": "2.0",
"id": "search_123",
"method": "tools/find",
"params": {
"query": "protein structure analysis",
"limit": 5,
"format": "mcp_standard"
}
}
```
Successful response:
```json
{
"jsonrpc": "2.0",
"id": "search_123",
"result": {
"tools": [...],
"_meta": {
"search_query": "protein structure analysis",
"search_method": "AI-powered (ToolFinderLLM)",
"total_matches": 5
}
}
}
```
"""
try:
method = request.get("method")
request_id = request.get("id")
params = request.get("params", {})
# Handle custom methods
if method == "tools/find":
return await self._handle_tools_find(request_id, params)
elif method == "tools/search": # Alternative endpoint name
return await self._handle_tools_find(request_id, params)
# For all other methods, use the original handler
if self._original_handle_request:
if asyncio.iscoroutinefunction(self._original_handle_request):
return await self._original_handle_request(request)
else:
return self._original_handle_request(request)
else:
# Fallback: return method not found error
return {
"jsonrpc": "2.0",
"id": request_id,
"error": {"code": -32601, "message": f"Method not found: {method}"},
}
except Exception as e:
return {
"jsonrpc": "2.0",
"id": request.get("id"),
"error": {"code": -32603, "message": f"Internal error: {str(e)}"},
}
[docs]
async def _handle_tools_find(
self, request_id: str, params: Dict[str, Any]
) -> Dict[str, Any]:
"""
Handle the tools/find MCP method for AI-powered tool discovery.
This method implements the core functionality for the custom tools/find MCP method,
enabling clients to discover relevant scientific tools using natural language
queries. It supports both AI-powered semantic search and simple keyword matching.
Parameters:
===========
request_id : str
Unique identifier for this request, used in the JSON-RPC response
params : dict
Request parameters containing:
- query (required): Natural language description of desired functionality
- categories (optional): List of tool categories to filter results
- limit (optional): Maximum number of tools to return (default: 10)
- use_advanced_search (optional): Whether to use AI search (default: True)
- search_method (optional): Specific search method - 'auto', 'llm', 'embedding', 'keyword' (default: 'auto')
- format (optional): Response format - 'detailed' or 'mcp_standard' (default: 'detailed')
Returns:
========
dict
JSON-RPC 2.0 response containing either:
- Success: Result with discovered tools and metadata
- Error: Error object with appropriate code and message
Response Formats:
================
Detailed Format (default):
Returns comprehensive tool information including:
- Tool names, descriptions, types
- Parameter schemas with detailed property information
- Search metadata (query, method used, match count)
MCP Standard Format:
Returns tools in standard MCP tools/list format:
- Simplified tool schema compatible with MCP clients
- inputSchema formatted for direct MCP consumption
- Metadata included in separate _meta field
Search Methods:
==============
AI-Powered Search (ToolFinderLLM):
- Uses Large Language Model to understand query semantics
- Analyzes tool descriptions for intelligent matching
- Provides relevance scoring and reasoning
- Automatically used when available and use_advanced_search=True
Simple Keyword Search:
- Basic text matching against tool names and descriptions
- Case-insensitive substring matching
- Used as fallback or when use_advanced_search=False
Error Handling:
==============
- Validates required parameters (query must be provided)
- Handles search failures gracefully with informative messages
- Provides detailed error context for debugging
Examples:
=========
Basic protein analysis search:
```python
params = {
"query": "protein structure analysis",
"limit": 3
}
```
Category-filtered drug search:
```python
params = {
"query": "drug interactions",
"categories": ["ChEMBL", "fda_drug_label"],
"limit": 5,
"format": "mcp_standard"
}
```
"""
try:
# Extract parameters
query = params.get("query", "")
categories = params.get("categories")
limit = params.get("limit", 10)
use_advanced_search = params.get("use_advanced_search", True)
search_method = params.get(
"search_method", "auto"
) # 'auto', 'llm', 'embedding', 'keyword'
format_type = params.get(
"format", "detailed"
) # 'detailed' or 'mcp_standard'
if not query:
return {
"jsonrpc": "2.0",
"id": request_id,
"error": {
"code": -32602,
"message": "Invalid params: 'query' is required",
},
}
# Perform the search using existing search functionality
search_result = await self._perform_tool_search(
query=query,
categories=categories,
limit=limit,
use_advanced_search=use_advanced_search,
search_method=search_method,
)
# Parse the search result
search_data = json.loads(search_result)
# Format response based on requested format
if format_type == "mcp_standard":
# Format as standard MCP tools/list style response
tools_list = []
for tool in search_data.get("tools", []):
mcp_tool = {
"name": tool.get("name"),
"description": tool.get("description", ""),
"inputSchema": {
"type": "object",
"properties": tool.get("parameters", {}),
"required": tool.get("required", []),
},
}
tools_list.append(mcp_tool)
result = {
"tools": tools_list,
"_meta": {
"search_query": query,
"search_method": search_data.get("search_method"),
"total_matches": search_data.get("total_matches"),
"categories_filtered": categories,
},
}
else:
# Return detailed format (default)
result = search_data
return {"jsonrpc": "2.0", "id": request_id, "result": result}
except json.JSONDecodeError as e:
return {
"jsonrpc": "2.0",
"id": request_id,
"error": {
"code": -32603,
"message": f"Search result parsing error: {str(e)}",
},
}
except Exception as e:
return {
"jsonrpc": "2.0",
"id": request_id,
"error": {
"code": -32603,
"message": f"Internal error in tools/find: {str(e)}",
},
}
[docs]
async def _perform_tool_search(
self,
query: str,
categories: Optional[List[str]],
limit: int,
use_advanced_search: bool,
search_method: str = "auto",
) -> str:
"""
Execute tool search using the most appropriate search method available.
Simplified unified interface that leverages the consistent tool interfaces.
All search tools now return JSON format directly.
Parameters:
===========
query : str
Natural language query describing the desired tool functionality
categories : list of str, optional
Tool categories to filter results by
limit : int
Maximum number of tools to return
use_advanced_search : bool
Whether to prefer AI-powered search when available
search_method : str, default 'auto'
Specific search method: 'auto', 'llm', 'embedding', 'keyword'
Returns:
========
str
JSON string containing search results
"""
try:
# Determine which tool to use based on method and availability
tool_name = self._select_search_tool(search_method, use_advanced_search)
# Prepare unified function call - all search tools now use same interface
function_call = {
"name": tool_name,
"arguments": {"description": query, "limit": limit},
}
# Add categories only if provided to avoid validation issues
if categories is not None:
function_call["arguments"]["categories"] = categories
# Execute the search tool
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(
self.executor, self.tooluniverse.run_one_function, function_call
)
# All search tools now return JSON format directly
if isinstance(result, str):
return result
elif isinstance(result, dict) or isinstance(result, list):
return json.dumps(result, indent=2)
else:
return str(result)
except Exception as e:
return json.dumps(
{
"error": f"Search error: {str(e)}",
"query": query,
"fallback_used": True,
"tools": [],
},
indent=2,
)
[docs]
def _select_search_tool(self, search_method: str, use_advanced_search: bool) -> str:
"""
Select the appropriate search tool based on method and availability.
Returns:
str: Tool name to use for search
"""
# Get available tools
all_tools = self.tooluniverse.return_all_loaded_tools()
available_tool_names = [tool.get("name", "") for tool in all_tools]
# Handle specific method requests
if search_method == "keyword":
return "Tool_Finder_Keyword"
elif search_method == "llm" and "Tool_Finder_LLM" in available_tool_names:
return "Tool_Finder_LLM"
elif search_method == "embedding" and "Tool_Finder" in available_tool_names:
return "Tool_Finder"
elif search_method == "auto":
# Auto-selection priority: Keyword > RAG > LLM
if use_advanced_search:
if "Tool_Finder_Keyword" in available_tool_names:
return "Tool_Finder_Keyword"
if "Tool_Finder" in available_tool_names:
return "Tool_Finder"
elif "Tool_Finder_LLM" in available_tool_names:
return "Tool_Finder_LLM"
else:
# Invalid method or method not available, fallback to keyword
return "Tool_Finder_Keyword"
[docs]
def _setup_smcp_tools(self):
"""
Initialize and configure SMCP-specific tools and features.
This method orchestrates the complete setup of SMCP functionality including
ToolUniverse tool loading, validation, automatic tool exposure to the MCP
interface, search functionality initialization, and utility tool registration.
The setup process is designed to be robust, handle various edge cases gracefully,
and provide informative feedback about the configuration process. It implements
intelligent fallback strategies to ensure functionality even when specific
components are unavailable.
Setup Process Overview:
=====================
1. **Tool Loading Assessment**: Check if ToolUniverse already has tools loaded
to avoid unnecessary reloading and potential conflicts
2. **Category Validation**: If specific categories are requested, validate them
against available categories and provide helpful feedback for invalid ones
3. **Tool Loading Strategy**: Load tools using the most appropriate method:
- Category-specific loading for focused deployments
- Full loading for comprehensive access
- Graceful fallback when category loading fails
4. **Tool Exposure**: Convert loaded ToolUniverse tools to MCP format with
proper schema transformation and execution wrapping
5. **Search Setup**: Initialize multi-tiered search capabilities including
AI-powered and fallback methods
6. **Utility Registration**: Add server management and diagnostic tools
Tool Loading Strategy:
=====================
**Already Loaded Check**:
If ToolUniverse already contains loaded tools (len(all_tools) > 0), skip
the loading phase to prevent duplication and preserve existing configuration.
This supports scenarios where users pre-configure ToolUniverse instances.
**Category-Specific Loading**:
When tool_categories is specified:
- Validate each category against available tool categories
- Log warnings for invalid categories with suggestions
- Load only valid categories to optimize performance
- Fall back to full loading if no valid categories remain
**Full Loading (Default)**:
When auto_expose_tools=True and no specific categories are requested,
load all available tools to provide comprehensive functionality.
**Graceful Fallback**:
If category-specific loading fails for any reason, automatically
fall back to loading all tools to ensure basic functionality.
Tool Exposure Process:
=====================
**Schema Transformation**:
- Convert ToolUniverse parameter schemas to MCP-compatible format
- Handle complex parameter types and validation rules
- Preserve documentation and examples where available
**Execution Wrapping**:
- Create async wrappers for synchronous ToolUniverse tools
- Implement proper error handling and result formatting
- Use thread pool execution to prevent blocking
**Safety Mechanisms**:
- Skip meta-tools (MCPAutoLoaderTool, MCPClientTool) that shouldn't be exposed
- Track exposed tools to prevent duplicates
- Handle tool conversion failures gracefully without stopping entire process
Search Setup:
============
**Multi-Tiered Search Architecture**:
1. **ToolFinderLLM** (Primary): Cost-optimized AI-powered semantic understanding using LLM
2. **Tool_RAG** (Secondary): Embedding-based similarity search
3. **Keyword Search** (Fallback): Simple text matching, always available
**Initialization Process**:
- Check for availability of advanced search tools in loaded tools
- Attempt to load search tools if not already present
- Configure search capabilities based on what's available
- Provide clear feedback about search capabilities
**Search Tool Loading**:
Attempts to load tool_finder_llm and tool_finder categories which include:
- ToolFinderLLM: Cost-optimized LLM-based intelligent tool discovery
- Tool_RAG: Embedding-based semantic search
- Supporting utilities and configuration tools
Error Handling:
==============
**Category Validation Errors**:
- Log specific invalid categories with available alternatives
- Continue with valid categories only
- Fall back to full loading if no valid categories
**Tool Loading Errors**:
- Log detailed error information for debugging
- Continue setup process with already loaded tools
- Ensure server remains functional even with partial failures
**Search Setup Errors**:
- Gracefully handle missing search tool dependencies
- Fall back to simpler search methods automatically
- Log informative messages about search capabilities
**Tool Exposure Errors**:
- Handle individual tool conversion failures without stopping process
- Log specific tool errors for debugging
- Continue with remaining tools to maximize functionality
Performance Considerations:
==========================
- **Lazy Loading**: Only load tools when needed to minimize startup time
- **Efficient Validation**: Quick category checks before expensive operations
- **Parallel Processing**: Use thread pools for tool conversion where possible
- **Memory Management**: Efficient tool representation and storage
Diagnostic Output:
=================
Provides informative logging throughout the setup process:
```
Tools already loaded in ToolUniverse (356 tools), skipping reload
Exposing 356 tools from ToolUniverse
✅ ToolFinderLLM (cost-optimized) available for advanced search
Exposed tool: UniProt_get_entry_by_accession (type: uniprot)
```
Notes:
======
- This method is called automatically during SMCP initialization
- Should not be called manually after server initialization
- Setup is idempotent - can be called multiple times safely
- All setup phases include comprehensive error handling
- Performance scales with the number of tools being loaded and exposed
"""
# Always ensure full tool set is loaded (hooks may have preloaded a minimal set)
# Deduplication in ToolUniverse.load_tools prevents duplicates, so reloading is safe
if self.tool_categories:
try:
# Validate categories first
valid_categories = self._get_valid_categories()
invalid_categories = [
cat for cat in self.tool_categories if cat not in valid_categories
]
if invalid_categories:
available_categories = list(valid_categories)
self.logger.warning(
f"Invalid categories {invalid_categories}. Available categories: {available_categories}"
)
# Filter to valid categories only
valid_only = [
cat for cat in self.tool_categories if cat in valid_categories
]
if valid_only:
self.logger.info(f"Loading valid categories: {valid_only}")
self.tooluniverse.load_tools(
tool_type=valid_only,
exclude_tools=self.exclude_tools,
exclude_categories=self.exclude_categories,
include_tools=self.include_tools,
tools_file=self.tools_file,
tool_config_files=self.tool_config_files,
include_tool_types=self.include_tool_types,
exclude_tool_types=self.exclude_tool_types,
)
else:
self.logger.warning(
"No valid categories found, loading all tools instead"
)
self.tooluniverse.load_tools(
exclude_tools=self.exclude_tools,
exclude_categories=self.exclude_categories,
include_tools=self.include_tools,
tools_file=self.tools_file,
tool_config_files=self.tool_config_files,
include_tool_types=self.include_tool_types,
exclude_tool_types=self.exclude_tool_types,
)
else:
self.tooluniverse.load_tools(
tool_type=self.tool_categories,
exclude_tools=self.exclude_tools,
exclude_categories=self.exclude_categories,
include_tools=self.include_tools,
tools_file=self.tools_file,
tool_config_files=self.tool_config_files,
include_tool_types=self.include_tool_types,
exclude_tool_types=self.exclude_tool_types,
)
except Exception as e:
self.logger.error(f"Error loading specified categories: {e}")
self.logger.info("Falling back to loading all tools")
self.tooluniverse.load_tools(
exclude_tools=self.exclude_tools,
exclude_categories=self.exclude_categories,
include_tools=self.include_tools,
tools_file=self.tools_file,
tool_config_files=self.tool_config_files,
include_tool_types=self.include_tool_types,
exclude_tool_types=self.exclude_tool_types,
)
elif self.auto_expose_tools:
# Load all tools by default
self.tooluniverse.load_tools(
exclude_tools=self.exclude_tools,
exclude_categories=self.exclude_categories,
include_tools=self.include_tools,
tools_file=self.tools_file,
tool_config_files=self.tool_config_files,
include_tool_types=self.include_tool_types,
exclude_tool_types=self.exclude_tool_types,
)
# Auto-expose ToolUniverse tools as MCP tools
if self.auto_expose_tools:
self._expose_tooluniverse_tools()
# Add search functionality if enabled
if self.search_enabled:
self._add_search_tools()
# Add utility tools
self._add_utility_tools()
[docs]
def _expose_tooluniverse_tools(self):
"""
Automatically expose ToolUniverse tools as MCP-compatible tools.
This method performs the critical task of converting ToolUniverse's tool
definitions into FastMCP-compatible tools that can be called via the MCP
protocol. It handles the complex mapping between different tool formats
while ensuring compatibility and usability.
Process Overview:
================
1. **Tool Inventory**: Enumerate all loaded ToolUniverse tools
2. **Type Filtering**: Skip meta-tools that shouldn't be exposed
3. **Schema Conversion**: Transform ToolUniverse schemas to MCP format
4. **Function Wrapping**: Create async wrappers for tool execution
5. **Registration**: Register tools with FastMCP framework
Tool Type Filtering:
===================
Skips these internal tool types:
- MCPAutoLoaderTool: Used for loading other MCP servers
- MCPClientTool: Used for connecting to external MCP servers
These are meta-tools that manage other tools rather than providing
end-user functionality, so they're excluded from the MCP interface.
Schema Transformation:
=====================
ToolUniverse Tool Format:
```json
{
"name": "tool_name",
"parameter": {
"type": "object",
"properties": {...},
"required": [...]
}
}
```
MCP Tool Format:
```python
async def tool_function(arguments: str = "{}") -> str:
# Tool execution logic
```
Execution Model:
===============
- **JSON Arguments**: All tools accept a single 'arguments' parameter
containing JSON-encoded tool parameters
- **Async Execution**: Tools run in thread pool to prevent blocking
- **Error Handling**: Comprehensive error catching and reporting
- **Type Safety**: Proper argument parsing and validation
Duplicate Prevention:
====================
- Tracks exposed tools in self._exposed_tools set
- Prevents re-registration of already exposed tools
- Handles tool reloading scenarios gracefully
Error Recovery:
==============
- Individual tool failures don't stop the entire process
- Detailed error logging for debugging
- Continues with remaining tools if some fail to convert
Performance Optimization:
========================
- Lazy evaluation of tool schemas
- Minimal memory footprint per tool
- Efficient tool lookup and execution
- Thread pool reuse for all tool executions
Examples:
=========
Original ToolUniverse tool call:
```python
tu.run_one_function({
"name": "UniProt_get_entry_by_accession",
"arguments": {"accession": "P05067"}
})
```
Equivalent MCP tool call:
```python
await tool_function('{"accession": "P05067"}')
```
"""
if not hasattr(self.tooluniverse, "all_tools"):
self.logger.warning("No all_tools attribute in tooluniverse")
return
self.logger.info(
f"Exposing {len(self.tooluniverse.all_tools)} tools from ToolUniverse"
)
# Define tool types that should not be exposed as MCP tools
# These are internal/meta tools that are used for loading other tools
skip_tool_types = {"MCPAutoLoaderTool", "MCPClientTool"}
for i, tool_config in enumerate(self.tooluniverse.all_tools):
try:
# Debug: Check the type of tool_config
if not isinstance(tool_config, dict):
self.logger.warning(
f"tool_config at index {i} is not a dict, it's {type(tool_config)}: {tool_config}"
)
continue
tool_name = tool_config.get("name")
tool_type = tool_config.get("type")
# Skip internal/meta tools that are used for loading other tools
if tool_type in skip_tool_types:
self.logger.debug(
f"Skipping exposure of meta tool: {tool_name} (type: {tool_type})"
)
continue
if tool_name and tool_name not in self._exposed_tools:
self._create_mcp_tool_from_tooluniverse(tool_config)
self._exposed_tools.add(tool_name)
self.logger.debug(f"Exposed tool: {tool_name} (type: {tool_type})")
except Exception as e:
self.logger.error(f"Error processing tool at index {i}: {e}")
self.logger.debug(f"Tool config: {tool_config}")
continue
exposed_count = len(self._exposed_tools)
self.logger.info(f"Successfully exposed {exposed_count} tools to MCP interface")
[docs]
def _add_search_tools(self):
"""
Register AI-powered tool search and discovery functionality.
This method adds sophisticated tool discovery capabilities to the SMCP server,
enabling clients to find relevant tools using natural language queries.
It provides both programmatic (MCP tool) and protocol-level (tools/find method)
interfaces for tool discovery.
Registered Tools:
================
find_tools:
Primary tool discovery interface with AI-powered search capabilities.
Parameters:
- query (str): Natural language description of desired functionality
- categories (list, optional): Tool categories to filter by
- limit (int, default=10): Maximum number of results
- use_advanced_search (bool, default=True): Use AI vs keyword search
Returns: JSON string with discovered tools and search metadata
search_tools:
Backward-compatible alias for find_tools with identical functionality.
Maintained for compatibility with existing integrations.
Search Capabilities:
===================
AI-Powered Search (ToolFinderLLM):
- Uses Large Language Model to understand query semantics with optimized context
- Pre-filters tools using keyword matching to reduce LLM context cost
- Analyzes only essential tool information (name + description) for cost efficiency
- Provides relevance scoring and reasoning
- Handles complex queries like "analyze protein interactions in cancer"
Embedding-Based Search (Tool_RAG):
- Uses vector embeddings for semantic similarity matching
- Fast approximate matching for large tool collections
- Good balance between speed and semantic understanding
Keyword Search (Fallback):
- Simple text matching against tool names and descriptions
- Always available regardless of AI tool availability
- Provides basic but reliable tool discovery
Search Strategy:
===============
1. **Preference**: ToolFinderLLM (most intelligent, cost-optimized)
2. **Fallback**: Tool_RAG (semantic similarity)
3. **Final**: Simple keyword matching (always works)
Integration Details:
===================
- Automatically initializes available search tools during setup
- Shares search logic with tools/find MCP method
- Provides consistent results across different interfaces
- Handles tool loading and availability detection
Error Handling:
==============
- Graceful degradation when AI tools unavailable
- Informative error messages for debugging
- Fallback mechanisms ensure search always works
- Detailed logging of search method selection
Usage Examples:
==============
Via MCP tool interface:
```python
result = await find_tools(
query="protein structure prediction",
categories=["uniprot", "hpa"],
limit=5
)
```
Via tools/find MCP method:
```json
{
"method": "tools/find",
"params": {
"query": "drug interaction analysis",
"limit": 3
}
}
```
"""
# Initialize tool finder (prefer LLM-based if available, fallback to embedding-based)
self._init_tool_finder()
@self.tool()
async def find_tools(
query: str,
categories: Optional[List[str]] = None,
limit: int = 10,
use_advanced_search: bool = True,
search_method: str = "auto",
) -> str:
"""
Find and search available ToolUniverse tools using AI-powered search.
This tool provides the same functionality as the tools/find MCP method.
Args:
query: Search query describing the desired functionality
categories: Optional list of categories to filter by
limit: Maximum number of results to return (default: 10)
use_advanced_search: Use AI-powered search if available (default: True)
search_method: Specific search method - 'auto', 'llm', 'embedding', 'keyword' (default: 'auto')
Returns:
JSON string containing matching tools with detailed information
"""
return await self._perform_tool_search(
query, categories, limit, use_advanced_search, search_method
)
# # Keep the original search_tools as an alias for backward compatibility
# @self.tool()
# async def search_tools(
# query: str,
# categories: Optional[List[str]] = None,
# limit: int = 10,
# use_advanced_search: bool = True,
# search_method: str = 'auto'
# ) -> str:
# """
# Search available ToolUniverse tools (alias for find_tools).
# Args:
# query: Search query string describing the desired functionality
# categories: Optional list of categories to filter by
# limit: Maximum number of results to return
# use_advanced_search: Whether to use AI-powered tool finder
# search_method: Specific search method - 'auto', 'llm', 'embedding', 'keyword' (default: 'auto')
# Returns:
# JSON string containing matching tools information
# """
# return await self._perform_tool_search(query, categories, limit, use_advanced_search, search_method)
[docs]
def _init_tool_finder(self):
"""
Initialize intelligent tool discovery system with automatic fallback.
This method sets up the tool finder infrastructure that powers AI-driven
tool discovery. It implements a tiered approach, trying the most advanced
search methods first and falling back to simpler methods if needed.
Initialization Strategy:
=======================
Phase 1 - Detection:
Scans loaded ToolUniverse tools to identify available search tools:
- ToolFinderLLM: Advanced LLM-based semantic search
- Tool_RAG: Embedding-based similarity search
Phase 2 - Loading (if needed):
If no search tools are found, attempts to load them:
- Loads 'tool_finder_llm' and 'tool_finder' categories
- Re-scans for available tools after loading
Phase 3 - Selection:
Selects the best available search method:
1. ToolFinderLLM (preferred - most intelligent)
2. Tool_RAG (fallback - good semantic understanding)
3. Simple keyword search (always available)
Tool Finder Capabilities:
========================
ToolFinderLLM:
- Uses GPT-4 or similar LLM for query understanding
- Analyzes tool descriptions for semantic matching
- Provides relevance scoring and selection reasoning
- Handles complex, multi-faceted queries effectively
- Best for: "Find tools to analyze protein-drug interactions in cancer research"
Tool_RAG:
- Uses pre-computed embeddings for fast similarity search
- Good semantic understanding without LLM overhead
- Faster than LLM-based search for simple queries
- Best for: "protein analysis", "drug discovery"
Simple Search:
- Basic keyword matching against names and descriptions
- No dependencies, always available
- Fast and reliable for exact term matches
- Best for: "chembl", "uniprot", "fda"
Configuration Management:
========================
Sets instance attributes:
- tool_finder_available (bool): Whether advanced search is available
- tool_finder_type (str): Type of search tool loaded ("ToolFinderLLM" | "Tool_RAG")
Error Handling:
==============
- Handles missing dependencies gracefully
- Provides informative console output about search capabilities
- Ensures search functionality always works (via simple fallback)
- Logs detailed information for debugging
Performance Considerations:
==========================
- Tool loading only happens if search tools aren't already available
- Search tool detection is cached to avoid repeated scans
- ToolFinderLLM requires network access and API keys
- Tool_RAG requires embedding files but works offline
Dependencies:
============
- ToolFinderLLM: Requires OpenAI API access or compatible endpoint
- Tool_RAG: Requires sentence-transformers and embedding data
- Simple search: No external dependencies
"""
self.tool_finder_available = False
self.tool_finder_type = None
# Check if ToolFinderLLM is available in loaded tools
try:
all_tools = self.tooluniverse.return_all_loaded_tools()
available_tool_names = [tool.get("name", "") for tool in all_tools]
# Try ToolFinderLLM first (more advanced)
if "Tool_Finder_LLM" in available_tool_names:
self.tool_finder_available = True
self.tool_finder_type = "Tool_Finder_LLM"
self.logger.info(
"✅ Tool_Finder_LLM (cost-optimized) available for advanced search"
)
return
# Fallback to Tool_RAG (embedding-based)
if "Tool_RAG" in available_tool_names:
self.tool_finder_available = True
self.tool_finder_type = "Tool_RAG"
self.logger.info(
"✅ Tool_RAG (embedding-based) available for advanced search"
)
return
# Check if ToolFinderKeyword is available for simple search
if "Tool_Finder_Keyword" in available_tool_names:
self.logger.info("✅ ToolFinderKeyword available for simple search")
self.logger.warning("⚠️ No advanced tool finders available in loaded tools")
self.logger.debug(
f"Available tools: {available_tool_names[:5]}..."
) # Show first 5 tools
except Exception as e:
self.logger.warning(f"⚠️ Failed to check for tool finders: {e}")
# Try to load tool finder tools if not already loaded
try:
self.logger.debug("🔄 Attempting to load tool finder tools...")
# Load tool_finder category which includes ToolFinderLLM, Tool_RAG, and ToolFinderKeyword
self.tooluniverse.load_tools(tool_type=["tool_finder"])
# Re-check availability
all_tools = self.tooluniverse.return_all_loaded_tools()
available_tool_names = [tool.get("name", "") for tool in all_tools]
if "Tool_Finder_LLM" in available_tool_names:
self.tool_finder_available = True
self.tool_finder_type = "Tool_Finder_LLM"
self.logger.info(
"✅ Successfully loaded Tool_Finder_LLM for advanced search"
)
elif "Tool_RAG" in available_tool_names:
self.tool_finder_available = True
self.tool_finder_type = "Tool_RAG"
self.logger.info("✅ Successfully loaded Tool_RAG for advanced search")
else:
self.logger.warning("⚠️ Failed to load any advanced tool finder tools")
# Check if ToolFinderKeyword is available for simple search
if "Tool_Finder_Keyword" in available_tool_names:
self.logger.info("✅ Tool_Finder_Keyword available for simple search")
else:
self.logger.warning(
"⚠️ ToolFinderKeyword not available, using fallback search"
)
except Exception as e:
self.logger.warning(f"⚠️ Failed to load tool finder tools: {e}")
self.logger.info(
"📝 Advanced search will not be available, using simple keyword search only"
)
[docs]
def _add_utility_tools(self):
"""
Register essential server management and diagnostic tools.
This method adds a suite of utility tools that provide server introspection,
tool management, and direct execution capabilities. These tools are essential
for monitoring server health, understanding available capabilities, and
providing administrative functionality.
Registered Utility Tools:
========================
get_server_info:
Comprehensive server status and capability reporting.
Returns detailed JSON with:
- Server identification (name, type, version info)
- Tool statistics (total tools, exposed tools, categories)
- Feature flags (search enabled, FastMCP status)
- Resource usage (max workers, thread pool status)
Use cases:
- Health checks and monitoring
- Capability discovery by clients
- Debugging server configuration issues
execute_tooluniverse_function:
Direct interface for executing ToolUniverse functions with custom parameters.
Parameters:
- function_name (str): Name of the ToolUniverse tool to execute
- arguments (str): JSON string containing tool parameters
Features:
- Bypasses MCP tool wrappers for direct execution
- Supports any loaded ToolUniverse tool
- Provides detailed error reporting
- Uses thread pool for non-blocking execution
Use cases:
- Administrative tool execution
- Debugging tool behavior
- Custom automation scripts
list_available_tooluniverse_tools:
Comprehensive inventory of all available ToolUniverse tools.
Returns:
- Complete tool catalog with names, descriptions, types
- Parameter schemas and requirements for each tool
- Tool statistics and categorization
Use cases:
- Tool discovery and exploration
- Documentation generation
- Client capability mapping
- Integration planning
Implementation Details:
======================
Error Handling:
- Each tool includes comprehensive try-catch blocks
- Detailed error messages with context information
- Graceful degradation when tools or data unavailable
- JSON-formatted error responses for consistency
Thread Safety:
- All tools use async execution patterns
- Thread pool executor for CPU-intensive operations
- Proper resource cleanup and management
- Non-blocking I/O for network operations
Security Considerations:
- execute_tooluniverse_function provides direct tool access
- JSON parsing with proper validation
- No file system access beyond ToolUniverse scope
- Appropriate error message sanitization
Performance Optimization:
- Lazy loading of tool information
- Caching where appropriate
- Minimal memory footprint
- Efficient JSON serialization
Examples:
=========
Server health check:
```python
info = await get_server_info()
status = json.loads(info)
if status['total_tooluniverse_tools'] > 0:
# Server healthy
pass
```
Direct tool execution:
```python
result = await execute_tooluniverse_function(
function_name="UniProt_get_entry_by_accession",
arguments='{"accession": "P05067"}'
)
```
Tool inventory:
```python
tools = await list_available_tooluniverse_tools()
catalog = json.loads(tools)
# Available: {catalog['total_tools']} tools
```
"""
@self.tool()
async def get_server_info() -> str:
"""
Get information about the SMCP server and its capabilities.
Returns:
JSON string containing server information
"""
try:
info = {
"server_name": self.name,
"server_type": "SMCP (Scientific Model Context Protocol)",
"fastmcp_available": FASTMCP_AVAILABLE,
"tooluniverse_loaded": hasattr(self, "tooluniverse"),
"total_exposed_tools": len(self._exposed_tools),
"search_enabled": self.search_enabled,
"max_workers": self.max_workers,
"tool_categories_loaded": len(
getattr(self.tooluniverse, "tool_category_dicts", {})
),
"total_tooluniverse_tools": len(
getattr(self.tooluniverse, "all_tools", [])
),
}
return json.dumps(info, indent=2)
except Exception as e:
return f"Error getting server info: {str(e)}"
@self.tool()
async def execute_tooluniverse_function(
function_name: str, arguments: str
) -> str:
"""
Execute a ToolUniverse function directly with custom arguments.
Args:
function_name: Name of the ToolUniverse function to execute
arguments: JSON string of arguments to pass to the function
Returns:
Function execution result
"""
try:
# Parse arguments from JSON string
import json
if isinstance(arguments, str):
parsed_args = json.loads(arguments)
else:
parsed_args = arguments
function_call = {"name": function_name, "arguments": parsed_args}
# Execute in thread pool
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(
self.executor, self.tooluniverse.run_one_function, function_call
)
return str(result)
except Exception as e:
return f"Error executing {function_name}: {str(e)}"
@self.tool()
async def list_available_tooluniverse_tools() -> str:
"""
List all available ToolUniverse tools that can be executed.
Returns:
JSON string containing available tools and their descriptions
"""
try:
# Check if ToolUniverse has loaded tools
if (
not hasattr(self.tooluniverse, "all_tools")
or not self.tooluniverse.all_tools
):
return json.dumps({"error": "No ToolUniverse tools loaded"})
tools_info = []
for tool_config in self.tooluniverse.all_tools:
if isinstance(tool_config, dict):
tool_info = {
"name": tool_config.get("name", "Unknown"),
"description": tool_config.get(
"description", "No description available"
),
"type": tool_config.get("type", "Unknown"),
"parameter_schema": tool_config.get("parameter", {}),
}
tools_info.append(tool_info)
return json.dumps(
{"total_tools": len(tools_info), "tools": tools_info}, indent=2
)
except Exception as e:
return json.dumps({"error": f"Error listing tools: {str(e)}"}, indent=2)
[docs]
def add_custom_tool(
self, name: str, function: Callable, description: Optional[str] = None, **kwargs
):
"""
Add a custom Python function as an MCP tool to the SMCP server.
This method provides a convenient way to extend SMCP functionality with
custom tools beyond those provided by ToolUniverse. Custom tools are
automatically integrated into the MCP interface and can be discovered
and used by clients alongside existing tools.
Parameters:
===========
name : str
Unique name for the tool in the MCP interface. Should be descriptive
and follow naming conventions (lowercase with underscores preferred).
Examples: "analyze_protein_sequence", "custom_data_processor"
function : Callable
Python function to execute when the tool is called. The function:
- Can be synchronous or asynchronous
- Should have proper type annotations for parameters
- Should include a comprehensive docstring
- Will be automatically wrapped for MCP compatibility
description : str, optional
Human-readable description of the tool's functionality. If provided,
this will be set as the function's __doc__ attribute. If None, the
function's existing docstring will be used.
**kwargs
Additional FastMCP tool configuration options:
- parameter_schema: Custom JSON schema for parameters
- return_schema: Schema for return values
- examples: Usage examples for the tool
- tags: Categorization tags
Returns:
========
Callable
The decorated function registered with FastMCP framework.
Usage Examples:
==============
Simple synchronous function:
```python
def analyze_text(text: str, max_length: int = 100) -> str:
'''Analyze text and return summary.'''
return text[:max_length] + "..." if len(text) > max_length else text
server.add_custom_tool(
name="text_analyzer",
function=analyze_text,
description="Analyze and summarize text content"
)
```
Asynchronous function with complex parameters:
```python
async def process_data(
data: List[Dict[str, Any]],
processing_type: str = "standard"
) -> Dict[str, Any]:
'''Process scientific data with specified method.'''
# Custom processing logic here
return {"processed_items": len(data), "type": processing_type}
server.add_custom_tool(
name="data_processor",
function=process_data
)
```
Function with custom schema:
```python
def calculate_score(values: List[float]) -> float:
'''Calculate composite score from values.'''
return sum(values) / len(values) if values else 0.0
server.add_custom_tool(
name="score_calculator",
function=calculate_score,
parameter_schema={
"type": "object",
"properties": {
"values": {
"type": "array",
"items": {"type": "number"},
"description": "List of numeric values to process"
}
},
"required": ["values"]
}
)
```
Integration with ToolUniverse:
=============================
Custom tools work seamlessly alongside ToolUniverse tools:
- Appear in tool discovery searches
- Follow same calling conventions
- Include in server diagnostics and listings
- Support all MCP client interaction patterns
Best Practices:
==============
- Use descriptive, unique tool names
- Include comprehensive docstrings
- Add proper type annotations for parameters
- Handle errors gracefully within the function
- Consider async functions for I/O-bound operations
- Test tools thoroughly before deployment
Notes:
======
- Custom tools are registered immediately upon addition
- Tools can be added before or after server startup
- Function signature determines parameter schema automatically
- Custom tools support all FastMCP features and conventions
"""
if description:
function.__doc__ = description
# Use FastMCP's tool decorator
decorated_function = self.tool(name=name, **kwargs)(function)
return decorated_function
[docs]
async def close(self):
"""
Perform comprehensive cleanup and resource management during server shutdown.
This method ensures graceful shutdown of the SMCP server by properly cleaning
up all resources, stopping background tasks, and releasing system resources.
It's designed to be safe to call multiple times and handles errors gracefully.
Cleanup Operations:
==================
**Thread Pool Shutdown:**
- Gracefully stops the ThreadPoolExecutor used for tool execution
- Waits for currently running tasks to complete
- Prevents new tasks from being submitted
- Times out after reasonable wait period to prevent hanging
**Resource Cleanup:**
- Releases any open file handles or network connections
- Clears internal caches and temporary data
- Stops background monitoring tasks
- Frees memory allocated for tool configurations
**Error Handling:**
- Continues cleanup even if individual operations fail
- Logs cleanup errors for debugging without raising exceptions
- Ensures critical resources are always released
Usage Patterns:
==============
**Automatic Cleanup (Recommended):**
```python
server = SMCP("My Server")
try:
server.run_simple() # Cleanup happens automatically on exit
except KeyboardInterrupt:
pass # run_simple() handles cleanup
```
**Manual Cleanup:**
```python
server = SMCP("My Server")
try:
# Custom server logic here
pass
finally:
await server.close() # Explicit cleanup
```
**Context Manager Pattern:**
```python
async with SMCP("My Server") as server:
# Server operations
pass
# Cleanup happens automatically
```
Performance Considerations:
==========================
- Cleanup operations are typically fast (< 1 second)
- Thread pool shutdown may take longer if tasks are running
- Network connections are closed immediately
- Memory cleanup depends on garbage collection
Error Recovery:
==============
- Individual cleanup failures don't stop the overall process
- Critical errors are logged but don't raise exceptions
- Cleanup is idempotent - safe to call multiple times
- System resources are guaranteed to be released
Notes:
======
- This method is called automatically by run_simple() on shutdown
- Can be called manually for custom server lifecycle management
- Async method to properly handle async resource cleanup
- Safe to call even if server hasn't been fully initialized
"""
try:
# Shutdown thread pool
self.executor.shutdown(wait=True)
except Exception:
pass
[docs]
def run_simple(
self,
transport: Literal["stdio", "http", "sse"] = "http",
host: str = "0.0.0.0",
port: int = 7000,
**kwargs,
):
"""
Start the SMCP server with simplified configuration and automatic setup.
This method provides a convenient way to launch the SMCP server with sensible
defaults for different deployment scenarios. It handles transport configuration,
logging setup, and graceful shutdown automatically.
Parameters:
===========
transport : {"stdio", "http", "sse"}, default "http"
Communication transport protocol:
- "stdio": Standard input/output communication
* Best for: Command-line tools, subprocess integration
* Pros: Low overhead, simple integration
* Cons: Single client, no network access
- "http": HTTP-based communication (streamable-http)
* Best for: Web applications, REST API integration
* Pros: Wide compatibility, stateless, scalable
* Cons: Higher overhead than stdio
- "sse": Server-Sent Events over HTTP
* Best for: Real-time applications, streaming responses
* Pros: Real-time communication, web-compatible
* Cons: Browser limitations, more complex
host : str, default "0.0.0.0"
Server bind address for HTTP/SSE transports:
- "0.0.0.0": Listen on all network interfaces (default)
- "127.0.0.1": localhost only (more secure)
- Specific IP: Bind to particular interface
port : int, default 7000
Server port for HTTP/SSE transports. Choose ports:
- 7000-7999: Recommended range for SMCP servers
- Above 1024: No root privileges required
- Check availability: Ensure port isn't already in use
**kwargs
Additional arguments passed to FastMCP's run() method:
- debug (bool): Enable debug logging
- access_log (bool): Log client requests
- workers (int): Number of worker processes (HTTP only)
Server Startup Process:
======================
1. **Initialization Summary**: Displays server configuration and capabilities
2. **Transport Setup**: Configures selected communication method
3. **Service Start**: Begins listening for client connections
4. **Graceful Shutdown**: Handles interrupts and cleanup
Deployment Scenarios:
====================
Development & Testing:
```python
server = SMCP(name="Dev Server")
server.run_simple(transport="stdio") # For CLI testing
```
Local Web Service:
```python
server = SMCP(name="Local API")
server.run_simple(transport="http", host="127.0.0.1", port=8000)
```
Production Service:
```python
server = SMCP(
name="Production SMCP",
tool_categories=["ChEMBL", "uniprot", "opentarget"],
max_workers=20
)
server.run_simple(
transport="http",
host="0.0.0.0",
port=7000,
workers=4
)
```
Real-time Applications:
```python
server = SMCP(name="Streaming API")
server.run_simple(transport="sse", port=7001)
```
Error Handling:
==============
- **KeyboardInterrupt**: Graceful shutdown on Ctrl+C
- **Port in Use**: Clear error message with suggestions
- **Transport Errors**: Detailed debugging information
- **Cleanup**: Automatic resource cleanup on exit
Logging Output:
==============
Provides informative startup messages:
```
🚀 Starting SMCP server 'My Server'...
📊 Loaded 356 tools from ToolUniverse
🔍 Search enabled: True
🌐 Server running on http://0.0.0.0:7000
```
Security Considerations:
=======================
- Use host="127.0.0.1" for local-only access
- Configure firewall rules for production deployment
- Consider HTTPS termination with reverse proxy
- Validate all client inputs through MCP protocol
Performance Notes:
=================
- HTTP transport supports multiple concurrent clients
- stdio transport is single-client but lower latency
- SSE transport enables real-time bidirectional communication
- Thread pool size affects concurrent tool execution capacity
"""
self.logger.info(f"🚀 Starting SMCP server '{self.name}'...")
self.logger.info(
f"📊 Loaded {len(self._exposed_tools)} tools from ToolUniverse"
)
self.logger.info(f"🔍 Search enabled: {self.search_enabled}")
# Log hook configuration
if self.hooks_enabled or self.hook_type:
if self.hook_type:
self.logger.info(f"🔗 Hooks enabled: {self.hook_type}")
elif self.hook_config:
hook_count = len(self.hook_config.get("hooks", []))
self.logger.info(f"🔗 Hooks enabled: {hook_count} custom hooks")
else:
self.logger.info("🔗 Hooks enabled: default configuration")
else:
self.logger.info("🔗 Hooks disabled")
try:
if transport == "stdio":
self.run(transport="stdio", **kwargs)
elif transport == "http":
self.run(transport="streamable-http", host=host, port=port, **kwargs)
elif transport == "sse":
self.run(transport="sse", host=host, port=port, **kwargs)
else:
raise ValueError(f"Unsupported transport: {transport}")
except KeyboardInterrupt:
self.logger.info("\n🛑 Server stopped by user")
except Exception as e:
self.logger.error(f"❌ Server error: {e}")
finally:
# Cleanup
asyncio.run(self.close())
[docs]
def _create_mcp_tool_from_tooluniverse(self, tool_config: Dict[str, Any]):
"""Create an MCP tool from a ToolUniverse tool configuration.
This method creates a function with proper parameter signatures that match
the ToolUniverse tool schema, enabling FastMCP's automatic parameter validation.
"""
try:
# Debug: Ensure tool_config is a dictionary
if not isinstance(tool_config, dict):
raise ValueError(
f"tool_config must be a dictionary, got {type(tool_config)}: {tool_config}"
)
tool_name = tool_config["name"]
description = tool_config.get(
"description", f"ToolUniverse tool: {tool_name}"
)
parameters = tool_config.get("parameter", {})
# Extract parameter information from the schema
# Handle case where properties might be None (like in Finish tool)
properties = parameters.get("properties")
if properties is None:
properties = {}
required_params = parameters.get("required", [])
# Handle non-standard schema format where 'required' is set on individual properties
# instead of at the object level (common in ToolUniverse schemas)
if not required_params and properties:
required_params = [
param_name
for param_name, param_info in properties.items()
if param_info.get("required", False)
]
# Build function signature dynamically with Pydantic Field support
import inspect
from typing import Annotated
from pydantic import Field
# Create parameter signature for the function
func_params = []
param_annotations = {}
for param_name, param_info in properties.items():
param_type = param_info.get("type", "string")
param_description = param_info.get(
"description", f"{param_name} parameter"
)
is_required = param_name in required_params
# Map JSON schema types to Python types and create appropriate Field
field_kwargs = {"description": param_description}
if param_type == "string":
python_type = str
# For string type, don't add json_schema_extra - let Pydantic handle it
elif param_type == "integer":
python_type = int
# For integer type, don't add json_schema_extra - let Pydantic handle it
elif param_type == "number":
python_type = float
# For number type, don't add json_schema_extra - let Pydantic handle it
elif param_type == "boolean":
python_type = bool
# For boolean type, don't add json_schema_extra - let Pydantic handle it
elif param_type == "array":
python_type = list
# Add array-specific schema information only for complex cases
items_info = param_info.get("items", {})
if items_info:
# Clean up items definition - remove invalid fields
cleaned_items = items_info.copy()
# Remove 'required' field from items (not valid in JSON Schema for array items)
if "required" in cleaned_items:
cleaned_items.pop("required")
field_kwargs["json_schema_extra"] = {
"type": "array",
"items": cleaned_items,
}
else:
# If no items specified, default to string items
field_kwargs["json_schema_extra"] = {
"type": "array",
"items": {"type": "string"},
}
elif param_type == "object":
python_type = dict
# Add object-specific schema information
object_props = param_info.get("properties", {})
if object_props:
# Clean up the nested object properties - fix common schema issues
cleaned_props = {}
nested_required = []
for prop_name, prop_info in object_props.items():
cleaned_prop = prop_info.copy()
# Fix string "True"/"False" in required field (common ToolUniverse issue)
if "required" in cleaned_prop:
req_value = cleaned_prop.pop("required")
if req_value in ["True", "true", True]:
nested_required.append(prop_name)
# Remove the individual required field as it should be at object level
cleaned_props[prop_name] = cleaned_prop
# Create proper JSON schema for nested object
object_schema = {"type": "object", "properties": cleaned_props}
# Add required array at object level if there are required fields
if nested_required:
object_schema["required"] = nested_required
field_kwargs["json_schema_extra"] = object_schema
else:
# For unknown types, default to string and only add type info if it's truly unknown
python_type = str
if param_type not in [
"string",
"integer",
"number",
"boolean",
"array",
"object",
]:
field_kwargs["json_schema_extra"] = {"type": param_type}
# Create Pydantic Field with enhanced schema information
pydantic_field = Field(**field_kwargs)
if is_required:
# Required parameter with description and schema info
annotated_type = Annotated[python_type, pydantic_field]
param_annotations[param_name] = annotated_type
func_params.append(
inspect.Parameter(
param_name,
inspect.Parameter.POSITIONAL_OR_KEYWORD,
annotation=annotated_type,
)
)
else:
# Optional parameter with description, schema info and default value
annotated_type = Annotated[
Union[python_type, type(None)], pydantic_field
]
param_annotations[param_name] = annotated_type
func_params.append(
inspect.Parameter(
param_name,
inspect.Parameter.POSITIONAL_OR_KEYWORD,
default=None,
annotation=annotated_type,
)
)
# Create the async function with dynamic signature
if not properties:
# Tool has no parameters - create simple function
async def dynamic_tool_function() -> str:
"""Execute ToolUniverse tool with no arguments."""
try:
# Prepare function call with empty arguments
function_call = {"name": tool_name, "arguments": {}}
# Execute in thread pool to avoid blocking
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(
self.executor,
self.tooluniverse.run_one_function,
function_call,
)
# Format the result
if isinstance(result, str):
return result
else:
return json.dumps(result, indent=2, default=str)
except Exception as e:
error_msg = f"Error executing {tool_name}: {str(e)}"
self.logger.error(error_msg)
return json.dumps({"error": error_msg}, indent=2)
# Set function metadata
dynamic_tool_function.__name__ = tool_name
dynamic_tool_function.__signature__ = inspect.Signature([])
dynamic_tool_function.__annotations__ = {"return": str}
else:
# Tool has parameters - create function with dynamic signature
async def dynamic_tool_function(**kwargs) -> str:
"""Execute ToolUniverse tool with provided arguments."""
try:
# Filter out None values for optional parameters
args_dict = {k: v for k, v in kwargs.items() if v is not None}
# Validate required parameters
missing_required = [
param for param in required_params if param not in args_dict
]
if missing_required:
return json.dumps(
{
"error": f"Missing required parameters: {missing_required}",
"required": required_params,
"provided": list(args_dict.keys()),
},
indent=2,
)
# Prepare function call
function_call = {"name": tool_name, "arguments": args_dict}
# Execute in thread pool to avoid blocking
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(
self.executor,
self.tooluniverse.run_one_function,
function_call,
)
# Format the result
if isinstance(result, str):
return result
else:
return json.dumps(result, indent=2, default=str)
except Exception as e:
error_msg = f"Error executing {tool_name}: {str(e)}"
self.logger.error(error_msg)
return json.dumps({"error": error_msg}, indent=2)
# Set function metadata
dynamic_tool_function.__name__ = tool_name
# Set function signature dynamically for tools with parameters
if func_params:
dynamic_tool_function.__signature__ = inspect.Signature(func_params)
# Set annotations for type hints
dynamic_tool_function.__annotations__ = param_annotations.copy()
dynamic_tool_function.__annotations__["return"] = str
# Create detailed docstring for internal use, but use clean description for FastMCP
param_docs = []
for param_name, param_info in properties.items():
param_desc = param_info.get("description", f"{param_name} parameter")
param_type = param_info.get("type", "string")
is_required = param_name in required_params
required_text = "required" if is_required else "optional"
param_docs.append(
f" {param_name} ({param_type}, {required_text}): {param_desc}"
)
# Set a simple docstring for the function (internal use)
dynamic_tool_function.__doc__ = f"""{description}
Returns:
str: Tool execution result
"""
# Register with FastMCP using explicit description (clean, without parameter list)
self.tool(description=description)(dynamic_tool_function)
except Exception as e:
self.logger.error(f"Error creating MCP tool from config: {e}")
self.logger.debug(f"Tool config: {tool_config}")
# Don't raise - continue with other tools
return
# Convenience function for quick server creation
[docs]
def create_smcp_server(
name: str = "SMCP Server",
tool_categories: Optional[List[str]] = None,
search_enabled: bool = True,
**kwargs,
) -> SMCP:
"""
Create a configured SMCP server with common defaults and best practices.
This convenience function simplifies SMCP server creation by providing
sensible defaults for common use cases while still allowing full customization
through additional parameters.
Parameters:
===========
name : str, default "SMCP Server"
Human-readable server name used in logs and server identification.
Choose descriptive names like:
- "Scientific Research API"
- "Drug Discovery Server"
- "Proteomics Analysis Service"
tool_categories : list of str, optional
Specific ToolUniverse categories to load. If None, loads all available
tools (350+ tools). Common category combinations:
Scientific Research:
["ChEMBL", "uniprot", "opentarget", "pubchem", "hpa"]
Drug Discovery:
["ChEMBL", "fda_drug_label", "clinical_trials", "pubchem"]
Literature Analysis:
["EuropePMC", "semantic_scholar", "pubtator", "agents"]
Minimal Setup:
["tool_finder_llm", "special_tools"]
search_enabled : bool, default True
Enable AI-powered tool discovery via tools/find method.
Recommended to keep enabled unless you have specific performance
requirements or want to minimize dependencies.
**kwargs
Additional SMCP configuration options:
- tooluniverse_config: Pre-configured ToolUniverse instance
- auto_expose_tools (bool, default True): Auto-expose ToolUniverse tools
- max_workers (int, default 5): Thread pool size for tool execution
- Any FastMCP server options (debug, logging, etc.)
Returns:
========
SMCP
Fully configured SMCP server instance ready to run.
Usage Examples:
==============
Quick Start (all tools):
```python
server = create_smcp_server("Research Server")
server.run_simple()
```
Focused Server (specific domains):
```python
server = create_smcp_server(
name="Drug Discovery API",
tool_categories=["ChEMBL", "fda_drug_label", "clinical_trials"],
max_workers=10
)
server.run_simple(port=8000)
```
Custom Configuration:
```python
server = create_smcp_server(
name="High-Performance Server",
search_enabled=True,
max_workers=20,
debug=True
)
server.run_simple(transport="http", host="0.0.0.0", port=7000)
```
Pre-configured ToolUniverse:
```python
tu = ToolUniverse()
tu.load_tools(tool_type=["uniprot", "ChEMBL"])
server = create_smcp_server(
name="Protein-Drug Server",
tooluniverse_config=tu,
search_enabled=True
)
```
Benefits of Using This Function:
===============================
- **Simplified Setup**: Reduces boilerplate code for common configurations
- **Best Practices**: Applies recommended settings automatically
- **Consistent Naming**: Encourages good server naming conventions
- **Future-Proof**: Will include new recommended defaults in future versions
- **Documentation**: Provides clear examples and guidance
Equivalent Manual Configuration:
===============================
This function is equivalent to:
```python
server = SMCP(
name=name,
tool_categories=tool_categories,
search_enabled=search_enabled,
auto_expose_tools=True,
max_workers=5,
**kwargs
)
```
When to Use Manual Configuration:
================================
- Need precise control over all initialization parameters
- Using custom ToolUniverse configurations
- Implementing custom MCP methods or tools
- Advanced deployment scenarios with specific requirements
"""
return SMCP(
name=name,
tool_categories=tool_categories,
search_enabled=search_enabled,
**kwargs,
)