工具组成教程¶
将ToolUniverse的600多种工具整合为强大的科学工作流
概述¶
工具组合是一门将单个科学工具整合为复杂研究工作流的艺术。ToolUniverse 的工具组合器(Tool Composer)支持集成具有异构后端的工具,以构建端到端的工作流。通过利用工具调用器(Tool Caller)进行代码内的直接执行,工具组合器会生成一个容器函数,将工具调用器和 ToolUniverse 暴露为内联可执行的原语。
Individual Tools → Composed Workflows → Research Solutions
Example: Literature Search & Summary Tool
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│EuropePMC │ │ Literature │ │ Research │
│OpenAlex │ → │ Search & │ → │ Summary │
│PubTator │ │ Summary │ │ Generated │
│AI Reviewer │ │ Tool │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
工具组合的优势: - 复杂研究:解决单一工具无法处理的多步骤问题 - 工作流程复用:为常见任务创建可重复使用的研究管道 - 自动化:减少不同工具之间的手动协调 - 质量控制:在关键步骤中嵌入验证和专家审查 - 异构集成:无缝组合具有不同后端的工具 - 智能循环:支持自适应的多步骤实验分析
工具编写器架构¶
Tool Composer 会生成一个容器函数,作为复杂工作流的执行主干。该容器函数以 compose(arguments, tooluniverse, call_tool) 的形式实现,包含用于协调不同类型工具的逻辑,从而使它们能够在单一工作流中协同工作。
容器功能组件:
arguments:指定遵循 ToolUniverse 交互协议架构的工具调用参数
tooluniverse:ToolUniverse 的一个实例,提供 ToolUniverse 所支持的所有可用功能
call_tool:Tool Caller 的可调用接口,用于抽象调用 ToolUniverse 中的各个工具
执行模式: - 链式调用:将一个工具的输出作为下一个工具的输入进行链式调用 - 广播调用:通过单个查询调用多个工具 - 智能体循环:构建智能体循环,利用智能体工具生成函数调用、执行工具操作,并结合工具反馈进行多步骤实验分析
工具组成原理¶
ToolUniverse 的 ComposeTool 系统采用基于配置驱动的方法:
配置文件:在类似于 compose_tools.json 的 JSON 文件中定义组合工具
实现脚本:在compose环境中编写Python脚本:def compose(arguments, tooluniverse, call_tool): …
自动加载:ComposeTool 自动加载依赖项并执行工作流
创建您的第一个 Compose 工具¶
让我们创建一个文献检索和摘要工具作为示例:
步骤 1:创建实现脚本¶
创建文件 src/tooluniverse/compose_scripts/literature_tool.py:
"""
Literature Search & Summary Tool
Minimal compose tool perfect for paper screenshots
"""
def compose(arguments, tooluniverse, call_tool):
"""Search literature and generate summary"""
topic = arguments['research_topic']
literature = {}
literature['pmc'] = call_tool('EuropePMC_search_articles', {'query': topic, 'limit': 5})
literature['openalex'] = call_tool('openalex_literature_search', {'search_keywords': topic, 'max_results': 5})
literature['pubtator'] = call_tool('PubTator3_LiteratureSearch', {'text': topic, 'page_size': 5})
summary = call_tool('MedicalLiteratureReviewer', {
'research_topic': topic, 'literature_content': str(literature),
'focus_area': 'key findings', 'study_types': 'all studies',
'quality_level': 'all evidence', 'review_scope': 'rapid review'
})
return summary
步骤 2:将配置添加到 compose_tools.json¶
将以下配置添加到 src/tooluniverse/data/compose_tools.json 中:
{
"type": "ComposeTool",
"name": "LiteratureSearchTool",
"description": "Comprehensive literature search and summary tool that searches multiple databases (EuropePMC, OpenAlex, PubTator) and generates AI-powered summaries of research findings",
"parameter": {
"type": "object",
"properties": {
"research_topic": {
"type": "string",
"description": "The research topic or query to search for in the literature"
}
},
"required": ["research_topic"]
},
"auto_load_dependencies": true,
"fail_on_missing_tools": false,
"required_tools": [
"EuropePMC_search_articles",
"openalex_literature_search",
"PubTator3_LiteratureSearch",
"MedicalLiteratureReviewer"
],
"composition_file": "literature_tool.py",
"composition_function": "compose"
}
步骤 3:使用撰写工具¶
一旦配置完成,您即可像使用其他 ToolUniverse 工具一样使用您的 compose 工具:
from tooluniverse import ToolUniverse
# Initialize ToolUniverse
tu = ToolUniverse()
# Load compose tools
tu.load_tools(['compose_tools'])
# Use your literature search tool
result = tu.run({"name": "LiteratureSearchTool", "arguments": {'research_topic': 'COVID-19 vaccine efficacy'}})
print(result)
Compose 工具配置参考¶
必填字段¶
类型:必须为 “ComposeTool”
name: 为您的编排工具设置唯一名称
description: 工具功能的人类可读描述
parameter:定义输入参数的 JSON 模式
composition_file:compose_scripts/ 目录中的 Python 文件
composition_function:调用的函数名称(通常为“compose”)
可选字段¶
auto_load_dependencies:是否自动加载所需工具(默认值:true)
fail_on_missing_tools:是否在缺少必要工具时失败(默认值:false)
required_tools:必须具备的工具名称列表
函数签名组合¶
您的 compose 函数必须遵循以下确切的签名:
def compose(arguments, tooluniverse, call_tool):
"""
Compose function signature
Args:
arguments (dict): Input parameters from the tool call
tooluniverse (ToolUniverse): Reference to the ToolUniverse instance
call_tool (function): Function to call other tools
Returns:
Any: The result of your composition
"""
# Your composition logic here
pass
异构工作流构建¶
如《ToolUniverse》论文所示,组合工具可以同时运行多个文献检索工具,随后调用摘要智能体对结果进行综合,展示了异构工作流的构建方式,其中每个步骤均由工具执行驱动。该方法实现了:
多后端集成:结合来自不同科学数据库和API的工具
并发执行:同时运行多个工具以提高效率
智能合成:使用人工智能智能体从异构来源合成结果
自适应分析:构建能够根据中间结果进行调整的工作流程
核心组成模式¶
1. 顺序链式调用¶
用例:每个步骤依赖于前一步的线性工作流程
模式:将一个工具的输出链接为下一个工具的输入
def compose(arguments, tooluniverse, call_tool):
"""Sequential pipeline: Disease → Targets → Drugs → Safety Assessment"""
disease_id = arguments['disease_efo_id']
# Step 1: Find disease-associated targets
targets_result = call_tool('OpenTargets_get_associated_targets_by_disease_efoId', {
'efoId': disease_id
})
top_targets = targets_result["data"]["disease"]["associatedTargets"]["rows"][:5]
# Step 2: Find known drugs for this disease
drugs_result = call_tool('OpenTargets_get_associated_drugs_by_disease_efoId', {
'efoId': disease_id,
'size': 20
})
drug_rows = drugs_result["data"]["disease"]["knownDrugs"]["rows"]
# Step 3: Extract SMILES and assess safety
safety_assessments = []
processed_drugs = set()
for drug in drug_rows[:5]: # Limit for demo
drug_name = drug["drug"]["name"]
if drug_name in processed_drugs:
continue
processed_drugs.add(drug_name)
# Get SMILES from drug name
cid_result = call_tool('PubChem_get_CID_by_compound_name', {
'name': drug_name
})
if cid_result and 'IdentifierList' in cid_result:
cids = cid_result['IdentifierList']['CID']
if cids:
smiles_result = call_tool('PubChem_get_compound_properties_by_CID', {
'cid': cids[0]
})
if smiles_result and 'PropertyTable' in smiles_result:
properties = smiles_result['PropertyTable']['Properties'][0]
smiles = properties.get('CanonicalSMILES') or properties.get('ConnectivitySMILES')
if smiles:
# Assess safety properties
bbb_result = call_tool('ADMETAI_predict_BBB_penetrance', {
'smiles': [smiles]
})
safety_assessments.append({
'drug_name': drug_name,
'smiles': smiles,
'bbb_penetrance': bbb_result
})
return {
'disease': disease_id,
'targets_found': len(top_targets),
'drugs_analyzed': len(safety_assessments),
'safety_results': safety_assessments
}
2. 广播(并行执行)¶
用例:可同时运行的独立操作
模式:使用单个查询调用多个工具(广播)
def compose(arguments, tooluniverse, call_tool):
"""Parallel search across multiple literature databases"""
research_topic = arguments['research_topic']
# Execute searches in parallel
literature = {}
literature['pmc'] = call_tool('EuropePMC_search_articles', {
'query': research_topic, 'limit': 50
})
literature['openalex'] = call_tool('openalex_literature_search', {
'search_keywords': research_topic, 'max_results': 50
})
literature['pubtator'] = call_tool('PubTator3_LiteratureSearch', {
'text': research_topic, 'page_size': 50
})
# Synthesize findings using AI agent
synthesis = call_tool('MedicalLiteratureReviewer', {
'research_topic': research_topic,
'literature_content': str(literature),
'focus_area': 'key findings',
'study_types': 'all studies',
'quality_level': 'all evidence',
'review_scope': 'comprehensive review'
})
return {
'topic': research_topic,
'sources_searched': len(literature),
'total_papers': sum(len(r.get('documents', r.get('papers', [])))
for r in literature.values()),
'synthesis': synthesis,
'detailed_results': literature
}
3. 智能体循环¶
用例:在人工智能指导和工具反馈下的迭代优化
模式:构建智能体循环,利用智能体工具生成函数调用、执行工具并结合工具反馈,以进行多步骤实验分析
def compose(arguments, tooluniverse, call_tool):
"""Iterative compound optimization with AI-guided feedback loops"""
initial_smiles = arguments['initial_smiles']
target_protein = arguments['target_protein']
current_compound = initial_smiles
optimization_history = []
max_iterations = 5
target_affinity = -8.0 # Strong binding threshold
for iteration in range(max_iterations):
# Step 1: Predict binding affinity using molecular docking
binding_result = call_tool('boltz2_docking', {
'protein_id': target_protein,
'ligand_smiles': current_compound
})
# Step 2: Predict ADMET properties
bbb_result = call_tool('ADMETAI_predict_BBB_penetrance', {
'smiles': [current_compound]
})
bio_result = call_tool('ADMETAI_predict_bioavailability', {
'smiles': [current_compound]
})
tox_result = call_tool('ADMETAI_predict_toxicity', {
'smiles': [current_compound]
})
# Step 3: Record iteration data
iteration_data = {
'iteration': iteration,
'compound': current_compound,
'binding_affinity': binding_result.get('binding_affinity'),
'binding_probability': binding_result.get('binding_probability'),
'bbb_penetrance': bbb_result,
'bioavailability': bio_result,
'toxicity': tox_result
}
optimization_history.append(iteration_data)
# Step 4: Check if target achieved
if binding_result.get('binding_affinity', 0) <= target_affinity:
break
# Step 5: AI-guided compound optimization
# Use an agentic tool to analyze current results and suggest improvements
optimization_suggestion = call_tool('ChemicalOptimizationAgent', {
'current_compound': current_compound,
'current_properties': iteration_data,
'optimization_goals': ['binding_affinity', 'oral_bioavailability'],
'target_protein': target_protein
})
# Step 6: Generate next compound based on AI feedback
next_compound = call_tool('CompoundGenerator', {
'base_compound': current_compound,
'optimization_suggestions': optimization_suggestion,
'modification_type': 'targeted_improvement'
})
current_compound = next_compound.get('new_compound', current_compound)
return {
'initial_compound': initial_smiles,
'final_compound': current_compound,
'iterations': len(optimization_history),
'optimization_history': optimization_history,
'target_achieved': binding_result.get('binding_affinity', 0) <= target_affinity
}
4. 错误处理与回退机制¶
用例:能够优雅处理故障的稳健工作流程
模式:实现回退机制和优雅降级
def compose(arguments, tooluniverse, call_tool):
"""Workflow with comprehensive error handling and fallbacks"""
results = {"status": "running", "completed_steps": []}
try:
# Step 1: Critical initial step
step1_result = call_tool('critical_analysis_tool', arguments)
results["step1"] = step1_result
results["completed_steps"].append("step1")
except Exception as e:
results["status"] = "failed"
results["error"] = f"Step 1 failed: {str(e)}"
return results
try:
# Step 2: Optional enhancement step
step2_result = call_tool('enhancement_tool', {"data": step1_result})
results["step2"] = step2_result
results["completed_steps"].append("step2")
except Exception as e:
# Continue without this step
results["step2_warning"] = f"Enhancement step failed: {str(e)}"
# Step 3: Alternative approaches with fallback
try:
step3_result = call_tool('primary_validation_tool', {"data": step1_result})
results["validation"] = step3_result
except Exception:
# Fallback validation method
try:
fallback_result = call_tool('alternative_validation_tool', {"data": step1_result})
results["validation"] = fallback_result
results["validation_method"] = "fallback"
except Exception as e:
results["validation_error"] = str(e)
results["status"] = "completed"
return results
真实世界的组成示例¶
有关组合工具实际应用的完整示例,请参见 科学工作流 教程,其中包括:
全面的药物发现流程:从靶点识别到安全性评估的端到端工作流程
生物标志物发现工作流程:利用文献、表达数据和通路分析进行多步骤生物标志物验证
高级文献综述:基于人工智能的系统综述及引文分析
自主研究工作流程:利用人工智能反馈进行多步骤分析的自适应工作流程
这些示例展示了组合工具如何协调复杂的科学工作流程,结合来自不同后端的工具以解决实际研究问题。
工具调用接口¶
Tool Caller 提供了一个可调用接口,用于抽象调用 ToolUniverse 中的各个工具。该抽象实现了:
统一工具访问:所有工具均通过相同的 call_tool 接口访问
协议合规性:工具调用遵循 ToolUniverse 的交互协议规范。
错误处理:在不同工具类型之间实现一致的错误处理
依赖管理:工具依赖的自动加载与管理
工具调用者使用模式:
def compose(arguments, tooluniverse, call_tool):
# Direct tool invocation through the Tool Caller interface
result = call_tool('tool_name', {'param1': 'value1', 'param2': 'value2'})
# The call_tool function handles:
# - Tool loading and instantiation
# - Parameter validation
# - Execution and error handling
# - Result formatting
return result
故障排除¶
常见问题及解决方案¶
Tool Not Found Error
Check that the tool name is correct in your compose script
Ensure the tool is loaded in ToolUniverse
Verify the tool is in the required_tools list
Use auto_load_dependencies: true to automatically load missing tools
Import Errors
Make sure your compose script is in the compose_scripts/ directory
Check that the function name matches composition_function
Verify the function signature is correct: def compose(arguments, tooluniverse, call_tool):
Parameter Errors
Validate your parameter schema in the JSON configuration
Check that required parameters are provided
Ensure parameter types match the schema
Follow the interaction protocol schema of ToolUniverse
Performance Issues
Limit the number of tools called in sequence
Use auto_load_dependencies: true for automatic loading
Consider caching results for repeated calls
Implement proper error handling to avoid cascading failures
Heterogeneous Backend Issues
Ensure all required tools are available across different backends
Use fail_on_missing_tools: false for graceful degradation
Implement fallback mechanisms for critical workflow steps
可用的编写工具¶
ToolUniverse 当前提供了多种预设的组合工具,以展示不同的工作流程模式:
** Working Compose Tools**:
LiteratureSearchTool - Literature research and synthesis
Searches EuropePMC, OpenAlex, and PubTator databases
Uses AI agent for literature summarization
Demonstrates broadcasting pattern
ComprehensiveDrugDiscoveryPipeline - End-to-end drug discovery
Target identification using OpenTargets
Lead discovery from known drugs
Safety assessment using ADMETAI tools
Literature validation
Demonstrates sequential chaining with tool integration
BiomarkerDiscoveryWorkflow - Biomarker discovery and validation
Literature-based biomarker discovery
Multi-strategy gene search using HPA
Comprehensive pathway analysis using HPA tools
Clinical validation using FDA data
Demonstrates multi-strategy fallbacks and error handling
DrugSafetyAnalyzer - Drug safety assessment
PubChem compound information retrieval
EuropePMC literature search
Demonstrates safety-focused workflows
ToolDescriptionOptimizer - Tool optimization
AI-powered tool description improvement
Test case generation and quality evaluation
Demonstrates agentic optimization loops
ToolDiscover - Tool discovery and generation
AI-powered tool creation from descriptions
Iterative code improvement
Demonstrates advanced agentic workflows
主要功能: - 所有工具均经过测试并可正常运行,支持真实数据处理 - 全面的错误处理机制,具备优雅的降级方案 - 工具链集成,支持复杂的多步骤工作流程 - 动态数据提取**(例如,从药物名称中提取SMILES) - **多策略方法,确保数据检索的稳健性
参见
ToolUniverse 架构 - ToolUniverse architecture
Set Up ToolUniverse - Connect to AI agents
科学工作流 - Real composition examples
小技巧
从简单开始:先从像 LiteratureSearchTool 示例这样的顺序工作流入手,随着对工具组合的熟悉,再逐步过渡到更复杂的模式。
备注
Compose 工具位置:所有 compose 脚本必须放置在 src/tooluniverse/compose_scripts/ 目录下,并在 src/tooluniverse/data/compose_tools.json 中注册。
重要
工具组合器架构:工具组合器生成容器函数,将 ToolUniverse 和 Tool Caller 作为内联的可执行原语暴露出来,从而实现复杂科学工作流中灵活的多工具执行模式。