工具组成教程

将ToolUniverse的600多种工具整合为强大的科学工作流

概述

工具组合是一门将单个科学工具整合为复杂研究工作流的艺术。ToolUniverse 的工具组合器(Tool Composer)支持集成具有异构后端的工具,以构建端到端的工作流。通过利用工具调用器(Tool Caller)进行代码内的直接执行,工具组合器会生成一个容器函数,将工具调用器和 ToolUniverse 暴露为内联可执行的原语。

Individual Tools → Composed Workflows → Research Solutions

Example: Literature Search & Summary Tool

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│EuropePMC    │    │ Literature  │    │ Research    │
│OpenAlex     │ →  │ Search &    │ →  │ Summary     │
│PubTator     │    │ Summary     │    │ Generated   │
│AI Reviewer  │    │ Tool        │    │             │
└─────────────┘    └─────────────┘    └─────────────┘

工具组合的优势: - 复杂研究:解决单一工具无法处理的多步骤问题 - 工作流程复用:为常见任务创建可重复使用的研究管道 - 自动化:减少不同工具之间的手动协调 - 质量控制:在关键步骤中嵌入验证和专家审查 - 异构集成:无缝组合具有不同后端的工具 - 智能循环:支持自适应的多步骤实验分析

工具编写器架构

Tool Composer 会生成一个容器函数,作为复杂工作流的执行主干。该容器函数以 compose(arguments, tooluniverse, call_tool) 的形式实现,包含用于协调不同类型工具的逻辑,从而使它们能够在单一工作流中协同工作。

容器功能组件:

  1. arguments:指定遵循 ToolUniverse 交互协议架构的工具调用参数

  2. tooluniverse:ToolUniverse 的一个实例,提供 ToolUniverse 所支持的所有可用功能

  3. call_tool:Tool Caller 的可调用接口,用于抽象调用 ToolUniverse 中的各个工具

执行模式: - 链式调用:将一个工具的输出作为下一个工具的输入进行链式调用 - 广播调用:通过单个查询调用多个工具 - 智能体循环:构建智能体循环,利用智能体工具生成函数调用、执行工具操作,并结合工具反馈进行多步骤实验分析

工具组成原理

ToolUniverse 的 ComposeTool 系统采用基于配置驱动的方法:

  1. 配置文件:在类似于 compose_tools.json 的 JSON 文件中定义组合工具

  2. 实现脚本:在compose环境中编写Python脚本:def compose(arguments, tooluniverse, call_tool): …

  3. 自动加载:ComposeTool 自动加载依赖项并执行工作流

创建您的第一个 Compose 工具

让我们创建一个文献检索和摘要工具作为示例:

步骤 1:创建实现脚本

创建文件 src/tooluniverse/compose_scripts/literature_tool.py

"""
Literature Search & Summary Tool
Minimal compose tool perfect for paper screenshots
"""

def compose(arguments, tooluniverse, call_tool):
    """Search literature and generate summary"""
    topic = arguments['research_topic']

    literature = {}
    literature['pmc'] = call_tool('EuropePMC_search_articles', {'query': topic, 'limit': 5})
    literature['openalex'] = call_tool('openalex_literature_search', {'search_keywords': topic, 'max_results': 5})
    literature['pubtator'] = call_tool('PubTator3_LiteratureSearch', {'text': topic, 'page_size': 5})

    summary = call_tool('MedicalLiteratureReviewer', {
        'research_topic': topic, 'literature_content': str(literature),
        'focus_area': 'key findings', 'study_types': 'all studies',
        'quality_level': 'all evidence', 'review_scope': 'rapid review'
    })

    return summary

步骤 2:将配置添加到 compose_tools.json

将以下配置添加到 src/tooluniverse/data/compose_tools.json 中:

{
  "type": "ComposeTool",
  "name": "LiteratureSearchTool",
  "description": "Comprehensive literature search and summary tool that searches multiple databases (EuropePMC, OpenAlex, PubTator) and generates AI-powered summaries of research findings",
  "parameter": {
    "type": "object",
    "properties": {
      "research_topic": {
        "type": "string",
        "description": "The research topic or query to search for in the literature"
      }
    },
    "required": ["research_topic"]
  },
  "auto_load_dependencies": true,
  "fail_on_missing_tools": false,
  "required_tools": [
    "EuropePMC_search_articles",
    "openalex_literature_search",
    "PubTator3_LiteratureSearch",
    "MedicalLiteratureReviewer"
  ],
  "composition_file": "literature_tool.py",
  "composition_function": "compose"
}

步骤 3:使用撰写工具

一旦配置完成,您即可像使用其他 ToolUniverse 工具一样使用您的 compose 工具:

from tooluniverse import ToolUniverse

# Initialize ToolUniverse
tu = ToolUniverse()

# Load compose tools
tu.load_tools(['compose_tools'])

# Use your literature search tool
result = tu.run({"name": "LiteratureSearchTool", "arguments": {'research_topic': 'COVID-19 vaccine efficacy'}})

print(result)

Compose 工具配置参考

必填字段

  • 类型:必须为 “ComposeTool”

  • name: 为您的编排工具设置唯一名称

  • description: 工具功能的人类可读描述

  • parameter:定义输入参数的 JSON 模式

  • composition_filecompose_scripts/ 目录中的 Python 文件

  • composition_function:调用的函数名称(通常为“compose”)

可选字段

  • auto_load_dependencies:是否自动加载所需工具(默认值:true)

  • fail_on_missing_tools:是否在缺少必要工具时失败(默认值:false)

  • required_tools:必须具备的工具名称列表

函数签名组合

您的 compose 函数必须遵循以下确切的签名:

def compose(arguments, tooluniverse, call_tool):
    """
    Compose function signature

    Args:
        arguments (dict): Input parameters from the tool call
        tooluniverse (ToolUniverse): Reference to the ToolUniverse instance
        call_tool (function): Function to call other tools

    Returns:
        Any: The result of your composition
    """
    # Your composition logic here
    pass

异构工作流构建

如《ToolUniverse》论文所示,组合工具可以同时运行多个文献检索工具,随后调用摘要智能体对结果进行综合,展示了异构工作流的构建方式,其中每个步骤均由工具执行驱动。该方法实现了:

  • 多后端集成:结合来自不同科学数据库和API的工具

  • 并发执行:同时运行多个工具以提高效率

  • 智能合成:使用人工智能智能体从异构来源合成结果

  • 自适应分析:构建能够根据中间结果进行调整的工作流程

核心组成模式

1. 顺序链式调用

用例:每个步骤依赖于前一步的线性工作流程

模式:将一个工具的输出链接为下一个工具的输入

def compose(arguments, tooluniverse, call_tool):
    """Sequential pipeline: Disease → Targets → Drugs → Safety Assessment"""

    disease_id = arguments['disease_efo_id']

    # Step 1: Find disease-associated targets
    targets_result = call_tool('OpenTargets_get_associated_targets_by_disease_efoId', {
        'efoId': disease_id
    })

    top_targets = targets_result["data"]["disease"]["associatedTargets"]["rows"][:5]

    # Step 2: Find known drugs for this disease
    drugs_result = call_tool('OpenTargets_get_associated_drugs_by_disease_efoId', {
        'efoId': disease_id,
        'size': 20
    })

    drug_rows = drugs_result["data"]["disease"]["knownDrugs"]["rows"]

    # Step 3: Extract SMILES and assess safety
    safety_assessments = []
    processed_drugs = set()

    for drug in drug_rows[:5]:  # Limit for demo
        drug_name = drug["drug"]["name"]
        if drug_name in processed_drugs:
            continue
        processed_drugs.add(drug_name)

        # Get SMILES from drug name
        cid_result = call_tool('PubChem_get_CID_by_compound_name', {
            'name': drug_name
        })

        if cid_result and 'IdentifierList' in cid_result:
            cids = cid_result['IdentifierList']['CID']
            if cids:
                smiles_result = call_tool('PubChem_get_compound_properties_by_CID', {
                    'cid': cids[0]
                })

                if smiles_result and 'PropertyTable' in smiles_result:
                    properties = smiles_result['PropertyTable']['Properties'][0]
                    smiles = properties.get('CanonicalSMILES') or properties.get('ConnectivitySMILES')

                    if smiles:
                        # Assess safety properties
                        bbb_result = call_tool('ADMETAI_predict_BBB_penetrance', {
                            'smiles': [smiles]
                        })

                        safety_assessments.append({
                            'drug_name': drug_name,
                            'smiles': smiles,
                            'bbb_penetrance': bbb_result
                        })

    return {
        'disease': disease_id,
        'targets_found': len(top_targets),
        'drugs_analyzed': len(safety_assessments),
        'safety_results': safety_assessments
    }

2. 广播(并行执行)

用例:可同时运行的独立操作

模式:使用单个查询调用多个工具(广播)

def compose(arguments, tooluniverse, call_tool):
    """Parallel search across multiple literature databases"""

    research_topic = arguments['research_topic']

    # Execute searches in parallel
    literature = {}
    literature['pmc'] = call_tool('EuropePMC_search_articles', {
        'query': research_topic, 'limit': 50
    })
    literature['openalex'] = call_tool('openalex_literature_search', {
        'search_keywords': research_topic, 'max_results': 50
    })
    literature['pubtator'] = call_tool('PubTator3_LiteratureSearch', {
        'text': research_topic, 'page_size': 50
    })

    # Synthesize findings using AI agent
    synthesis = call_tool('MedicalLiteratureReviewer', {
        'research_topic': research_topic,
        'literature_content': str(literature),
        'focus_area': 'key findings',
        'study_types': 'all studies',
        'quality_level': 'all evidence',
        'review_scope': 'comprehensive review'
    })

    return {
        'topic': research_topic,
        'sources_searched': len(literature),
        'total_papers': sum(len(r.get('documents', r.get('papers', [])))
                           for r in literature.values()),
        'synthesis': synthesis,
        'detailed_results': literature
    }

3. 智能体循环

用例:在人工智能指导和工具反馈下的迭代优化

模式:构建智能体循环,利用智能体工具生成函数调用、执行工具并结合工具反馈,以进行多步骤实验分析

def compose(arguments, tooluniverse, call_tool):
    """Iterative compound optimization with AI-guided feedback loops"""

    initial_smiles = arguments['initial_smiles']
    target_protein = arguments['target_protein']

    current_compound = initial_smiles
    optimization_history = []
    max_iterations = 5
    target_affinity = -8.0  # Strong binding threshold

    for iteration in range(max_iterations):
        # Step 1: Predict binding affinity using molecular docking
        binding_result = call_tool('boltz2_docking', {
            'protein_id': target_protein,
            'ligand_smiles': current_compound
        })

        # Step 2: Predict ADMET properties
        bbb_result = call_tool('ADMETAI_predict_BBB_penetrance', {
            'smiles': [current_compound]
        })

        bio_result = call_tool('ADMETAI_predict_bioavailability', {
            'smiles': [current_compound]
        })

        tox_result = call_tool('ADMETAI_predict_toxicity', {
            'smiles': [current_compound]
        })

        # Step 3: Record iteration data
        iteration_data = {
            'iteration': iteration,
            'compound': current_compound,
            'binding_affinity': binding_result.get('binding_affinity'),
            'binding_probability': binding_result.get('binding_probability'),
            'bbb_penetrance': bbb_result,
            'bioavailability': bio_result,
            'toxicity': tox_result
        }
        optimization_history.append(iteration_data)

        # Step 4: Check if target achieved
        if binding_result.get('binding_affinity', 0) <= target_affinity:
            break

        # Step 5: AI-guided compound optimization
        # Use an agentic tool to analyze current results and suggest improvements
        optimization_suggestion = call_tool('ChemicalOptimizationAgent', {
            'current_compound': current_compound,
            'current_properties': iteration_data,
            'optimization_goals': ['binding_affinity', 'oral_bioavailability'],
            'target_protein': target_protein
        })

        # Step 6: Generate next compound based on AI feedback
        next_compound = call_tool('CompoundGenerator', {
            'base_compound': current_compound,
            'optimization_suggestions': optimization_suggestion,
            'modification_type': 'targeted_improvement'
        })

        current_compound = next_compound.get('new_compound', current_compound)

    return {
        'initial_compound': initial_smiles,
        'final_compound': current_compound,
        'iterations': len(optimization_history),
        'optimization_history': optimization_history,
        'target_achieved': binding_result.get('binding_affinity', 0) <= target_affinity
    }

4. 错误处理与回退机制

用例:能够优雅处理故障的稳健工作流程

模式:实现回退机制和优雅降级

def compose(arguments, tooluniverse, call_tool):
    """Workflow with comprehensive error handling and fallbacks"""

    results = {"status": "running", "completed_steps": []}

    try:
        # Step 1: Critical initial step
        step1_result = call_tool('critical_analysis_tool', arguments)
        results["step1"] = step1_result
        results["completed_steps"].append("step1")

    except Exception as e:
        results["status"] = "failed"
        results["error"] = f"Step 1 failed: {str(e)}"
        return results

    try:
        # Step 2: Optional enhancement step
        step2_result = call_tool('enhancement_tool', {"data": step1_result})
        results["step2"] = step2_result
        results["completed_steps"].append("step2")

    except Exception as e:
        # Continue without this step
        results["step2_warning"] = f"Enhancement step failed: {str(e)}"

    # Step 3: Alternative approaches with fallback
    try:
        step3_result = call_tool('primary_validation_tool', {"data": step1_result})
        results["validation"] = step3_result

    except Exception:
        # Fallback validation method
        try:
            fallback_result = call_tool('alternative_validation_tool', {"data": step1_result})
            results["validation"] = fallback_result
            results["validation_method"] = "fallback"

        except Exception as e:
            results["validation_error"] = str(e)

    results["status"] = "completed"
    return results

真实世界的组成示例

有关组合工具实际应用的完整示例,请参见 科学工作流 教程,其中包括:

  • 全面的药物发现流程:从靶点识别到安全性评估的端到端工作流程

  • 生物标志物发现工作流程:利用文献、表达数据和通路分析进行多步骤生物标志物验证

  • 高级文献综述:基于人工智能的系统综述及引文分析

  • 自主研究工作流程:利用人工智能反馈进行多步骤分析的自适应工作流程

这些示例展示了组合工具如何协调复杂的科学工作流程,结合来自不同后端的工具以解决实际研究问题。

工具调用接口

Tool Caller 提供了一个可调用接口,用于抽象调用 ToolUniverse 中的各个工具。该抽象实现了:

  • 统一工具访问:所有工具均通过相同的 call_tool 接口访问

  • 协议合规性:工具调用遵循 ToolUniverse 的交互协议规范。

  • 错误处理:在不同工具类型之间实现一致的错误处理

  • 依赖管理:工具依赖的自动加载与管理

工具调用者使用模式

def compose(arguments, tooluniverse, call_tool):
    # Direct tool invocation through the Tool Caller interface
    result = call_tool('tool_name', {'param1': 'value1', 'param2': 'value2'})

    # The call_tool function handles:
    # - Tool loading and instantiation
    # - Parameter validation
    # - Execution and error handling
    # - Result formatting

    return result

故障排除

常见问题及解决方案

  1. Tool Not Found Error

  • Check that the tool name is correct in your compose script

  • Ensure the tool is loaded in ToolUniverse

  • Verify the tool is in the required_tools list

  • Use auto_load_dependencies: true to automatically load missing tools

  1. Import Errors

  • Make sure your compose script is in the compose_scripts/ directory

  • Check that the function name matches composition_function

  • Verify the function signature is correct: def compose(arguments, tooluniverse, call_tool):

  1. Parameter Errors

  • Validate your parameter schema in the JSON configuration

  • Check that required parameters are provided

  • Ensure parameter types match the schema

  • Follow the interaction protocol schema of ToolUniverse

  1. Performance Issues

  • Limit the number of tools called in sequence

  • Use auto_load_dependencies: true for automatic loading

  • Consider caching results for repeated calls

  • Implement proper error handling to avoid cascading failures

  1. Heterogeneous Backend Issues

  • Ensure all required tools are available across different backends

  • Use fail_on_missing_tools: false for graceful degradation

  • Implement fallback mechanisms for critical workflow steps

可用的编写工具

ToolUniverse 当前提供了多种预设的组合工具,以展示不同的工作流程模式:

** Working Compose Tools**:

  1. LiteratureSearchTool - Literature research and synthesis

  • Searches EuropePMC, OpenAlex, and PubTator databases

  • Uses AI agent for literature summarization

  • Demonstrates broadcasting pattern

  1. ComprehensiveDrugDiscoveryPipeline - End-to-end drug discovery

  • Target identification using OpenTargets

  • Lead discovery from known drugs

  • Safety assessment using ADMETAI tools

  • Literature validation

  • Demonstrates sequential chaining with tool integration

  1. BiomarkerDiscoveryWorkflow - Biomarker discovery and validation

  • Literature-based biomarker discovery

  • Multi-strategy gene search using HPA

  • Comprehensive pathway analysis using HPA tools

  • Clinical validation using FDA data

  • Demonstrates multi-strategy fallbacks and error handling

  1. DrugSafetyAnalyzer - Drug safety assessment

  • PubChem compound information retrieval

  • EuropePMC literature search

  • Demonstrates safety-focused workflows

  1. ToolDescriptionOptimizer - Tool optimization

  • AI-powered tool description improvement

  • Test case generation and quality evaluation

  • Demonstrates agentic optimization loops

  1. ToolDiscover - Tool discovery and generation

  • AI-powered tool creation from descriptions

  • Iterative code improvement

  • Demonstrates advanced agentic workflows

主要功能: - 所有工具均经过测试并可正常运行,支持真实数据处理 - 全面的错误处理机制,具备优雅的降级方案 - 工具链集成,支持复杂的多步骤工作流程 - 动态数据提取**(例如,从药物名称中提取SMILES) - **多策略方法,确保数据检索的稳健性

参见

小技巧

从简单开始:先从像 LiteratureSearchTool 示例这样的顺序工作流入手,随着对工具组合的熟悉,再逐步过渡到更复杂的模式。

备注

Compose 工具位置:所有 compose 脚本必须放置在 src/tooluniverse/compose_scripts/ 目录下,并在 src/tooluniverse/data/compose_tools.json 中注册。

重要

工具组合器架构:工具组合器生成容器函数,将 ToolUniverse 和 Tool Caller 作为内联的可执行原语暴露出来,从而实现复杂科学工作流中灵活的多工具执行模式。