摘要钩子

基于人工智能的长工具输出智能摘要

SummarizationHook 使用先进的人工智能模型自动总结冗长的工具输出,保留关键发现和结果,同时缩减内容长度。

概述

功能说明: - 分析工具输出并识别关键信息 - 利用人工智能模型生成简明摘要 - 保留重要的技术细节和发现 - 在保持相关性的前提下缩减输出长度

使用场景: - 来自科学数据库的大型数据集 - 复杂的研究结果和文献 - 超出内存限制的长格式工具输出 - 需要从详细数据中快速获取洞见时

快速入门

简单用法

from tooluniverse import ToolUniverse

# Enable SummarizationHook (default)
tu = ToolUniverse(hooks_enabled=True)

# Or explicitly specify
tu = ToolUniverse(hooks_enabled=True, hook_type='SummarizationHook')

tu.load_tools(['uniprot'])

result = tu.run({
    "name": "UniProt_get_entry_by_accession",
    "arguments": {"accession": "P05067"}
})

# Result is automatically summarized
print(f"Summary length: {len(str(result))} characters")

高级配置

hook_config = {
    "exclude_tools": [
        "Tool_RAG",
        "ToolFinderEmbedding"
    ],
    "hooks": [{
        "name": "protein_summarization",
        "type": "SummarizationHook",
        "enabled": True,
        "conditions": {
            "output_length": {
                "operator": ">",
                "threshold": 8000
            }
        },
        "hook_config": {
                "chunk_size": 30000,
            "focus_areas": "protein_function_and_structure",
            "max_summary_length": 3500
        }
    }]
}

tu = ToolUniverse(hooks_enabled=True, hook_config=hook_config)

配置选项

Chunk Size - Controls the size of chunks for processing - Default: 30000 characters - Range: 10000-50000 characters recommended for optimal performance

关注重点 - 指定摘要时应关注的内容 - 默认值:”key_findings_and_results”

最大摘要长度 - 限制最终摘要的长度 - 默认值:3000 字符

Excluding Tools

Use exclude_tools to prevent specific tools from being summarized:

hook_config = {
    "exclude_tools": [
        "Tool_RAG",           # Exact match
        "ToolFinderEmbedding", # Exact match
        "CustomTool_*"         # Wildcard pattern
    ],
    "hooks": [...]
}

This is particularly useful for excluding tool discovery tools that shouldn’t be processed by hooks.

关注领域选项

总体关注领域: - key_findings_and_results:主要发现和结果 - consolidate_and_prioritize:合并多个摘要并优先排序 - technical_details:技术规格和细节 - main_conclusions:主要结论和成果

领域特定关注点: - protein_function_and_structure:蛋白质相关信息 - compound_properties_and_activity:化合物性质及活性数据 - key_findings_and_relevance:文献检索结果 - clinical_significance_and_drug_interactions:临床意义及药物相互作用 - methodology_and_results:研究方法及结果

示例

科学文献分析

# Summarize literature search results
tu = ToolUniverse(hooks_enabled=True)
tu.load_tools(['europepmc'])

result = tu.run({
    "name": "EuropePMC_search_publications",
    "arguments": {
        "query": "CRISPR gene editing therapeutic applications",
        "resultType": "core"
    }
})

# Get AI-powered summary of research findings
print("Research Summary:")
print(result)

蛋白质数据汇总

# Configure for protein data
protein_config = {
    'tool_specific_hooks': {
        'UniProt_get_entry_by_accession': {
            'enabled': True,
            'hooks': [{
                'name': 'protein_summarization',
                'type': 'SummarizationHook',
                'enabled': True,
                'conditions': {
                    'output_length': {
                        'operator': '>',
                        'threshold': 8000
                    }
                },
                'hook_config': {
                    'focus_areas': 'protein_function_and_structure',
                    'max_summary_length': 3500
                }
            }]
        }
    }
}

tu = ToolUniverse(hooks_enabled=True, hook_config=protein_config)

# Execute protein tool
result = tu.run({
    "name": "UniProt_get_entry_by_accession",
    "arguments": {"accession": "P05067"}
})

# Result will be summarized focusing on protein function and structure

化合物分析总结

# Configure for compound analysis
compound_config = {
    'tool_specific_hooks': {
        'ChEMBL_search_compounds': {
            'enabled': True,
            'hooks': [{
                'name': 'compound_summarization',
                'type': 'SummarizationHook',
                'enabled': True,
                'conditions': {
                    'output_length': {
                        'operator': '>',
                        'threshold': 7000
                    }
                },
                'hook_config': {
                    'focus_areas': 'compound_properties_and_activity',
                    'max_summary_length': 3000
                }
            }]
        }
    }
}

tu = ToolUniverse(hooks_enabled=True, hook_config=compound_config)

# Execute compound search
result = tu.run({
    "name": "ChEMBL_search_compounds",
    "arguments": {
        "compound_name": "aspirin",
        "limit": 100
    }
})

# Result will be summarized focusing on compound properties and activity

故障排除

摘要未触发 - 检查阈值设置:确保输出超过阈值 - 验证钩子是否启用:检查 enabled 字段 - 确认工具名称匹配:确保工具名称完全匹配 - 审查条件:检查所有条件参数

摘要质量差 - 调整关注区域:使用更具体的关注区域 - 修改分块大小:较小的分块可能提供更好的上下文 - 增加最大摘要长度:允许更详细的摘要 - 检查查询上下文:确保捕捉到原始查询

性能问题 - 提高阈值:减少处理的输出数量 - 优化分块大小:平衡处理时间与质量 - 使用特定工具钩子:比全局钩子更高效 - 启用缓存:减少重复处理

调试

启用钩子操作的详细日志记录:

import logging
logging.basicConfig(level=logging.DEBUG)

# Hook operations will be logged in detail
tu = ToolUniverse(hooks_enabled=True, hook_config=config)

验证

验证钩子配置:

# Check hook configuration
hook_manager = tu.hook_manager
for hook in hook_manager.hooks:
    print(f"Hook: {hook.name}")
    print(f"Enabled: {hook.enabled}")
    print(f"Type: {hook.config.get('type')}")
    print(f"Conditions: {hook.config.get('conditions')}")

后续步骤

参见