Tool Description Optimizer - Complete Tutorial¶
Table of Contents¶
🚀 Quick Start¶
Basic Usage (3 steps)¶
from tooluniverse import ToolUniverse
# 1. Initialize and load tools
tu = ToolUniverse()
tu.load_tools()
# 2. Get tool configuration and run optimizer
tool_config = tu.get_tool_description("your_tool_name")
result = tu.run({
"name": "ToolDescriptionOptimizer",
"arguments": {"tool_config": tool_config}
})
# 3. Check results
print(f"✅ Optimized in {result['total_iterations']} rounds, score: {result['final_quality_score']}/10")
Example Results¶
Before: "Fetch a list of active ingredients in a specific drug product."
After: "Provides detailed information about the active ingredients in drug products, including their properties and quantities, as found in various formulations and brands."
Overview¶
The ToolDescriptionOptimizer is an AI-powered tool that automatically improves tool documentation through iterative optimization. It analyzes actual test results to generate more accurate, concise, and user-friendly descriptions.
Core Functions¶
Test Case Generation: Creates diverse test scenarios for comprehensive analysis
Multi-Round Optimization: Iteratively improves through feedback-driven enhancement
Quality Assessment: Evaluates 6 key criteria with objective scoring
Comprehensive Reporting: Documents the entire optimization process
What Gets Optimized¶
Tool Description: Overall purpose and high-level functionality
Parameter Descriptions: Input requirements, constraints, and examples
Language Quality: Clarity, conciseness, and user-friendliness
Accuracy: Alignment with actual tool behavior
Key Features¶
🎯 Intelligent Analysis¶
Tests tools with real inputs to verify description accuracy
Identifies and eliminates redundancy between tool and parameter descriptions
Removes filler language and meaningless corporate-speak
📊 Quality Evaluation System¶
Multi-criteria scoring (0-10 scale):
Clarity & Understandability: How easy descriptions are to understand
Accuracy: How well descriptions match actual behavior
Completeness: Whether all necessary information is included
Conciseness: Absence of redundancy and filler language
User Friendliness: How helpful descriptions are for users
Redundancy Avoidance: Proper separation of concerns
🔄 Adaptive Optimization¶
Configurable iteration limits and quality thresholds
Feedback-driven test case generation in later rounds
Automatic convergence when quality targets are met
Configuration Parameters¶
Parameter |
Type |
Default |
Description |
---|---|---|---|
|
dict |
Required |
The tool configuration to optimize |
|
int |
3 |
Maximum number of optimization rounds |
|
float |
8.0 |
Quality score threshold (1-10) for satisfaction |
|
string |
auto-generated |
Custom path for the optimization report |
|
bool |
true |
Whether to save optimization results |
Common Configuration Patterns¶
# Quick improvement (development phase)
{"max_iterations": 2, "satisfaction_threshold": 7.5}
# Standard optimization (most use cases)
{"max_iterations": 3, "satisfaction_threshold": 8.0}
# Thorough optimization (production tools)
{"max_iterations": 5, "satisfaction_threshold": 8.5}
Understanding the Output¶
Optimization Report Structure¶
Final Optimized Descriptions: Improved tool and parameter descriptions
Optimization History: Round-by-round progression with scores and feedback
Detailed Rationale: Explanations for each change made
Test Results: Actual tool execution data used for analysis
Complete Configurations: Before and after JSON configurations
Reading Quality Scores¶
7.0-7.9: Good improvement, suitable for development
8.0-8.4: High quality, ready for production use
8.5-9.0: Excellent quality, meets strict standards
9.0+: Outstanding quality, rarely achieved
Real Example: FDA Tool Optimization¶
Step-by-Step Process¶
Original Description:
"Fetch a list of active ingredients in a specific drug product."
Round 1: Optimizer identifies description is too simplistic based on test results showing detailed ingredient information with quantities and properties.
Round 2: Adds detail about quantities and properties, but feedback indicates redundancy with parameter descriptions.
Round 3: Eliminates redundancy and focuses on core functionality.
Final Result:
"Provides detailed information about the active ingredients in drug products, including their properties and quantities, as found in various formulations and brands."
Key Improvements Achieved¶
✅ More accurate representation of actual functionality
✅ Eliminated redundancy with parameter descriptions
✅ Clearer scope indication (multiple formulations and brands)
✅ More precise action verb (“provides” vs “fetch”)
✅ Quality score improved from ~5.0 to 7.5/10
Best Practices¶
1. Tool Selection Strategy¶
Prioritize: Complex tools, frequently used tools, unclear descriptions
Avoid: Simple tools, recently optimized tools, domain-specific jargon-heavy tools
2. Configuration Guidelines¶
Scenario |
max_iterations |
satisfaction_threshold |
Use Case |
---|---|---|---|
Development |
2-3 |
7.0-7.5 |
Quick improvements |
Production |
3-4 |
8.0-8.5 |
Standard quality |
Critical Tools |
4-5 |
8.5-9.0 |
Highest quality |
3. Review Process¶
Always review optimized descriptions before applying
Test functionality with new descriptions
Consider domain expertise that AI might miss
Iterate if needed with different thresholds
Advanced Features¶
Feedback-Driven Enhancement¶
Later rounds generate targeted test cases based on previous feedback
Ensures comprehensive coverage of edge cases and usage patterns
Addresses specific quality issues identified in earlier rounds
Robust Parsing Technology¶
Handles various LLM output formats with whitespace/formatting issues
Uses regex patterns and fallback strategies for JSON extraction
Resolves “No additional test cases generated” issues automatically
Multi-Criteria Balancing¶
Maintains accuracy while improving conciseness
Preserves user-friendliness during clarity improvements
Eliminates redundancy without losing important information
Troubleshooting¶
Common Issues¶
Issue |
Cause |
Solution |
---|---|---|
Quality score plateaus |
Tool already well-optimized |
Lower threshold or try different tool |
Optimization takes too long |
Complex tool/high iterations |
Reduce max_iterations to 2-3 |
Descriptions too technical |
Domain-specific complexity |
Manual review and adjustment needed |
No test cases generated |
Fixed in recent updates |
Update to latest version |
Performance Tips¶
Start with lower thresholds (7.5) and iterate up
Use fewer iterations for initial testing
Focus on one tool at a time for detailed analysis
Integration Tutorial¶
Development Workflow Integration¶
# Development phase - quick feedback
def dev_optimize(tool_config):
return tu.run({
"name": "ToolDescriptionOptimizer",
"arguments": {
"tool_config": tool_config,
"max_iterations": 2,
"satisfaction_threshold": 7.0
}
})
# Pre-production - thorough review
def production_optimize(tool_config):
return tu.run({
"name": "ToolDescriptionOptimizer",
"arguments": {
"tool_config": tool_config,
"max_iterations": 4,
"satisfaction_threshold": 8.5
}
})
Maintenance Strategy¶
Regular audits: Quarterly review of tool descriptions
Update triggers: When tool functionality changes
User feedback: Incorporate real user confusion reports
Version control: Track optimization history for rollbacks
Complete Example Scripts¶
Single Tool Optimization¶
#!/usr/bin/env python3
"""Single Tool Optimization Script"""
from tooluniverse import ToolUniverse
import os
def optimize_single_tool(tool_name, quality_target=8.0):
"""Optimize one tool with custom quality target."""
tu = ToolUniverse()
tu.load_tools()
tool_config = tu.get_tool_description(tool_name)
if not tool_config:
print(f"❌ Tool '{tool_name}' not found!")
return None
print(f"🔧 Optimizing {tool_name}...")
result = tu.run({
"name": "ToolDescriptionOptimizer",
"arguments": {
"tool_config": tool_config,
"max_iterations": 3,
"satisfaction_threshold": quality_target,
"output_file": f"{tool_name}_optimization.txt"
}
})
print(f"✅ Completed: {result['total_iterations']} rounds, score: {result['final_quality_score']}/10")
return result
# Usage
optimize_single_tool("FDA_get_active_ingredient_info_by_drug_name", 8.0)
Batch Tool Optimization¶
#!/usr/bin/env python3
"""Batch Optimization Script"""
from tooluniverse import ToolUniverse
import os
from datetime import datetime
def batch_optimize_tools(tool_names, output_dir="batch_optimizations"):
"""Optimize multiple tools and save results."""
tu = ToolUniverse()
tu.load_tools()
os.makedirs(output_dir, exist_ok=True)
results = {}
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
for tool_name in tool_names:
print(f"\n🔧 Processing {tool_name}...")
tool_config = tu.get_tool_description(tool_name)
if not tool_config:
print(f"❌ Tool '{tool_name}' not found!")
continue
try:
result = tu.run({
"name": "ToolDescriptionOptimizer",
"arguments": {
"tool_config": tool_config,
"max_iterations": 3,
"satisfaction_threshold": 8.0,
"output_file": f"{output_dir}/{tool_name}_{timestamp}.txt"
}
})
results[tool_name] = {
'success': True,
'iterations': result['total_iterations'],
'final_score': result['final_quality_score'],
'satisfied': result['achieved_satisfaction']
}
print(f"✅ {tool_name}: {result['final_quality_score']}/10 in {result['total_iterations']} rounds")
except Exception as e:
results[tool_name] = {'success': False, 'error': str(e)}
print(f"❌ {tool_name}: Failed - {e}")
# Summary report
print(f"\n📊 Batch Optimization Summary:")
successful = [name for name, result in results.items() if result.get('success')]
print(f"✅ Successful: {len(successful)}/{len(tool_names)}")
if successful:
avg_score = sum(results[name]['final_score'] for name in successful) / len(successful)
print(f"📈 Average quality score: {avg_score:.1f}/10")
return results
# Usage
tools_to_optimize = [
"FDA_get_active_ingredient_info_by_drug_name",
"ChEMBL_get_drug_by_chembl_id",
"EuropePMC_search_articles"
]
batch_optimize_tools(tools_to_optimize)
Advanced Integration Script¶
#!/usr/bin/env python3
"""Advanced Integration Example"""
from tooluniverse import ToolUniverse
import json
import os
from typing import Dict, List, Optional
class ToolOptimizationManager:
"""Advanced tool optimization management with reporting."""
def __init__(self, output_dir: str = "optimizations"):
self.tu = ToolUniverse()
self.tu.load_tools()
self.output_dir = output_dir
os.makedirs(output_dir, exist_ok=True)
def optimize_with_strategy(self, tool_name: str, strategy: str = "standard") -> Optional[Dict]:
"""Optimize tool with predefined strategies."""
strategies = {
"quick": {"max_iterations": 2, "satisfaction_threshold": 7.5},
"standard": {"max_iterations": 3, "satisfaction_threshold": 8.0},
"thorough": {"max_iterations": 5, "satisfaction_threshold": 8.5}
}
if strategy not in strategies:
raise ValueError(f"Unknown strategy: {strategy}")
config = strategies[strategy]
tool_config = self.tu.get_tool_description(tool_name)
if not tool_config:
print(f"Tool '{tool_name}' not found!")
return None
result = self.tu.run({
"name": "ToolDescriptionOptimizer",
"arguments": {
"tool_config": tool_config,
"output_file": f"{self.output_dir}/{tool_name}_{strategy}.txt",
**config
}
})
return result
def compare_strategies(self, tool_name: str) -> Dict:
"""Compare different optimization strategies for a tool."""
results = {}
for strategy in ["quick", "standard", "thorough"]:
print(f"Testing {strategy} strategy...")
results[strategy] = self.optimize_with_strategy(tool_name, strategy)
return results
# Usage
manager = ToolOptimizationManager()
# Single tool with specific strategy
result = manager.optimize_with_strategy("FDA_get_active_ingredient_info_by_drug_name", "thorough")
# Compare strategies
comparison = manager.compare_strategies("ChEMBL_get_drug_by_chembl_id")
Summary¶
The ToolDescriptionOptimizer provides automated, AI-powered enhancement of tool documentation with:
🎯 Precision: Matches descriptions to actual tool behavior through testing
🔄 Iteration: Multi-round improvement with feedback-driven enhancement
📊 Measurement: Objective quality scoring across 6 key criteria
🛠️ Flexibility: Configurable for different quality targets and use cases
📝 Documentation: Comprehensive reporting of optimization process
Quick Reference¶
Development:
max_iterations=2, satisfaction_threshold=7.5
Production:
max_iterations=3, satisfaction_threshold=8.0
Critical:
max_iterations=5, satisfaction_threshold=8.5
Start with the Quick Start section above, then refer to specific sections as needed for your use case.