Available Tools ReferenceΒΆ
Complete reference of all ToolUniverse scientific tools and their capabilities.
ToolUniverse provides 600+ tools across eight major categories, each serving specific computational and analytical requirements in scientific research.
Tool Ecosystem OverviewΒΆ
ToolUniverse integrates tools across eight major categories:
ToolUniverse Ecosystem (600+ Tools):
βββββββββββββββββββ
β ML Models β 15 tools β Prediction, Classification, Generation
β (AI/ML) β
βββββββββββββββββββ
βββββββββββββββββββ
β AI Agents β 33 tools β Autonomous Planning, Tool Routing
β (Agentic) β
βββββββββββββββββββ
βββββββββββββββββββ
β Software β 164 tools β Bioinformatics, Analysis Packages
β Packages β
βββββββββββββββββββ
βββββββββββββββββββ
β Human Expert β 6 tools β Consultation, Validation, Feedback
β Feedback β
βββββββββββββββββββ
βββββββββββββββββββ
β Robotics β 1 tool β ROS Communication, Lab Automation
β (Automation) β
βββββββββββββββββββ
βββββββββββββββββββ
β Databases β 84 tools β Structured Data, Knowledge Bases
β (Storage) β
βββββββββββββββββββ
βββββββββββββββββββ
β Embedding β 4 tools β Vector Search, Semantic Retrieval
β Stores β
βββββββββββββββββββ
βββββββββββββββββββ
β APIs β 281 tools β External Services, Data Access
β (Integration) β
βββββββββββββββββββ
Tool Categories SummaryΒΆ
Category |
Count |
Percentage |
Primary Use Cases |
---|---|---|---|
APIs |
281 |
48.4% |
External data access, real-time information |
Software Packages |
164 |
28.3% |
Computational analysis, local processing |
Databases |
84 |
14.5% |
Structured data storage and retrieval |
AI Agents |
33 |
5.7% |
Autonomous reasoning and planning |
ML Models |
15 |
2.6% |
Prediction and classification tasks |
Expert Feedback |
6 |
1.0% |
Human validation and guidance |
Embedding Stores |
4 |
0.7% |
Semantic search and similarity |
Robotics |
1 |
0.2% |
Laboratory automation |
Total |
588 |
100% |
Comprehensive scientific ecosystem |
𧬠Molecular & Genetic Data¢
UniProt - Protein InformationΒΆ
Access comprehensive protein and gene information.
Key Functions:
* UniProt_get_protein_info
- Get detailed protein information by gene symbol
* UniProt_search_proteins
- Search proteins by keywords
* UniProt_get_protein_sequence
- Retrieve protein sequences
Example:
query = {
"name": "UniProt_get_protein_info",
"arguments": {"gene_symbol": "BRCA1"}
}
result = tu.run(query)
Gene Ontology - Functional AnnotationΒΆ
Gene Ontology annotations and functional analysis.
Key Functions:
* GeneOntology_get_annotations
- Get GO annotations for genes
* GeneOntology_search_terms
- Search GO terms
* GeneOntology_get_enrichment
- Functional enrichment analysis
Example:
query = {
"name": "GeneOntology_get_annotations",
"arguments": {"gene_symbols": ["BRCA1", "BRCA2", "TP53"]}
}
Enrichr - Gene Set AnalysisΒΆ
Comprehensive gene set enrichment analysis.
Key Functions:
* Enrichr_analyze_gene_list
- Enrichment analysis for gene lists
* Enrichr_get_libraries
- List available gene set libraries
* Enrichr_download_results
- Download enrichment results
Example:
query = {
"name": "Enrichr_analyze_gene_list",
"arguments": {
"genes": ["BRCA1", "BRCA2", "TP53", "ATM", "CHEK2"],
"library": "KEGG_2021_Human"
}
}
π― Disease & Target DataΒΆ
OpenTargets PlatformΒΆ
Comprehensive disease-target association data.
Key Functions:
* OpenTargets_get_associated_targets_by_disease_efoId
- Disease-associated targets
* OpenTargets_get_associated_diseases_by_target
- Target-associated diseases
* OpenTargets_get_disease_id_description_by_name
- Disease lookup
* OpenTargets_get_evidence
- Evidence for associations
* OpenTargets_get_drug_info
- Drug information and mechanisms
Example:
# Get targets for Alzheimer's disease
query = {
"name": "OpenTargets_get_associated_targets_by_disease_efoId",
"arguments": {"efoId": "EFO_0000249"}
}
EFO - Experimental Factor OntologyΒΆ
Disease and experimental factor ontology.
Key Functions:
* EFO_search_diseases
- Search diseases by name
* EFO_get_disease_hierarchy
- Get disease relationships
* EFO_get_synonyms
- Get disease synonyms
Example:
query = {
"name": "EFO_search_diseases",
"arguments": {"query": "diabetes"}
}
π Drug & Chemical DataΒΆ
PubChem - Chemical InformationΒΆ
Comprehensive chemical compound database.
Key Functions:
* PubChem_get_compound_info
- Get compound information by name/ID
* PubChem_search_compounds
- Search compounds by structure/properties
* PubChem_get_compound_properties
- Molecular properties
* PubChem_similarity_search
- Chemical similarity search
Example:
query = {
"name": "PubChem_get_compound_info",
"arguments": {"compound_name": "aspirin"}
}
ChEMBL - Bioactivity DataΒΆ
Chemical bioactivity and drug discovery data.
Key Functions:
* ChEMBL_get_compound_targets
- Get targets for compounds
* ChEMBL_get_compounds_by_target
- Get compounds targeting proteins
* ChEMBL_get_bioactivity_data
- Bioactivity measurements
* ChEMBL_search_similar_compounds
- Chemical similarity search
Example:
query = {
"name": "ChEMBL_get_compounds_by_target",
"arguments": {"target_symbol": "EGFR"}
}
π‘οΈ Drug Safety & RegulatoryΒΆ
OpenFDA - FDA DataΒΆ
FDA drug labeling and adverse event data.
Key Functions:
* FAERS_count_reactions_by_drug_event
- Count adverse reactions by drug
* openfda_get_warnings_by_drug_name
- Get FDA warnings
* OpenFDA_get_drug_labels
- Drug labeling information
* OpenFDA_search_recalls
- Drug recall information
Example:
# Search adverse events
query = {
"name": "FAERS_count_reactions_by_drug_event",
"arguments": {"medicinalproduct": "warfarin"}
}
# Get FDA warnings
query = {
"name": "openfda_get_warnings_by_drug_name",
"arguments": {"medicinalproduct": "warfarin"}
}
DailyMed - Drug LabelingΒΆ
Official FDA drug labeling information.
Key Functions:
* DailyMed_get_drug_label
- Get official drug labels
* DailyMed_search_drugs
- Search drugs by name
* DailyMed_get_NDC_info
- NDC (drug code) information
Example:
query = {
"name": "DailyMed_get_drug_label",
"arguments": {"medicinalproduct": "metformin"}
}
π§ͺ Clinical ResearchΒΆ
ClinicalTrials.govΒΆ
Clinical trial registry and results database.
Key Functions:
* ClinicalTrials_search_studies
- Search clinical trials
* ClinicalTrials_get_study_details
- Get detailed study information
* ClinicalTrials_get_trial_results
- Get trial results
* ClinicalTrials_search_by_condition
- Find trials by medical condition
Example:
query = {
"name": "ClinicalTrials_search_studies",
"arguments": {
"condition": "breast cancer",
"intervention": "immunotherapy"
}
}
π Literature & PublicationsΒΆ
PubTator - Biomedical LiteratureΒΆ
PubMed literature with named entity recognition.
Key Functions:
* PubTator_search_publications
- Search literature with entities
* PubTator_get_annotations
- Get entity annotations
* PubTator_search_by_entity
- Search by specific entities
Example:
query = {
"name": "PubTator_search_publications",
"arguments": {
"query": "@GENE_BRCA1 @DISEASE_cancer"
}
}
Europe PMCΒΆ
European literature database with full-text access.
Key Functions:
* EuropePMC_search_articles
- Search articles and abstracts
* EuropePMC_get_full_text
- Get full-text when available
* EuropePMC_get_citations
- Get citation data
Example:
query = {
"name": "EuropePMC_search_articles",
"arguments": {"query": "CRISPR gene therapy"}
}
Semantic ScholarΒΆ
AI-powered academic search engine.
Key Functions:
* SemanticScholar_search_papers
- Search academic papers
* SemanticScholar_get_paper_details
- Get detailed paper information
* SemanticScholar_get_citations
- Citation network analysis
Example:
query = {
"name": "SemanticScholar_search_papers",
"arguments": {"query": "machine learning drug discovery"}
}
OpenAlexΒΆ
Open academic publication database.
Key Functions:
* OpenAlex_search_works
- Search academic works
* OpenAlex_get_author_info
- Author information and metrics
* OpenAlex_get_institution_data
- Institution research data
π Specialized DatabasesΒΆ
Human Protein AtlasΒΆ
Tissue and cell expression data.
Key Functions:
* HPA_get_tissue_expression
- Tissue expression patterns
* HPA_get_cell_expression
- Single-cell expression data
* HPA_get_protein_localization
- Subcellular localization
Example:
query = {
"name": "HPA_get_tissue_expression",
"arguments": {"gene_symbol": "BRCA1"}
}
Reactome PathwaysΒΆ
Biological pathway database.
Key Functions:
* Reactome_get_pathways_by_gene
- Pathways for genes
* Reactome_search_pathways
- Search pathway database
* Reactome_get_pathway_details
- Detailed pathway information
Example:
query = {
"name": "Reactome_get_pathways_by_gene",
"arguments": {"gene_symbol": "TP53"}
}
HumanBaseΒΆ
Tissue-specific gene networks.
Key Functions:
* HumanBase_get_gene_networks
- Tissue-specific networks
* HumanBase_predict_gene_function
- Gene function prediction
* HumanBase_get_tissue_expression
- Tissue expression patterns
MedlinePlusΒΆ
Consumer health information.
Key Functions:
* MedlinePlus_get_health_topics
- Health topic information
* MedlinePlus_search_conditions
- Search medical conditions
* MedlinePlus_get_drug_info
- Consumer drug information
π€ AI-Powered ToolsΒΆ
Machine Learning Models (15 tools)ΒΆ
Apply machine learning algorithms for prediction, classification, and generation tasks.
Core ML Tools:
boltz2_docking - Protein-ligand binding prediction
{
"name": "boltz2_docking",
"arguments": {
"protein_structure": "1ABC",
"ligand_smiles": "CCO"
}
}
# Returns: binding_affinity, binding_probability, confidence_score
ADMET_predict_CYP_interactions - Drug metabolism prediction
{
"name": "ADMET_predict_CYP_interactions",
"arguments": {
"smiles": "CC(=O)OC1=CC=CC=C1C(=O)O", # Aspirin
"cyp_enzymes": ["CYP3A4", "CYP2D6"]
}
}
# Returns: interaction_probabilities, metabolic_stability
run_TxAgent_biomedical_reasoning - Therapeutic reasoning
{
"name": "run_TxAgent_biomedical_reasoning",
"arguments": {
"query": "What are the therapeutic targets for Alzheimer's disease?",
"context": "precision_medicine"
}
}
# Returns: therapeutic_insights, target_recommendations
AI Agents (33 tools)ΒΆ
Autonomous tools that perceive environments, make decisions, and take actions toward research goals.
Literature & Analysis Agents:
HypothesisGenerator - Generate research hypotheses
{
"name": "HypothesisGenerator",
"arguments": {
"research_area": "cancer immunotherapy",
"constraints": ["FDA-approved targets", "known biomarkers"],
"num_hypotheses": 5
}
}
# Returns: ranked_hypotheses, supporting_evidence, testable_predictions
ExperimentalDesignScorer - Evaluate experimental designs
{
"name": "ExperimentalDesignScorer",
"arguments": {
"experiment_description": "Phase II trial for EGFR inhibitor",
"evaluation_criteria": ["feasibility", "statistical_power", "ethics"]
}
}
# Returns: design_score, improvement_suggestions, risk_assessment
MedicalLiteratureReviewer - Comprehensive literature analysis
{
"name": "MedicalLiteratureReviewer",
"arguments": {
"topic": "CAR-T cell therapy safety profile",
"databases": ["PubMed", "ClinicalTrials.gov"],
"time_range": "2020-2024"
}
}
# Returns: comprehensive_review, key_findings, research_gaps
Tool Discovery & CompositionΒΆ
AI tools for discovering and combining other tools.
Key Functions:
* discover_tools_by_description
- Find tools by natural language
* compose_tools_for_workflow
- Create tool workflows
* optimize_tool_descriptions
- Improve tool descriptions
Example:
query = {
"name": "discover_tools_by_description",
"arguments": {
"description": "I need to find genes associated with heart disease"
}
}
π Search & Integration ToolsΒΆ
Tool FinderΒΆ
Find appropriate tools for your research needs.
Key Functions:
* find_tools_by_keyword
- Keyword-based tool search
* find_tools_by_category
- Browse tools by category
* get_tool_recommendations
- Get tool recommendations
Example:
query = {
"name": "find_tools_by_keyword",
"arguments": {"keywords": ["drug", "safety", "adverse"]}
}
Embedding Stores (4 tools)ΒΆ
Store and retrieve vectorized representations of scientific data for semantic search.
Core Embedding Tools:
embedding_tool_finder - Semantic tool discovery
{
"name": "embedding_tool_finder",
"arguments": {
"query": "predict protein folding dynamics",
"top_k": 10,
"similarity_threshold": 0.7
}
}
# Returns: relevant_tools, similarity_scores, tool_descriptions
embedding_database_search - Vector similarity search
{
"name": "embedding_database_search",
"arguments": {
"query_vector": embedding_vector,
"database": "pubmed_abstracts",
"top_k": 50
}
}
# Returns: similar_documents, relevance_scores, metadata
Data IntegrationΒΆ
Tools for combining data from multiple sources.
Key Functions:
* integrate_gene_data
- Combine gene data from multiple sources
* cross_reference_identifiers
- Map between different ID systems
* validate_data_consistency
- Check data consistency
π οΈ Tool Usage PatternsΒΆ
Single Tool QueriesΒΆ
Simple, focused queries for specific information:
# Get protein info
protein_query = {
"name": "UniProt_get_protein_info",
"arguments": {"gene_symbol": "EGFR"}
}
# Search adverse events
safety_query = {
"name": "FAERS_count_reactions_by_drug_event",
"arguments": {"medicinalproduct": "metformin"}
}
Multi-Tool WorkflowsΒΆ
Combine multiple tools for comprehensive analysis:
# Step 1: Get disease info
disease_query = {
"name": "OpenTargets_get_disease_id_description_by_name",
"arguments": {"diseaseName": "diabetes"}
}
# Step 2: Get associated targets
targets_query = {
"name": "OpenTargets_get_associated_targets_by_disease_efoId",
"arguments": {"efoId": disease_id}
}
# Step 3: Analyze target pathways
pathway_query = {
"name": "Enrichr_analyze_gene_list",
"arguments": {
"genes": target_list,
"library": "KEGG_2021_Human"
}
}
Batch ProcessingΒΆ
Process multiple related queries efficiently:
# Process multiple genes
genes = ["BRCA1", "BRCA2", "TP53", "ATM"]
results = {}
for gene in genes:
query = {
"name": "UniProt_get_protein_info",
"arguments": {"gene_symbol": gene}
}
results[gene] = tu.run(query)
Integration PatternsΒΆ
Multi-Tool WorkflowsΒΆ
Combine multiple tools for comprehensive analysis:
from tooluniverse import ToolUniverse
# Drug discovery workflow
def drug_discovery_pipeline(disease_name):
tooluni = ToolUniverse()
tooluni.load_tools()
# 1. Find disease ID
disease_query = {
"name": "opentarget_get_disease_id_description_by_name",
"arguments": {"disease_name": disease_name}
}
disease_info = tooluni.run(disease_query)
# 2. Get associated targets
targets_query = {
"name": "opentarget_get_associated_targets_by_disease_efoId",
"arguments": {"disease_efo_id": disease_info['id']}
}
targets = tooluni.run(targets_query)
# 3. Find drugs for each target
drugs = []
for target in targets[:5]: # Top 5 targets
drugs_query = {
"name": "opentarget_get_associated_drugs_by_target_ensemblID",
"arguments": {
"target_ensembl_id": target['id'],
"size": 10,
"cursor": ""
}
}
target_drugs = tooluni.run(drugs_query)
drugs.extend(target_drugs)
# 4. Check safety profiles
for drug in drugs[:10]: # Top 10 drugs
safety_query = {
"name": "openfda_get_warnings_by_drug_name",
"arguments": {"drug_name": drug['name']}
}
safety = tooluni.run(safety_query)
drug['safety_warnings'] = safety
return drugs
Tool Composition PatternsΒΆ
Sequential Workflows:
# Disease β Targets β Compounds β Prediction
workflow = [
("OpenTargets_get_associated_targets_by_disease_efoId", {"efoId": disease_id}),
("ChEMBL_search_compounds_by_target", {"target_id": target_result}),
("boltz2_docking", {"protein_id": target, "ligand_smiles": compound}),
("ADMETAI_predict_admet_properties", {"smiles": compound})
]
Parallel Data Gathering:
# Multi-database literature search
parallel_searches = [
("PubTator_search_publications", {"query": research_topic}),
("EuropePMC_search_articles", {"query": research_topic}),
("SemanticScholar_search_papers", {"query": research_topic})
]
Feedback Loops:
# Iterative optimization
while not satisfactory_result:
prediction = ml_model_prediction(current_compound)
if prediction.score < threshold:
analogs = chemical_database_search(current_compound)
current_compound = select_best_analog(analogs)
else:
break
π Tool Performance TipsΒΆ
Optimization StrategiesΒΆ
Use specific queries: More specific queries return faster
Limit results: Use
limit
parameter to control result sizeCache results: Enable caching for repeated queries
Batch when possible: Some tools support batch operations
Rate LimitingΒΆ
ToolUniverse automatically handles API rate limits, but you can optimize:
import time
# Add delays for large batch operations
for query in large_query_list:
result = tu.run(query)
time.sleep(0.1) # Small delay between requests
Error HandlingΒΆ
Always include error handling for robust applications:
try:
result = tu.run(query)
if result and 'data' in result:
# Process successful result
process_data(result['data'])
else:
print("No data returned")
except Exception as e:
print(f"Query failed: {e}")
Performance OptimizationΒΆ
Category-Specific ConsiderationsΒΆ
ML Models: - Remote execution reduces local resource requirements - Batch predictions when possible - Cache results for expensive computations
APIs: - Respect rate limits and implement backoff - Use pagination for large datasets - Cache frequent queries
Databases: - Use specific field queries instead of full searches - Implement result limits for exploration - Index frequently accessed data
Agents: - Configure appropriate timeout values - Use streaming for long-running tasks - Implement progress monitoring
Best PracticesΒΆ
Tool Selection: Choose the right tool for your specific use case
Rate Limiting: Respect API rate limits to avoid blocking
Error Handling: Always handle potential API errors gracefully
Caching: Use caching for frequently accessed data
Batch Processing: Use batch operations when available for efficiency
Configuration: Configure tools appropriately for your environment
Tool Discovery & SelectionΒΆ
Finding the Right ToolsΒΆ
By Category:
# List tools by category
ml_tools = tu.list_tools_by_category("ML Models")
database_tools = tu.list_tools_by_category("Databases")
api_tools = tu.list_tools_by_category("APIs")
By Functionality:
# Semantic search across all categories
protein_tools = tu.run({
"name": "find_tools",
"arguments": {"query": "protein structure prediction", "limit": 10}
})
drug_tools = tu.run({
"name": "find_tools",
"arguments": {"query": "drug safety analysis", "limit": 10}
})
literature_tools = tu.run({
"name": "find_tools",
"arguments": {"query": "literature review automation", "limit": 10}
})
By Domain:
# Load domain-specific tools
tu.load_tools(tool_type=[
"opentarget", # Disease-target data
"ChEMBL", # Chemical data
"uniprot", # Protein data
"pubtator" # Literature with entities
])
API AuthenticationΒΆ
# Environment-based API key management
import os
# Recommended: Use environment variables
api_keys = {
'OPENTARGETS_API_KEY': os.getenv('OPENTARGETS_API_KEY'),
'NCBI_API_KEY': os.getenv('NCBI_API_KEY'),
'SEMANTIC_SCHOLAR_API_KEY': os.getenv('SEMANTIC_SCHOLAR_API_KEY')
}
# ToolUniverse automatically manages authentication
tu = ToolUniverse()
tu.configure_api_keys(api_keys)
Future ExtensionsΒΆ
Planned Categories: - Visualization Tools: Interactive plotting and dashboard generation - Workflow Engines: Advanced orchestration and scheduling - Cloud Services: Distributed computing and storage - Compliance Tools: Regulatory and ethics validation
Community Contributions: - Tool submission guidelines - Quality assurance processes - Community voting and validation - Maintenance and updates
π― Next StepsΒΆ
Now that you know what tools are available:
π Try Examples: Examples & Code Samples - See tools in action
π¬ Build Workflows: Scientific Workflows - Combine tools for research
β‘ Optimize: best_practices - Performance and production tips
π οΈ Create Custom: ../tutorials/custom_tools - Build your own tools
Tip
Discovery tip: Use the AI-powered tool discovery features to find the right tools for your specific research questions!
Tip
Tool ecosystem synergy: The eight categories are designed to work together. APIs provide data access, ML models add intelligence, agents orchestrate complex workflows, while databases and embedding stores enable efficient information management.