Available Tools ReferenceΒΆ

Complete reference of all ToolUniverse scientific tools and their capabilities.

ToolUniverse provides 600+ tools across eight major categories, each serving specific computational and analytical requirements in scientific research.

Tool Ecosystem OverviewΒΆ

ToolUniverse integrates tools across eight major categories:

ToolUniverse Ecosystem (600+ Tools):

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   ML Models     β”‚ 15 tools  β†’ Prediction, Classification, Generation
β”‚     (AI/ML)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   AI Agents     β”‚ 33 tools  β†’ Autonomous Planning, Tool Routing
β”‚   (Agentic)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Software      β”‚ 164 tools β†’ Bioinformatics, Analysis Packages
β”‚   Packages      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Human Expert    β”‚ 6 tools   β†’ Consultation, Validation, Feedback
β”‚   Feedback      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Robotics      β”‚ 1 tool    β†’ ROS Communication, Lab Automation
β”‚  (Automation)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Databases     β”‚ 84 tools  β†’ Structured Data, Knowledge Bases
β”‚   (Storage)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Embedding      β”‚ 4 tools   β†’ Vector Search, Semantic Retrieval
β”‚   Stores        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      APIs       β”‚ 281 tools β†’ External Services, Data Access
β”‚  (Integration)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tool Categories SummaryΒΆ

Tool Distribution by CategoryΒΆ

Category

Count

Percentage

Primary Use Cases

APIs

281

48.4%

External data access, real-time information

Software Packages

164

28.3%

Computational analysis, local processing

Databases

84

14.5%

Structured data storage and retrieval

AI Agents

33

5.7%

Autonomous reasoning and planning

ML Models

15

2.6%

Prediction and classification tasks

Expert Feedback

6

1.0%

Human validation and guidance

Embedding Stores

4

0.7%

Semantic search and similarity

Robotics

1

0.2%

Laboratory automation

Total

588

100%

Comprehensive scientific ecosystem

🧬 Molecular & Genetic Data¢

UniProt - Protein InformationΒΆ

Access comprehensive protein and gene information.

Key Functions: * UniProt_get_protein_info - Get detailed protein information by gene symbol * UniProt_search_proteins - Search proteins by keywords * UniProt_get_protein_sequence - Retrieve protein sequences

Example:

query = {
    "name": "UniProt_get_protein_info",
    "arguments": {"gene_symbol": "BRCA1"}
}
result = tu.run(query)

Gene Ontology - Functional AnnotationΒΆ

Gene Ontology annotations and functional analysis.

Key Functions: * GeneOntology_get_annotations - Get GO annotations for genes * GeneOntology_search_terms - Search GO terms * GeneOntology_get_enrichment - Functional enrichment analysis

Example:

query = {
    "name": "GeneOntology_get_annotations",
    "arguments": {"gene_symbols": ["BRCA1", "BRCA2", "TP53"]}
}

Enrichr - Gene Set AnalysisΒΆ

Comprehensive gene set enrichment analysis.

Key Functions: * Enrichr_analyze_gene_list - Enrichment analysis for gene lists * Enrichr_get_libraries - List available gene set libraries * Enrichr_download_results - Download enrichment results

Example:

query = {
    "name": "Enrichr_analyze_gene_list",
    "arguments": {
        "genes": ["BRCA1", "BRCA2", "TP53", "ATM", "CHEK2"],
        "library": "KEGG_2021_Human"
    }
}

🎯 Disease & Target Data¢

OpenTargets PlatformΒΆ

Comprehensive disease-target association data.

Key Functions: * OpenTargets_get_associated_targets_by_disease_efoId - Disease-associated targets * OpenTargets_get_associated_diseases_by_target - Target-associated diseases * OpenTargets_get_disease_id_description_by_name - Disease lookup * OpenTargets_get_evidence - Evidence for associations * OpenTargets_get_drug_info - Drug information and mechanisms

Example:

# Get targets for Alzheimer's disease
query = {
    "name": "OpenTargets_get_associated_targets_by_disease_efoId",
    "arguments": {"efoId": "EFO_0000249"}
}

EFO - Experimental Factor OntologyΒΆ

Disease and experimental factor ontology.

Key Functions: * EFO_search_diseases - Search diseases by name * EFO_get_disease_hierarchy - Get disease relationships * EFO_get_synonyms - Get disease synonyms

Example:

query = {
    "name": "EFO_search_diseases",
    "arguments": {"query": "diabetes"}
}

πŸ’Š Drug & Chemical DataΒΆ

PubChem - Chemical InformationΒΆ

Comprehensive chemical compound database.

Key Functions: * PubChem_get_compound_info - Get compound information by name/ID * PubChem_search_compounds - Search compounds by structure/properties * PubChem_get_compound_properties - Molecular properties * PubChem_similarity_search - Chemical similarity search

Example:

query = {
    "name": "PubChem_get_compound_info",
    "arguments": {"compound_name": "aspirin"}
}

ChEMBL - Bioactivity DataΒΆ

Chemical bioactivity and drug discovery data.

Key Functions: * ChEMBL_get_compound_targets - Get targets for compounds * ChEMBL_get_compounds_by_target - Get compounds targeting proteins * ChEMBL_get_bioactivity_data - Bioactivity measurements * ChEMBL_search_similar_compounds - Chemical similarity search

Example:

query = {
    "name": "ChEMBL_get_compounds_by_target",
    "arguments": {"target_symbol": "EGFR"}
}

πŸ›‘οΈ Drug Safety & RegulatoryΒΆ

OpenFDA - FDA DataΒΆ

FDA drug labeling and adverse event data.

Key Functions: * FAERS_count_reactions_by_drug_event - Count adverse reactions by drug * openfda_get_warnings_by_drug_name - Get FDA warnings * OpenFDA_get_drug_labels - Drug labeling information * OpenFDA_search_recalls - Drug recall information

Example:

# Search adverse events
query = {
    "name": "FAERS_count_reactions_by_drug_event",
    "arguments": {"medicinalproduct": "warfarin"}
}

# Get FDA warnings
query = {
    "name": "openfda_get_warnings_by_drug_name",
    "arguments": {"medicinalproduct": "warfarin"}
}

DailyMed - Drug LabelingΒΆ

Official FDA drug labeling information.

Key Functions: * DailyMed_get_drug_label - Get official drug labels * DailyMed_search_drugs - Search drugs by name * DailyMed_get_NDC_info - NDC (drug code) information

Example:

query = {
    "name": "DailyMed_get_drug_label",
    "arguments": {"medicinalproduct": "metformin"}
}

πŸ§ͺ Clinical ResearchΒΆ

ClinicalTrials.govΒΆ

Clinical trial registry and results database.

Key Functions: * ClinicalTrials_search_studies - Search clinical trials * ClinicalTrials_get_study_details - Get detailed study information * ClinicalTrials_get_trial_results - Get trial results * ClinicalTrials_search_by_condition - Find trials by medical condition

Example:

query = {
    "name": "ClinicalTrials_search_studies",
    "arguments": {
        "condition": "breast cancer",
        "intervention": "immunotherapy"
    }
}

πŸ“š Literature & PublicationsΒΆ

PubTator - Biomedical LiteratureΒΆ

PubMed literature with named entity recognition.

Key Functions: * PubTator_search_publications - Search literature with entities * PubTator_get_annotations - Get entity annotations * PubTator_search_by_entity - Search by specific entities

Example:

query = {
    "name": "PubTator_search_publications",
    "arguments": {
        "query": "@GENE_BRCA1 @DISEASE_cancer"
    }
}

Europe PMCΒΆ

European literature database with full-text access.

Key Functions: * EuropePMC_search_articles - Search articles and abstracts * EuropePMC_get_full_text - Get full-text when available * EuropePMC_get_citations - Get citation data

Example:

query = {
    "name": "EuropePMC_search_articles",
    "arguments": {"query": "CRISPR gene therapy"}
}

Semantic ScholarΒΆ

AI-powered academic search engine.

Key Functions: * SemanticScholar_search_papers - Search academic papers * SemanticScholar_get_paper_details - Get detailed paper information * SemanticScholar_get_citations - Citation network analysis

Example:

query = {
    "name": "SemanticScholar_search_papers",
    "arguments": {"query": "machine learning drug discovery"}
}

OpenAlexΒΆ

Open academic publication database.

Key Functions: * OpenAlex_search_works - Search academic works * OpenAlex_get_author_info - Author information and metrics * OpenAlex_get_institution_data - Institution research data

πŸ“Š Specialized DatabasesΒΆ

Human Protein AtlasΒΆ

Tissue and cell expression data.

Key Functions: * HPA_get_tissue_expression - Tissue expression patterns * HPA_get_cell_expression - Single-cell expression data * HPA_get_protein_localization - Subcellular localization

Example:

query = {
    "name": "HPA_get_tissue_expression",
    "arguments": {"gene_symbol": "BRCA1"}
}

Reactome PathwaysΒΆ

Biological pathway database.

Key Functions: * Reactome_get_pathways_by_gene - Pathways for genes * Reactome_search_pathways - Search pathway database * Reactome_get_pathway_details - Detailed pathway information

Example:

query = {
    "name": "Reactome_get_pathways_by_gene",
    "arguments": {"gene_symbol": "TP53"}
}

HumanBaseΒΆ

Tissue-specific gene networks.

Key Functions: * HumanBase_get_gene_networks - Tissue-specific networks * HumanBase_predict_gene_function - Gene function prediction * HumanBase_get_tissue_expression - Tissue expression patterns

MedlinePlusΒΆ

Consumer health information.

Key Functions: * MedlinePlus_get_health_topics - Health topic information * MedlinePlus_search_conditions - Search medical conditions * MedlinePlus_get_drug_info - Consumer drug information

πŸ€– AI-Powered ToolsΒΆ

Machine Learning Models (15 tools)ΒΆ

Apply machine learning algorithms for prediction, classification, and generation tasks.

Core ML Tools:

boltz2_docking - Protein-ligand binding prediction

{
    "name": "boltz2_docking",
    "arguments": {
        "protein_structure": "1ABC",
        "ligand_smiles": "CCO"
    }
}
# Returns: binding_affinity, binding_probability, confidence_score

ADMET_predict_CYP_interactions - Drug metabolism prediction

{
    "name": "ADMET_predict_CYP_interactions",
    "arguments": {
        "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O",  # Aspirin
        "cyp_enzymes": ["CYP3A4", "CYP2D6"]
    }
}
# Returns: interaction_probabilities, metabolic_stability

run_TxAgent_biomedical_reasoning - Therapeutic reasoning

{
    "name": "run_TxAgent_biomedical_reasoning",
    "arguments": {
        "query": "What are the therapeutic targets for Alzheimer's disease?",
        "context": "precision_medicine"
    }
}
# Returns: therapeutic_insights, target_recommendations

AI Agents (33 tools)ΒΆ

Autonomous tools that perceive environments, make decisions, and take actions toward research goals.

Literature & Analysis Agents:

HypothesisGenerator - Generate research hypotheses

{
    "name": "HypothesisGenerator",
    "arguments": {
        "research_area": "cancer immunotherapy",
        "constraints": ["FDA-approved targets", "known biomarkers"],
        "num_hypotheses": 5
    }
}
# Returns: ranked_hypotheses, supporting_evidence, testable_predictions

ExperimentalDesignScorer - Evaluate experimental designs

{
    "name": "ExperimentalDesignScorer",
    "arguments": {
        "experiment_description": "Phase II trial for EGFR inhibitor",
        "evaluation_criteria": ["feasibility", "statistical_power", "ethics"]
    }
}
# Returns: design_score, improvement_suggestions, risk_assessment

MedicalLiteratureReviewer - Comprehensive literature analysis

{
    "name": "MedicalLiteratureReviewer",
    "arguments": {
        "topic": "CAR-T cell therapy safety profile",
        "databases": ["PubMed", "ClinicalTrials.gov"],
        "time_range": "2020-2024"
    }
}
# Returns: comprehensive_review, key_findings, research_gaps

Tool Discovery & CompositionΒΆ

AI tools for discovering and combining other tools.

Key Functions: * discover_tools_by_description - Find tools by natural language * compose_tools_for_workflow - Create tool workflows * optimize_tool_descriptions - Improve tool descriptions

Example:

query = {
    "name": "discover_tools_by_description",
    "arguments": {
        "description": "I need to find genes associated with heart disease"
    }
}

πŸ” Search & Integration ToolsΒΆ

Tool FinderΒΆ

Find appropriate tools for your research needs.

Key Functions: * find_tools_by_keyword - Keyword-based tool search * find_tools_by_category - Browse tools by category * get_tool_recommendations - Get tool recommendations

Example:

query = {
    "name": "find_tools_by_keyword",
    "arguments": {"keywords": ["drug", "safety", "adverse"]}
}

Embedding Stores (4 tools)ΒΆ

Store and retrieve vectorized representations of scientific data for semantic search.

Core Embedding Tools:

embedding_tool_finder - Semantic tool discovery

{
    "name": "embedding_tool_finder",
    "arguments": {
        "query": "predict protein folding dynamics",
        "top_k": 10,
        "similarity_threshold": 0.7
    }
}
# Returns: relevant_tools, similarity_scores, tool_descriptions

embedding_database_search - Vector similarity search

{
    "name": "embedding_database_search",
    "arguments": {
        "query_vector": embedding_vector,
        "database": "pubmed_abstracts",
        "top_k": 50
    }
}
# Returns: similar_documents, relevance_scores, metadata

Data IntegrationΒΆ

Tools for combining data from multiple sources.

Key Functions: * integrate_gene_data - Combine gene data from multiple sources * cross_reference_identifiers - Map between different ID systems * validate_data_consistency - Check data consistency

πŸ› οΈ Tool Usage PatternsΒΆ

Single Tool QueriesΒΆ

Simple, focused queries for specific information:

# Get protein info
protein_query = {
    "name": "UniProt_get_protein_info",
    "arguments": {"gene_symbol": "EGFR"}
}

# Search adverse events
safety_query = {
    "name": "FAERS_count_reactions_by_drug_event",
    "arguments": {"medicinalproduct": "metformin"}
}

Multi-Tool WorkflowsΒΆ

Combine multiple tools for comprehensive analysis:

# Step 1: Get disease info
disease_query = {
    "name": "OpenTargets_get_disease_id_description_by_name",
    "arguments": {"diseaseName": "diabetes"}
}

# Step 2: Get associated targets
targets_query = {
    "name": "OpenTargets_get_associated_targets_by_disease_efoId",
    "arguments": {"efoId": disease_id}
}

# Step 3: Analyze target pathways
pathway_query = {
    "name": "Enrichr_analyze_gene_list",
    "arguments": {
        "genes": target_list,
        "library": "KEGG_2021_Human"
    }
}

Batch ProcessingΒΆ

Process multiple related queries efficiently:

# Process multiple genes
genes = ["BRCA1", "BRCA2", "TP53", "ATM"]

results = {}
for gene in genes:
    query = {
        "name": "UniProt_get_protein_info",
        "arguments": {"gene_symbol": gene}
    }
    results[gene] = tu.run(query)

Integration PatternsΒΆ

Multi-Tool WorkflowsΒΆ

Combine multiple tools for comprehensive analysis:

from tooluniverse import ToolUniverse

# Drug discovery workflow
def drug_discovery_pipeline(disease_name):
    tooluni = ToolUniverse()
    tooluni.load_tools()

    # 1. Find disease ID
    disease_query = {
        "name": "opentarget_get_disease_id_description_by_name",
        "arguments": {"disease_name": disease_name}
    }
    disease_info = tooluni.run(disease_query)

    # 2. Get associated targets
    targets_query = {
        "name": "opentarget_get_associated_targets_by_disease_efoId",
        "arguments": {"disease_efo_id": disease_info['id']}
    }
    targets = tooluni.run(targets_query)

    # 3. Find drugs for each target
    drugs = []
    for target in targets[:5]:  # Top 5 targets
        drugs_query = {
            "name": "opentarget_get_associated_drugs_by_target_ensemblID",
            "arguments": {
                "target_ensembl_id": target['id'],
                "size": 10,
                "cursor": ""
            }
        }
        target_drugs = tooluni.run(drugs_query)
        drugs.extend(target_drugs)

    # 4. Check safety profiles
    for drug in drugs[:10]:  # Top 10 drugs
        safety_query = {
            "name": "openfda_get_warnings_by_drug_name",
            "arguments": {"drug_name": drug['name']}
        }
        safety = tooluni.run(safety_query)
        drug['safety_warnings'] = safety

    return drugs

Tool Composition PatternsΒΆ

Sequential Workflows:

# Disease β†’ Targets β†’ Compounds β†’ Prediction
workflow = [
    ("OpenTargets_get_associated_targets_by_disease_efoId", {"efoId": disease_id}),
    ("ChEMBL_search_compounds_by_target", {"target_id": target_result}),
    ("boltz2_docking", {"protein_id": target, "ligand_smiles": compound}),
    ("ADMETAI_predict_admet_properties", {"smiles": compound})
]

Parallel Data Gathering:

# Multi-database literature search
parallel_searches = [
    ("PubTator_search_publications", {"query": research_topic}),
    ("EuropePMC_search_articles", {"query": research_topic}),
    ("SemanticScholar_search_papers", {"query": research_topic})
]

Feedback Loops:

# Iterative optimization
while not satisfactory_result:
    prediction = ml_model_prediction(current_compound)
    if prediction.score < threshold:
        analogs = chemical_database_search(current_compound)
        current_compound = select_best_analog(analogs)
    else:
        break

πŸ“ˆ Tool Performance TipsΒΆ

Optimization StrategiesΒΆ

  1. Use specific queries: More specific queries return faster

  2. Limit results: Use limit parameter to control result size

  3. Cache results: Enable caching for repeated queries

  4. Batch when possible: Some tools support batch operations

Rate LimitingΒΆ

ToolUniverse automatically handles API rate limits, but you can optimize:

import time

# Add delays for large batch operations
for query in large_query_list:
    result = tu.run(query)
    time.sleep(0.1)  # Small delay between requests

Error HandlingΒΆ

Always include error handling for robust applications:

try:
    result = tu.run(query)
    if result and 'data' in result:
        # Process successful result
        process_data(result['data'])
    else:
        print("No data returned")
except Exception as e:
    print(f"Query failed: {e}")

Performance OptimizationΒΆ

Category-Specific ConsiderationsΒΆ

ML Models: - Remote execution reduces local resource requirements - Batch predictions when possible - Cache results for expensive computations

APIs: - Respect rate limits and implement backoff - Use pagination for large datasets - Cache frequent queries

Databases: - Use specific field queries instead of full searches - Implement result limits for exploration - Index frequently accessed data

Agents: - Configure appropriate timeout values - Use streaming for long-running tasks - Implement progress monitoring

Best PracticesΒΆ

  1. Tool Selection: Choose the right tool for your specific use case

  2. Rate Limiting: Respect API rate limits to avoid blocking

  3. Error Handling: Always handle potential API errors gracefully

  4. Caching: Use caching for frequently accessed data

  5. Batch Processing: Use batch operations when available for efficiency

  6. Configuration: Configure tools appropriately for your environment

Tool Discovery & SelectionΒΆ

Finding the Right ToolsΒΆ

By Category:

# List tools by category
ml_tools = tu.list_tools_by_category("ML Models")
database_tools = tu.list_tools_by_category("Databases")
api_tools = tu.list_tools_by_category("APIs")

By Functionality:

# Semantic search across all categories
protein_tools = tu.run({
    "name": "find_tools",
    "arguments": {"query": "protein structure prediction", "limit": 10}
})
drug_tools = tu.run({
    "name": "find_tools",
    "arguments": {"query": "drug safety analysis", "limit": 10}
})
literature_tools = tu.run({
    "name": "find_tools",
    "arguments": {"query": "literature review automation", "limit": 10}
})

By Domain:

# Load domain-specific tools
tu.load_tools(tool_type=[
    "opentarget",    # Disease-target data
    "ChEMBL",        # Chemical data
    "uniprot",       # Protein data
    "pubtator"       # Literature with entities
])

API AuthenticationΒΆ

# Environment-based API key management
import os

# Recommended: Use environment variables
api_keys = {
    'OPENTARGETS_API_KEY': os.getenv('OPENTARGETS_API_KEY'),
    'NCBI_API_KEY': os.getenv('NCBI_API_KEY'),
    'SEMANTIC_SCHOLAR_API_KEY': os.getenv('SEMANTIC_SCHOLAR_API_KEY')
}

# ToolUniverse automatically manages authentication
tu = ToolUniverse()
tu.configure_api_keys(api_keys)

Future ExtensionsΒΆ

Planned Categories: - Visualization Tools: Interactive plotting and dashboard generation - Workflow Engines: Advanced orchestration and scheduling - Cloud Services: Distributed computing and storage - Compliance Tools: Regulatory and ethics validation

Community Contributions: - Tool submission guidelines - Quality assurance processes - Community voting and validation - Maintenance and updates

🎯 Next Steps¢

Now that you know what tools are available:

  • πŸš€ Try Examples: Examples & Code Samples - See tools in action

  • πŸ”¬ Build Workflows: Scientific Workflows - Combine tools for research

  • ⚑ Optimize: best_practices - Performance and production tips

  • πŸ› οΈ Create Custom: ../tutorials/custom_tools - Build your own tools

Tip

Discovery tip: Use the AI-powered tool discovery features to find the right tools for your specific research questions!

Tip

Tool ecosystem synergy: The eight categories are designed to work together. APIs provide data access, ML models add intelligence, agents orchestrate complex workflows, while databases and embedding stores enable efficient information management.