Uniref Tools

Configuration File: uniref_tools.json Tool Type: Local Tools Count: 2

This page contains all tools defined in the uniref_tools.json configuration file.

Available Tools

UniRef_get_cluster (Type: UniRefTool)

Get detailed information about a UniProt UniRef protein sequence cluster. UniRef clusters group r…

UniRef_get_cluster tool specification

Tool Information:

  • Name: UniRef_get_cluster

  • Type: UniRefTool

  • Description: Get detailed information about a UniProt UniRef protein sequence cluster. UniRef clusters group related protein sequences at different identity thresholds: UniRef100 (identical sequences + sub-fragments), UniRef90 (90%+ identity), UniRef50 (50%+ identity). Returns cluster name, member count, representative member details (protein name, organism, accession, sequence length), common taxon, seed sequence, update date, and full protein sequence. Use this to understand protein sequence families and find representative sequences.

Parameters:

  • cluster_id (string) (required) UniRef cluster ID. Format: UniRefNN_ACCESSION. Examples: ‘UniRef90_P04637’ (p53 cluster at 90% identity, 160 members), ‘UniRef50_P04637’ (p53 at 50% identity), ‘UniRef100_P04637’ (identical to p53), ‘UniRef90_P00533’ (EGFR at 90%).

Example Usage:

query = {
    "name": "UniRef_get_cluster",
    "arguments": {
        "cluster_id": "example_value"
    }
}
result = tu.run(query)

UniRef_search_clusters (Type: UniRefTool)

Search UniProt UniRef protein sequence clusters by protein name, gene, organism, or keyword. UniR…

UniRef_search_clusters tool specification

Tool Information:

  • Name: UniRef_search_clusters

  • Type: UniRefTool

  • Description: Search UniProt UniRef protein sequence clusters by protein name, gene, organism, or keyword. UniRef clusters group related sequences to reduce redundancy: UniRef90 (default, 90% identity) clusters ~250 million sequences into ~150 million clusters. Useful for finding protein families, identifying sequence redundancy, and obtaining representative sequences for a protein of interest.

Parameters:

  • query (string) (required) Search query. Examples: ‘p53’, ‘insulin’, ‘kinase Homo sapiens’, ‘BRCA1’, ‘hemoglobin’. Supports protein names, gene symbols, organism names.

  • cluster_type (string) (optional) Cluster identity level: ‘UniRef100’ (identical), ‘UniRef90’ (90% identity, default), ‘UniRef50’ (50% identity). Lower identity = fewer, larger clusters.

  • size (integer) (optional) Maximum number of results to return (default: 10, max: 25).

Example Usage:

query = {
    "name": "UniRef_search_clusters",
    "arguments": {
        "query": "example_value"
    }
}
result = tu.run(query)