Pdc Tools

Configuration File: pdc_tools.json Tool Type: Local Tools Count: 6

This page contains all tools defined in the pdc_tools.json configuration file.

Available Tools

PDC_get_clinical_data (Type: PDCTool)

Get clinical metadata for patient cases in a cancer proteomics study from the NCI Proteomics Data…

PDC_get_clinical_data tool specification

Tool Information:

  • Name: PDC_get_clinical_data

  • Type: PDCTool

  • Description: Get clinical metadata for patient cases in a cancer proteomics study from the NCI Proteomics Data Commons (PDC). Given a PDC study accession number, returns paginated clinical data including case IDs, disease type, primary site, and demographic information (gender, ethnicity, race) for each patient. Supports pagination with offset and limit parameters. Use this to understand the patient population in a proteomics study for downstream analysis or to link proteomics data with clinical characteristics.

Parameters:

  • operation (string) (required) Operation type

  • pdc_study_id (string) (required) PDC study accession number (e.g., ‘PDC000127’)

  • offset (integer) (optional) Pagination offset (default: 0)

  • limit (integer) (optional) Number of records to return (default: 20, max varies by study)

Example Usage:

query = {
    "name": "PDC_get_clinical_data",
    "arguments": {
        "operation": "example_value",
        "pdc_study_id": "example_value"
    }
}
result = tu.run(query)

PDC_get_gene_protein (Type: PDCTool)

Get protein information and proteomics study coverage for a gene from the NCI Proteomics Data Com…

PDC_get_gene_protein tool specification

Tool Information:

  • Name: PDC_get_gene_protein

  • Type: PDCTool

  • Description: Get protein information and proteomics study coverage for a gene from the NCI Proteomics Data Commons (PDC). Given a gene symbol (e.g., TP53, EGFR, MYC), returns the gene description, NCBI gene ID, HGNC authority, associated protein accessions (UniProt, RefSeq, Ensembl), and spectral count data across all PDC studies where the protein was detected. Spectral counts indicate protein abundance evidence in each study. Useful for understanding which cancer proteomics datasets have measured a specific protein.

Parameters:

  • operation (string) (required) Operation type

  • gene_name (string) (required) Gene symbol to look up (e.g., ‘TP53’, ‘EGFR’, ‘BRCA1’, ‘MYC’)

Example Usage:

query = {
    "name": "PDC_get_gene_protein",
    "arguments": {
        "operation": "example_value",
        "gene_name": "example_value"
    }
}
result = tu.run(query)

PDC_get_quant_data_matrix (Type: PDCTool)

Get the quantitative protein abundance matrix (gene x aliquot/case) for a CPTAC/PDC study - the c…

PDC_get_quant_data_matrix tool specification

Tool Information:

  • Name: PDC_get_quant_data_matrix

  • Type: PDCTool

  • Description: Get the quantitative protein abundance matrix (gene x aliquot/case) for a CPTAC/PDC study - the core quantitative CPTAC output (log2 ratios / abundance values), NOT spectral counts. PDC’s other tools return study metadata, programs, clinical data, and per-gene spectral counts (presence evidence) only. Given a PDC study ID (e.g. ‘PDC000127’ for CPTAC CCRCC) and a data_type (e.g. ‘log2_ratio’), returns the column header of aliquot identifiers and per-gene quantitative value rows. The full matrix can be thousands of genes x hundreds of aliquots; gene rows are truncated to max_genes (default 50) while the aliquot header is always returned in full. Find study IDs with PDC_search_studies or PDC_get_study_summary.

Parameters:

  • operation (string) (required) Operation type

  • pdc_study_id (string) (required) PDC study identifier (e.g. ‘PDC000127’ for CPTAC CCRCC Discovery Study - Proteome).

  • data_type (string) (optional) Quantitation data type. Common values: ‘log2_ratio’ (default), ‘unshared_log2_ratio’, ‘precursor_area’, ‘unshared_precursor_area’.

  • max_genes (integer) (optional) Maximum number of gene rows to return (the aliquot column header is always returned in full). Default 50. Use a larger value to retrieve more of the matrix.

Example Usage:

query = {
    "name": "PDC_get_quant_data_matrix",
    "arguments": {
        "operation": "example_value",
        "pdc_study_id": "example_value"
    }
}
result = tu.run(query)

PDC_get_study_summary (Type: PDCTool)

Get detailed metadata for a specific cancer proteomics study from the NCI Proteomics Data Commons…

PDC_get_study_summary tool specification

Tool Information:

  • Name: PDC_get_study_summary

  • Type: PDCTool

  • Description: Get detailed metadata for a specific cancer proteomics study from the NCI Proteomics Data Commons (PDC). Given a PDC study accession number (e.g., PDC000127), returns comprehensive study information including disease type, primary site, analytical fraction (Proteome, Phosphoproteome, Acetylome, Glycoproteome), experiment type (TMT10, TMT11, iTRAQ, LFQ), sample counts, program/project names, embargo status, and file counts by data category. Useful for understanding study scope, data availability, and experimental design before downloading data.

Parameters:

  • operation (string) (required) Operation type

  • pdc_study_id (string) (required) PDC study accession number (e.g., ‘PDC000127’, ‘PDC000120’, ‘PDC000173’)

Example Usage:

query = {
    "name": "PDC_get_study_summary",
    "arguments": {
        "operation": "example_value",
        "pdc_study_id": "example_value"
    }
}
result = tu.run(query)

PDC_list_programs (Type: PDCTool)

List all programs and projects in the NCI Proteomics Data Commons (PDC). Returns the complete hie…

PDC_list_programs tool specification

Tool Information:

  • Name: PDC_list_programs

  • Type: PDCTool

  • Description: List all programs and projects in the NCI Proteomics Data Commons (PDC). Returns the complete hierarchy of programs (CPTAC, ICPC, APOLLO, CBTN, Georgetown, Broad Institute, etc.) and their constituent projects. Each program represents a major cancer proteomics initiative, and projects within a program correspond to specific cancer studies or sub-studies. Use program/project IDs to explore associated studies.

Parameters:

  • operation (string) (required) Operation type

Example Usage:

query = {
    "name": "PDC_list_programs",
    "arguments": {
        "operation": "example_value"
    }
}
result = tu.run(query)

PDC_search_studies (Type: PDCTool)

Search the NCI Proteomics Data Commons (PDC) for cancer proteomics studies by keyword. PDC houses…

PDC_search_studies tool specification

Tool Information:

  • Name: PDC_search_studies

  • Type: PDCTool

  • Description: Search the NCI Proteomics Data Commons (PDC) for cancer proteomics studies by keyword. PDC houses annotated data from CPTAC, ICPC, APOLLO, and other cancer proteomics programs covering 19+ cancer types with 160+ datasets. Search by disease name (e.g., ‘Breast’, ‘Lung’, ‘CCRCC’), program name (e.g., ‘CPTAC’, ‘HTAN’), or analytical fraction (e.g., ‘Proteome’, ‘Phosphoproteome’). Returns matching study IDs, names, and PDC accession numbers. Use PDC study IDs with PDC_get_study_summary for detailed metadata.

Parameters:

  • operation (string) (required) Operation type

  • query (string) (required) Search keyword for studies. Can be disease name (Breast, Lung, Renal), program name (CPTAC, HTAN), or fraction type (Proteome, Phosphoproteome, Acetylome).

Example Usage:

query = {
    "name": "PDC_search_studies",
    "arguments": {
        "operation": "example_value",
        "query": "example_value"
    }
}
result = tu.run(query)