Cdc Tools

Configuration File: cdc_tools.json Tool Type: Local Tools Count: 3

This page contains all tools defined in the cdc_tools.json configuration file.

Available Tools

cdc_data_aggregate (Type: CDCRESTTool)

Run server-side SoQL aggregation on a CDC Data.CDC.gov (Socrata) dataset using the /resource/{dat…

cdc_data_aggregate tool specification

Tool Information:

  • Name: cdc_data_aggregate

  • Type: CDCRESTTool

  • Description: Run server-side SoQL aggregation on a CDC Data.CDC.gov (Socrata) dataset using the /resource/{dataset_id}.json endpoint, which (unlike cdc_data_get_dataset’s legacy /api/views endpoint) honors $select/$group/$having/$query. Compute grouped statistics across the ENTIRE dataset server-side (counts/sum/avg per disease event, per state, per year) instead of downloading up to 50000 raw rows and aggregating client-side. Provide select_clause with an aggregate (e.g. ‘event, count(*)’) plus group_clause (‘event’), OR provide a full soql_query (e.g. ‘SELECT event, count(*) AS n GROUP BY event ORDER BY n DESC LIMIT 10’). Field names are the dataset’s API field names (lowercase). Find dataset_id with cdc_data_search_datasets and inspect field names via cdc_data_get_dataset. No authentication required.

Parameters:

  • dataset_id (string) (required) Dataset ID (Socrata 4x4 resource id) from Data.CDC.gov, e.g. ‘jkcx-ndu8’ (weekly notifiable disease counts).

  • select_clause ([‘string’, ‘null’]) (optional) SoQL $select expression, typically grouping columns plus an aggregate function. Examples: ‘event, count(*)’, ‘states, sum(m1)’, ‘avg(value) AS mean’. Aggregate result columns are named like ‘count_1’, ‘sum_m1’ unless you add ‘AS alias’.

  • group_clause ([‘string’, ‘null’]) (optional) SoQL $group expression: comma-separated non-aggregate columns in select_clause. Examples: ‘event’, ‘states, year’.

  • where_clause ([‘string’, ‘null’]) (optional) Optional SoQL $where filter applied before aggregation. Example: “year = ‘2019’”, “states = ‘ALABAMA’”.

  • having_clause ([‘string’, ‘null’]) (optional) Optional SoQL $having filter applied to aggregated groups. Example: ‘count(*) > 100’.

  • order_by ([‘string’, ‘null’]) (optional) Optional SoQL $order expression. Example: ‘count_1 DESC’, ‘event ASC’.

  • soql_query ([‘string’, ‘null’]) (optional) Optional full SoQL query string. When provided it OVERRIDES select_clause/where_clause/group_clause/having_clause/order_by/limit/offset (it must carry its own clauses). Example: ‘SELECT event, count(*) AS n GROUP BY event ORDER BY n DESC LIMIT 10’.

  • limit ([‘integer’, ‘null’]) (optional) Maximum number of grouped result rows to return (default 50, max 50000). Ignored when soql_query is provided.

  • offset ([‘integer’, ‘null’]) (optional) Number of result rows to skip for pagination (default 0). Ignored when soql_query is provided.

Example Usage:

query = {
    "name": "cdc_data_aggregate",
    "arguments": {
        "dataset_id": "example_value"
    }
}
result = tu.run(query)

cdc_data_get_dataset (Type: CDCRESTTool)

Retrieve data from a specific CDC dataset on Data.CDC.gov. Requires a dataset ID (view ID) which …

cdc_data_get_dataset tool specification

Tool Information:

  • Name: cdc_data_get_dataset

  • Type: CDCRESTTool

  • Description: Retrieve data from a specific CDC dataset on Data.CDC.gov. Requires a dataset ID (view ID) which can be found using cdc_data_search_datasets.

Parameters:

  • dataset_id (string) (required) Dataset ID (view ID) from Data.CDC.gov (e.g., ‘p5x4-u35c’)

  • limit (integer) (optional) Maximum number of rows to return (default: 100, max: 50000)

  • offset (integer) (optional) Number of rows to skip for pagination (default: 0)

  • where_clause (string) (optional) Optional SoQL WHERE clause for filtering (e.g., “year = ‘2020’”)

  • order_by (string) (optional) Optional column name to order by (e.g., ‘year’)

Example Usage:

query = {
    "name": "cdc_data_get_dataset",
    "arguments": {
        "dataset_id": "example_value"
    }
}
result = tu.run(query)

cdc_data_search_datasets (Type: CDCRESTTool)

Search for datasets on Data.CDC.gov (CDC’s Socrata-based open data portal). Returns a list of ava…

cdc_data_search_datasets tool specification

Tool Information:

  • Name: cdc_data_search_datasets

  • Type: CDCRESTTool

  • Description: Search for datasets on Data.CDC.gov (CDC’s Socrata-based open data portal). Returns a list of available datasets matching search criteria. Use this to discover datasets before querying data.

Parameters:

  • search_query (string) (optional) Search term to find datasets (e.g., ‘mortality’, ‘vaccination’, ‘covid’)

  • category (string) (optional) Optional category filter (e.g., ‘Health’, ‘Public Safety’)

  • limit (integer) (optional) Maximum number of datasets to return (default: 50, max: 1000)

  • offset (integer) (optional) Number of results to skip for pagination (default: 0)

Example Usage:

query = {
    "name": "cdc_data_search_datasets",
    "arguments": {
    }
}
result = tu.run(query)