Data Quality Tools

Configuration File: data_quality_tools.json Tool Type: Local Tools Count: 1

This page contains all tools defined in the data_quality_tools.json configuration file.

Available Tools

DataQuality_assess (Type: DataQualityTool)

Assess the quality of a tabular dataset (CSV file or JSON array of records). Returns per-column s…

DataQuality_assess tool specification

Tool Information:

  • Name: DataQuality_assess

  • Type: DataQualityTool

  • Description: Assess the quality of a tabular dataset (CSV file or JSON array of records). Returns per-column statistics (data type, missing count/percentage, unique values, numeric min/max/mean/std, categorical mode/top values), overall summary (total rows, columns, complete cases), and warnings for columns with >20% missing values, zero variance, potential outliers (>3 SD from mean), and highly correlated numeric pairs (|r| > 0.95). Pure local computation with pandas – no external API calls. Useful for pre-analysis data validation and quality control.

Parameters:

  • data (unknown) (required) Input dataset: either a JSON array of records (list of dicts) or an absolute path to a CSV file.

  • columns ([‘array’, ‘null’]) (optional) List of column names to assess. Default: all columns.

Example Usage:

query = {
    "name": "DataQuality_assess",
    "arguments": {
        "data": "example_value"
    }
}
result = tu.run(query)