Data Validator
Perform comprehensive data quality checks on datasets — validate schemas, detect anomalies, find duplicates, and enforce data contracts. Essential for ETL pipelines where bad data silently corrupts downstream analytics and dashboards.
Overview
The Data Validator is a specialized skill for AI agents, part of the TerminalSkills/skills repository on GitHub. This tool addresses the critical need for data integrity within automated workflows and ETL pipelines. By enabling agents like Claude, Gemini, and Codex to perform comprehensive quality checks, it helps prevent the silent corruption of downstream analytics and dashboards. The skill facilitates schema validation, anomaly detection, and duplicate identification while enforcing strict data contracts. As part of a repository with 72 stars, this skill provides a structured approach to maintaining dataset health. It is designed for developers using coding-focused agents to ensure that incoming data meets predefined standards before further processing or visualization occurs.
Use Cases
Install Notes
# Review source first
open https://github.com/TerminalSkills/skills/blob/main/skills/data-validator/SKILL.mdCopy or clone the skill folder into your agent skills directory after reviewing its instructions and scripts.
Security Notes
Users should ensure that the AI agent has appropriate read permissions for the datasets being analyzed. When processing sensitive or regulated information, verify that the agent's environment complies with local data privacy standards, as the skill interacts directly with dataset contents to perform validation and anomaly detection.
Related Skills
Feedback Analysis
TerminalSkills/skills
Collect user feedback from multiple channels, categorize it, extract patterns, and turn it into prioritized product decisions. Build a systematic process from raw input to actionable insight.
Data Extractor
TerminalSkills/skills
Extract structured data from documents in any format: PDF, DOCX, HTML, TXT, images, and more. Converts unstructured or semistructured content into clean JSON, CSV, or other structured formats. Handles invoices, forms, reports, and freetext documents.
Pandas
TerminalSkills/skills
Pandas is a Python library for loading, cleaning, transforming, and analyzing tabular data. It provides DataFrames for structured data manipulation, supports CSV, Excel, SQL, JSON, and Parquet formats, and offers powerful groupby aggregation, merge/join operations, time series resampling, and method chaining for buildi
Data Analysis
TerminalSkills/skills
Analyze tabular data from CSV, Excel, or other structured formats. Generate summary statistics, discover patterns, answer specific questions, and produce visualizations. Uses Python with pandas for data manipulation and matplotlib/seaborn for charts.