Codex Claude Code ClaudeCursorGemini

Data Validator

Perform comprehensive data quality checks on datasets — validate schemas, detect anomalies, find duplicates, and enforce data contracts. Essential for ETL pipelines where bad data silently corrupts downstream analytics and dashboards.

Overview

The Data Validator is a specialized skill for AI agents, part of the TerminalSkills/skills repository on GitHub. This tool addresses the critical need for data integrity within automated workflows and ETL pipelines. By enabling agents like Claude, Gemini, and Codex to perform comprehensive quality checks, it helps prevent the silent corruption of downstream analytics and dashboards. The skill facilitates schema validation, anomaly detection, and duplicate identification while enforcing strict data contracts. As part of a repository with 72 stars, this skill provides a structured approach to maintaining dataset health. It is designed for developers using coding-focused agents to ensure that incoming data meets predefined standards before further processing or visualization occurs.

Use Cases

Verifying dataset schemas against predefined contracts during ETL pipeline execution.

Identifying statistical anomalies and duplicate records in raw data files.

Ensuring data quality before feeding information into analytics dashboards.

Install Notes

# Review source first
open https://github.com/TerminalSkills/skills/blob/main/skills/data-validator/SKILL.md

Copy or clone the skill folder into your agent skills directory after reviewing its instructions and scripts.

Security Notes

Users should ensure that the AI agent has appropriate read permissions for the datasets being analyzed. When processing sensitive or regulated information, verify that the agent's environment complies with local data privacy standards, as the skill interacts directly with dataset contents to perform validation and anomaly detection.

Related Skills

Interview Me

addyosmani/agent-skills

Data Analysis

Extracts what the user actually wants instead of what they think they should want. Achieves this through one-question-at-a-time interview until ~95% confidence about the underlying intent. Use when an ask is underspecified ("build me X" without "for whom" or "why now"), when the user explicitly invokes ("interview m...

reactreview

80,654 StarsMIT

Derive Client

vercel-labs/agent-browser

Data Analysis

Reverse-engineer a website's internal API by recording browser traffic into a HAR file, then generate a standalone client or CLI that calls the endpoints directly, with no browser needed after the first recording. Use when asked to "derive a client", "build a CLI for <site>", "reverse engineer this site's API", "rec...

browserdata

39,283 StarsApache-2.0

Electron

vercel-labs/agent-browser

Data Analysis

Automate Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify, etc.) using agent-browser via Chrome DevTools Protocol. Use when the user needs to interact with an Electron app, automate a desktop app, connect to a running app, control a native app, or test an Electron application. Triggers include...

CodexClaude

designbrowser

39,283 StarsApache-2.0

Codeql

trailofbits/skills

Data Analysis

Scans a codebase for security vulnerabilities using CodeQL's interprocedural data flow and taint tracking analysis. Triggers on "run codeql", "codeql scan", "codeql analysis", "build codeql database", or "find vulnerabilities with codeql". Supports "run all" (security-and-quality + security-experimental suites) and...

Claude CodeClaude

typescriptpython

6,280 StarsSource linked

Supabase Postgres Best Practices

supabase/agent-skills

Data Analysis

Postgres performance optimization and best practices from Supabase. Use this skill when writing, reviewing, or optimizing Postgres queries, schema designs, or database configurations.

securityreview

2,428 StarsMIT

Supabase

supabase/agent-skills

Data Analysis

Use when doing ANY task involving Supabase. Triggers: Supabase products (Database, Auth, Edge Functions, Realtime, Storage, Vectors, Cron, Queues); client libraries and SSR integrations (supabase-js, @supabase/ssr) in Next.js, React, SvelteKit, Astro, Remix; auth issues (login, logout, sessions, JWT, cookies, getSes...

reactfrontend

2,428 StarsMIT