agent-browser
Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button",
Overview
The agent-browser skill, hosted within the mxyhi/ok-skills repository, provides a command-line interface designed for AI agents to perform complex web interactions. This tool enables agents like Claude, Cursor, and Windsurf to execute tasks such as navigating between pages, interacting with UI elements like buttons and forms, and capturing screenshots for visual verification. According to the documentation in the source repository, it is particularly effective for automating repetitive browser-based workflows and extracting structured data from web pages. The repository has gained significant community interest, currently showing 423 stars. By integrating this skill, developers can empower their AI agents to handle live web environments, facilitating automated testing and real-time information retrieval across various websites and web applications.
Use Cases
Install Notes
# Review source first
open https://github.com/mxyhi/ok-skills/blob/main/agent-browser/SKILL.mdCopy or clone the skill folder into your agent skills directory after reviewing its instructions and scripts.
Security Notes
Users should be aware that this skill allows AI agents to interact directly with live web environments and input data into forms. As with any browser automation tool, it is important to monitor the agent's actions to ensure compliance with website terms of service and to protect sensitive information during automated sessions.
Related Skills
Web Application Testing
anthropics/skills
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
Electron App Automation
mxyhi/ok-skills
Automate Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify, etc.) using agent-browser via Chrome DevTools Protocol. Use when the user needs to interact with an Electron app, automate a desktop app, connect to a running app, control a native app, or test an Electron application. Triggers include "au
Browser Trace
mxyhi/ok-skills
Capture a full DevTools-protocol trace of any browser automation — CDP firehose, screenshots, and DOM dumps — then bisect the stream into per-page searchable buckets. Use when the user wants to debug a failed run, audit network/console/DOM activity, attach a trace to an in-progress session, or feed structured per-page
Kimi WebBridge
mxyhi/ok-skills
Control the user's real browser (with their login sessions) via a local daemon at http://127.0.0.1:10086.
Ably — Realtime Infrastructure as a Service
TerminalSkills/skills
You are an expert in Ably, the enterprisegrade realtime messaging platform. You help developers add pub/sub messaging, presence, chat, live updates, and event streaming to applications with guaranteed message ordering, exactlyonce delivery, automatic reconnection, and global edge infrastructure — handling millions of m