Skills / blink skill
blink skill
Proactive screen awareness + Claude Vision assistance
Installation
Compatibility
Description
๐๏ธ Blink Skill ยท ่ฎฉไฝ ็ AI ๅญฆไผใ็จ็ผใ่งๅฏไฝ ็็ต่
Teach your AI to blink โ watch your screen, understand your context, help without being asked.
ไฝๆๆฌ่ทๅพ AI ไธปๅจๅๅฉๅ่ฝ๏ผๆ็ฅไฝ ๅจๅไปไน โ ไธปๅจ้ฎไฝ ้ไธ้่ฆๅธฎๅฟ โ ๆชๅพ โ Claude Vision ๅๆ โ ็ปๅบๅๅฉใ
ไธไธช็จ็ผ๏ผAI ๅฐฑ็ฅ้ไฝ ๅจๅไปไนใ
ไธ้่ฆไฝ ไธปๅจๆ่ฟฐ๏ผไธ้่ฆๆชๅพๅ็ปๅฎโโๅฎ่ชๅทฑ็๏ผ่ชๅทฑ้ฎ๏ผ่ชๅทฑๅธฎใ
โจ What is Blink? ไปไนๆฏใ็จ็ผใ๏ผ
Most AI assistants are reactive โ they wait for you to describe your situation, paste content, or upload screenshots. That's friction.
Blink removes that friction. It's a proactive AI skill that watches your PC activity via a lightweight Python sentinel, and when it detects you're in a meeting, working on a document, watching a video, or coding โ it reaches out and asks if you need help. One screenshot is all it takes.
ๅคงๅคๆฐ AI ๅฉๆๆฏ่ขซๅจ็โโไฝ ๅพๆ่ฟฐๆ ๅตใ็ฒ่ดดๅ ๅฎนใไธไผ ๆชๅพใ
็จ็ผ็ๆไบ่ฟไธๅ๏ผAI ่ชๅทฑ็๏ผ่ชๅทฑ้ฎ๏ผไธๅผ ๆชๅพ๏ผ็ดๆฅๅๅฉใ
PC Sentinel watches your screen (Python, 1s interval)
โ
Detects: Tencent Meeting / WPS doc / Bilibili video / Claude Code
โ
AI asks proactively: "Hey, want me to help with that?"
โ
You say: "Yes" / "ๅธฎๆๆด็ไธไธ" / "Summarize this"
โ
Blink takes ONE screenshot โ Claude Vision analyzes it
โ
Result: meeting notes / document analysis / video summary / code review
โ
Need more? Scroll manually โ say "็ปง็ปญ" โ up to 3 screenshots total
๐ Demo
| Scenario | What Blink Does | |----------|----------------| | ๐ฅ Tencent Meeting opens | "Want me to record and summarize the meeting?" | | ๐ WPS document opens | "Want me to help with this document?" โ screenshots & analyzes it | | ๐ฌ Bilibili/YouTube video | "Want me to summarize this video?" | | ๐ป Claude Code / Codex active | "Want a screenshot analysis of your coding context?" |
๐ Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Windows PC โ
โ โ
โ pc_sentinel.pyw โโwritesโโโถ pc-status.json (every 1s) โ
โ (Python watcher) { status, detail, foreground } โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ fs.watch / poll
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Node.js WebSocket Server โ
โ โ
โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Proactive Layer โ โ Blink Engine โ โ
โ โ โ โ โ โ
โ โ fs.watch detectsโโโโโถโ blinkOnce(task, opts) โ โ
โ โ meeting / WPS / โ โ โโ CDP screenshot (browser) โ โ
โ โ video / coding โ โ โโ PowerShell screenshot โ โ
โ โ โ โ (WPS / system) โ โ
โ โ _proactiveOn โ โ โ โ
โ โ Connect() โ โ analyzeAndArchive(task, opts)โ โ
โ โโโโโโโโโโโโโโโโโโโ โ โโ multi-screen + Word doc โ โ
โ โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ โ
โ โ axios POST โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโโโ
โ Claude Vision โ
โ API (Anthropic) โ
โโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโผโโโโโโโโโโ
โ Browser UI / TTS โ
โ (WebSocket client) โ
โโโโโโโโโโโโโโโโโโโโโโ
๐ Project Structure
changzheng-blink/
โโโ README.md # This file
โโโ LICENSE # MIT
โโโ skill.md # Claude Code skill manifest
โโโ snippets/
โโโ screen_vision.js # Core: screenshot + Claude Vision
โโโ proactive-detection.js # Core: PC state detection + proactive trigger
โโโ blink-functions.js # Core: blink session management
Integration into ้ฟๅพๆบ (pc-server/)
The full server integration lives in the main Changzheng repo:
pc-server/
โโโ server.js # WebSocket server + routing
โ โโโ fs.watch(STATUS_FILE) # โ proactive trigger (background apps)
โ โโโ _proactiveOnConnect(ws) # โ trigger on new connection
โ โโโ _broadcastProactive(...) # โ send question to all clients
โ โโโ _blinkStart(ws, ...) # โ start a blink session
โ โโโ _blinkContinue(ws, ...) # โ handle "็ปง็ปญ" for next screenshot
โโโ lib/
โโโ screen_vision.js # โ screenshot + Vision (same as snippets/)
๐ ๏ธ Setup
Prerequisites
- Windows 10/11 (screenshot uses PowerShell)
- Node.js 18+
- Python 3.8+ (for
pc_sentinel.pyw) - Claude API Key (get one here)
- Chrome/Edge browser (optional, for CDP-based browser screenshots)
Install
npm install ws axios docx
1. Run the PC Sentinel
The sentinel watches active windows and writes state every second:
python pc_sentinel.pyw
# Writes to: ~/.openclaw/workspace/pc-status.json
pc-status.json format:
{
"status": "meeting",
"detail": "bg:wemeetapp",
"foreground": "msedge",
"fg_title": "My Browser - Microsoft Edge",
"idle_seconds": 0,
"updated_at": "2026-04-08T14:30:00"
}
Possible status values: meeting, office, browsing, coding, video, unknown
2. Integrate into your WebSocket server
const screenVision = require('./snippets/screen_vision');
const { watchProactiveTrigger, checkOnConnect } = require('./snippets/proactive-detection');
const { startBlink, handleBlinkContinue } = require('./snippets/blink-functions');
// Watch for proactive triggers
watchProactiveTrigger(STATUS_FILE, getContext, broadcastProactive);
// On new WebSocket connection
wss.on('connection', (ws) => {
checkOnConnect(ws, getContext, sendFn, sendTTSFn);
});
// In your message router (highest priority)
async function routeMessage(ws, text) {
// 1. Check for blink continuation
const continued = await handleBlinkContinue(ws, text, { send, sendTTS, addHistory, reply });
if (continued) return;
// 2. Check for proactive confirmation/denial
if (ws._proactivePending) {
if (DENY_RE.test(text)) { /* dismiss */ return; }
if (/* isConfirm */ true) { await startBlink(ws, text, { send, sendTTS, addHistory }); return; }
}
// 3. Normal routing...
}
โ๏ธ Configuration
| Option | Default | Description |
|--------|---------|-------------|
| BLINK_MAX | 3 | Maximum screenshots per session |
| PROBE_DISMISS_MS | 30 min | Cooldown after user dismissal |
| forceSystem | false | Use PowerShell screenshot instead of CDP |
| scrollCount | 5 | Screens for analyzeAndArchive (multi-page) |
๐ Key State Variables
// Per-connection (on ws object)
ws._proactivePending // { type, appKey, askedAt } โ pending proactive question
ws._blinkSession // { task, forceSystem, summaries[], count } โ active blink session
// Global
_probeDismissed // Map<key, timestamp> โ 30-min cooldown per app
_probeLastKey // string โ last broadcast key (dedup)
_probeCooldown // number โ timestamp of last broadcast
๐ง How Proactive Detection Works
// pc-status.json is watched for changes
fs.watch(STATUS_FILE, async () => {
const ctx = await getContext();
// Parse detail (can be string or object depending on detection method)
// String: "bg:wemeetapp" (background process)
// Object: { matched_by, fg_process, fg_window } (foreground process)
const { matchedBy, fgProcess } = parseDetail(ctx);
// Detect trigger type
if (ctx.status === 'meeting' && /wemeet|zoom/i.test(matchedBy + fgProcess)) {
// โ ask about recording
} else if (ctx.status === 'office' && /wps/i.test(matchedBy + fgProcess)) {
// โ ask about document help
}
// Use unified key format: "probe:meeting", "probe:document"
const key = `probe:${triggerType}`;
if (!_isDismissed(key)) broadcastProactive(...);
});
๐ธ Screenshot Methods
| Method | Used For | How |
|--------|----------|-----|
| CDP (Chrome DevTools Protocol) | Browser tabs | chrome.tabs.captureVisibleTab via CDP |
| PowerShell EncodedCommand | WPS, system-wide | System.Drawing.Graphics.CopyFromScreen via Base64-encoded PS |
PowerShell commands are Base64-encoded as UTF-16LE to avoid quoting issues and AMSI false positives.
๐ Blink Session Flow
User confirms help
โ
โผ
[WPS? โ AppActivate WPS window]
โ
โผ
blinkOnce(task, { forceSystem })
โ
โผ
Claude Vision analyzes screenshot
โ
โผ
Output: "ใ็ฌฌ1ๅผ ๆชๅพๅๆใ..."
โ
โผ
"็ฟป้กตๅ่ฏด็ปง็ปญๅฏไปฅๅ็ไธๅฑ (max 3)"
โ
User scrolls manually
โ
User says: "็ปง็ปญ" / "ไธไธ้กต" / "ๅฅฝไบ"
โ
โผ
blinkOnce(task, { screenIndex: 2, prevSummaries: [...] })
โ
โผ
Output: "ใ็ฌฌ2ๅผ ๆชๅพๅๆใ..." (new content only)
๐ค Contributing
PRs welcome! Key areas for improvement:
- [ ] macOS support (replace PowerShell with
screencapture) - [ ] Auto-scroll for WPS (currently manual due to window focus issues)
- [ ] More trigger apps (Slack, Notion, VSCode)
- [ ] Voice confirmation (STT integration)
๐ License
MIT ยฉ 2026 ๅฑฑ่
๐ฅ Contributors
| | Name | Role | |---|------|------| | ๐งโ๐ป | ZHANG Tianrui (@zhangtianruiwork-droid) | Creator ยท Architecture ยท Integration | | ๐ค | Claude (Anthropic) | Co-developer ยท Code generation ยท Vision analysis |
Built as part of ้ฟๅพๆบ (Changzheng) โ a personal AI assistant that runs 24/7 on Windows, powered by Claude + DeepSeek.
Related Skills
last30days skill
AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
frontend slides
Create beautiful slides on the web using Claude's frontend skills
context mode
Context window optimization for AI coding agents. Sandboxes tool output, 98% reduction. 14 platforms
claude seo
Universal SEO skill for Claude Code. 19 sub-skills, 12 subagents, 3 extensions (DataForSEO, Firecrawl, Banana). Technical SEO, E-E-A-T, schema, GEO/AEO, backlinks, local SEO, maps intelligence, Google APIs, and PDF/Excel reporting.
claude ads
Comprehensive paid advertising audit & optimization skill for Claude Code. 250+ checks across Google, Meta, YouTube, LinkedIn, TikTok, Microsoft & Apple Ads with weighted scoring, parallel agents, industry templates, and AI creative generation.
claude obsidian
Claude + Obsidian knowledge companion. Persistent, compounding wiki vault based on Karpathy's LLM Wiki pattern. /wiki /save /autoresearch