From johnlindquist-claude
Analyze images using Gemini's vision capabilities. Use for image analysis, text extraction from screenshots, and visual content understanding.
npx claudepluginhub joshuarweaver/cascade-ai-ml-engineering --plugin johnlindquist-claudeThis skill uses the workspace's default tool permissions.
Analyze images using Gemini Pro's vision capabilities.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Analyze images using Gemini Pro's vision capabilities.
pip install google-generativeai
export GEMINI_API_KEY=your_api_key
# Analyze an image
gemini -m pro -f /path/to/image.png "Describe this image in detail"
# With specific question
gemini -m pro -f screenshot.png "What error message is shown?"
# Multiple images
gemini -m pro -f image1.png -f image2.png "Compare these two images"
gemini -m pro -f image.png "Describe this image comprehensively:
1. Main subject/content
2. Colors and composition
3. Text visible (if any)
4. Context and purpose
5. Notable details"
gemini -m pro -f screenshot.png "Extract all text from this image.
Format as plain text, preserving layout where possible.
Include any text in buttons, labels, or UI elements."
gemini -m pro -f code-screenshot.png "Extract the code from this screenshot.
Provide as properly formatted code with correct indentation.
Note any parts that are unclear or partially visible."
gemini -m pro -f ui-screenshot.png "Analyze this UI:
1. What application/website is this?
2. What page/screen is shown?
3. Main UI elements and their purpose
4. User flow/actions available
5. Any UX issues or suggestions"
gemini -m pro -f error-screenshot.png "Analyze this error:
1. What error is shown?
2. What is the likely cause?
3. How to fix it?
4. Any related information visible?"
gemini -m pro -f diagram.png "Explain this diagram:
1. What type of diagram is this?
2. Main components and their relationships
3. Data/process flow
4. Key takeaways"
gemini -m pro -f debug-screen.png "I'm debugging an issue. From this screenshot:
1. What is the current state?
2. What errors or warnings are visible?
3. What should I look at?
4. Suggested next steps"
gemini -m pro -f before.png -f after.png "Compare these before and after images:
1. What changed?
2. Is this an improvement?
3. Any issues in the 'after' version?
4. Anything missing?"
gemini -m pro -f design.png "Provide design feedback:
1. Visual hierarchy
2. Color usage
3. Typography
4. Spacing and alignment
5. Accessibility concerns
6. Suggestions for improvement"
gemini -m pro -f chart.png "Extract data from this chart:
1. Chart type
2. Data series and values
3. Axes labels and ranges
4. Key trends or insights
5. Output as structured data if possible"
gemini -m pro -f form.png "Analyze this form:
1. Form purpose
2. Fields and their types
3. Required vs optional
4. Validation rules visible
5. UX suggestions"
# Capture screenshot (macOS)
screencapture -i /tmp/bug.png
# Analyze and format as issue
gemini -m pro -f /tmp/bug.png "Create a bug report from this screenshot:
## Summary
[One-line description]
## Steps to Reproduce
[Inferred from screenshot]
## Expected Behavior
[What should happen]
## Actual Behavior
[What the screenshot shows]
## Environment
[Any visible system info]"
gemini -m pro -f ui-design.png "Generate React component code that recreates this UI:
- Use Tailwind CSS for styling
- Make it responsive
- Include proper TypeScript types
- Add appropriate accessibility attributes"
gemini -m pro -f app-screen.png "Write user documentation for this screen:
- What this screen is for
- How to use each feature
- Common tasks
- Tips and notes"