gemini

MCP Server that enables Claude code to interact with Gemini

★ 184von @RLabs-IncMITGitHub →

Installation

Claude Code

claude mcp add gemini -- npx -y @rlabs-inc/gemini-mcp

npx

npx -y @rlabs-inc/gemini-mcp

npm: @rlabs-inc/gemini-mcp

Transport

stdio

Tools (20)

Feature

Description

Claude

Gemini

Architecture

Visual components

Integration

React/CSS styling

Reasoning

Creative generation

Variable

Required

Description

GEMINI_API_KEY

Yes

GEMINI_OUTPUT_DIR

GEMINI_MODEL

GEMINI_PRO_MODEL

GEMINI_FLASH_MODEL

GEMINI_IMAGE_MODEL

GEMINI_IMAGE_THINKING_LEVEL

GEMINI_VIDEO_MODEL

VERBOSE

QUIET

GEMINI_ENABLED_TOOLS

GEMINI_TOOL_PRESET

Preset

Tool Groups

Dokumentation

MCP Server Gemini

A Model Context Protocol (MCP) server for integrating Google's Gemini 3 models with Claude Code, enabling powerful collaboration between both AI systems. Now with a beautiful CLI!

MCP Registry Support: Now discoverable in the official MCP ecosystem!

Features

| Feature |-------------------------------|- | Deep Research Agent | Token Counting | Text-to-Speech | URL Analysis | Context Caching | YouTube Analysis | Document Analysis | 4K Image Generation | Multi-Turn Image Editing | Video Generation | Code Execution | Google Search | Structured Output | Data Extraction | Thinking Levels | Direct Query | Brainstorming | Code Analysis | Summarization | Description | ----------------------------------------------------------------| | Autonomous multi-step research with web search and citations | | Count tokens and estimate costs before API calls | | 30 unique voices, single speaker or two-speaker dialogues | | Analyze, compare, and extract data from web pages | | Cache large documents for efficient repeated queries | | Analyze videos by URL with timestamp clipping | | PDFs, DOCX, spreadsheets with table extraction | | Generate images up to 4K with 10 aspect ratios | | Iteratively refine images through conversation | | Create videos with Veo 2.0 (async with polling) | | Gemini writes and runs Python code (pandas, numpy, matplotlib) | | Real-time web information with inline citations | | JSON responses with schema validation | | Extract entities, facts, sentiment from text | | Control reasoning depth (minimal/low/medium/high) | | Send prompts to Gemini 3 Pro/Flash models | | Claude + Gemini collaborative problem-solving | | Analyze code for quality, security, performance | | Summarize content at different detail levels |

Quick Installation

MCP Server for Claude Code

# Using npm (Recommended)
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY npx -y @rlabs-inc/gemini-mcp

# Using bun
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY bunx @rlabs-inc/gemini-mcp

CLI (Global Install)

# Install globally
npm install -g @rlabs-inc/gemini-mcp

# Set your API key once (stored securely)
gcli config set api-key YOUR_KEY

# Now use any command!
gcli search "latest news"
glci image "sunset over mountains" --ratio 16:9

Get your API key: Visit Google AI Studio - it's free and takes seconds!

Installation Options

# With verbose logging
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY VERBOSE=true bunx -y @rlabs-inc/gemini-mcp

# With custom output directory for generated images/videos
claude mcp add gemini -s user -- env GEMINI_API_KEY=YOUR_KEY GEMINI_OUTPUT_DIR=/path/to/output bunx -y @rlabs-inc/gemini-mcp

Available Tools

gemini-query

Direct queries to Gemini with thinking level control:

prompt: "Explain quantum entanglement"
model: "pro" or "flash"
thinkingLevel: "low" | "medium" | "high" (optional)

low: Fast responses, minimal reasoning
medium: Balanced (Flash only)
high: Deep reasoning for complex tasks (default)

gemini-generate-image

Generate images with Nano Banana Pro (Claude can SEE them!):

prompt: "a futuristic city at sunset"
style: "cyberpunk" (optional)
aspectRatio: "16:9" (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)
imageSize: "2K" (1K, 2K, 4K)
useGoogleSearch: false (ground in real-world info)
thinkingLevel: "high" (optional - minimal, low, medium, high)
personGeneration: "ALLOW_ALL" (optional - ALLOW_ALL, ALLOW_ADULT, ALLOW_NONE)
seed: 42 (optional - for reproducible results)

gemini-start-image-edit

Start a multi-turn image editing session:

prompt: "a cozy cabin in the mountains"
aspectRatio: "16:9"
imageSize: "2K"
useGoogleSearch: false
thinkingLevel: "high" (optional - minimal, low, medium, high)
personGeneration: "ALLOW_ALL" (optional - ALLOW_ALL, ALLOW_ADULT, ALLOW_NONE)
seed: 42 (optional - for reproducible results)

Returns a session ID for iterative editing.

gemini-continue-image-edit

Continue refining an image:

sessionId: "edit-123456789"
prompt: "add snow on the roof and make it nighttime"

gemini-end-image-edit

Close an editing session:

sessionId: "edit-123456789"

gemini-list-image-sessions

List all active editing sessions.

gemini-generate-video

Generate videos using Veo:

prompt: "a cat playing piano"
aspectRatio: "16:9" (optional)
negativePrompt: "blurry, text" (optional)

Video generation is async (takes 1-5 minutes). Use gemini-check-video to poll.

gemini-check-video

Check video generation status and download when complete:

operationId: "operations/xxx-xxx-xxx"

gemini-analyze-code

Analyze code for issues:

code: "function foo() { ... }"
language: "typescript" (optional)
focus: "quality" | "security" | "performance" | "bugs" | "general"

gemini-analyze-text

Analyze text content:

text: "Your text here..."
type: "sentiment" | "summary" | "entities" | "key-points" | "general"

gemini-brainstorm

Collaborative brainstorming:

prompt: "How could we implement real-time collaboration?"
claudeThoughts: "I think we should use WebSockets..."
maxRounds: 3 (optional)

gemini-summarize

Summarize content:

content: "Long text to summarize..."
length: "brief" | "moderate" | "detailed"
format: "paragraph" | "bullet-points" | "outline"

gemini-run-code

Let Gemini write and execute Python code:

prompt: "Calculate the first 50 prime numbers and plot them"
data: "optional CSV data to analyze" (optional)

Supports libraries: numpy, pandas, matplotlib, scipy, scikit-learn, tensorflow, and more. Generated charts are saved to the output directory and returned as images.

gemini-search

Real-time web search with citations:

query: "What happened in tech news this week?"
returnCitations: true (default)

Returns grounded responses with inline citations and source URLs.

gemini-structured

Get JSON responses matching a schema:

prompt: "Extract the meeting details from this email..."
schema: '{"type":"object","properties":{"date":{"type":"string"},"attendees":{"type":"array"}}}'
useGoogleSearch: false (optional)

gemini-extract

Convenience tool for common extraction patterns:

text: "Your text to analyze..."
extractType: "entities" | "facts" | "summary" | "keywords" | "sentiment" | "custom"
customFields: "name, date, amount" (for custom extraction)

gemini-youtube

Analyze YouTube videos directly:

url: "https://www.youtube.com/watch?v=..."
question: "What happens at 2:30?"
startTime: "1m30s" (optional, for clipping)
endTime: "5m00s" (optional, for clipping)

gemini-youtube-summary

Quick video summarization:

url: "https://www.youtube.com/watch?v=..."
style: "brief" | "detailed" | "bullet-points" | "chapters"

gemini-analyze-document

Analyze PDFs and documents:

filePath: "/path/to/document.pdf"
question: "Summarize the key findings"
mediaResolution: "low" | "medium" | "high"

gemini-summarize-pdf

Quick PDF summarization:

filePath: "/path/to/document.pdf"
style: "brief" | "detailed" | "outline" | "key-points"

gemini-extract-tables

Extract tables from documents:

filePath: "/path/to/document.pdf"
outputFormat: "markdown" | "csv" | "json"

Workflow: Claude + Gemini

The killer combination for development:

| Claude | Gemini | |--------|--------| | Complex logic | Frontend/UI | | Architecture | Visual components | | Backend code | Image generation | | Integration | React/CSS styling | | Reasoning | Creative generation |

Example workflow:

Ask Claude to design the backend API
Use gemini-generate-image for UI mockups
Ask Gemini to generate React components via gemini-query
Use multi-turn editing to refine visuals
Let Claude wire everything together

Environment Variables

| Variable | Required | Default | Description | |-------------------------|----------|------------------------------|-------------------------------| | GEMINI_API_KEY | Yes | - | Your Google Gemini API key | | GEMINI_OUTPUT_DIR | No | ./gemini-output | Where to save generated files | | GEMINI_MODEL | No | - | Override model for init test | | GEMINI_PRO_MODEL | No | gemini-3-pro-preview | Pro model (Gemini 3) | | GEMINI_FLASH_MODEL | No | gemini-3-flash-preview | Flash model (Gemini 3) | | GEMINI_IMAGE_MODEL | No | gemini-3-pro-image-preview | Image model (Nano Banana Pro) | | GEMINI_IMAGE_THINKING_LEVEL | No | high | Default thinking level for image generation (minimal, low, medium, high) | | GEMINI_VIDEO_MODEL | No | veo-2.0-generate-001 | Video model | | VERBOSE | No | false | Enable verbose logging | | QUIET | No | false | Minimize logging | | GEMINI_ENABLED_TOOLS | No | - | Comma-separated list of tool groups to load (e.g., query,search,image-gen) | | GEMINI_TOOL_PRESET | No | - | Preset profile: minimal, text, image, research, media, full |

Tool Configuration

By default, all 37 tools are loaded. To reduce context usage, configure which tools to load:

Available Presets

| Preset | Tool Groups | |--------|-------------| | minimal | query, brainstorm | | text | query, brainstorm, analyze, summarize, structured | | image | query, image-gen, image-edit, image-analyze | | research | query, search, deep-research, url-context, document | | media | query, image-gen, image-edit, image-analyze, video-gen, youtube, speech | | full | All 18 tool groups (default) |

Using Presets

# Minimal - query and brainstorm
GEMINI_TOOL_PRESET=minimal

# Text processing
GEMINI_TOOL_PRESET=text  # query, brainstorm, analyze, summarize, structured

# Image workflows
GEMINI_TOOL_PRESET=image  # query, image-gen, image-edit, image-analyze

# Research workflows
GEMINI_TOOL_PRESET=research  # query, search, deep-research, url-context, document

Using Explicit Tool Lists

# Only specific tools
GEMINI_ENABLED_TOOLS=query,search,image-gen

Combining Preset + Explicit

# Start with preset, add extras
GEMINI_TOOL_PRESET=minimal
GEMINI_ENABLED_TOOLS=search,image-gen  # Adds to minimal preset

Available Tool Groups

| Group | Tools | |-------|-------| | query | gemini-query | | brainstorm | gemini-brainstorm | | analyze | gemini-analyze-code, gemini-analyze-text | | summarize | gemini-summarize | | image-gen | gemini-generate-image, gemini-image-prompt | | image-edit | gemini-start-image-edit, gemini-continue-image-edit, gemini-end-image-edit, gemini-list-image-sessions | | video-gen | gemini-generate-video, gemini-check-video | | code-exec | gemini-run-code | | search | gemini-search | | structured | gemini-structured, gemini-extract | | youtube | gemini-youtube, gemini-youtube-summary | | document | gemini-analyze-document, gemini-summarize-pdf, gemini-extract-tables | | url-context | gemini-analyze-url, gemini-compare-urls, gemini-extract-from-url | | cache | gemini-create-cache, gemini-query-cache, gemini-list-caches, gemini-delete-cache | | speech | gemini-speak, gemini-dialogue, gemini-list-voices | | token-count | gemini-count-tokens | | deep-research | gemini-deep-research, gemini-check-research, gemini-research-followup | | image-analyze | gemini-analyze-image |

Manual Installation

Global Install

# Using npm
npm install -g @rlabs-inc/gemini-mcp

# Using bun
bun install -g @rlabs-inc/gemini-mcp

Claude Code Configuration

{
  "gemini": {
    "command": "npx",
    "args": ["-y", "@rlabs-inc/gemini-mcp"],
    "env": {
      "GEMINI_API_KEY": "your-api-key",
      "GEMINI_OUTPUT_DIR": "/path/to/save/files"
    }
  }
}

Troubleshooting

Rate Limits (429 Errors)

If you're hitting rate limits on the free tier:

Set GEMINI_MODEL=gemini-3-flash-preview to use Flash for init (higher limits)
Or upgrade to a paid plan

Connection Issues

Verify your API key at Google AI Studio
Check server status: claude mcp list
Try with verbose logging: VERBOSE=true

Image/Video Issues

Ensure your API key has access to image/video generation
Check output directory permissions
Files save to GEMINI_OUTPUT_DIR (default: ./gemini-output)
For 4K images, generation takes longer

Previous Versions

0.7.2

Beautiful CLI with Themes! Use Gemini directly from your terminal:

# Install globally
npm install -g @rlabs-inc/gemini-mcp

# Set your API key once
gcli config set api-key YOUR_KEY

# Generate images, videos, search, research, and more!
gcli image "a cat astronaut" --size 4K
gcli search "latest AI news"
gcli research "quantum computing applications" --wait
gcli speak "Hello world" --voice Puck

5 Beautiful Themes: terminal, neon, ocean, forest, minimal

CLI Commands:

gcli query - Direct Gemini queries with thinking levels
gcli search - Real-time web search with citations
gcli research - Deep research agent
gcli image - Generate images (up to 4K)
gcli video - Generate videos with Veo
gcli speak - Text-to-speech with 30 voices
gcli tokens - Count tokens and estimate costs
gcli config - Manage settings

v0.6.x: Deep Research, Token Counting, TTS, URL analysis, Context Caching v0.5.x: 30+ tools, YouTube analysis, Document analysis v0.4.x: Code execution, Google Search v0.3.x: Thinking levels, Structured output, 4K images v0.2.x: Image/Video generation with Veo

Development

git clone https://github.com/rlabs-inc/gemini-mcp.git
cd gemini-mcp
bun install
bun run build
bun run dev -- --verbose

Scripts

| Command | Description | |---------------------|-----------------------------| | bun run build | Build for production | | bun run dev | Development mode with watch | | bun run typecheck | Type check without emitting | | bun run format | Format with Prettier | | bun run lint | Lint with ESLint |

License

MIT License

Made with Claude + Gemini working together