nano-banana

nilecui/nano-banana

Design

26 installs

About

SKILL.md

nano-banana

nilecui/nano-banana

Design

26 installs

About

Generate and edit high-quality AI images using Google's Gemini 3 Pro Image model (Nano Banana Pro) via MCP...

SKILL.md

Nano Banana Pro - AI Image Generation

Generate stunning 4K images, edit photos, and create graphics with perfect text rendering using Google's latest Gemini 3 Pro Image model via MCP.

When to Use

Invoke when user:

Asks to "generate an image" or "create a picture"
Wants to "edit this photo" or "modify this image"
Needs graphics with text (logos, infographics, diagrams)
Requests "consistent characters" across multiple images
Says "visualize this" or "make me a [visual thing]"

Prerequisites

1. Gemini API Key

Get a free API key from Google AI Studio:

Sign in with Google account
Click "Get API Key" → "Create API Key"
Copy and save securely

2. MCP Server Setup

Recommended: NanoBanana-MCP (uses Gemini 3 Pro for highest quality)

# Quick install via Claude Code CLI
claude mcp add nano-banana --env GEMINI_API_KEY=your-key-here -- npx -y nanobanana-mcp

Or add to ~/.claude/settings.json manually:

{
  "mcpServers": {
    "nano-banana": {
      "command": "npx",
      "args": ["-y", "nanobanana-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Alternative: Nano-Banana-MCP by ConechoAI (Gemini 2.5 Flash - faster, lower cost)

{
  "mcpServers": {
    "nano-banana": {
      "command": "npx",
      "args": ["nano-banana-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

Once MCP is configured, these tools become available:

Core Tools

Tool	Purpose	Key Parameters
`gemini_generate_image`	Create new images from text prompts	`prompt`, `model`, `aspectRatio`, `imageSize`
`gemini_edit_image`	Modify existing images with instructions	`imagePath`, `instructions`, `model`
`continue_editing`	Refine the last generated image	`instructions`
`get_image_history`	List all generated images in session	-

Model Options

Model ID	Description
`gemini-3-pro-image-preview`	Default. Highest quality, 4K support, best text rendering
`gemini-2.0-flash-exp`	Faster generation, good quality, lower cost
`gemini-2.0-flash-preview-image-generation`	Alternative 2.0 model

Image Size (Gemini 3 only)

Size	Use Case
`4K`	Final assets, print, marketing materials
`2K`	Balanced quality and speed
`1K`	Fast iteration, prototyping

Advanced Features

Feature	Capability
4K Output	Up to 5632×3072 pixels
Text Rendering	Accurate text in images (signs, labels, UI)
Multi-Image Composition	Combine up to 14 reference images
Character Consistency	Maintain same character across 5+ images
Google Search Grounding	Real-world accurate imagery

Prompting Best Practices

Structure Your Prompts

[Subject] + [Style] + [Details] + [Technical Specs]

Example:

"A cozy coffee shop interior, watercolor illustration style, warm lighting, wooden furniture, steaming cup on table, 4K resolution, soft morning light through windows"

For Best Results

Be Specific - Include colors, materials, lighting, mood
Specify Style - "photorealistic", "oil painting", "3D render", "anime"
Add Context - Time of day, weather, setting
Request Resolution - "4K", "high resolution", "detailed"

Precision Mode (JSON Prompting)

For high-stakes work requiring exact reproducibility, use structured JSON schemas.

When to Activate

Trigger phrases:

"I need exact control over..."
"Create a product shot for [brand]..."
"Generate a UI mockup..."
"Make an infographic showing..."
"I want to iterate on just the lighting..."
"A/B test different versions..."

Three Schema Types

Type	Use Case	Key Controls
`marketing_image`	Product shots, hero images	subject, props, lighting, camera, brand locks
`ui_builder`	App screens, dashboards	tokens, screens, containers, components
`diagram_spec`	Flowcharts, infographics	nodes, edges, data constraints

The Translator Workflow

Describe - User explains what they want in plain English
Clarify - Claude asks targeted questions for missing fields
Generate - Claude outputs structured JSON schema
Review - User checks key fields match intent
Render - JSON converts to precise prompt for Nano Banana Pro
Iterate - Modify specific fields, re-render (scoped changes)

Example: Product Shot

User: "I need a hero shot for Aurora Lime seltzer"

Claude asks: "For the Aurora Lime hero shot:

Can size? (12oz standard?)
Props? (lime slices, ice, condensation?)
Background style? (solid color, gradient, bokeh?)
Lighting mood? (bright/refreshing or moody/premium?)"

Result: Structured JSON with exact specifications that can be iterated field-by-field.

Scoped Edits (The Key Unlock)

JSON enables changing ONE thing without regenerating everything:

Change	What Stays Fixed
Swap lighting direction	Subject, props, background
Try different camera angle	Lighting, props, environment
Change background color	Subject geometry, lighting setup
Add/remove props	Everything else

Reference Docs

references/json-prompting.md - Full JSON prompting guide
references/translator-prompt.md - Translator system prompt
references/schemas/ - Template schemas for each type
references/examples-json.md - Filled-out examples

Text in Images

Nano Banana Pro excels at text rendering:

"A vintage movie poster for 'COSMIC ADVENTURE' with bold retro typography, starfield background, astronaut silhouette, 1970s sci-fi aesthetic"

Character Consistency

For consistent characters across images:

Generate initial character with detailed description
Use history:0 reference in subsequent prompts
Describe scene changes while referencing original

First: "A young woman with red curly hair, freckles, green eyes, wearing a blue jacket"
Then: "The same woman from history:0, now sitting at a café, reading a book"

Workflow Examples

Basic Image Generation

User: "Create an image of a futuristic city at sunset"

Claude uses: gemini_generate_image
Prompt: "Futuristic cityscape at golden hour sunset, towering glass skyscrapers with holographic advertisements, flying vehicles, warm orange and purple sky, photorealistic, 4K resolution, cinematic lighting"

Photo Editing

User: "Edit this photo to make it look like winter"

Claude uses: gemini_edit_image
Input: [user's image path]
Instructions: "Transform to winter scene: add snow on ground and surfaces, frost on windows, visible breath, overcast sky, cool blue color grading"

Iterative Refinement

User: "Make the lighting warmer"

Claude uses: continue_editing
Instructions: "Adjust lighting to warmer tones, add golden hour glow, enhance orange/yellow highlights, softer shadows"

Output Management

Images save to: ~/Documents/nanobanana_generated/

Naming format: generated-[timestamp]-[id].png

Security Notes

API keys stored locally in environment variables
Never committed to version control
Images processed locally, not stored on external servers
Use .env files for key management in projects

Model Comparison

Model	Speed	Quality	Cost	Best For
`gemini-3-pro-image-preview`	Slower	Highest (4K)	Higher	Final assets, print, marketing
`gemini-2.0-flash-exp`	Fast	Good	Lower	Prototyping, iteration, drafts

Troubleshooting

Issue	Solution
"API key invalid"	Verify key at AI Studio
"Rate limited"	Wait 60s, or upgrade API tier
"MCP not connected"	Restart Claude Code, check config syntax
"Image not saving"	Check write permissions on output directory

Integration

Works well with:

Artifacts Builder - Generate images for HTML artifacts
Process Mapper - Create diagram visuals
Research to Essay - Add illustrations to content

References

references/prompting-guide.md - Detailed prompting techniques
references/examples.md - Sample prompts by category

About

SKILL.md

About

Generate and edit high-quality AI images using Google's Gemini 3 Pro Image model (Nano Banana Pro) via MCP...

SKILL.md

Nano Banana Pro - AI Image Generation

Generate stunning 4K images, edit photos, and create graphics with perfect text rendering using Google's latest Gemini 3 Pro Image model via MCP.

When to Use

Invoke when user:

Asks to "generate an image" or "create a picture"
Wants to "edit this photo" or "modify this image"
Needs graphics with text (logos, infographics, diagrams)
Requests "consistent characters" across multiple images
Says "visualize this" or "make me a [visual thing]"

Prerequisites

1. Gemini API Key

Get a free API key from Google AI Studio:

Sign in with Google account
Click "Get API Key" → "Create API Key"
Copy and save securely

2. MCP Server Setup

Recommended: NanoBanana-MCP (uses Gemini 3 Pro for highest quality)

# Quick install via Claude Code CLI
claude mcp add nano-banana --env GEMINI_API_KEY=your-key-here -- npx -y nanobanana-mcp

Or add to ~/.claude/settings.json manually:

{
  "mcpServers": {
    "nano-banana": {
      "command": "npx",
      "args": ["-y", "nanobanana-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Alternative: Nano-Banana-MCP by ConechoAI (Gemini 2.5 Flash - faster, lower cost)

{
  "mcpServers": {
    "nano-banana": {
      "command": "npx",
      "args": ["nano-banana-mcp"],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

Once MCP is configured, these tools become available:

Core Tools

Tool	Purpose	Key Parameters
`gemini_generate_image`	Create new images from text prompts	`prompt`, `model`, `aspectRatio`, `imageSize`
`gemini_edit_image`	Modify existing images with instructions	`imagePath`, `instructions`, `model`
`continue_editing`	Refine the last generated image	`instructions`
`get_image_history`	List all generated images in session	-

Model Options

Model ID	Description
`gemini-3-pro-image-preview`	Default. Highest quality, 4K support, best text rendering
`gemini-2.0-flash-exp`	Faster generation, good quality, lower cost
`gemini-2.0-flash-preview-image-generation`	Alternative 2.0 model

Image Size (Gemini 3 only)

Size	Use Case
`4K`	Final assets, print, marketing materials
`2K`	Balanced quality and speed
`1K`	Fast iteration, prototyping

Advanced Features

Feature	Capability
4K Output	Up to 5632×3072 pixels
Text Rendering	Accurate text in images (signs, labels, UI)
Multi-Image Composition	Combine up to 14 reference images
Character Consistency	Maintain same character across 5+ images
Google Search Grounding	Real-world accurate imagery

Prompting Best Practices

Structure Your Prompts

[Subject] + [Style] + [Details] + [Technical Specs]

Example:

"A cozy coffee shop interior, watercolor illustration style, warm lighting, wooden furniture, steaming cup on table, 4K resolution, soft morning light through windows"

For Best Results

Be Specific - Include colors, materials, lighting, mood
Specify Style - "photorealistic", "oil painting", "3D render", "anime"
Add Context - Time of day, weather, setting
Request Resolution - "4K", "high resolution", "detailed"

Precision Mode (JSON Prompting)

For high-stakes work requiring exact reproducibility, use structured JSON schemas.

When to Activate

Trigger phrases:

"I need exact control over..."
"Create a product shot for [brand]..."
"Generate a UI mockup..."
"Make an infographic showing..."
"I want to iterate on just the lighting..."
"A/B test different versions..."

Three Schema Types

Type	Use Case	Key Controls
`marketing_image`	Product shots, hero images	subject, props, lighting, camera, brand locks
`ui_builder`	App screens, dashboards	tokens, screens, containers, components
`diagram_spec`	Flowcharts, infographics	nodes, edges, data constraints

The Translator Workflow

Describe - User explains what they want in plain English
Clarify - Claude asks targeted questions for missing fields
Generate - Claude outputs structured JSON schema
Review - User checks key fields match intent
Render - JSON converts to precise prompt for Nano Banana Pro
Iterate - Modify specific fields, re-render (scoped changes)

Example: Product Shot

User: "I need a hero shot for Aurora Lime seltzer"

Claude asks: "For the Aurora Lime hero shot:

Can size? (12oz standard?)
Props? (lime slices, ice, condensation?)
Background style? (solid color, gradient, bokeh?)
Lighting mood? (bright/refreshing or moody/premium?)"

Result: Structured JSON with exact specifications that can be iterated field-by-field.

Scoped Edits (The Key Unlock)

JSON enables changing ONE thing without regenerating everything:

Change	What Stays Fixed
Swap lighting direction	Subject, props, background
Try different camera angle	Lighting, props, environment
Change background color	Subject geometry, lighting setup
Add/remove props	Everything else

Reference Docs

references/json-prompting.md - Full JSON prompting guide
references/translator-prompt.md - Translator system prompt
references/schemas/ - Template schemas for each type
references/examples-json.md - Filled-out examples

Text in Images

Nano Banana Pro excels at text rendering:

"A vintage movie poster for 'COSMIC ADVENTURE' with bold retro typography, starfield background, astronaut silhouette, 1970s sci-fi aesthetic"

Character Consistency

For consistent characters across images:

Generate initial character with detailed description
Use history:0 reference in subsequent prompts
Describe scene changes while referencing original

First: "A young woman with red curly hair, freckles, green eyes, wearing a blue jacket"
Then: "The same woman from history:0, now sitting at a café, reading a book"

Workflow Examples

Basic Image Generation

User: "Create an image of a futuristic city at sunset"

Claude uses: gemini_generate_image
Prompt: "Futuristic cityscape at golden hour sunset, towering glass skyscrapers with holographic advertisements, flying vehicles, warm orange and purple sky, photorealistic, 4K resolution, cinematic lighting"

Photo Editing

User: "Edit this photo to make it look like winter"

Claude uses: gemini_edit_image
Input: [user's image path]
Instructions: "Transform to winter scene: add snow on ground and surfaces, frost on windows, visible breath, overcast sky, cool blue color grading"

Iterative Refinement

User: "Make the lighting warmer"

Claude uses: continue_editing
Instructions: "Adjust lighting to warmer tones, add golden hour glow, enhance orange/yellow highlights, softer shadows"

Output Management

Images save to: ~/Documents/nanobanana_generated/

Naming format: generated-[timestamp]-[id].png

Security Notes

API keys stored locally in environment variables
Never committed to version control
Images processed locally, not stored on external servers
Use .env files for key management in projects

Model Comparison

Model	Speed	Quality	Cost	Best For
`gemini-3-pro-image-preview`	Slower	Highest (4K)	Higher	Final assets, print, marketing
`gemini-2.0-flash-exp`	Fast	Good	Lower	Prototyping, iteration, drafts

Troubleshooting

Issue	Solution
"API key invalid"	Verify key at AI Studio
"Rate limited"	Wait 60s, or upgrade API tier
"MCP not connected"	Restart Claude Code, check config syntax
"Image not saving"	Check write permissions on output directory

Integration

Works well with:

Artifacts Builder - Generate images for HTML artifacts
Process Mapper - Create diagram visuals
Research to Essay - Add illustrations to content

References

references/prompting-guide.md - Detailed prompting techniques
references/examples.md - Sample prompts by category