Skip to content

Content Blueprint

Generate Images With AI as Creative Director

AgriciDanielMay 2026

A Claude Code skill that orchestrates Gemini Nano for image generation, editing, and visual asset creation with intent-based prompt engineering.

Image GenerationImage EditingMulti-Turn SessionsPrompt EngineeringBatch Workflows

Install Command

Run this command to deploy the blueprint to your environment.

What problem does this solve?

Turns Claude into a Creative Director that interprets your visual intent instead of passing raw text to an image model. It analyzes what you actually need, selects the right domain expertise (Cinema, Product, Portrait, Editorial, UI, Logo, Landscape, Infographic, Abstract), constructs optimized prompts using Google's 5-component formula, and orchestrates Gemini for the best possible results. Without this intelligence, you either get mediocre results from vague prompts or spend hours manually engineering every detail.

How does it work?

Once installed, you use the `/banana` command with variants for different tasks. 1. For image generation, describe your intent and Claude asks clarifying questions about use case and brand context, then constructs a detailed 5-component prompt (Subject to Action to Location/Context to Composition to Style) and generates the image. 2. For editing, provide an image path and edit instructions, and Claude rephrases them into specific transformations (edge-preserving background removal, color shifts, outpainting). 3. For multi-turn sessions, use `/banana chat` to refine characters or styles across iterations while maintaining consistency. 4. For batch variations, `/banana batch` generates N distinct versions rotating different prompt components. Additional modes: `/banana inspire` to browse 2,500+ prompt examples, `/banana preset` to manage brand/style templates, `/banana cost` to track API spend. Output: image file, crafted prompt, settings, and refinement suggestions.

Included Skill Groups

What's the biggest win?

Professional-quality images on the first try, because Claude understands intent and applies domain expertise to construct the prompt instead of you guessing how to describe it.

What should I know technically?

Requires Node.js 18+ for MCP setup (uses native fetch for API calls). Google AI API key is free but rate-limited. Fallback scripts (generate.py, edit.py) bypass MCP and call the Gemini REST API directly if MCP is unavailable. ImageMagick 7 (magick command) preferred for post-processing; falls back to ImageMagick 6 (convert command) if v7 absent.

What should I watch out for?

Google Gemini free tier has strict rate limits (5-15 requests per minute, 20-500 per day as of Dec 2025) with 92% reduction from previous limits. ImageMagick is optional but required for post-processing (background removal, cropping, format conversion). Text rendering in generated images can be unreliable; keep text under 25 characters and test before publishing. Safety filters block certain content. Use the rephrase strategies in the prompt-engineering reference.

Key Features

Creative Director Orchestration

Claude interprets intent, selects domain mode, and constructs 5-component prompts instead of passing raw user text to Gemini

9 Domain Modes

Cinema, Product, Portrait, Editorial, UI/Web, Logo, Landscape, Infographic, Abstract. Each has specialized prompt construction and composition rules

Image Editing & Multi-Turn Sessions

Intelligent editing that preserves details, and /banana chat for iterative refinement with style consistency across turns

Brand Presets & Batch Workflows

Save and load style presets, generate N variations with component rotation, and track costs via built-in logging

14 Aspect Ratios & 4K Output

Support for 1:1, 16:9, 9:16, 21:9 ultra-wide cinematic, and up to 4096x4096 resolution control with post-processing pipelines

Tools in this Blueprint

Google Gemini Nano Banana models
Claude logo
4.7(315 reviews)
ImageMagick
FFmpeg

About This Blueprint

License
MIT
Industry
Technology
Skills
0 workflows, 0 sub-skills, 1 standalone