Generate Images With Creative Director Intelligence2 skills
AI image generation skill where Claude acts as Creative Director using Gemini Nano models, with 5-component prompt engineering and 9 domain expertise modes.
Content Blueprint
A Claude Code skill that orchestrates Gemini Nano for image generation, editing, and visual asset creation with intent-based prompt engineering.
Run this command to deploy the blueprint to your environment.
Turns Claude into a Creative Director that interprets your visual intent instead of passing raw text to an image model. It analyzes what you actually need, selects the right domain expertise (Cinema, Product, Portrait, Editorial, UI, Logo, Landscape, Infographic, Abstract), constructs optimized prompts using Google's 5-component formula, and orchestrates Gemini for the best possible results. Without this intelligence, you either get mediocre results from vague prompts or spend hours manually engineering every detail.
Once installed, you use the `/banana` command with variants for different tasks. 1. For image generation, describe your intent and Claude asks clarifying questions about use case and brand context, then constructs a detailed 5-component prompt (Subject to Action to Location/Context to Composition to Style) and generates the image. 2. For editing, provide an image path and edit instructions, and Claude rephrases them into specific transformations (edge-preserving background removal, color shifts, outpainting). 3. For multi-turn sessions, use `/banana chat` to refine characters or styles across iterations while maintaining consistency. 4. For batch variations, `/banana batch` generates N distinct versions rotating different prompt components. Additional modes: `/banana inspire` to browse 2,500+ prompt examples, `/banana preset` to manage brand/style templates, `/banana cost` to track API spend. Output: image file, crafted prompt, settings, and refinement suggestions.
AI image generation skill where Claude acts as Creative Director using Gemini Nano models, with 5-component prompt engineering and 9 domain expertise modes.
Professional-quality images on the first try, because Claude understands intent and applies domain expertise to construct the prompt instead of you guessing how to describe it.
Requires Node.js 18+ for MCP setup (uses native fetch for API calls). Google AI API key is free but rate-limited. Fallback scripts (generate.py, edit.py) bypass MCP and call the Gemini REST API directly if MCP is unavailable. ImageMagick 7 (magick command) preferred for post-processing; falls back to ImageMagick 6 (convert command) if v7 absent.
Google Gemini free tier has strict rate limits (5-15 requests per minute, 20-500 per day as of Dec 2025) with 92% reduction from previous limits. ImageMagick is optional but required for post-processing (background removal, cropping, format conversion). Text rendering in generated images can be unreliable; keep text under 25 characters and test before publishing. Safety filters block certain content. Use the rephrase strategies in the prompt-engineering reference.
Creative Director Orchestration
Claude interprets intent, selects domain mode, and constructs 5-component prompts instead of passing raw user text to Gemini
9 Domain Modes
Cinema, Product, Portrait, Editorial, UI/Web, Logo, Landscape, Infographic, Abstract. Each has specialized prompt construction and composition rules
Image Editing & Multi-Turn Sessions
Intelligent editing that preserves details, and /banana chat for iterative refinement with style consistency across turns
Brand Presets & Batch Workflows
Save and load style presets, generate N variations with component rotation, and track costs via built-in logging
14 Aspect Ratios & 4K Output
Support for 1:1, 16:9, 9:16, 21:9 ultra-wide cinematic, and up to 4096x4096 resolution control with post-processing pipelines