imgx-mcp
The AI you’re already working with generates and edits images — informed by your article, your project, your context. No prompt engineering. No switching services. An MCP server for context-aware image generation and editing. Works with Claude Code, Gemini CLI, Cursor, Windsurf, and any MCP-compatible tool.
Context-aware image generation
When you need an image in the middle of development or content creation, switching to a separate service breaks your context. The article content, the project background, the brand direction — none of it carries over. You end up re-explaining the same information to a different tool.
With imgx-mcp, the AI you’re already working with uses everything it knows — the article’s purpose, target audience, project direction — to generate the right image. Just say “I need a cover image for this article” and the AI determines the composition, style, and prompt. Review and editing happen in the same session.
Key Features
Multi-provider
Use Gemini and OpenAI through a single interface. Switching providers doesn't change how you work. The default model, gemini-2.5-flash-image (Nano Banana), has a free tier (10 RPM / 500 RPD, no credit card required). Start free, then upgrade to paid models like Nano Banana 2 or Nano Banana Pro when you need higher resolution or quality. You can also compare results from different providers using the same prompt.
Text-based editing
No masks needed. Edit images by describing what you want: “darken the background”, “add a person”. Pass an existing image file as input and describe the changes in text.
Iterative editing with edit_last
Use the output of one generation as input for the next, without specifying file paths. Generate, adjust, adjust, done. Iterate within the same session.
Undo/redo for safe experimentation
Go back to any previous state after a series of edits. Undo, then take a different direction from that point. History is managed per session. Switch between multiple image chains and resume where you left off.
You: "Create a cover image for this article"
AI: Reads article content, determines style → generate_image
You: "Make it warmer" → edit_last
You: "Good. Save to images/" → edit_lastSkill — An image production expert in your dev environment
The MCP server gives the AI the ability to generate and edit images. The Skill gives it the knowledge to do it well — so you don’t need to learn prompt syntax, model specifications, or service-specific parameters.
Without the Skill, the AI has powerful tools but limited knowledge of how to use them. With the Skill loaded, the AI draws on a structured body of image production expertise and applies it automatically.
What the Skill brings:
- Automatic prompt construction — You say “I need a cover image.” The AI builds a structured prompt using the Subject-Context-Style framework: identifying what to show, where to place it, and how it should look. No prompt engineering on your part
- 24 editing techniques — Proven approaches for atmosphere, composition, element manipulation, and style transfer. “Make it warmer” or “add depth of field” — the AI selects the right instruction for the model
- Intelligent model selection — Starts with the free model. Suggests paid upgrades only when your needs exceed free tier capabilities — and explains what the upgrade adds. Knows which model has the best text rendering, which supports transparent backgrounds, and which offers the best cost-performance
- Platform-aware sizing — Mention “Twitter OGP” or “App Store screenshot” and the AI selects the correct aspect ratio and resolution. Covers social media, OGP, app stores, print, and blog platforms
- Trending style templates — Ghibli, action figure in box, 3D clay, pixel art, chibi, and more. Say the style name and the AI applies the right prompt structure
- Multi-image consistency — Design tokens and character DNA templates keep slide decks, social media series, and brand assets visually coherent across multiple generations
The image generation models already have these capabilities. The Skill is what makes them accessible without specialized knowledge. It ships with the Plugin and can be installed standalone.
Setup
Add the following to your MCP config file.
Claude Code
.mcp.json in your project root:
{
"mcpServers": {
"imgx": {
"command": "npx",
"args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
"env": { "GEMINI_API_KEY": "your-key" }
}
}
}Claude Desktop
claude_desktop_config.json (location):
{
"mcpServers": {
"imgx": {
"command": "npx",
"args": ["--package=imgx-mcp", "-y", "imgx-mcp"],
"env": { "GEMINI_API_KEY": "your-key" }
}
}
}Windows: Change to
"command": "cmd"and"args": ["/c", "npx", "--package=imgx-mcp", "-y", "imgx-mcp"].
Get your API key from Google AI Studio (free, no credit card required — the default model gemini-2.5-flash-image has a free tier). To use OpenAI, add "OPENAI_API_KEY": "your-key" to env.
For other tools (Gemini CLI, Cursor, Windsurf, etc.), see the usage guide.
Claude Desktop config file location
| OS | Path |
|---|---|
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
Restart Claude Desktop after editing.
Supported Tools
Works with any MCP-compatible tool, including:
| Tool | Type |
|---|---|
| Claude Code | CLI agent |
| Claude Desktop | Desktop app |
| Gemini CLI | CLI agent |
| Codex CLI | CLI agent |
| Cursor | IDE |
| Windsurf | IDE |
| Cline | IDE extension |
| Continue.dev | IDE extension |
| Zed | Editor |
If your tool supports MCP, the same config works.
MCP Tools
| Tool | Description |
|---|---|
generate_image | Generate an image from text |
edit_image | Edit an existing image with text instructions |
edit_last | Edit the previous output directly (no path needed) |
undo_edit | Undo the last edit, revert to previous image |
redo_edit | Redo a previously undone edit |
edit_history | Show all sessions and their edit history |
switch_session | Switch to a different session to resume work |
clear_history | Clear history (per-session or all; only deletes files in managed directories) |
set_output_dir | Change the default output directory |
list_providers | List available providers and their capabilities |
Providers
| Provider | Model | Capabilities |
|---|---|---|
| Gemini | gemini-2.5-flash-image (Nano Banana) | Default. Free tier (10 RPM / 500 RPD, no credit card). 1024px max, 7 aspect ratios |
| Gemini | gemini-3.1-flash-image-preview (Nano Banana 2) | Paid. Fast, up to 4K, 14 aspect ratios, $0.045–$0.151/image |
| Gemini | gemini-3-pro-image-preview (Nano Banana Pro) | Paid. Highest quality, up to 4K, resolution control, reference image support |
| OpenAI | gpt-image-1 | Paid. Generate, edit, multiple outputs, format selection (PNG/JPEG/WebP), background transparency |
| OpenAI | gpt-image-1.5 | Paid. ~4x faster, 20% cheaper than gpt-image-1, improved text rendering |
| OpenAI | gpt-image-1-mini | Paid. Budget model at $0.005–$0.036/image |
imgx-mcp itself is free and open source (MIT). The default model (Nano Banana) is free to use — no credit card required. Paid models are available when you need higher resolution or quality. You use your own API keys.
The OGP images and diagrams on this page were themselves produced by imgx-mcp during the content creation session — prompt generation, image creation, iterative editing, and embedding were all handled as a continuous flow.
Links
- GitHub — Source code and issues
- npm — Package
- Usage guide — Setup details and examples
- Story — Why a coffee shop builds image tools

