Skip to main content

Image Generation

KOPI O Agent generates images from text prompts via FAL.ai. Nine models are supported out of the box, each with different speed, quality, and cost tradeoffs. The active model is user-configurable via kopi tools and persists in config.yaml.

Supported Models

ModelSpeedStrengthsPrice
fal-ai/flux-2/klein/9b (default)<1sFast, crisp text$0.006/MP
fal-ai/flux-2-pro~6sStudio photorealism$0.03/MP
fal-ai/z-image/turbo~2sBilingual EN/CN, 6B params$0.005/MP
fal-ai/nano-banana-pro~8sGemini 3 Pro, reasoning depth, text rendering$0.15/image (1K)
fal-ai/gpt-image-1.5~15sPrompt adherence$0.034/image
fal-ai/gpt-image-2~20sSOTA text rendering + CJK, world-aware photorealism$0.04–0.06/image
fal-ai/ideogram/v3~5sBest typography$0.03–0.09/image
fal-ai/recraft/v4/pro/text-to-image~8sDesign, brand systems, production-ready$0.25/image
fal-ai/qwen-image~12sLLM-based, complex text$0.02/MP

Prices are FAL's pricing at time of writing; check fal.ai for current numbers.

Setup

Nous Subscribers

If you have a paid Nous Portal subscription, you can use image generation through the Tool Gateway without a FAL API key. Your model selection persists across both paths.

If the managed gateway returns HTTP 4xx for a specific model, that model isn't yet proxied on the portal side — the agent will tell you so, with remediation steps (set FAL_KEY for direct access, or pick a different model).

Get a FAL API Key

  1. Sign up at fal.ai
  2. Generate an API key from your dashboard

Configure and Pick a Model

Run the tools command:

kopi tools

Navigate to 🎨 Image Generation, pick your backend (Nous Subscription or FAL.ai), then the picker shows all supported models in a column-aligned table — arrow keys to navigate, Enter to select:

  Model                          Speed    Strengths                    Price
fal-ai/flux-2/klein/9b <1s Fast, crisp text $0.006/MP ← currently in use
fal-ai/flux-2-pro ~6s Studio photorealism $0.03/MP
fal-ai/z-image/turbo ~2s Bilingual EN/CN, 6B $0.005/MP
...

Your selection is saved to config.yaml:

image_gen:
model: fal-ai/flux-2/klein/9b
use_gateway: false # true if using Nous Subscription

GPT-Image Quality

The fal-ai/gpt-image-1.5 and fal-ai/gpt-image-2 request quality is pinned to medium (~$0.034–$0.06/image at 1024×1024). We don't expose the low / high tiers as a user-facing option so that Nous Portal billing stays predictable across all users — the cost spread between tiers is 3–22×. If you want a cheaper option, pick Klein 9B or Z-Image Turbo; if you want higher quality, use Nano Banana Pro or Recraft V4 Pro.

Usage

The agent-facing schema is intentionally minimal — the model picks up whatever you've configured:

Generate an image of a serene mountain landscape with cherry blossoms
Create a square portrait of a wise old owl — use the typography model
Make me a futuristic cityscape, landscape orientation

Aspect Ratios

Every model accepts the same three aspect ratios from the agent's perspective. Internally, each model's native size spec is filled in automatically:

Agent inputimage_size (flux/z-image/qwen/recraft/ideogram)aspect_ratio (nano-banana-pro)image_size (gpt-image-1.5)image_size (gpt-image-2)
landscapelandscape_16_916:91536x1024landscape_4_3 (1024×768)
squaresquare_hd1:11024x1024square_hd (1024×1024)
portraitportrait_16_99:161024x1536portrait_4_3 (768×1024)

GPT Image 2 maps to 4:3 presets rather than 16:9 because its minimum pixel count is 655,360 — the landscape_16_9 preset (1024×576 = 589,824) would be rejected.

This translation happens in _build_fal_payload() — agent code never has to know about per-model schema differences.

Automatic Upscaling

Upscaling via FAL's Clarity Upscaler is gated per-model:

ModelUpscale?Why
fal-ai/flux-2-proBackward-compat (was the pre-picker default)
All othersFast models would lose their sub-second value prop; hi-res models don't need it

When upscaling runs, it uses these settings:

SettingValue
Upscale factor
Creativity0.35
Resemblance0.6
Guidance scale4
Inference steps18

If upscaling fails (network issue, rate limit), the original image is returned automatically.

How It Works Internally

  1. Model resolution_resolve_fal_model() reads image_gen.model from config.yaml, falls back to the FAL_IMAGE_MODEL env var, then to fal-ai/flux-2/klein/9b.
  2. Payload building_build_fal_payload() translates your aspect_ratio into the model's native format (preset enum, aspect-ratio enum, or GPT literal), merges the model's default params, applies any caller overrides, then filters to the model's supports whitelist so unsupported keys are never sent.
  3. Submission_submit_fal_request() routes via direct FAL credentials or the managed Nous gateway.
  4. Upscaling — runs only if the model's metadata has upscale: True.
  5. Delivery — final image URL returned to the agent, which emits a MEDIA:<url> tag that platform adapters convert to native media.

Debugging

Enable debug logging:

export IMAGE_TOOLS_DEBUG=true

Debug logs go to ./logs/image_tools_debug_<session_id>.json with per-call details (model, parameters, timing, errors).

Platform Delivery

PlatformDelivery
CLIImage URL printed as markdown ![](url) — click to open
TelegramPhoto message with the prompt as caption
DiscordEmbedded in a message
SlackURL unfurled by Slack
WhatsAppMedia message
OthersURL in plain text

Limitations

  • Requires FAL credentials (direct FAL_KEY or Nous Subscription)
  • Text-to-image only — no inpainting, img2img, or editing via this tool
  • Temporary URLs — FAL returns hosted URLs that expire after hours/days; save locally if needed
  • Per-model constraints — some models don't support seed, num_inference_steps, etc. The supports filter silently drops unsupported params; this is expected behavior