Feature 01Workflow

ERNIE-Image vs Turbo: The 50 vs 8 Step Tradeoff

One endpoint charges $0.03 per megapixel and runs 50 diffusion steps. The other charges $0.01 and runs 8. Here is where the 3x price delta earns its money and where it quietly burns your budget.

By ernie-api editorial.Apr 19, 2026.6 min read

fal hosts two inference variants of ERNIE-Image. The standard fal-ai/ernie-image runs 50 denoising steps at $0.03 per megapixel. fal-ai/ernie-image/turbo runs 8 steps at $0.01 per megapixel. Same 8B DiT base, same Apache 2.0 weights, same aspect ratio options. The choice between them is not a quality question, it is a workload question.

A 1024 by 1024 render is roughly 1 megapixel, so you are paying $0.03 versus $0.01 per image. Scale that to a batch of 1000 posters and the delta is $20. Scale it to 100,000 product thumbnails a month and you are looking at $2,000 a month of decision. The question is which bucket your workload actually belongs in.

What 8 steps still does well

Turbo holds up for simple scenes, single subjects, clean backgrounds, and short text strings. If you are rendering product hero shots with the brand name as a single word at the top, Turbo gives you 90 percent of the standard endpoint quality. Photographic work without dense typography is the sweet spot. A plate of food, a sneaker on a gradient, a model portrait against a solid wall. Turbo ships those.

Turbo also works well as a draft layer in a two pass pipeline. Render 20 Turbo variants of a prompt at $0.20 total, pick the composition you like, then render the final at 50 steps for $0.03. You have paid $0.23 for a curated final versus $0.60 if you had rendered 20 variants at full quality. Same creative outcome at roughly 60 percent of the cost.

Where 8 steps falls apart

Three workload shapes make Turbo the wrong tool.

First, dense typography. Anything with more than 30 characters of text across multiple lines will show glyph artifacts at 8 steps that the 50 step pass cleans up. The difference is most visible in Simplified Chinese and Japanese, where glyph edges are complex and the model needs the extra denoising budget to resolve them.

Second, multi panel layouts. The comic style and storyboard renders that ERNIE-Image handles well at 50 steps lose panel separation and text balloon integrity at 8. You end up with merged panels and smeared captions.

Third, fine detail subjects. Jewelry on a white background, embroidery pattern references, watch dial photography. Anything where the buyer is inspecting a 2400 pixel crop of a specific region needs the 50 step pass to hold edge crispness.

The real cost of a 1000 poster batch

Run the math on 1000 posters at 1600 by 900, which is roughly 1.44 megapixels per image. Standard endpoint: 1000 times $0.03 times 1.44 equals $43.20. Turbo endpoint: 1000 times $0.01 times 1.44 equals $14.40. You save $28.80 by going Turbo.

Now add the reshoot cost. If 15 percent of the Turbo batch needs to be rerendered on the standard endpoint because the typography failed, that is 150 extra full quality renders at $0.04320 each, which adds $6.48. Your net savings drop to $22.32. If the reshoot rate climbs to 40 percent, you pay $17.28 in reshoots and your savings collapse to $11.52. At a 50 percent reshoot rate you are essentially even, and you have paid for it in operator time on top.

The rule of thumb. If your prompt is typography heavy and you expect a reshoot rate above 30 percent, skip Turbo entirely and go straight to 50 steps. If your prompt is visually simple and reshoots are rare, Turbo saves you real money at scale.

Calling both endpoints

Here is the pattern for the two pass pipeline. Turbo for the variant sweep, standard for the final.

1import { fal } from '@fal-ai/client';
2
3fal.config({ credentials: process.env.FAL_KEY });
4
5const prompt = 'A vintage film festival poster, muted teal and rust palette, bold sans serif title "NIGHT FRAMES 2026" centered upper third, festival dates "June 4 to 12" in bottom fifth, grain texture';
6
7const drafts = await Promise.all(
8  Array.from({ length: 8 }).map(() =>
9    fal.subscribe('fal-ai/ernie-image/turbo', {
10      input: {
11        prompt,
12        image_size: 'landscape_16_9',
13        num_inference_steps: 8,
14        enable_prompt_enhancer: true,
15      },
16    })
17  )
18);
19
20console.log(drafts.map(d => d.data.images[0].url));
21
22const finalPass = await fal.subscribe('fal-ai/ernie-image', {
23  input: {
24    prompt,
25    image_size: 'landscape_16_9',
26    num_inference_steps: 50,
27    enable_prompt_enhancer: true,
28    seed: 834217,
29  },
30});
31
32console.log(finalPass.data.images[0].url);

Lock the seed on the standard pass once you have picked a draft you liked, so you are rendering the same composition at higher quality rather than starting over.

A decision rule for production pipelines

Write a one line router at the top of your image generation service. If the prompt contains a quoted string longer than 30 characters, send to standard. If the prompt contains the words comic, panel, menu, poster, or signage, send to standard. Otherwise, default to Turbo. You will route roughly 70 percent of typical e commerce and marketing workloads to Turbo, catch the dense typography cases automatically, and avoid the manual reshoot spiral.

This is the pattern that keeps monthly bills predictable. The 3x price delta is real, but it stops being a decision once the routing logic handles the classification for you.

00Back to the archive