03

Finding the usabilityboundary of AIproduct images

Company
Kuka Home
Year
2025 – 2026
Type
AIGC · Cross-border E-commerce
Role
Product Intern · AI Content Production

The cross-border e-commerce team needed large volumes of furniture scene images for Amazon/Wayfair listings. Traditional shoots cost thousands per set. I used AIGC tools to explore a core question — which AI-generated images can ship directly, which need editing, and which must be regenerated.

Product Intern · Cross-border E-commerce Product Management

Responsible for exploring and optimizing AI-generated product images for dining chairs, bar chairs, and single chairs. Using Flowith AI, Dreamer (Nano Banana), and Google Gemini for scene image generation — combining Amazon/Wayfair channel requirements to deliver 120+ scene cases and 50+ image tasks.

AI-generated images look good ≠ ready for e-commerce listings

Furniture scene images for overseas platforms need to show accurate product details, material texture, proper scale, and visual credibility. AI tools generate drafts in seconds, but outputs frequently fail on the details that matter most for actual product listings.

01

Color drift

AI frequently altered product colors. Work required repeatedly emphasizing "chair color 100% accurate" and "strictly match the reference image" — but models still drifted.

02

Angle & orientation loss

Needed precise control over product orientation — "rotate 90° to face front", "no bird's-eye view", "45° oblique downward" — but models interpreted spatial instructions inconsistently.

03

Quantity & placement errors

Specifying "one table, four chairs" might yield three or five. When replacing products in existing scenes, position and scale often shifted unpredictably.

04

CN→EN prompt gap

Chinese marketing language couldn't be fed directly to models. Translating into "instructions the model understands" was a separate skill — not just language translation, but concept translation.

6 scene templates × 5 operation modes: a reusable prompt production system

I decomposed the work into two dimensions — scene templates define "what to generate", operation modes define "how to generate and adjust". This meant each new task could start from an existing template library instead of writing prompts from scratch.

“The key shift was treating prompt work as a repeatable production method — not "think of a good prompt every time", but select from template → assemble → fine-tune → accept/reject.”

Dining Room

One table, four chairs, group dining scene — family of four or friends gathering

Bedroom

Vanity corner with female model — skincare or getting-ready scene

Reading Corner

Floor-to-ceiling bookshelf, female model reading, cozy atmosphere

Home Office

Male model working at desk with laptop, warm natural light

Cafe

Large cafe scene, three table-chair pairs, two people seated drinking coffee

Patio

American-style yard, large parasol + outdoor table + two dining chairs

Six standard scene templates

5 operation types that cover the full generation workflow

01

Scene Generation

Product white-background image + scene description → direct image generation. The most common starting point.

02

Reverse Prompt

Extract prompts from existing good images, then reuse for new products. "Reverse-engineer this scene photo's AI prompt for me."

03

Scene Replacement

Swap new product into existing scene composition. "Replace the bar chairs in image 4 with these 3 reference images, keep positions unchanged."

04

Attribute Fine-tuning

Precise adjustments to color, angle, perspective, and element additions. "Rotate the front chair 90°", "Change to 45° oblique downward", "Add tableware matching chair count."

05

Style Iteration

Keep product unchanged, experiment with different scene styles and atmospheres. "Don't add too many elements, let the image breathe. Try different styles."

From product white-background photo to platform-ready asset

01

Requirement Decomposition

Identify product category (dining/bar/single chair), target scene, and channel requirements (Amazon main image vs scene image vs video first frame).

02

Template Selection & Assembly

Pick base framework from 6 scene templates, fill in product description, material, color, angle requirements. Add style constraints and negative prompts.

03

Multi-tool Trial

Dreamer is fast for batch iteration; Gemini has stronger comprehension for complex scene descriptions. Choose tool based on task type.

04

Iterative Refinement (3-5 rounds)

Address color drift, angle errors, and scale distortion by adjusting prompt segments. Operations include replacement, attribute tuning, and style iteration.

05

Quality Judgment & Delivery

Classify output into three tiers: ✅ ready for listing / ⚠️ needs minor post-editing / ❌ unusable, must regenerate.

AIGC workflow: Product Input → Prompt → Generation → Review → Final

From vague brief to precise visual instruction

Below are real prompts from actual work, showing how a vague request like "shoot a set of scene photos for this dining chair" becomes precise instructions an AI model can execute.

Dining Room (with models)

Commercial photography of modern dining chairs, one table four chairs, chairs matching the reference image. Set in a sunlit, warm beige dining room. 4 happy friends (American, ~30 years old, 2M 2F) sitting around a rectangular wooden table chatting and laughing. Soft natural light through large windows. Light wood floor, rug under table, decorative cabinet in background. Nordic interior, photorealistic, 8K, sharp focus.

Bedroom vanity (with model)

An elegant woman in her thirties wearing a champagne silk robe, sitting on the modern sculptural chair from the reference image, applying makeup with a brush at a bedroom vanity. Side-sitting pose, graceful manner. Minimalist vanity, large round frameless mirror on the wall. Carpet on floor, a large-leaf plant on the left. Warm afternoon sunlight from the side. High-end home magazine quality, cinematic lighting, minimalist style, 8K HD.

Scene replacement instruction

The first 3 images are white-background photos of the dining chair (same chair, multiple angles). Replace the dining chairs in reference image 4 with this chair, keep positions unchanged. Dining chair color accurate, 8K, realistic, front light.

Prompt to result comparison

What this practice demonstrated

AIGC production is a workflow problem, not an inspiration problem

Good outputs relied on repeatable templates + operation modes + quality gates, not on finding one "perfect prompt".

E-commerce usability is far stricter than looking good

Color off by a shade, angle slightly wrong, scale distorted = cannot ship. Aesthetic judgment must align with commercial standards.

Reusable templates cut trial-and-error costs in half

After establishing 6 scene templates + 5 operation modes, new product image generation went from "start from zero every time" to "pick template → fine-tune".

~25% conversion improvement vs traditional workflow

AI visual solutions demonstrated clear advantages in cost and speed over traditional photography, translating into higher content production efficiency and listing performance.

120+

Scene cases covered

50+

Image tasks completed

3-5

Iteration rounds per task

10+

Templates & SOPs delivered

If I continued, what would come next

01

Automated quality scoring

Use rules or a lightweight model to auto-classify generated images into "ready / needs editing / discard" — reducing manual per-image review.

02

Dynamic prompt templates

Evolve static templates into parameterized ones — input category + channel + season, auto-generate base prompt.

03

Multi-model orchestration

Different tools excel at different tasks (Dreamer for speed, Gemini for comprehension, Flowith for style). Build a selection decision tree.

Next Project

University Competition Management System