# Stage 3: Asset Identification

You are identifying all visual assets needed for short-form video generation AND determining which need reference images.

## Your Task

Read the script and identify every character, location, and object that needs to be created. Write to `assets.json` in this directory.

## Previous Stage Outputs

**Script**: `../03-script/current/script.json`

Read this carefully - it contains:
- All scenes with visual descriptions
- Characters and their appearances
- Locations for each scene
- Objects and props

## Reference Image Strategy

**NanoBanana can generate many things from description alone:**
- Generic people (old man, waitress, biker)
- Common locations (diner, parking lot, office)
- Standard objects (pie, glass, plate, car)

**BUT some things NEED reference images:**
- Real historical figures (cosmonaut Lebedev, specific politicians)
- Specific cultural objects (25 rouble bill, specific car model, specific uniform)
- Famous locations (Red Square, Eiffel Tower)
- Trademarked items (Harley-Davidson logo, specific brands)

**If NOTHING needs references, we can skip Stage 5 (reference search) entirely.**

## Asset Categories

### 1. Characters

For each character that appears in the video:

**Required fields:**
- `id`: Character ID (matches script)
- `age`: Age range (e.g., "late 50s", "30-35")
- `gender`: Male/Female/Non-binary
- `appearance`: Physical description (face, body, distinctive features)
- `clothing`: What they wear (be specific - colors, style, details)
- `appears_in`: List of scene numbers where they appear
- `speaking_in`: List of scene numbers where they speak
- **`needs_reference`**: true/false - does this need a reference image?
- **`reference_reason`**: (if needs_reference=true) Why? Options:
  - "historical_figure" - Real person from history
  - "real_person" - Specific living person
  - "specific_uniform" - Needs exact uniform details (cosmonaut suit, military, etc.)
  - "cultural_specific" - Ethnically/culturally specific features that need authenticity

**For speaking characters, also include:**
- `voice_description`: Voice quality (e.g., "deep, authoritative", "young, energetic")
- `emotional_range`: Scenes and emotions (e.g., {"scene_2": "mocking", "scene_5": "shocked"})

**Important:**
- Generic characters (old man, waitress, biker) do NOT need references
- Historical figures (Yuri Gagarin, cosmonaut Lebedev) DO need references
- NO children (only ages 25-45 for AI video generation safety)

### 2. Locations

For each distinct location in the video:

**Required fields:**
- `id`: Location ID (e.g., "diner_interior", "parking_lot")
- `type`: General category (interior/exterior, setting type)
- `description`: Detailed visual description
- `time_of_day`: Lighting conditions (day/night/dusk/etc.)
- `weather`: If exterior (sunny, overcast, etc.)
- `key_features`: Specific visual elements that must be present
- `appears_in`: List of scene numbers
- **`needs_reference`**: true/false
- **`reference_reason`**: (if needs_reference=true) Why? Options:
  - "famous_landmark" - Specific famous place (Red Square, Eiffel Tower)
  - "specific_interior" - Specific room/building that needs to match reality
  - "historical_location" - Place from specific time period that needs accuracy

**Important:**
- Generic locations (diner, office, parking lot) do NOT need references
- Famous landmarks or specific historical places DO need references

### 3. Objects

For each important prop or object:

**Required fields:**
- `id`: Object ID (e.g., "apple_pie", "harley_motorcycles")
- `description`: Detailed visual description
- `importance`: "critical" (story depends on it) or "atmospheric" (adds to scene)
- `appears_in`: List of scene numbers
- `state_changes`: If object changes appearance (e.g., {"scene_1": "intact pie", "scene_1_end": "cigarette in pie"})
- **`needs_reference`**: true/false
- **`reference_reason`**: (if needs_reference=true) Why? Options:
  - "specific_currency" - Specific money (25 rouble bill, euro note)
  - "specific_vehicle" - Exact car/motorcycle model with details
  - "branded_item" - Trademarked/branded product
  - "historical_artifact" - Specific item from history
  - "technical_device" - Device that needs technical accuracy (Soyuz capsule, specific phone)

**Important:**
- Generic objects (pie, glass, cigarette) do NOT need references
- Specific currency, vehicles with logos, historical items DO need references

### 4. Frame References (cross-episode)

If "Available Episode Frames" section is present below, you can reference frames
from completed episodes. When an asset clearly matches something visible in a
previous episode frame, add a `frame_references` field to that asset:

```json
"frame_references": ["001/08-frames/16-scene/003-refined-003-refined-003.jpg"]
```

Read each frame description carefully and select frames where the asset is clearly
visible and prominent. Prefer frames where the asset is the subject or fills a
significant portion of the frame. These will be sent to NanoBanana as reference
images to ensure visual consistency across episodes.

## Output Format

Write to `assets.json`:

```json
{
  "needs_references": false,
  "reference_summary": "All assets are generic and can be generated from descriptions alone",
  "characters": [
    {
      "id": "old_man",
      "age": "late 50s to early 60s",
      "gender": "male",
      "appearance": "Weathered, sun-damaged face with deep wrinkles around eyes and mouth. Gray stubble, thinning gray hair. Stocky build, calloused hands. Tough, road-worn look of long-haul trucker.",
      "clothing": "Faded red plaid flannel shirt, worn jeans, trucker cap with faded logo, leather work boots",
      "appears_in": [2, 3, 4, 5],
      "speaking_in": [],
      "needs_reference": false,
      "reference_reason": null
    },
    {
      "id": "cosmonaut_lebedev",
      "age": "35-40",
      "gender": "male",
      "appearance": "Specific person - Cosmonaut Valentin Lebedev from 1982 Soyuz mission",
      "clothing": "Soviet Sokol spacesuit - white with blue trim, specific helmet design, Cyrillic text on chest",
      "appears_in": [1, 3, 5],
      "speaking_in": [3],
      "needs_reference": true,
      "reference_reason": "historical_figure",
      "voice_description": "Calm, professional, Russian accent",
      "emotional_range": {
        "scene_3": "calm, focused"
      }
    }
  ],
  "locations": [
    {
      "id": "diner_interior",
      "type": "interior - roadside diner",
      "description": "Classic American roadside diner, 1960s-70s aesthetic. Chrome-edged counter with red vinyl stools, checkered floor, fluorescent lighting with slight flicker, window showing parking lot.",
      "time_of_day": "day or dusk",
      "key_features": "Counter with stools, window to parking lot, worn linoleum floor",
      "appears_in": [1, 2, 3, 4],
      "needs_reference": false,
      "reference_reason": null
    },
    {
      "id": "red_square",
      "type": "exterior - famous landmark",
      "description": "Red Square in Moscow - specific view with St. Basil's Cathedral visible, cobblestone plaza, Kremlin wall",
      "time_of_day": "day",
      "weather": "clear",
      "key_features": "St. Basil's colorful domes, cobblestones, specific architectural details",
      "appears_in": [2, 5],
      "needs_reference": true,
      "reference_reason": "famous_landmark"
    }
  ],
  "objects": [
    {
      "id": "apple_pie",
      "description": "Classic American apple pie slice on white plate - golden brown crust, visible apple filling with cinnamon",
      "importance": "critical",
      "appears_in": [1, 2],
      "state_changes": {
        "scene_1": "pristine pie slice",
        "scene_1_end": "cigarette stubbed into pie"
      },
      "needs_reference": false,
      "reference_reason": null
    },
    {
      "id": "25_rouble_note",
      "description": "Soviet 25 rouble banknote from 1961 - specific design with Lenin portrait, Cyrillic text, specific colors (brown-green)",
      "importance": "critical",
      "appears_in": [3],
      "state_changes": null,
      "needs_reference": true,
      "reference_reason": "specific_currency"
    }
  ]
}
```

## Top-Level Fields

**`needs_references`**: Boolean - true if ANY asset needs references, false if all are generic

**`reference_summary`**: String - quick explanation:
- If false: "All assets are generic and can be generated from descriptions alone"
- If true: "References needed for: cosmonaut Lebedev (historical figure), 25 rouble note (specific currency), Red Square (famous landmark)"

## Important Notes

1. **Be conservative with references** - Only mark needs_reference=true if really necessary
2. **Generic is better** - If a generic version works for the story, use it
3. **Check every asset** - Every character/location/object must have needs_reference field
4. **Top-level flag** - Set needs_references=true if ANY asset needs a reference

## Workflow Impact

**If needs_references = false:**
- Stage 5 (reference search) will be SKIPPED
- Proceed directly to Stage 6 (characteristic shot)

**If needs_references = true:**
- Stage 5 will search for each asset with needs_reference=true
- User will select best reference images
- References used in Stage 6 (characteristic shot)

## Output Files

You must create TWO files:

1. **assets.json** — The structured asset data (format above)
2. **user_message.txt** — A friendly message explaining your findings and providing guidance

Check that JSON is valid before finishing!

## User Message File (user_message.txt)

After creating `assets.json`, write `user_message.txt` — a friendly message to the user.

This message should explain (2-3 paragraphs):

**1. What you identified:**
- How many characters, locations, objects
- Whether reference images are needed (and for what)
- Key visual details that will define the look (clothing, setting, lighting)
- Any characters with dialogue — their voice/delivery style

**2. Generation challenges or considerations:**
- Assets that might be tricky for AI to generate consistently
- Visual details that need to stay consistent across scenes (same clothing, same dog breed, etc.)
- Any state changes that could be problematic (object transformations, lighting shifts)
- Whether character appearances are distinct enough to avoid confusion

**3. Specific suggestions for user:**

When to **Accept**:
- The character descriptions match your vision
- Location and lighting choices feel right
- The reference image decision (skip or search) makes sense
- You're ready to proceed to characteristic shot generation

When to **Refine** (examples):
- "Dave should look older/younger/more rugged"
- "The location should be [different time of day/weather/season]"
- "Add [specific prop] that's important to the story"
- "The dog should be [different breed] — it matters for the joke"
- "Paul's clothing should be more [specific style] to contrast with Dave"
- "[Character X] needs a reference image because [reason]"

When to **Regenerate** (examples):
- "The character descriptions don't match the tone of the story at all"
- "You missed important visual elements from the script"
- "The locations are wrong — this should be set in [different environment]"
- "Start over with a completely different visual approach"

**Tone**: Be conversational and specific to THIS story. Mention actual character names, visual details, and story-specific considerations.

**Example message**:
```
I've identified 3 characters, 2 locations, and 2 key props for the water-walking dog story. All assets are generic enough to generate without reference images — no historical figures, specific brands, or famous landmarks needed, so we can skip the reference search stage.

The most important visual consistency challenge will be the golden retriever — it needs to look identical across the lake scenes (walking on water, retrieving ducks) and clearly DRY even when on the water surface. Paul and Dave need to be visually distinct: I gave Paul a darker, more weathered look (charcoal jacket, faded hat) vs Dave's lighter, friendlier gear (tan vest, green cap) to mirror their pessimist/optimist dynamic. The autumn lake needs to be calm and glassy — choppy water would undermine the water-walking effect.

**Accept** if the character looks and location details work for you. **Refine** if you want to change: character ages or clothing (maybe Paul should look more blue-collar?), dog breed (Labs retrieve too, could be funnier with a small dog?), location details (time of day, weather), or if you think any asset actually needs a reference image. **Regenerate** if the overall visual direction feels wrong for the story tone.
```
