Asset Uploads

Backend services and API endpoints for uploading image and voice assets

Overview

Asset uploads allow users to submit image and voice files that feed into the Indexing Pipeline. Uploaded assets go through the Media System (which stores S3 coordinates and serves content via a proxy), are recorded in the likeness_assets table, and automatically queued for AI tag generation.

Both image and voice uploads follow the same architecture: an API route delegates to a service function in @zooly/likeness-search, which uses uploadMediaFile + createMedia, stores the generated filename (not raw S3 URLs) as contentUrl, and enqueues indexing events.

Architecture

sequenceDiagram participant Client participant API as Next.js API Route participant SVC as Service Function participant Media as uploadMediaFile participant S3 as AWS S3 participant MediaDB as media table participant DB as likeness_assets participant Queue as Indexing Queue Client->>API: POST multipart/form-data or JSON API->>SVC: addImageAssets / addVoiceAssets SVC->>SVC: Validate file types SVC->>Media: uploadMediaFile(buffer, name, type, basePath) Media->>S3: uploadToS3(buffer, key, contentType) Media-->>SVC: { filename, s3Key, s3Bucket, s3Region, contentType } SVC->>MediaDB: createMedia(...) SVC->>DB: createAsset(type, contentUrl: filename) SVC->>Queue: addToQueue(IMAGE_ASSET / VOICE_ASSET) SVC-->>API: result API-->>Client: JSON response

API Endpoints

Add Image Assets

Endpoint: POST /api/ui-api/user-image-assets/add

Location: apps/zooly-app/app/api/ui-api/user-image-assets/add/route.ts

Authentication: Required (user cookie)

Service Function: addImageAssets() from @zooly/likeness-search

Supports two input formats:

File Upload (Multipart Form Data)

POST /api/ui-api/user-image-assets/add
Content-Type: multipart/form-data

FormData:
  files: File[] (one or more image files)

Accepted MIME types: image/jpeg, image/png, image/webp

URL-Based Upload (JSON)

{
  "imageUrls": [
    "https://example.com/photo1.jpg",
    "https://example.com/photo2.png"
  ]
}

Response

{
  "success": true,
  "message": "Processed 2 URLs",
  "created": 2,
  "duplicates": 0,
  "assetIds": ["abc123...", "def456..."]
}

Status Codes:

  • 200 - Success (all or partial)
  • 400 - Bad request (no files/URLs provided, all invalid)
  • 401 - Unauthorized
  • 500 - Internal server error

Add Voice Assets

Endpoint: POST /api/ui-api/user-voice-assets/add

Location: apps/zooly-app/app/api/ui-api/user-voice-assets/add/route.ts

Authentication: Required (user cookie)

Service Function: addVoiceAssets() from @zooly/likeness-search

Supports two input formats:

File Upload (Multipart Form Data)

POST /api/ui-api/user-voice-assets/add
Content-Type: multipart/form-data

FormData:
  files: File[] (one or more audio files)

Accepted file types: Audio files (audio/*, video/*)

URL-Based Upload (JSON)

{
  "voiceUrls": [
    "https://example.com/audio1.mp3",
    "https://example.com/audio2.wav"
  ]
}

Response

{
  "success": true,
  "message": "Processed 2 URLs",
  "created": 2,
  "duplicates": 0,
  "assetIds": ["abc123...", "def456..."]
}

Status Codes:

  • 200 - Success (all or partial)
  • 400 - Bad request (no files/URLs provided, all invalid)
  • 401 - Unauthorized
  • 500 - Internal server error

Service Functions

addImageAssets(cookieHeader, imageUrls?, files?)

Location: packages/likeness-search/src/addImageAssets.ts

Import: import { addImageAssets } from "@zooly/likeness-search"

Process:

  1. Resolves accountId from cookie via resolveAccountId()
  2. If files are provided, validates MIME type and uploads each via uploadMediaFile(buffer, file.name, file.type, "likeness-image-assets"), then createMedia() to persist S3 metadata
  3. If URLs are provided, validates each URL string
  4. Checks existing likeness_assets records (type IMAGE) for duplicate contentUrl values
  5. Creates new likeness_assets records with type: "IMAGE" and contentUrl set to the generated filename (not S3 URL)
  6. Enqueues an IMAGE_ASSET event via addToQueue() to trigger the indexing pipeline

See Media System for details on how filenames are generated and served via the proxy.

addVoiceAssets(cookieHeader, voiceUrls?, files?)

Location: packages/likeness-search/src/addVoiceAssets.ts

Import: import { addVoiceAssets } from "@zooly/likeness-search"

Process:

  1. Resolves accountId from cookie via resolveAccountId()
  2. If files are provided, validates MIME type (audio/* or video/*) and uploads each via uploadMediaFile(buffer, file.name, file.type, "likeness-voice-assets"), then createMedia() to persist S3 metadata
  3. If URLs are provided, validates each with validateAudioUrl() (rejects video platform URLs)
  4. Checks existing likeness_assets records (type VOICE) for duplicate contentUrl values
  5. Creates new likeness_assets records with type: "VOICE" and contentUrl set to the generated filename
  6. Enqueues a VOICE_ASSET event via addToQueue() to trigger the indexing pipeline

See Media System for details.

S3 Storage and Media System

Assets are stored in the project's S3 bucket with the following key structure:

Asset TypeS3 Key PatternExample
Imagelikeness-image-assets/{filename}likeness-image-assets/photo-a3f2.jpg
Voicelikeness-voice-assets/{filename}likeness-voice-assets/sample-x7k9.mp3

contentUrl in likeness_assets stores the filename (e.g., photo-a3f2.jpg), not the raw S3 URL. Clients receive proxy URLs via getMediaUrl(filename) — e.g., {NEXT_PUBLIC_APP_URL}/api/media/photo-a3f2.jpg. The Media System documents the proxy endpoint, media table, and upload helpers.

Database Records

Each upload creates a record in the likeness_assets table:

{
  id: nanoid(),            // auto-generated
  accountId: "...",        // from authenticated user
  type: "IMAGE" | "VOICE",
  contentUrl: "photo-a3f2.jpg",  // media filename (not S3 URL)
  description: null,
  searchTags: null,        // populated later by AI tag generation
  voiceSampleUrl: null,    // populated later for VOICE assets (also filename)
  tagAttemptCount: 0,
  tagLastAttemptAt: null,
}

S3 metadata (key, bucket, region) is stored in the media table. List endpoints and search results map filenames through getMediaUrl() before returning to clients.

For schema details, see Database Schema — likeness_assets.

Deduplication

Both upload functions check for existing assets before creating new records:

  1. Load all likeness_assets for the account filtered by type (IMAGE or VOICE)
  2. Build a set of existing contentUrl values
  3. Skip any incoming URL that already exists in the set
  4. Return the duplicate count in the response

This prevents the same file from being uploaded and indexed multiple times.

Indexing Pipeline Integration

After assets are created, the upload functions enqueue events that trigger the Indexing Pipeline:

Upload TypeQueue EventWhat Happens Next
ImageIMAGE_ASSETCron daemon picks up event → data sufficiency check → AI tag generation via Gemini vision → upsert to search index
VoiceVOICE_ASSETCron daemon picks up event → data sufficiency check → AI tag generation via Gemini audio + ElevenLabs voice clone → upsert to search index

The indexing daemon runs every 2 minutes. After an upload, the asset will typically be processed and searchable within one or two cron cycles, depending on queue depth.