Database Schema

Database tables and relationships for Likeness Search

Overview

The Likeness Search system uses eight core database tables, all defined using Drizzle ORM and following the project's conventions (nanoid IDs, snake_case columns, timestamps with timezone).

Schema Location

All schemas are located in: packages/db/src/schema/

Table Relationships

erDiagram account ||--o{ likeness_assets : "has" account ||--o{ likeness_search : "indexed in" account ||--o{ likeness_search_vector : "has embedding" account ||--o{ likeness_need_indexing_queue : "has events" account ||--o{ account_social_links : "has" account ||--o{ scrapes : "has" account ||--o{ ip_terms : "has" account ||--o{ eleven_labs : "has voice clones" likeness_assets ||--o| likeness_search : "tags aggregated to" account_social_links ||--o| likeness_search : "follower count aggregated to" ip_terms ||--o| likeness_search : "required for indexing" likeness_assets ||--o| likeness_search_vector : "tags normalized to" eleven_labs ||--o| likeness_assets : "voiceId linked to"

Core Tables

1. likeness_assets

File: packages/db/src/schema/likenessAssets.ts

Stores uploaded images and voice samples with AI-extracted search tags.

Key Fields:

  • id - Primary key (nanoid)
  • accountId - Foreign key to account.id
  • type - Enum: IMAGE or VOICE
  • contentUrl - Media filename (e.g., photo-a3f2.jpg), resolved to proxy URL via Media System
  • searchTags - JSON field storing AI-extracted tags
  • voiceSampleUrl - Media filename for generated voice sample (VOICE assets only)
  • tagAttemptCount - Number of failed AI extraction attempts (max 5)
  • tagLastAttemptAt - Timestamp of last attempt

Access Functions: packages/db/src/access/likenessAssets.ts

  • getAssetById(id)
  • listAssetsByAccountId(accountId)
  • getUnprocessedAssets(accountId, maxAttempts)
  • createAsset(data)
  • updateAssetSearchTags(id, searchTags)
  • updateAssetVoiceSample(id, voiceSampleUrl)
  • incrementTagAttemptCount(id)
  • resetTagAttemptCount(id)

File: packages/db/src/schema/likenessSearch.ts

SQL search index with 30+ enum fields for exact filtering.

Key Fields:

  • id - Primary key (nanoid)
  • accountId - Foreign key to account.id (unique)
  • Category & Basic: category, gender, country, targetAudience
  • Social Data: numberOfFollowers, primaryPlatform, isVerified, contentNiche, engagementRate
  • Physical Data: birthYear, height, weight, bodyType
  • Visual Data - Hair: hairColor, hairLength, hairType
  • Visual Data - Face: eyeColor, skinColor, ethnicity, faceShape, facialHair, wearGlasses
  • Visual Data - Modifications: hasTattoos, hasPiercings
  • Voice Data: voicePitch, voiceTone, accent, primaryLanguage

Access Functions: packages/db/src/access/likenessSearch.ts

  • upsertSearchIndex(accountId, data)
  • getSearchIndexByAccountId(accountId)
  • deleteSearchIndex(accountId)
  • searchByFilters(filters, limit, offset)

3. likeness_search_vector

File: packages/db/src/schema/likenessSearchVector.ts

pgvector embeddings for semantic similarity search fallback.

Key Fields:

  • accountId - Primary key, foreign key to account.id
  • content - Text representation of tags (normalized for embedding)
  • embedding - pgvector column (1536 dimensions, OpenAI text-embedding-3-small)

Access Functions: packages/db/src/access/likenessSearchVector.ts

  • vectorUpsert(accountId, content, embedding)
  • vectorSearch(queryEmbedding, topK, maxDistance)
  • vectorSearchWithImageAssets(queryEmbedding, opts)
  • vectorSearchWithVoiceAssets(queryEmbedding, opts)
  • deleteFromVectorIndex(accountId)

4. likeness_need_indexing_queue

File: packages/db/src/schema/likenessNeedIndexingQueue.ts

Event queue driving the indexing pipeline.

Key Fields:

  • id - Primary key (nanoid)
  • accountId - Foreign key to account.id
  • elementType - Enum: IMAGE_ASSET, VOICE_ASSET, VOICE_TERM, IMAGE_TERM, LIKENESS_TERM, PROFILE_DESCRIPTION, SOCIAL_DATA
  • elementId - Optional ID of the element that triggered the event
  • status - Enum: PENDING, IN_PROGRESS, COMPLETED, FAILED, DISCARDED, TIMEOUT, AWAITING_RETRY
  • info - JSON field for additional metadata
  • retryCount - Number of retry attempts (max 5)
  • retryAt - Timestamp for retry (when status is AWAITING_RETRY)

Indexes:

  • (status, createdAt) - For getting oldest PENDING events
  • (accountId, status) - For checking active events per account
  • (status, retryAt) - For processing retryable events

Access Functions: packages/db/src/access/likenessNeedIndexingQueue.ts

  • addToQueue(accountId, elementType, elementId, info)
  • getOldestPendingEvent()
  • getEventById(eventId)
  • markEventInProgress(eventId)
  • markEventCompleted(eventId)
  • markEventFailed(eventId, errorInfo)
  • markEventDiscarded(eventId, reason)
  • markEventAwaitingRetry(eventId, retryAt)
  • markTimedOutEvents()
  • processRetryableEvents()
  • hasActiveEventForAccount(accountId)
  • markOtherPendingEventsAsDiscarded(accountId, excludeEventId)
  • getEventRetryCount(eventId)

File: packages/db/src/schema/accountSocialLinks.ts

Social media links and platform-specific follower counts.

Key Fields:

  • id - Primary key (nanoid)
  • accountId - Foreign key to account.id
  • platform - Enum: INSTAGRAM, TIKTOK, TWITTER, YOUTUBE, LINKEDIN, FACEBOOK, TWITCH, SNAPCHAT, ONLYFANS
  • url - Social media profile URL
  • followersCount - Platform-specific follower count (updated via scraping)

Unique Constraint: (accountId, platform) - One link per platform per account

Access Functions: packages/db/src/access/accountSocialLinks.ts

  • getSocialLinksByAccountId(accountId)
  • getSocialLinkByAccountAndPlatform(accountId, platform)
  • upsertSocialLink(data)
  • deleteSocialLink(id)
  • getTotalFollowersCount(accountId)

6. scrapes

File: packages/db/src/schema/scrapes.ts

Social media scraping results and retry tracking.

Key Fields:

  • id - Primary key (nanoid)
  • accountId - Foreign key to account.id
  • linkhash - Hash of the scraped URL (for deduplication)
  • link - The scraped URL
  • name - Scraped name/username
  • avatar - Scraped avatar URL
  • followers - Scraped follower count
  • error - Error message if scraping failed
  • rawData - JSON field storing raw scraping response
  • attemptCount - Number of failed scrape attempts (max 5)
  • lastAttemptAt - Timestamp of last attempt

Unique Constraint: (accountId, linkhash) - One scrape record per account-link combination

Indexes:

  • (accountId) - For getting all scrapes for an account

Access Functions: packages/db/src/access/scrapes.ts

  • getScrapesByAccountId(accountId)
  • upsertScrape(data)
  • incrementScrapeAttemptCount(accountId, linkhash)
  • resetScrapeAttemptCount(accountId, linkhash)

7. eleven_labs

File: packages/db/src/schema/elevenLabs.ts

ElevenLabs voice clones linked to accounts. This table stores voice IDs created on ElevenLabs and is used system-wide, not just for likeness search.

Key Fields:

  • accountId - Foreign key to account.id (part of composite primary key)
  • voiceId - ElevenLabs voice ID (part of composite primary key)
  • modelId - TTS model ID (e.g., "eleven_multilingual_v2")
  • stability - Voice stability setting (0-100)
  • similarityBoost - Similarity boost setting (0-100)
  • style - Style setting (0-100)
  • useSpeakerBoost - Enable speaker boost (boolean)
  • voiceTitle - Custom title for the voice
  • voiceExample - Media filename for TTS sample audio
  • createdAt - Timestamp when voice was created
  • updatedAt - Timestamp when voice was last updated

Primary Key: Composite (accountId, voiceId) - Allows multiple voice clones per account

Foreign Key: accountId references account.id with ON DELETE CASCADE

Access Functions: packages/db/src/access/elevenLabs.ts

  • getVoiceByAccountAndVoiceId(accountId, voiceId)
  • listVoicesByAccountId(accountId)
  • upsertVoice(data)
  • updateVoiceExample(accountId, voiceId, voiceExampleUrl)
  • deleteVoiceByAccountAndVoiceId(accountId, voiceId)
  • deleteAllVoicesByAccountId(accountId)

Usage:

  • Voice clones created during likeness search indexing are stored here
  • Voice IDs can be reused for other features (e.g., persona voice, marketplace previews)
  • voiceExample stores the generated TTS sample URL for playback

8. media

File: packages/db/src/schema/media.ts

Central registry for all uploaded files. Stores S3 coordinates and a unique filename used as the public reference. Content is served via the proxy endpoint GET /api/media/[filename] — see Media System.

Key Fields:

  • id - Primary key (nanoid)
  • filename - Unique public reference: {base}-{4char}.{ext} (e.g., photo-a3f2.jpg)
  • originalFilename - Original file name
  • s3Key, s3Bucket, s3Region - S3 metadata
  • contentType - MIME type
  • status - Enum: UPLOADING, COMPLETE, SOFT_DELETE, DELETE

Access Functions: packages/db/src/access/media.ts

  • createMedia(data)
  • getMediaByFilename(filename)
  • getMediaById(id)
  • updateMediaStatus(id, status)

Supporting Tables

account

File: packages/db/src/schema/account.ts (existing)

Used by the search system for:

  • displayName - Required for base requirements check
  • imageUrl - Required for base requirements check
  • slug - Used in search results
  • description - Can contribute tags to search index

ip_terms

File: packages/db/src/schema/ipTerms.ts (existing)

Used by the search system for:

  • Base requirements check (requires at least one approved term)
  • Voice search filtering (requires approved VoiceOver term for voice samples)

Enum Types

The system uses 23 PostgreSQL enum types, all defined in the schema files:

Category & Basic

  • likeness_search_category - MODELS, ACTORS, MUSICIANS, etc.
  • likeness_search_gender - MALE, FEMALE, NON_BINARY
  • likeness_search_country - USA, UK, etc.
  • likeness_search_target_audience - AGE_18_35, LUXURY, FITNESS
  • likeness_search_content_niche - FASHION, FITNESS, BEAUTY, etc.

Social

  • likeness_search_primary_platform - INSTAGRAM, TIKTOK, YOUTUBE, etc.
  • account_social_links_platform - Same as primary platform

Physical & Visual

  • likeness_search_body_type - SLIM, ATHLETIC, MUSCULAR, etc.
  • likeness_search_hair_color - BLACK, BROWN, BLONDE, etc.
  • likeness_search_hair_length - SHORT, MEDIUM, LONG
  • likeness_search_hair_type - STRAIGHT, CURLY, WAVY, KINKY
  • likeness_search_eye_color - BLUE, BROWN, GREEN, HAZEL
  • likeness_search_skin_color - FAIR, MEDIUM, DARK
  • likeness_search_ethnicity - CAUCASIAN, AFRICAN_AMERICAN, ASIAN, etc.
  • likeness_search_face_shape - OVAL, ROUND, SQUARE, etc.
  • likeness_search_facial_hair - CLEAN_SHAVEN, BEARD_FULL, etc.

Voice

  • likeness_search_voice_pitch - VERY_HIGH, HIGH, MEDIUM, LOW, VERY_LOW
  • likeness_search_voice_tone - WARM, AUTHORITATIVE, FRIENDLY, etc.
  • likeness_search_accent - AMERICAN_GENERAL, BRITISH_RP, etc.
  • likeness_search_language - ENGLISH, SPANISH, FRENCH, etc.

Queue

  • likeness_need_indexing_queue_element_type - Event trigger types
  • likeness_need_indexing_queue_status - Event status values
  • likeness_assets_type - IMAGE, VOICE
  • media_status - UPLOADING, COMPLETE, SOFT_DELETE, DELETE

Data Types

JSON Fields

Several tables use JSONB fields for flexible data storage:

  • likeness_assets.searchTags - AI-extracted tags (validated by Zod schemas)
  • likeness_need_indexing_queue.info - Additional event metadata
  • scrapes.rawData - Raw scraping response data

Vector Type

The likeness_search_vector.embedding field uses PostgreSQL's vector type (pgvector extension):

  • Dimensions: 1536 (OpenAI text-embedding-3-small)
  • Distance Metric: Cosine similarity (<=> operator)
  • Storage: Efficient binary format

Indexes and Constraints

Primary Keys

All tables use text primary keys with nanoid generation (except likeness_search_vector which uses accountId as primary key).

Foreign Keys

All tables reference account.id with ON DELETE CASCADE behavior (handled by Drizzle).

Unique Constraints

  • likeness_search.accountId - One search index entry per account
  • likeness_search_vector.accountId - One vector entry per account
  • account_social_links(accountId, platform) - One link per platform per account
  • scrapes(accountId, linkhash) - One scrape record per account-link combination
  • eleven_labs(accountId, voiceId) - Composite primary key (allows multiple voices per account)
  • media.filename - Unique index for proxy lookup

Indexes

likeness_need_indexing_queue:

  • (status, createdAt) - For FIFO event processing
  • (accountId, status) - For checking active events
  • (status, retryAt) - For processing retryable events

scrapes:

  • (accountId) - For getting all scrapes for an account

Migration

Latest Migration File: packages/db/drizzle/0009_modern_daredevil.sql

The migrations create:

  • 23 PostgreSQL enum types
  • 7 tables with proper foreign keys
  • Indexes for performance
  • Unique constraints for data integrity
  • Composite primary key for eleven_labs table

To Apply: Run npm run db:migrate in packages/db

Type Definitions

All TypeScript types are defined in packages/types/src/types/:

  • LikenessAssets - Asset type with all fields
  • NewLikenessAssets - Type for creating assets (omits auto-generated fields)
  • LikenessSearch - Search index type with all enum fields
  • SearchFilters - Search filter interface
  • FormattedSearchResult - Enriched search result type
  • LikenessQueueEvent - Queue event type
  • LikenessQueueElementType - Event trigger type union
  • LikenessQueueStatus - Event status type union
  • AccountSocialLink - Social link type
  • NewAccountSocialLink - Type for creating social links
  • ElevenLabsVoice - ElevenLabs voice clone type
  • NewElevenLabsVoice - Type for creating voice clones
  • Media - Media record type
  • MediaStatus - UPLOADING | COMPLETE | SOFT_DELETE | DELETE

Access Pattern

Important: Tables are never exposed directly. All database access goes through access functions in packages/db/src/access/. This ensures:

  • Consistent filtering (e.g., soft-deleted records)
  • Type safety
  • Data integrity
  • Easier testing and mocking

Example:

// ✅ Correct - Use access function
import { getAssetById } from "@zooly/app-db";
const asset = await getAssetById(assetId);

// ❌ Wrong - Don't access table directly
import { likenessAssetsTable } from "@zooly/app-db";
const asset = await db.select().from(likenessAssetsTable)...