Activity Model

Overview

This document describes the recommended data model for activities, contacts, people, and organizations in Reply Pilot.

This page is the source of truth for the activity/contact data model. When the schema changes in this area, update this page and regenerate the diagrams with rp.

The model is intentionally split into two layers:

  • identity layer: who the company, person, email address, or phone number is
  • activity layer: what happened, when it happened, and who participated

The design is inspired by the OFBiz approach built around Party, ContactMech, PartyRelationship, and CommunicationEvent, but simplified for Reply Pilot and adapted to PostgreSQL with standard BIGINT primary keys.

The main design goal is:

  • store every incoming or manually created event separately
  • store every discovered email address or phone number separately
  • link activities to parties and contact mechanisms without losing raw source data
  • support deduplication over email addresses, phone numbers, people, and organizations
  • keep the timeline query simple

Typical examples:

  • an incoming or sent email creates one activity row and one activity_email row
  • the same email may also create zero or more activity_email_attachment and activity_email_link rows
  • every address from from, reply-to, to, cc, and bcc is stored as a separate contact_mech
  • participants of the email are linked through activity_participant
  • a phone call creates one activity row and one activity_call row
  • the caller and callee are linked through activity_participant

Design Principles

  • party represents a business identity, either a person or an organization
  • contact_mech represents a communication endpoint, for example an email address or a phone number
  • party_contact_mech links a party to a contact mechanism
  • party_relationship links one party to another party
  • activity is the shared timeline record
  • activity_* tables store per-type details
  • activity_participant links an activity to parties and contact mechanisms
  • deduplication belongs to the contact mechanism layer, not to activity
  • code domains such as participant roles and relationship types stay as application constants in V1
  • V1 does not introduce a separate activity_thread table
  • a single activity must be insertable even when a party is still unknown

Identity Layer

party

Purpose:

  • root identity table for both people and organizations

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_type_code TEXT NOT NULL
  • display_name TEXT NOT NULL DEFAULT ''
  • status_code TEXT NOT NULL DEFAULT 'ACTIVE'
  • merged_into_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for party_type_code:

  • PERSON
  • ORGANIZATION
  • TEAM
  • UNKNOWN

Notes:

  • party is the stable identity layer used everywhere else
  • merges should not delete rows immediately; use merged_into_party_id

party_person

Purpose:

  • person-specific details

Columns:

  • party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADE
  • first_name TEXT NOT NULL DEFAULT ''
  • last_name TEXT NOT NULL DEFAULT ''
  • middle_name TEXT NOT NULL DEFAULT ''

party_organization

Purpose:

  • organization-specific details

Columns:

  • party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADE
  • legal_name TEXT NOT NULL DEFAULT ''
  • normalized_legal_name TEXT NOT NULL DEFAULT ''
  • company_registration_number TEXT NOT NULL DEFAULT ''
  • tax_identifier TEXT NOT NULL DEFAULT ''
  • show_by_default BOOLEAN NOT NULL DEFAULT TRUE
  • default_email_sales_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULL

Notes:

  • show_by_default = FALSE skryje organizaci z beznych seznamu, odvozenych company lookupu a company search dokumentu, ale neblokuje primy pristup na detail firmy
  • default_email_sales_user_id urcuje vychoziho obchodnika pro emailovou komunikaci konkretni spolecnosti; pokud neni vyplneny nebo uzivatel neni dostupny pro Jira assignment, aplikace pouzije globalni konfiguraci a dalsi fallbacky

party_role

Purpose:

  • attach business roles to a party

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • role_type_code TEXT NOT NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for role_type_code:

  • CUSTOMER
  • SUPPLIER
  • CONTACT
  • EMPLOYEE
  • LEAD

Constraint:

  • UNIQUE (party_id, role_type_code)

party_outreach_policy

Purpose:

  • store explicit commercial outreach policy for one organization

Columns:

  • party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADE
  • status_code TEXT NOT NULL DEFAULT 'ACTIVE_LEAD'
  • note TEXT NOT NULL DEFAULT ''
  • set_by_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for status_code:

  • ACTIVE_LEAD
  • DO_NOT_CONTACT
  • DISQUALIFIED

Notes:

  • this table is the source of truth for explicit commercial decisions such as do not contact
  • show_by_default = FALSE may still be used as a UI-level hide flag, but not as the business reason itself
  • operators should always attach a human note when switching a company to DO_NOT_CONTACT

party_identifier

Purpose:

  • store deduplication and lookup identifiers for a party

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • identifier_type_code TEXT NOT NULL
  • value_raw TEXT NOT NULL
  • value_normalized TEXT NOT NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for identifier_type_code:

  • COMPANY_REGISTRATION_NUMBER
  • TAX_IDENTIFIER
  • GLN
  • DOMAIN
  • WEBSITE
  • EXTERNAL_ID
  • VENDOR_CODE
  • CME_DODAVATEL_ID

Constraint:

  • UNIQUE (identifier_type_code, value_normalized)

party_cme_match

Purpose:

  • keep the latest known CME lookup result for one organization

Columns:

  • party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADE
  • match_state TEXT NOT NULL DEFAULT 'NO_MATCH'
  • cme_record_type TEXT NOT NULL DEFAULT ''
  • integration_state TEXT NOT NULL DEFAULT ''
  • cme_dodavatel_id BIGINT NULL
  • match_reason TEXT NOT NULL DEFAULT ''
  • matched_by_ico BOOLEAN NOT NULL DEFAULT FALSE
  • matched_by_dic BOOLEAN NOT NULL DEFAULT FALSE
  • matched_by_domain BOOLEAN NOT NULL DEFAULT FALSE
  • matched_by_name BOOLEAN NOT NULL DEFAULT FALSE
  • matched_ico TEXT NOT NULL DEFAULT ''
  • matched_dic TEXT NOT NULL DEFAULT ''
  • matched_domain TEXT NOT NULL DEFAULT ''
  • matched_name TEXT NOT NULL DEFAULT ''
  • checked_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • last_error TEXT NOT NULL DEFAULT ''

Recommended values for match_state:

  • NO_MATCH
  • MATCHED
  • AMBIGUOUS
  • ERROR

Notes:

  • this table stores current integration state, not a full audit log
  • integration_state mirrors current CME interpretation such as SUPPLIER_INTEGRATED, DODAVATEL_ONLY, or ONBOARDING_ONLY
  • stable CME supplier ids still belong into party_identifier as CME_DODAVATEL_ID

party_cme_check_queue

Purpose:

  • queue organizations waiting for CME verification

Columns:

  • party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADE
  • reason TEXT NOT NULL DEFAULT ''
  • priority INTEGER NOT NULL DEFAULT 100
  • requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • not_before TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • attempt_count INTEGER NOT NULL DEFAULT 0
  • locked_at TIMESTAMPTZ NULL

Notes:

  • used by the background worker to process only pending or stale organizations
  • avoids minute-based full scans over the whole party_organization table

Party Requirement Tracking

This view extends the party/activity model with requirement tracking for supplier onboarding and similar workflows.

requirement_type

Purpose:

  • list supported requirement codes tracked for organizations

Columns:

  • code TEXT PRIMARY KEY
  • name TEXT NOT NULL
  • sort_order INTEGER NOT NULL UNIQUE
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for code:

  • ENLISTMENT_TABLE
  • PRICE_LIST
  • TOP10
  • FEED
  • BUSINESS_TERMS
  • SUPPLIER_IDENTIFIER

requirement_state_type

Purpose:

  • list supported aggregate states for one requirement on one party

Columns:

  • code TEXT PRIMARY KEY
  • name TEXT NOT NULL
  • sort_order INTEGER NOT NULL UNIQUE
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for code:

  • UNKNOWN
  • REQUESTED
  • CANDIDATE_RECEIVED
  • CONFIRMED_RECEIVED
  • REJECTED
  • NEEDS_REVIEW

requirement_evidence_source_type

Purpose:

  • list supported evidence sources

Columns:

  • code TEXT PRIMARY KEY
  • name TEXT NOT NULL
  • sort_order INTEGER NOT NULL UNIQUE
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for code:

  • EMAIL_THREAD
  • EMAIL_MESSAGE
  • EMAIL_ATTACHMENT
  • MANUAL
  • IMPORT

requirement_evidence_verdict_type

Purpose:

  • list supported verdicts for one evidence item

Columns:

  • code TEXT PRIMARY KEY
  • name TEXT NOT NULL
  • sort_order INTEGER NOT NULL UNIQUE
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for code:

  • PRESENT
  • NOT_PRESENT
  • UNCLEAR

party_requirement_state

Purpose:

  • store the current aggregate state of one tracked requirement for one party

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • requirement_code TEXT NOT NULL REFERENCES public.requirement_type (code)
  • state_code TEXT NOT NULL REFERENCES public.requirement_state_type (code)
  • last_evidence_id BIGINT NULL REFERENCES public.party_requirement_evidence (id) ON DELETE SET NULL
  • is_manual_override BOOLEAN NOT NULL DEFAULT FALSE
  • manual_note TEXT NOT NULL DEFAULT ''
  • requested_at TIMESTAMPTZ NULL
  • fulfilled_at TIMESTAMPTZ NULL
  • resolved_delivery_mode_code TEXT NOT NULL DEFAULT ''
  • resolved_value_type_code TEXT NOT NULL DEFAULT ''
  • resolved_value_raw TEXT NOT NULL DEFAULT ''
  • resolved_value_normalized TEXT NOT NULL DEFAULT ''
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Constraint:

  • UNIQUE (party_id, requirement_code)

Notes:

  • this table is the source of truth for the current requirement state shown on the company detail or in company filters
  • manual overrides belong here; the evidence rows stay immutable audit records
  • FEED a SUPPLIER_IDENTIFIER tu navic drzi finalni propagovanou hodnotu vybranou agregatorem z historickych evidence/fact zaznamu

party_requirement_evidence

Purpose:

  • store one evidence item that supports or rejects a requirement state

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • requirement_code TEXT NOT NULL REFERENCES public.requirement_type (code)
  • source_type_code TEXT NOT NULL REFERENCES public.requirement_evidence_source_type (code)
  • verdict_code TEXT NOT NULL REFERENCES public.requirement_evidence_verdict_type (code)
  • activity_email_id BIGINT NULL REFERENCES public.activity_email (activity_id) ON DELETE SET NULL
  • external_thread_id TEXT NOT NULL DEFAULT ''
  • attachment_filename TEXT NOT NULL DEFAULT ''
  • attachment_relative_path TEXT NOT NULL DEFAULT ''
  • attachment_sha256 TEXT NOT NULL DEFAULT ''
  • confidence NUMERIC(5,4) NULL
  • reason TEXT NOT NULL DEFAULT ''
  • extract_json JSONB NOT NULL DEFAULT '{}'::jsonb
  • model_name TEXT NOT NULL DEFAULT ''
  • model_version TEXT NOT NULL DEFAULT ''
  • prompt_version TEXT NOT NULL DEFAULT ''
  • created_by TEXT NOT NULL DEFAULT 'system'
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended constraint:

  • CHECK (confidence IS NULL OR (confidence >= 0 AND confidence <= 1))

Notes:

  • evidence may point to a concrete imported email through activity_email_id
  • the thread id still stays inline because V1 still has no dedicated thread table
  • evidence may denormalize selected attachment metadata even though activity_email_attachment exists, because evidence rows should remain immutable audit records
  • extract_json stores structured AI output, not the source of truth for the aggregate state

party_requirement_eval_queue

Purpose:

  • queue parties waiting for requirement-state re-evaluation

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • requirement_code TEXT NOT NULL REFERENCES public.requirement_type (code)
  • reason TEXT NOT NULL DEFAULT ''
  • priority INTEGER NOT NULL DEFAULT 100
  • not_before TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • attempt_count INTEGER NOT NULL DEFAULT 0
  • locked_at TIMESTAMPTZ NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Constraint:

  • UNIQUE (party_id, requirement_code)

Notes:

  • used by the background worker to re-evaluate only affected companies after a new email, prompt change, or manual reset
  • avoids minute-based rescans over every company and every past email

Extracted Email Facts

This view stores concrete values extracted from one email, attachment, or AI pass without collapsing them directly into the final company state.

party_feed_fact

Purpose:

  • store one extracted candidate feed value for one party

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • activity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADE
  • evidence_id BIGINT NULL REFERENCES public.party_requirement_evidence (id) ON DELETE SET NULL
  • external_thread_id TEXT NOT NULL DEFAULT ''
  • attachment_filename TEXT NOT NULL DEFAULT ''
  • attachment_relative_path TEXT NOT NULL DEFAULT ''
  • attachment_sha256 TEXT NOT NULL DEFAULT ''
  • url_raw TEXT NOT NULL DEFAULT ''
  • url_normalized TEXT NOT NULL DEFAULT ''
  • delivery_mode_code TEXT NOT NULL
  • confidence NUMERIC(5,4) NULL
  • reason TEXT NOT NULL DEFAULT ''
  • extract_json JSONB NOT NULL DEFAULT '{}'::jsonb
  • created_by TEXT NOT NULL DEFAULT 'system'
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for delivery_mode_code:

  • DIRECT_URL
  • LOGIN_REQUIRED
  • IDENTIFIER_ONLY
  • MANUAL_EXPORT
  • NO_FEED
  • OTHER

Recommended constraints:

  • CHECK (confidence IS NULL OR (confidence >= 0 AND confidence <= 1))
  • CHECK (delivery_mode_code <> '')

Notes:

  • one email may produce multiple feed facts, for example several URLs or one login URL plus one identifier-only hint
  • evidence_id is optional because deterministic extractors may write facts before or without a requirement-evidence row
  • facts may denormalize selected attachment metadata or URLs even though activity_email_attachment and activity_email_link exist, because facts should preserve the exact extraction input used at the time
  • the final company attribute still belongs to party_requirement_state

party_supplier_identifier_fact

Purpose:

  • store one extracted supplier identifier candidate for one party

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • activity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADE
  • evidence_id BIGINT NULL REFERENCES public.party_requirement_evidence (id) ON DELETE SET NULL
  • external_thread_id TEXT NOT NULL DEFAULT ''
  • attachment_filename TEXT NOT NULL DEFAULT ''
  • attachment_relative_path TEXT NOT NULL DEFAULT ''
  • attachment_sha256 TEXT NOT NULL DEFAULT ''
  • identifier_type_code TEXT NOT NULL
  • value_raw TEXT NOT NULL DEFAULT ''
  • value_normalized TEXT NOT NULL DEFAULT ''
  • confidence NUMERIC(5,4) NULL
  • reason TEXT NOT NULL DEFAULT ''
  • extract_json JSONB NOT NULL DEFAULT '{}'::jsonb
  • created_by TEXT NOT NULL DEFAULT 'system'
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for identifier_type_code:

  • same code family as party_identifier.identifier_type_code
  • typical values for supplier onboarding:
  • COMPANY_REGISTRATION_NUMBER
  • TAX_IDENTIFIER
  • GLN
  • VENDOR_CODE
  • CME_DODAVATEL_ID
  • EXTERNAL_ID

Recommended constraints:

  • CHECK (confidence IS NULL OR (confidence >= 0 AND confidence <= 1))
  • CHECK ((value_raw <> '' OR value_normalized <> '') AND identifier_type_code <> '')

Notes:

  • one email may produce multiple identifier facts
  • once confirmed, a selected identifier may later be promoted into party_identifier, but the extracted fact rows remain immutable source data
  • the final company attribute still belongs to party_requirement_state

Contact Mechanisms

This view focuses on party and contact tables only.

Contact model schema

contact_mech

Purpose:

  • generic communication endpoint
  • shared root record used for activity linking and contact subtype storage

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • contact_mech_type_code TEXT NOT NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for contact_mech_type_code:

  • EMAIL
  • PHONE
  • WEB
  • ADDRESS

Notes:

  • contact_mech must exist even when no matching party exists yet
  • V1 does not define a dedicated contact mechanism lifecycle beyond the fields described in this schema

contact_mech_email

Purpose:

  • email-specific fields

Columns:

  • contact_mech_id BIGINT PRIMARY KEY REFERENCES public.contact_mech (id) ON DELETE CASCADE
  • email TEXT NOT NULL

Constraints:

  • UNIQUE (email)

Notes:

  • email must be stored in normalized form and is the exact deduplication key
  • the original imported header value stays in activity_participant.address_raw

contact_mech_phone

Purpose:

  • phone-specific fields

Columns:

  • contact_mech_id BIGINT PRIMARY KEY REFERENCES public.contact_mech (id) ON DELETE CASCADE
  • phone_raw TEXT NOT NULL
  • phone_e164 TEXT NOT NULL
  • country_code TEXT NOT NULL DEFAULT ''

Constraints:

  • UNIQUE (phone_e164)

Notes:

  • phone_e164 is the exact deduplication key for phone numbers

party_contact_mech

Purpose:

  • assign a contact mechanism to a party

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • contact_mech_id BIGINT NOT NULL REFERENCES public.contact_mech (id) ON DELETE CASCADE
  • verified BOOLEAN NOT NULL DEFAULT FALSE
  • is_primary BOOLEAN NOT NULL DEFAULT FALSE
  • from_date TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • thru_date TIMESTAMPTZ NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Constraints:

  • UNIQUE (party_id, contact_mech_id)

Notes:

  • shared mailboxes such as sales@company.com can be linked directly to an organization party and, when needed, also to a person party
  • a person can have multiple email addresses and phone numbers

party_contact_mech_purpose

Purpose:

  • assign business meaning to the party/contact linkage

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • party_contact_mech_id BIGINT NOT NULL REFERENCES public.party_contact_mech (id) ON DELETE CASCADE
  • purpose_code TEXT NOT NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for purpose_code:

  • PRIMARY_EMAIL
  • WORK_EMAIL
  • BILLING_EMAIL
  • PRIMARY_PHONE
  • WORK_PHONE

Constraint:

  • UNIQUE (party_contact_mech_id, purpose_code)

Party Relationships

party_relationship

Purpose:

  • link one party to another party with direction, role, and validity

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • from_party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • to_party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADE
  • relationship_type_code TEXT NOT NULL
  • from_role_type_code TEXT NOT NULL DEFAULT ''
  • to_role_type_code TEXT NOT NULL DEFAULT ''
  • from_date TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • thru_date TIMESTAMPTZ NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for relationship_type_code:

  • EMPLOYMENT
  • CONTACT_FOR
  • SUPPLIER_RELATIONSHIP
  • CUSTOMER_RELATIONSHIP
  • REPORTS_TO

Examples:

  • person -> organization via EMPLOYMENT
  • person -> organization via CONTACT_FOR
  • organization -> organization via SUPPLIER_RELATIONSHIP

Activity Layer

This view focuses on activity tables only.

Activity model schema

activity_type

Purpose:

  • list of supported activity types

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • code TEXT NOT NULL UNIQUE
  • name TEXT NOT NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Initial seed:

  • email
  • call
  • meeting
  • note

activity

Purpose:

  • shared timeline record for every event

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • activity_type_id BIGINT NOT NULL REFERENCES public.activity_type (id)
  • primary_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULL
  • occurred_at TIMESTAMPTZ NOT NULL
  • title TEXT NOT NULL DEFAULT ''
  • summary TEXT NOT NULL DEFAULT ''
  • status_code TEXT NOT NULL DEFAULT 'ACTIVE'
  • created_by_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Notes:

  • primary_party_id is optional because an activity may arrive before the system knows the right person or organization
  • in V1 it stays nullable, and activity_participant is the primary source of party/contact linkage
  • occurred_at is the primary timeline sort key

Activity Detail Tables

Each detail table must use a strict one-to-one relationship:

  • activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADE

activity_email

Purpose:

  • email message details
  • stores both imported incoming messages and sent messages materialized after a successful Gmail send

Columns:

  • activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADE
  • provider TEXT NOT NULL DEFAULT 'gmail'
  • external_message_id TEXT NOT NULL
  • external_thread_id TEXT NOT NULL DEFAULT ''
  • internet_message_id TEXT NOT NULL DEFAULT ''
  • subject TEXT NOT NULL DEFAULT ''
  • body_text TEXT NOT NULL DEFAULT ''
  • body_html TEXT NOT NULL DEFAULT ''
  • sent_at TIMESTAMPTZ NULL
  • received_at TIMESTAMPTZ NULL

Constraints:

  • UNIQUE (provider, external_message_id)

Important rule:

  • one activity_email row should represent one email message, not one entire email thread
  • the thread is represented by external_thread_id
  • V1 does not introduce a separate activity_thread table
  • body_html stores decoded text/html MIME body parts only; it is not raw RFC822 storage and does not include separate inline CID image metadata
  • summaries, search, AI context, deterministic facts, and link extraction keep using body_text

activity_email_attachment

Purpose:

  • store attachment metadata for one imported email message
  • bridge activity_email to the existing file-based attachment cache

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • activity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADE
  • storage_backend_code TEXT NOT NULL DEFAULT 'REPLY_PILOT_BE_CACHE'
  • filename TEXT NOT NULL DEFAULT ''
  • mime_type TEXT NOT NULL DEFAULT 'application/octet-stream'
  • size_bytes BIGINT NOT NULL DEFAULT 0
  • relative_path TEXT NOT NULL DEFAULT ''
  • content_hash TEXT NOT NULL DEFAULT ''
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for storage_backend_code:

  • REPLY_PILOT_BE_CACHE

Recommended constraints:

  • CHECK (size_bytes >= 0)
  • CHECK (relative_path <> '')

Notes:

  • this table stores attachment metadata only, not binary content
  • binary payload stays in the existing backend cache under reply-pilot-be/data/emails/attachments/
  • relative_path is the storage bridge used to resolve the file from the DB row into the local cache

Purpose:

  • store explicit URLs extracted from one imported email message

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • activity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADE
  • source_code TEXT NOT NULL DEFAULT 'BODY_TEXT'
  • url_raw TEXT NOT NULL DEFAULT ''
  • url_normalized TEXT NOT NULL DEFAULT ''
  • scheme TEXT NOT NULL DEFAULT ''
  • host TEXT NOT NULL DEFAULT ''
  • path TEXT NOT NULL DEFAULT ''
  • context_snippet TEXT NOT NULL DEFAULT ''
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for source_code:

  • BODY_TEXT

Recommended constraints:

  • CHECK (url_raw <> '')
  • CHECK (url_normalized <> '')

Notes:

  • V1 stores only explicit http and https links extracted from body_text
  • url_raw preserves the exact string seen in the email, while url_normalized is the stable lookup key used by later extractors
  • this table is the source of truth for feed URL candidates before they are classified into company-level facts

activity_call

Purpose:

  • phone or online call details

Columns:

  • activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADE
  • started_at TIMESTAMPTZ NOT NULL
  • ended_at TIMESTAMPTZ NULL
  • direction_code TEXT NOT NULL DEFAULT ''
  • notes TEXT NOT NULL DEFAULT ''

Recommended values for direction_code:

  • INBOUND
  • OUTBOUND

activity_meeting

Purpose:

  • meeting details

Columns:

  • activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADE
  • started_at TIMESTAMPTZ NOT NULL
  • ended_at TIMESTAMPTZ NULL
  • location TEXT NOT NULL DEFAULT ''
  • meeting_url TEXT NOT NULL DEFAULT ''
  • notes TEXT NOT NULL DEFAULT ''

activity_note

Purpose:

  • manually entered note details

Columns:

  • activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADE
  • note_text TEXT NOT NULL

Lead Import Workflow

This view stores server-side batch imports of supplier leads prepared outside Reply Pilot, for example from reply-pilot-wholesale-scout.

lead_import_batch

Purpose:

  • store one imported JSONL file prepared for operator review

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • source_filename TEXT NOT NULL
  • source_sha256 TEXT NOT NULL
  • source_path TEXT NOT NULL
  • state_code TEXT NOT NULL DEFAULT 'REVIEW'
  • created_by_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULL
  • total_item_count INTEGER NOT NULL DEFAULT 0
  • processed_item_count INTEGER NOT NULL DEFAULT 0
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • completed_at TIMESTAMPTZ NULL

Recommended values for state_code:

  • REVIEW
  • COMPLETED
  • FAILED

Notes:

  • the source file lives on backend storage under a server-managed directory
  • DB stores the audit trail and review state, not the original file bytes

lead_import_item

Purpose:

  • store one supplier candidate from an imported JSONL batch

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • batch_id BIGINT NOT NULL REFERENCES public.lead_import_batch (id) ON DELETE CASCADE
  • line_number INTEGER NOT NULL
  • state_code TEXT NOT NULL DEFAULT 'PENDING'
  • supplier_name TEXT NOT NULL DEFAULT ''
  • legal_name TEXT NOT NULL DEFAULT ''
  • website TEXT NOT NULL DEFAULT ''
  • source_record_json JSONB NOT NULL DEFAULT '{}'::jsonb
  • matched_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULL
  • matched_reason TEXT NOT NULL DEFAULT ''
  • imported_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULL
  • jira_task_id BIGINT NULL REFERENCES public.task (id) ON DELETE SET NULL
  • jira_key TEXT NOT NULL DEFAULT ''
  • ai_email_subject TEXT NOT NULL DEFAULT ''
  • ai_email_body TEXT NOT NULL DEFAULT ''
  • ai_model TEXT NOT NULL DEFAULT ''
  • ai_generated_at TIMESTAMPTZ NULL
  • operator_note TEXT NOT NULL DEFAULT ''
  • last_error TEXT NOT NULL DEFAULT ''
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for state_code:

  • PENDING
  • IMPORTED
  • DO_NOT_CONTACT
  • SKIPPED
  • ERROR

Notes:

  • matched_party_id stores the best prefill match against an existing company
  • imported_party_id stores the actual target company after operator decision
  • source_record_json keeps the immutable imported payload, while AI draft and operator decision live in explicit columns

Jira Task Layer

Jira remains the source of truth for workflow status and task text. Reply Pilot keeps a local cache in task / task_jira so the app can join Jira work with companies, email threads, users, and lead import decisions without querying Jira for every local linkage decision. Jira summary and description are not stored in Reply Pilot; the app may fetch them from Jira only for the active request.

task

Purpose:

  • root table for local task records

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • task_type_code TEXT NOT NULL DEFAULT 'jira'
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for task_type_code:

  • jira

task_jira_work_type

Purpose:

  • lookup for the Jira work type represented by one task_jira cache row

Columns:

  • code TEXT PRIMARY KEY
  • name TEXT NOT NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for code:

  • supplier_onboarding
  • email_thread_reply

task_jira

Purpose:

  • cache one Jira work item and link it to a company

Columns:

  • task_id BIGINT PRIMARY KEY REFERENCES public.task (id) ON DELETE CASCADE
  • jira_work_type_code TEXT NOT NULL REFERENCES public.task_jira_work_type (code)
  • jira_key TEXT NOT NULL DEFAULT ''
  • status TEXT NOT NULL DEFAULT ''
  • party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULL
  • user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULL
  • resolution TEXT NOT NULL DEFAULT ''
  • jira_updated_at TIMESTAMPTZ NULL
  • jira_synced_at TIMESTAMPTZ NULL
  • jira_sync_error TEXT NOT NULL DEFAULT ''
  • jira_assignee_account_id TEXT NOT NULL DEFAULT ''
  • jira_assignee_display_name TEXT NOT NULL DEFAULT ''
  • jira_assignee_email TEXT NOT NULL DEFAULT ''

Notes:

  • status, resolution, and assignee fields are Jira cache values; Jira owns the canonical values
  • Jira summary and description are intentionally not cached in this table
  • party_id is the common company link for both supplier onboarding and email reply work items

task_jira_supplier_onboarding

Purpose:

  • subtype row for a supplier onboarding Jira work item

Columns:

  • task_id BIGINT PRIMARY KEY REFERENCES public.task_jira (task_id) ON DELETE CASCADE
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Notes:

  • all pre-existing local Jira tasks are migrated into this subtype
  • Jira Cloud work type migration of existing tickets is an operator bulk-move step, not an automatic DB migration

task_jira_email_thread_reply

Purpose:

  • subtype row for an email-thread reply Jira work item

Columns:

  • task_id BIGINT PRIMARY KEY REFERENCES public.task_jira (task_id) ON DELETE CASCADE
  • external_thread_id TEXT NOT NULL
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Notes:

  • company link remains in task_jira.party_id
  • email thread link uses external_thread_id until the model grows a dedicated email thread table
  • incoming email moves only linked email_thread_reply work items to Drafting Reply
  • sending a reply moves only linked email_thread_reply work items to Waiting for Reply

Activity Participants

activity_participant

Purpose:

  • link an activity to parties and contact mechanisms
  • preserve raw imported values from email headers or call sources
  • connect the activity layer to the contact_mech model

Columns:

  • id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY
  • activity_id BIGINT NOT NULL REFERENCES public.activity (id) ON DELETE CASCADE
  • participant_role_code TEXT NOT NULL
  • party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULL
  • contact_mech_id BIGINT NULL REFERENCES public.contact_mech (id) ON DELETE SET NULL
  • display_name_raw TEXT NOT NULL DEFAULT ''
  • address_raw TEXT NOT NULL DEFAULT ''
  • sort_order INTEGER NOT NULL DEFAULT 0
  • is_internal BOOLEAN NOT NULL DEFAULT FALSE
  • created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()

Recommended values for participant_role_code:

  • FROM
  • REPLY_TO
  • TO
  • CC
  • BCC
  • CALLER
  • CALLEE
  • ORGANIZER
  • ATTENDEE
  • AUTHOR

Recommended constraints:

  • CHECK (party_id IS NOT NULL OR contact_mech_id IS NOT NULL OR address_raw <> '')

Notes:

  • contact_mech_id is the critical link for matching imported email addresses or phone numbers to the deduplicated contact layer
  • party_id is optional because the correct person or organization may be resolved later
  • display_name_raw and address_raw preserve the original source value

This view shows how activities link to contact and party records.

Activity contact linkage schema

Incoming Email Flow

This is the main integration point between activity and contact_mech.

When an incoming email arrives:

  1. create one activity row with type email
  2. create one activity_email row for the concrete email message
  3. store normalized plain text in body_text
  4. store decoded HTML MIME body content in body_html when present
  5. if attachment metadata is already available in the backend cache, create one activity_email_attachment row per attachment and keep the binary payload in backend file storage
  6. extract explicit http(s) links from activity_email.body_text and create activity_email_link rows
  7. for each address in from, reply-to, to, cc, and bcc:
  8. normalize the email address
  9. look up an existing contact_mech through contact_mech_email.email
  10. if it does not exist, create contact_mech and contact_mech_email
  11. try to resolve an existing party through party_contact_mech
  12. create one activity_participant row with the proper role code
  13. if no matching party exists yet, keep the participant linked only to contact_mech_id
  14. when a person or organization is resolved later, update or add the party_contact_mech linkage without rewriting the activity history

This approach ensures that:

  • every email address is stored independently
  • to, cc, and bcc are searchable
  • deduplication works across multiple activities
  • later identity resolution does not destroy source history

Incoming Call Flow

When a phone call is recorded:

  1. create one activity row with type call
  2. create one activity_call row
  3. normalize the caller and callee phone numbers
  4. look up an existing contact_mech through contact_mech_phone.phone_e164
  5. if it does not exist, create contact_mech and contact_mech_phone
  6. link both sides through activity_participant
  7. resolve party_id only when available

Deduplication Strategy

Exact Deduplication

Exact matching should be based on:

  • contact_mech_email (email)
  • contact_mech_phone (phone_e164)
  • party_identifier (identifier_type_code, value_normalized)

Soft Deduplication

Candidate duplicate people or organizations can be suggested using:

  • matching name plus same email domain
  • matching person name plus shared company relationship
  • matching normalized organization name plus domain
  • repeated reuse of the same contact_mech by an unresolved party candidate

Merge Strategy

Recommended merge strategy:

  • do not hard-delete duplicates immediately
  • set party.merged_into_party_id
  • optionally add the same pattern later for contact_mech

Integrity Rules

  • one detail row per activity
  • one activity email row per external message
  • one contact mechanism per normalized email or phone value
  • one party_contact_mech row per unique party/contact pair
  • manual contact removal should set party_contact_mech.thru_date; it should not delete the shared contact_mech or historical activities
  • manual removal of a person from a company should set party_relationship.thru_date on the active CONTACT_FOR relationship; it should not delete the party_person
  • an activity participant may point to a party, a contact mechanism, or both
  • activity inserts and detail inserts must happen in one transaction
  • imported raw values must be preserved even after later normalization
  • activity (occurred_at DESC)
  • activity (primary_party_id, occurred_at DESC)
  • activity_participant (activity_id, participant_role_code, sort_order)
  • activity_participant (contact_mech_id)
  • activity_participant (party_id)
  • contact_mech_email (email) unique index
  • contact_mech_phone (phone_e164) unique index
  • party_contact_mech (party_id, contact_mech_id) unique index
  • party_identifier (identifier_type_code, value_normalized) unique index
  • activity_email (provider, external_message_id) unique index
  • activity_email (external_thread_id)
  • activity_email_attachment (relative_path) unique index
  • activity_email_attachment (activity_email_id)
  • activity_email_attachment (content_hash)
  • activity_email_link (activity_email_id, source_code, url_normalized) unique index
  • activity_email_link (activity_email_id)
  • activity_email_link (host)
  • party_requirement_state (requirement_code, state_code, updated_at DESC)
  • party_requirement_evidence (party_id, requirement_code, created_at DESC)
  • party_requirement_evidence (activity_email_id)
  • party_requirement_evidence (external_thread_id)
  • party_requirement_evidence (attachment_sha256)
  • party_requirement_eval_queue (not_before, priority, created_at)
  • party_requirement_eval_queue (locked_at)
  • party_feed_fact (party_id, created_at DESC)
  • party_feed_fact (activity_email_id)
  • party_feed_fact (evidence_id)
  • party_feed_fact (url_normalized)
  • party_feed_fact (delivery_mode_code)
  • party_supplier_identifier_fact (party_id, identifier_type_code, created_at DESC)
  • party_supplier_identifier_fact (activity_email_id)
  • party_supplier_identifier_fact (evidence_id)
  • party_supplier_identifier_fact (identifier_type_code, value_normalized)
  • party_outreach_policy (status_code)
  • lead_import_batch (state_code, created_at DESC)
  • lead_import_item (batch_id, state_code, line_number)
  • lead_import_item (matched_party_id)
  • lead_import_item (imported_party_id)

Migration Status

The legacy company, company_contact, and communication_history tables were transitional only. The repo has already switched writes to the party/contact_mech/activity model and the legacy tables were removed after verification.

Historical migration path was:

  1. create the new party, contact_mech, and activity tables
  2. migrate each company into party with party_type_code = 'ORGANIZATION'
  3. migrate each company contact email into:
  4. contact_mech
  5. contact_mech_email
  6. party_contact_mech
  7. migrate each communication-history row into:
  8. activity
  9. activity_email
  10. move the legacy Gmail thread id into activity_email.external_thread_id
  11. if the old data was only thread-based, keep the first migrated version thread-based and later split it into message-based import
  12. switch the backend write path to the new tables
  13. remove the legacy tables after production verification

MVP Scope

The first implementation should include:

  • party
  • party_organization
  • contact_mech
  • contact_mech_email
  • party_contact_mech
  • activity_type
  • activity
  • activity_email
  • activity_participant
  • incoming email import linked to contact_mech

The first implementation does not need:

  • full postal address support
  • advanced role taxonomies
  • automatic duplicate merges
  • strict trigger-based enforcement between activity type and detail table