Activity Model
Overview
This document describes the recommended data model for activities, contacts, people, and organizations in Reply Pilot.
This page is the source of truth for the activity/contact data model. When the
schema changes in this area, update this page and regenerate the diagrams with
rp.
The model is intentionally split into two layers:
- identity layer: who the company, person, email address, or phone number is
- activity layer: what happened, when it happened, and who participated
The design is inspired by the OFBiz approach built around Party,
ContactMech, PartyRelationship, and CommunicationEvent, but simplified
for Reply Pilot and adapted to PostgreSQL with standard BIGINT primary keys.
The main design goal is:
- store every incoming or manually created event separately
- store every discovered email address or phone number separately
- link activities to parties and contact mechanisms without losing raw source data
- support deduplication over email addresses, phone numbers, people, and organizations
- keep the timeline query simple
Typical examples:
- an incoming or sent email creates one
activityrow and oneactivity_emailrow - the same email may also create zero or more
activity_email_attachmentandactivity_email_linkrows - every address from
from,reply-to,to,cc, andbccis stored as a separatecontact_mech - participants of the email are linked through
activity_participant - a phone call creates one
activityrow and oneactivity_callrow - the caller and callee are linked through
activity_participant
Design Principles
partyrepresents a business identity, either a person or an organizationcontact_mechrepresents a communication endpoint, for example an email address or a phone numberparty_contact_mechlinks a party to a contact mechanismparty_relationshiplinks one party to another partyactivityis the shared timeline recordactivity_*tables store per-type detailsactivity_participantlinks an activity to parties and contact mechanisms- deduplication belongs to the contact mechanism layer, not to
activity - code domains such as participant roles and relationship types stay as application constants in V1
- V1 does not introduce a separate
activity_threadtable - a single activity must be insertable even when a party is still unknown
Identity Layer
party
Purpose:
- root identity table for both people and organizations
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_type_code TEXT NOT NULLdisplay_name TEXT NOT NULL DEFAULT ''status_code TEXT NOT NULL DEFAULT 'ACTIVE'merged_into_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for party_type_code:
PERSONORGANIZATIONTEAMUNKNOWN
Notes:
partyis the stable identity layer used everywhere else- merges should not delete rows immediately; use
merged_into_party_id
party_person
Purpose:
- person-specific details
Columns:
party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADEfirst_name TEXT NOT NULL DEFAULT ''last_name TEXT NOT NULL DEFAULT ''middle_name TEXT NOT NULL DEFAULT ''
party_organization
Purpose:
- organization-specific details
Columns:
party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADElegal_name TEXT NOT NULL DEFAULT ''normalized_legal_name TEXT NOT NULL DEFAULT ''company_registration_number TEXT NOT NULL DEFAULT ''tax_identifier TEXT NOT NULL DEFAULT ''show_by_default BOOLEAN NOT NULL DEFAULT TRUEdefault_email_sales_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULL
Notes:
show_by_default = FALSEskryje organizaci z beznych seznamu, odvozenych company lookupu a company search dokumentu, ale neblokuje primy pristup na detail firmydefault_email_sales_user_idurcuje vychoziho obchodnika pro emailovou komunikaci konkretni spolecnosti; pokud neni vyplneny nebo uzivatel neni dostupny pro Jira assignment, aplikace pouzije globalni konfiguraci a dalsi fallbacky
party_role
Purpose:
- attach business roles to a party
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADErole_type_code TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for role_type_code:
CUSTOMERSUPPLIERCONTACTEMPLOYEELEAD
Constraint:
UNIQUE (party_id, role_type_code)
party_outreach_policy
Purpose:
- store explicit commercial outreach policy for one organization
Columns:
party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADEstatus_code TEXT NOT NULL DEFAULT 'ACTIVE_LEAD'note TEXT NOT NULL DEFAULT ''set_by_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for status_code:
ACTIVE_LEADDO_NOT_CONTACTDISQUALIFIED
Notes:
- this table is the source of truth for explicit commercial decisions such as
do not contact show_by_default = FALSEmay still be used as a UI-level hide flag, but not as the business reason itself- operators should always attach a human note when switching a company to
DO_NOT_CONTACT
party_identifier
Purpose:
- store deduplication and lookup identifiers for a party
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADEidentifier_type_code TEXT NOT NULLvalue_raw TEXT NOT NULLvalue_normalized TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for identifier_type_code:
COMPANY_REGISTRATION_NUMBERTAX_IDENTIFIERGLNDOMAINWEBSITEEXTERNAL_IDVENDOR_CODECME_DODAVATEL_ID
Constraint:
UNIQUE (identifier_type_code, value_normalized)
party_cme_match
Purpose:
- keep the latest known CME lookup result for one organization
Columns:
party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADEmatch_state TEXT NOT NULL DEFAULT 'NO_MATCH'cme_record_type TEXT NOT NULL DEFAULT ''integration_state TEXT NOT NULL DEFAULT ''cme_dodavatel_id BIGINT NULLmatch_reason TEXT NOT NULL DEFAULT ''matched_by_ico BOOLEAN NOT NULL DEFAULT FALSEmatched_by_dic BOOLEAN NOT NULL DEFAULT FALSEmatched_by_domain BOOLEAN NOT NULL DEFAULT FALSEmatched_by_name BOOLEAN NOT NULL DEFAULT FALSEmatched_ico TEXT NOT NULL DEFAULT ''matched_dic TEXT NOT NULL DEFAULT ''matched_domain TEXT NOT NULL DEFAULT ''matched_name TEXT NOT NULL DEFAULT ''checked_at TIMESTAMPTZ NOT NULL DEFAULT NOW()last_error TEXT NOT NULL DEFAULT ''
Recommended values for match_state:
NO_MATCHMATCHEDAMBIGUOUSERROR
Notes:
- this table stores current integration state, not a full audit log
integration_statemirrors current CME interpretation such asSUPPLIER_INTEGRATED,DODAVATEL_ONLY, orONBOARDING_ONLY- stable CME supplier ids still belong into
party_identifierasCME_DODAVATEL_ID
party_cme_check_queue
Purpose:
- queue organizations waiting for CME verification
Columns:
party_id BIGINT PRIMARY KEY REFERENCES public.party (id) ON DELETE CASCADEreason TEXT NOT NULL DEFAULT ''priority INTEGER NOT NULL DEFAULT 100requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW()not_before TIMESTAMPTZ NOT NULL DEFAULT NOW()attempt_count INTEGER NOT NULL DEFAULT 0locked_at TIMESTAMPTZ NULL
Notes:
- used by the background worker to process only pending or stale organizations
- avoids minute-based full scans over the whole
party_organizationtable
Party Requirement Tracking
This view extends the party/activity model with requirement tracking for supplier onboarding and similar workflows.
requirement_type
Purpose:
- list supported requirement codes tracked for organizations
Columns:
code TEXT PRIMARY KEYname TEXT NOT NULLsort_order INTEGER NOT NULL UNIQUEcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for code:
ENLISTMENT_TABLEPRICE_LISTTOP10FEEDBUSINESS_TERMSSUPPLIER_IDENTIFIER
requirement_state_type
Purpose:
- list supported aggregate states for one requirement on one party
Columns:
code TEXT PRIMARY KEYname TEXT NOT NULLsort_order INTEGER NOT NULL UNIQUEcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for code:
UNKNOWNREQUESTEDCANDIDATE_RECEIVEDCONFIRMED_RECEIVEDREJECTEDNEEDS_REVIEW
requirement_evidence_source_type
Purpose:
- list supported evidence sources
Columns:
code TEXT PRIMARY KEYname TEXT NOT NULLsort_order INTEGER NOT NULL UNIQUEcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for code:
EMAIL_THREADEMAIL_MESSAGEEMAIL_ATTACHMENTMANUALIMPORT
requirement_evidence_verdict_type
Purpose:
- list supported verdicts for one evidence item
Columns:
code TEXT PRIMARY KEYname TEXT NOT NULLsort_order INTEGER NOT NULL UNIQUEcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for code:
PRESENTNOT_PRESENTUNCLEAR
party_requirement_state
Purpose:
- store the current aggregate state of one tracked requirement for one party
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADErequirement_code TEXT NOT NULL REFERENCES public.requirement_type (code)state_code TEXT NOT NULL REFERENCES public.requirement_state_type (code)last_evidence_id BIGINT NULL REFERENCES public.party_requirement_evidence (id) ON DELETE SET NULLis_manual_override BOOLEAN NOT NULL DEFAULT FALSEmanual_note TEXT NOT NULL DEFAULT ''requested_at TIMESTAMPTZ NULLfulfilled_at TIMESTAMPTZ NULLresolved_delivery_mode_code TEXT NOT NULL DEFAULT ''resolved_value_type_code TEXT NOT NULL DEFAULT ''resolved_value_raw TEXT NOT NULL DEFAULT ''resolved_value_normalized TEXT NOT NULL DEFAULT ''updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Constraint:
UNIQUE (party_id, requirement_code)
Notes:
- this table is the source of truth for the current requirement state shown on the company detail or in company filters
- manual overrides belong here; the evidence rows stay immutable audit records
- FEED a SUPPLIER_IDENTIFIER tu navic drzi finalni propagovanou hodnotu vybranou agregatorem z historickych evidence/fact zaznamu
party_requirement_evidence
Purpose:
- store one evidence item that supports or rejects a requirement state
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADErequirement_code TEXT NOT NULL REFERENCES public.requirement_type (code)source_type_code TEXT NOT NULL REFERENCES public.requirement_evidence_source_type (code)verdict_code TEXT NOT NULL REFERENCES public.requirement_evidence_verdict_type (code)activity_email_id BIGINT NULL REFERENCES public.activity_email (activity_id) ON DELETE SET NULLexternal_thread_id TEXT NOT NULL DEFAULT ''attachment_filename TEXT NOT NULL DEFAULT ''attachment_relative_path TEXT NOT NULL DEFAULT ''attachment_sha256 TEXT NOT NULL DEFAULT ''confidence NUMERIC(5,4) NULLreason TEXT NOT NULL DEFAULT ''extract_json JSONB NOT NULL DEFAULT '{}'::jsonbmodel_name TEXT NOT NULL DEFAULT ''model_version TEXT NOT NULL DEFAULT ''prompt_version TEXT NOT NULL DEFAULT ''created_by TEXT NOT NULL DEFAULT 'system'created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended constraint:
CHECK (confidence IS NULL OR (confidence >= 0 AND confidence <= 1))
Notes:
- evidence may point to a concrete imported email through
activity_email_id - the thread id still stays inline because V1 still has no dedicated thread table
- evidence may denormalize selected attachment metadata even though
activity_email_attachmentexists, because evidence rows should remain immutable audit records extract_jsonstores structured AI output, not the source of truth for the aggregate state
party_requirement_eval_queue
Purpose:
- queue parties waiting for requirement-state re-evaluation
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADErequirement_code TEXT NOT NULL REFERENCES public.requirement_type (code)reason TEXT NOT NULL DEFAULT ''priority INTEGER NOT NULL DEFAULT 100not_before TIMESTAMPTZ NOT NULL DEFAULT NOW()attempt_count INTEGER NOT NULL DEFAULT 0locked_at TIMESTAMPTZ NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Constraint:
UNIQUE (party_id, requirement_code)
Notes:
- used by the background worker to re-evaluate only affected companies after a new email, prompt change, or manual reset
- avoids minute-based rescans over every company and every past email
Extracted Email Facts
This view stores concrete values extracted from one email, attachment, or AI pass without collapsing them directly into the final company state.
party_feed_fact
Purpose:
- store one extracted candidate feed value for one party
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADEactivity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADEevidence_id BIGINT NULL REFERENCES public.party_requirement_evidence (id) ON DELETE SET NULLexternal_thread_id TEXT NOT NULL DEFAULT ''attachment_filename TEXT NOT NULL DEFAULT ''attachment_relative_path TEXT NOT NULL DEFAULT ''attachment_sha256 TEXT NOT NULL DEFAULT ''url_raw TEXT NOT NULL DEFAULT ''url_normalized TEXT NOT NULL DEFAULT ''delivery_mode_code TEXT NOT NULLconfidence NUMERIC(5,4) NULLreason TEXT NOT NULL DEFAULT ''extract_json JSONB NOT NULL DEFAULT '{}'::jsonbcreated_by TEXT NOT NULL DEFAULT 'system'created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for delivery_mode_code:
DIRECT_URLLOGIN_REQUIREDIDENTIFIER_ONLYMANUAL_EXPORTNO_FEEDOTHER
Recommended constraints:
CHECK (confidence IS NULL OR (confidence >= 0 AND confidence <= 1))CHECK (delivery_mode_code <> '')
Notes:
- one email may produce multiple feed facts, for example several URLs or one login URL plus one identifier-only hint
evidence_idis optional because deterministic extractors may write facts before or without a requirement-evidence row- facts may denormalize selected attachment metadata or URLs even though
activity_email_attachmentandactivity_email_linkexist, because facts should preserve the exact extraction input used at the time - the final company attribute still belongs to
party_requirement_state
party_supplier_identifier_fact
Purpose:
- store one extracted supplier identifier candidate for one party
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADEactivity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADEevidence_id BIGINT NULL REFERENCES public.party_requirement_evidence (id) ON DELETE SET NULLexternal_thread_id TEXT NOT NULL DEFAULT ''attachment_filename TEXT NOT NULL DEFAULT ''attachment_relative_path TEXT NOT NULL DEFAULT ''attachment_sha256 TEXT NOT NULL DEFAULT ''identifier_type_code TEXT NOT NULLvalue_raw TEXT NOT NULL DEFAULT ''value_normalized TEXT NOT NULL DEFAULT ''confidence NUMERIC(5,4) NULLreason TEXT NOT NULL DEFAULT ''extract_json JSONB NOT NULL DEFAULT '{}'::jsonbcreated_by TEXT NOT NULL DEFAULT 'system'created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for identifier_type_code:
- same code family as
party_identifier.identifier_type_code - typical values for supplier onboarding:
COMPANY_REGISTRATION_NUMBERTAX_IDENTIFIERGLNVENDOR_CODECME_DODAVATEL_IDEXTERNAL_ID
Recommended constraints:
CHECK (confidence IS NULL OR (confidence >= 0 AND confidence <= 1))CHECK ((value_raw <> '' OR value_normalized <> '') AND identifier_type_code <> '')
Notes:
- one email may produce multiple identifier facts
- once confirmed, a selected identifier may later be promoted into
party_identifier, but the extracted fact rows remain immutable source data - the final company attribute still belongs to
party_requirement_state
Contact Mechanisms
This view focuses on party and contact tables only.

contact_mech
Purpose:
- generic communication endpoint
- shared root record used for activity linking and contact subtype storage
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYcontact_mech_type_code TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for contact_mech_type_code:
EMAILPHONEWEBADDRESS
Notes:
contact_mechmust exist even when no matchingpartyexists yet- V1 does not define a dedicated contact mechanism lifecycle beyond the fields described in this schema
contact_mech_email
Purpose:
- email-specific fields
Columns:
contact_mech_id BIGINT PRIMARY KEY REFERENCES public.contact_mech (id) ON DELETE CASCADEemail TEXT NOT NULL
Constraints:
UNIQUE (email)
Notes:
emailmust be stored in normalized form and is the exact deduplication key- the original imported header value stays in
activity_participant.address_raw
contact_mech_phone
Purpose:
- phone-specific fields
Columns:
contact_mech_id BIGINT PRIMARY KEY REFERENCES public.contact_mech (id) ON DELETE CASCADEphone_raw TEXT NOT NULLphone_e164 TEXT NOT NULLcountry_code TEXT NOT NULL DEFAULT ''
Constraints:
UNIQUE (phone_e164)
Notes:
phone_e164is the exact deduplication key for phone numbers
party_contact_mech
Purpose:
- assign a contact mechanism to a party
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADEcontact_mech_id BIGINT NOT NULL REFERENCES public.contact_mech (id) ON DELETE CASCADEverified BOOLEAN NOT NULL DEFAULT FALSEis_primary BOOLEAN NOT NULL DEFAULT FALSEfrom_date TIMESTAMPTZ NOT NULL DEFAULT NOW()thru_date TIMESTAMPTZ NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Constraints:
UNIQUE (party_id, contact_mech_id)
Notes:
- shared mailboxes such as
sales@company.comcan be linked directly to an organization party and, when needed, also to a person party - a person can have multiple email addresses and phone numbers
party_contact_mech_purpose
Purpose:
- assign business meaning to the party/contact linkage
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYparty_contact_mech_id BIGINT NOT NULL REFERENCES public.party_contact_mech (id) ON DELETE CASCADEpurpose_code TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for purpose_code:
PRIMARY_EMAILWORK_EMAILBILLING_EMAILPRIMARY_PHONEWORK_PHONE
Constraint:
UNIQUE (party_contact_mech_id, purpose_code)
Party Relationships
party_relationship
Purpose:
- link one party to another party with direction, role, and validity
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYfrom_party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADEto_party_id BIGINT NOT NULL REFERENCES public.party (id) ON DELETE CASCADErelationship_type_code TEXT NOT NULLfrom_role_type_code TEXT NOT NULL DEFAULT ''to_role_type_code TEXT NOT NULL DEFAULT ''from_date TIMESTAMPTZ NOT NULL DEFAULT NOW()thru_date TIMESTAMPTZ NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for relationship_type_code:
EMPLOYMENTCONTACT_FORSUPPLIER_RELATIONSHIPCUSTOMER_RELATIONSHIPREPORTS_TO
Examples:
- person -> organization via
EMPLOYMENT - person -> organization via
CONTACT_FOR - organization -> organization via
SUPPLIER_RELATIONSHIP
Activity Layer
This view focuses on activity tables only.

activity_type
Purpose:
- list of supported activity types
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYcode TEXT NOT NULL UNIQUEname TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Initial seed:
emailcallmeetingnote
activity
Purpose:
- shared timeline record for every event
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYactivity_type_id BIGINT NOT NULL REFERENCES public.activity_type (id)primary_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULLoccurred_at TIMESTAMPTZ NOT NULLtitle TEXT NOT NULL DEFAULT ''summary TEXT NOT NULL DEFAULT ''status_code TEXT NOT NULL DEFAULT 'ACTIVE'created_by_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Notes:
primary_party_idis optional because an activity may arrive before the system knows the right person or organization- in V1 it stays nullable, and
activity_participantis the primary source of party/contact linkage occurred_atis the primary timeline sort key
Activity Detail Tables
Each detail table must use a strict one-to-one relationship:
activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADE
activity_email
Purpose:
- email message details
- stores both imported incoming messages and sent messages materialized after a successful Gmail send
Columns:
activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADEprovider TEXT NOT NULL DEFAULT 'gmail'external_message_id TEXT NOT NULLexternal_thread_id TEXT NOT NULL DEFAULT ''internet_message_id TEXT NOT NULL DEFAULT ''subject TEXT NOT NULL DEFAULT ''body_text TEXT NOT NULL DEFAULT ''body_html TEXT NOT NULL DEFAULT ''sent_at TIMESTAMPTZ NULLreceived_at TIMESTAMPTZ NULL
Constraints:
UNIQUE (provider, external_message_id)
Important rule:
- one
activity_emailrow should represent one email message, not one entire email thread - the thread is represented by
external_thread_id - V1 does not introduce a separate
activity_threadtable body_htmlstores decodedtext/htmlMIME body parts only; it is not raw RFC822 storage and does not include separate inline CID image metadata- summaries, search, AI context, deterministic facts, and link extraction keep
using
body_text
activity_email_attachment
Purpose:
- store attachment metadata for one imported email message
- bridge
activity_emailto the existing file-based attachment cache
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYactivity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADEstorage_backend_code TEXT NOT NULL DEFAULT 'REPLY_PILOT_BE_CACHE'filename TEXT NOT NULL DEFAULT ''mime_type TEXT NOT NULL DEFAULT 'application/octet-stream'size_bytes BIGINT NOT NULL DEFAULT 0relative_path TEXT NOT NULL DEFAULT ''content_hash TEXT NOT NULL DEFAULT ''created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for storage_backend_code:
REPLY_PILOT_BE_CACHE
Recommended constraints:
CHECK (size_bytes >= 0)CHECK (relative_path <> '')
Notes:
- this table stores attachment metadata only, not binary content
- binary payload stays in the existing backend cache under
reply-pilot-be/data/emails/attachments/ relative_pathis the storage bridge used to resolve the file from the DB row into the local cache
activity_email_link
Purpose:
- store explicit URLs extracted from one imported email message
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYactivity_email_id BIGINT NOT NULL REFERENCES public.activity_email (activity_id) ON DELETE CASCADEsource_code TEXT NOT NULL DEFAULT 'BODY_TEXT'url_raw TEXT NOT NULL DEFAULT ''url_normalized TEXT NOT NULL DEFAULT ''scheme TEXT NOT NULL DEFAULT ''host TEXT NOT NULL DEFAULT ''path TEXT NOT NULL DEFAULT ''context_snippet TEXT NOT NULL DEFAULT ''created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for source_code:
BODY_TEXT
Recommended constraints:
CHECK (url_raw <> '')CHECK (url_normalized <> '')
Notes:
- V1 stores only explicit
httpandhttpslinks extracted frombody_text url_rawpreserves the exact string seen in the email, whileurl_normalizedis the stable lookup key used by later extractors- this table is the source of truth for feed URL candidates before they are classified into company-level facts
activity_call
Purpose:
- phone or online call details
Columns:
activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADEstarted_at TIMESTAMPTZ NOT NULLended_at TIMESTAMPTZ NULLdirection_code TEXT NOT NULL DEFAULT ''notes TEXT NOT NULL DEFAULT ''
Recommended values for direction_code:
INBOUNDOUTBOUND
activity_meeting
Purpose:
- meeting details
Columns:
activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADEstarted_at TIMESTAMPTZ NOT NULLended_at TIMESTAMPTZ NULLlocation TEXT NOT NULL DEFAULT ''meeting_url TEXT NOT NULL DEFAULT ''notes TEXT NOT NULL DEFAULT ''
activity_note
Purpose:
- manually entered note details
Columns:
activity_id BIGINT PRIMARY KEY REFERENCES public.activity (id) ON DELETE CASCADEnote_text TEXT NOT NULL
Lead Import Workflow
This view stores server-side batch imports of supplier leads prepared outside
Reply Pilot, for example from reply-pilot-wholesale-scout.
lead_import_batch
Purpose:
- store one imported JSONL file prepared for operator review
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYsource_filename TEXT NOT NULLsource_sha256 TEXT NOT NULLsource_path TEXT NOT NULLstate_code TEXT NOT NULL DEFAULT 'REVIEW'created_by_user_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULLtotal_item_count INTEGER NOT NULL DEFAULT 0processed_item_count INTEGER NOT NULL DEFAULT 0created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()completed_at TIMESTAMPTZ NULL
Recommended values for state_code:
REVIEWCOMPLETEDFAILED
Notes:
- the source file lives on backend storage under a server-managed directory
- DB stores the audit trail and review state, not the original file bytes
lead_import_item
Purpose:
- store one supplier candidate from an imported JSONL batch
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYbatch_id BIGINT NOT NULL REFERENCES public.lead_import_batch (id) ON DELETE CASCADEline_number INTEGER NOT NULLstate_code TEXT NOT NULL DEFAULT 'PENDING'supplier_name TEXT NOT NULL DEFAULT ''legal_name TEXT NOT NULL DEFAULT ''website TEXT NOT NULL DEFAULT ''source_record_json JSONB NOT NULL DEFAULT '{}'::jsonbmatched_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULLmatched_reason TEXT NOT NULL DEFAULT ''imported_party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULLjira_task_id BIGINT NULL REFERENCES public.task (id) ON DELETE SET NULLjira_key TEXT NOT NULL DEFAULT ''ai_email_subject TEXT NOT NULL DEFAULT ''ai_email_body TEXT NOT NULL DEFAULT ''ai_model TEXT NOT NULL DEFAULT ''ai_generated_at TIMESTAMPTZ NULLoperator_note TEXT NOT NULL DEFAULT ''last_error TEXT NOT NULL DEFAULT ''created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for state_code:
PENDINGIMPORTEDDO_NOT_CONTACTSKIPPEDERROR
Notes:
matched_party_idstores the best prefill match against an existing companyimported_party_idstores the actual target company after operator decisionsource_record_jsonkeeps the immutable imported payload, while AI draft and operator decision live in explicit columns
Jira Task Layer
Jira remains the source of truth for workflow status and task text. Reply Pilot
keeps a local cache in task / task_jira so the app can join Jira work with
companies, email threads, users, and lead import decisions without querying Jira
for every local linkage decision. Jira summary and description are not stored in
Reply Pilot; the app may fetch them from Jira only for the active request.
task
Purpose:
- root table for local task records
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYtask_type_code TEXT NOT NULL DEFAULT 'jira'created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for task_type_code:
jira
task_jira_work_type
Purpose:
- lookup for the Jira work type represented by one
task_jiracache row
Columns:
code TEXT PRIMARY KEYname TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for code:
supplier_onboardingemail_thread_reply
task_jira
Purpose:
- cache one Jira work item and link it to a company
Columns:
task_id BIGINT PRIMARY KEY REFERENCES public.task (id) ON DELETE CASCADEjira_work_type_code TEXT NOT NULL REFERENCES public.task_jira_work_type (code)jira_key TEXT NOT NULL DEFAULT ''status TEXT NOT NULL DEFAULT ''party_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULLuser_id BIGINT NULL REFERENCES public.app_user (id) ON DELETE SET NULLresolution TEXT NOT NULL DEFAULT ''jira_updated_at TIMESTAMPTZ NULLjira_synced_at TIMESTAMPTZ NULLjira_sync_error TEXT NOT NULL DEFAULT ''jira_assignee_account_id TEXT NOT NULL DEFAULT ''jira_assignee_display_name TEXT NOT NULL DEFAULT ''jira_assignee_email TEXT NOT NULL DEFAULT ''
Notes:
status, resolution, and assignee fields are Jira cache values; Jira owns the canonical values- Jira summary and description are intentionally not cached in this table
party_idis the common company link for both supplier onboarding and email reply work items
task_jira_supplier_onboarding
Purpose:
- subtype row for a supplier onboarding Jira work item
Columns:
task_id BIGINT PRIMARY KEY REFERENCES public.task_jira (task_id) ON DELETE CASCADEcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Notes:
- all pre-existing local Jira tasks are migrated into this subtype
- Jira Cloud work type migration of existing tickets is an operator bulk-move step, not an automatic DB migration
task_jira_email_thread_reply
Purpose:
- subtype row for an email-thread reply Jira work item
Columns:
task_id BIGINT PRIMARY KEY REFERENCES public.task_jira (task_id) ON DELETE CASCADEexternal_thread_id TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Notes:
- company link remains in
task_jira.party_id - email thread link uses
external_thread_iduntil the model grows a dedicated email thread table - incoming email moves only linked
email_thread_replywork items toDrafting Reply - sending a reply moves only linked
email_thread_replywork items toWaiting for Reply
Activity Participants
activity_participant
Purpose:
- link an activity to parties and contact mechanisms
- preserve raw imported values from email headers or call sources
- connect the activity layer to the
contact_mechmodel
Columns:
id BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEYactivity_id BIGINT NOT NULL REFERENCES public.activity (id) ON DELETE CASCADEparticipant_role_code TEXT NOT NULLparty_id BIGINT NULL REFERENCES public.party (id) ON DELETE SET NULLcontact_mech_id BIGINT NULL REFERENCES public.contact_mech (id) ON DELETE SET NULLdisplay_name_raw TEXT NOT NULL DEFAULT ''address_raw TEXT NOT NULL DEFAULT ''sort_order INTEGER NOT NULL DEFAULT 0is_internal BOOLEAN NOT NULL DEFAULT FALSEcreated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
Recommended values for participant_role_code:
FROMREPLY_TOTOCCBCCCALLERCALLEEORGANIZERATTENDEEAUTHOR
Recommended constraints:
CHECK (party_id IS NOT NULL OR contact_mech_id IS NOT NULL OR address_raw <> '')
Notes:
contact_mech_idis the critical link for matching imported email addresses or phone numbers to the deduplicated contact layerparty_idis optional because the correct person or organization may be resolved laterdisplay_name_rawandaddress_rawpreserve the original source value
This view shows how activities link to contact and party records.

Incoming Email Flow
This is the main integration point between activity and contact_mech.
When an incoming email arrives:
- create one
activityrow with typeemail - create one
activity_emailrow for the concrete email message - store normalized plain text in
body_text - store decoded HTML MIME body content in
body_htmlwhen present - if attachment metadata is already available in the backend cache, create
one
activity_email_attachmentrow per attachment and keep the binary payload in backend file storage - extract explicit
http(s)links fromactivity_email.body_textand createactivity_email_linkrows - for each address in
from,reply-to,to,cc, andbcc: - normalize the email address
- look up an existing
contact_mechthroughcontact_mech_email.email - if it does not exist, create
contact_mechandcontact_mech_email - try to resolve an existing
partythroughparty_contact_mech - create one
activity_participantrow with the proper role code - if no matching
partyexists yet, keep the participant linked only tocontact_mech_id - when a person or organization is resolved later, update or add the
party_contact_mechlinkage without rewriting the activity history
This approach ensures that:
- every email address is stored independently
to,cc, andbccare searchable- deduplication works across multiple activities
- later identity resolution does not destroy source history
Incoming Call Flow
When a phone call is recorded:
- create one
activityrow with typecall - create one
activity_callrow - normalize the caller and callee phone numbers
- look up an existing
contact_mechthroughcontact_mech_phone.phone_e164 - if it does not exist, create
contact_mechandcontact_mech_phone - link both sides through
activity_participant - resolve
party_idonly when available
Deduplication Strategy
Exact Deduplication
Exact matching should be based on:
contact_mech_email (email)contact_mech_phone (phone_e164)party_identifier (identifier_type_code, value_normalized)
Soft Deduplication
Candidate duplicate people or organizations can be suggested using:
- matching name plus same email domain
- matching person name plus shared company relationship
- matching normalized organization name plus domain
- repeated reuse of the same
contact_mechby an unresolved party candidate
Merge Strategy
Recommended merge strategy:
- do not hard-delete duplicates immediately
- set
party.merged_into_party_id - optionally add the same pattern later for
contact_mech
Integrity Rules
- one detail row per activity
- one activity email row per external message
- one contact mechanism per normalized email or phone value
- one
party_contact_mechrow per unique party/contact pair - manual contact removal should set
party_contact_mech.thru_date; it should not delete the sharedcontact_mechor historical activities - manual removal of a person from a company should set
party_relationship.thru_dateon the activeCONTACT_FORrelationship; it should not delete theparty_person - an activity participant may point to a party, a contact mechanism, or both
- activity inserts and detail inserts must happen in one transaction
- imported raw values must be preserved even after later normalization
Recommended Indexes
activity (occurred_at DESC)activity (primary_party_id, occurred_at DESC)activity_participant (activity_id, participant_role_code, sort_order)activity_participant (contact_mech_id)activity_participant (party_id)contact_mech_email (email)unique indexcontact_mech_phone (phone_e164)unique indexparty_contact_mech (party_id, contact_mech_id)unique indexparty_identifier (identifier_type_code, value_normalized)unique indexactivity_email (provider, external_message_id)unique indexactivity_email (external_thread_id)activity_email_attachment (relative_path)unique indexactivity_email_attachment (activity_email_id)activity_email_attachment (content_hash)activity_email_link (activity_email_id, source_code, url_normalized)unique indexactivity_email_link (activity_email_id)activity_email_link (host)party_requirement_state (requirement_code, state_code, updated_at DESC)party_requirement_evidence (party_id, requirement_code, created_at DESC)party_requirement_evidence (activity_email_id)party_requirement_evidence (external_thread_id)party_requirement_evidence (attachment_sha256)party_requirement_eval_queue (not_before, priority, created_at)party_requirement_eval_queue (locked_at)party_feed_fact (party_id, created_at DESC)party_feed_fact (activity_email_id)party_feed_fact (evidence_id)party_feed_fact (url_normalized)party_feed_fact (delivery_mode_code)party_supplier_identifier_fact (party_id, identifier_type_code, created_at DESC)party_supplier_identifier_fact (activity_email_id)party_supplier_identifier_fact (evidence_id)party_supplier_identifier_fact (identifier_type_code, value_normalized)party_outreach_policy (status_code)lead_import_batch (state_code, created_at DESC)lead_import_item (batch_id, state_code, line_number)lead_import_item (matched_party_id)lead_import_item (imported_party_id)
Migration Status
The legacy company, company_contact, and communication_history tables
were transitional only. The repo has already switched writes to the
party/contact_mech/activity model and the legacy tables were removed after
verification.
Historical migration path was:
- create the new
party,contact_mech, andactivitytables - migrate each company into
partywithparty_type_code = 'ORGANIZATION' - migrate each company contact email into:
contact_mechcontact_mech_emailparty_contact_mech- migrate each communication-history row into:
activityactivity_email- move the legacy Gmail thread id into
activity_email.external_thread_id - if the old data was only thread-based, keep the first migrated version thread-based and later split it into message-based import
- switch the backend write path to the new tables
- remove the legacy tables after production verification
MVP Scope
The first implementation should include:
partyparty_organizationcontact_mechcontact_mech_emailparty_contact_mechactivity_typeactivityactivity_emailactivity_participant- incoming email import linked to
contact_mech
The first implementation does not need:
- full postal address support
- advanced role taxonomies
- automatic duplicate merges
- strict trigger-based enforcement between activity type and detail table