Arkloop Developers
Specifications

Database Architecture and Data Models

This document describes Arkloop's database boundaries, core tables, and architectural constraints for permissions, auditing, and billing. The production environment uses PostgreSQL as the sole target backend.

Migration Tool: Goose (embedded in src/services/api/internal/migrate/migrations/, with 139 migration files).

Note: Organization concept has been removed via migration 00118. Account is the sole tenant unit.

1. Terminology

account is the tenant boundary:

  • Data isolation boundary (permissions, exports, deletion, retention policies).
  • Auditing boundary (log attribution and accountability scope).
  • Billing and quota boundary (budgets, multipliers, usage reports).

platform is the global scope of the deployment instance:

  • Platform-level default configurations and platform-level credentials (ensures a new org can run without configuration).
  • Managed by platform_admin, not belonging to any specific org.
  • Org-level configurations are for overrides only and should not act as "global defaults."

2. Top-level Structure: org / team / project

2.1 accounts (Tenants/Companies)

ColumnDescription
idPK
slugURL-friendly identifier
nameDisplay name
created_atCreation time

2.2 users (User Entities)

ColumnDescription
idPK
usernameUsername
created_atCreation time

2.3 account_memberships (Organization Memberships)

ColumnDescription
idPK
account_idFK -> accounts
user_idFK -> users
roleRole (owner / member)

2.4 teams (Groups within an Organization)

ColumnDescription
idPK
account_idFK -> accounts
nameName

2.5 projects (Projects/Collaboration Domains)

ColumnDescription
idPK
account_idFK -> accounts
team_idFK -> teams (optional)
nameName
descriptionDescription
visibilityVisibility
deleted_atSoft delete flag

3. Threads and Messages

3.1 threads (Conversation Containers)

ColumnDescription
idPK
account_idFK -> accounts
created_by_user_idFK -> users
titleTitle
project_idFK -> projects (optional)
privatePrivate flag
deleted_atSoft delete
created_atCreation time

3.2 messages (Messages)

ColumnDescription
idPK
thread_idFK -> threads
account_idFK -> accounts
roleuser / assistant / system
contentText content
content_jsonJSONB structured content
hiddenHidden flag
created_atCreation time

4. Runs and Events

4.1 runs (Execution Instances)

ColumnDescription
idPK
account_idFK -> accounts
thread_idFK -> threads
created_by_user_idFK -> users
statusState machine
parent_run_idFK -> runs (sub-run)
created_atCreation time
updated_atUpdate time

4.2 run_events (Event Stream -- Single Source of Truth)

Partitioned by month (created_at), with automatic partition lifecycle management (ARKLOOP_RUN_EVENTS_RETENTION_MONTHS).

ColumnDescription
event_idPK
run_idFK -> runs
seqMonotonically increasing sequence within the run
tsServer-side timestamp
typeEvent type
data_jsonJSONB payload
tool_nameColumn index
error_classColumn index
created_atPartition key

Key Constraints:

  • seq strictly increases within the same run.
  • Written by the Worker, read and replayed as SSE by the API.
  • Supports after_seq cursor for resuming after disconnection.

5. LLM Providers and Routes

5.1 Provider Accounts

ColumnDescription
idPK
account_idFK -> accounts (currently an org-level resource)
providerProvider identifier
nameDisplay name
secret_idFK -> secrets (stored encrypted)
key_prefixKey prefix (for identification)
base_urlCustom base URL
advanced_jsonJSONB advanced configuration

5.2 llm_routes (Model Routing Rules)

ColumnDescription
idPK
account_idFK -> accounts
credential_idFK -> provider account record
modelModel identifier
priorityPriority
is_defaultDefault route flag
when_jsonJSONB conditional rules
multiplierRate multiplier
cache_pricing_jsonCache pricing

5.3 secrets (Generic Encrypted Storage)

Encrypted with AES-256-GCM, using the key provided by ARKLOOP_ENCRYPTION_KEY.

ColumnDescription
idPK
account_idFK -> accounts (required for org scope; NULL for platform scope)
scopeorg / platform
nameLogical key (unique within the same scope)
encrypted_valueEncrypted value (base64)
key_versionEncryption version
rotated_atRotation time (optional)
created_atCreation time
updated_atUpdate time

Constraints:

  • scope='org': (account_id, name) must be unique.
  • scope='platform': name must be globally unique.

secrets usage:

  • API Keys for LLM / ASR credentials.
  • API Keys for Tool Providers.

Currently: Configuration items in the Config Registry marked as Sensitive=true are masked when returned by the API; values are written to platform_settings/org_settings unencrypted.

5.4 platform_settings / org_settings (Unified Configuration: Config Resolver)

Used for Track A Config Resolver (key-value configuration), supporting platform defaults and org overrides.

platform_settings

ColumnDescription
keyPK
valueConfiguration value (non-sensitive)
updated_atUpdate time

org_settings

ColumnDescription
account_idFK -> accounts
keyConfiguration key
valueConfiguration value (non-sensitive)
updated_atUpdate time

Resolver Priority Chain (high to low):

  1. ENV override (forced deployment layer override)
  2. org_settings
  3. platform_settings
  4. Registry default value

5.5 tool_provider_configs (Tool Backend Activation and Credential Association)

Used for backend selection, credentials, and base_url configuration for Tool Groups such as web_search and web_fetch.

ColumnDescription
idPK
account_idFK -> accounts (required for org scope; NULL for platform scope)
scopeorg / platform
group_nameTool Group name (tool name as seen by LLM, e.g., web_search)
provider_nameProvider name (internal tool name, e.g., web_search.tavily)
is_activeActivation status (at most one active per scope + group)
secret_idFK -> secrets (API Key, stored encrypted)
key_prefixKey prefix (for Console display)
base_urlCustom endpoint (SearXNG / self-hosted Firecrawl, etc.)
config_jsonNon-sensitive parameters (JSONB)
created_atCreation time
updated_atUpdate time

Resolution Chain:

  • Org scope active provider prioritized.
  • Falls back to platform scope active provider if no org configuration exists.

6. Personas

6.1 personas (Persona Definitions)

ColumnDescription
idPK
account_idFK -> accounts
persona_keyPersona identifier
versionVersion
display_nameDisplay name
descriptionDescription
prompt_mdSystem prompt
tool_allowlistAllowed tools list
tool_denylistForbidden tools list
budgets_jsonRuntime budgets such as temperature, output cap, and tool budgets
modelOptional model selector, stored as provider_name^model_name or a bare model
reasoning_modeReasoning mode
stream_thinkingWhether to emit message.delta with channel: thinking to clients (default true; omitting the key in persona YAML means true)
prompt_cache_controlPrompt cache policy
preferred_credentialFallback provider name when model is empty

Persona now absorbs the executable configuration that previously lived in Agent Configs and Prompt Templates, so there is no separate configuration table layer anymore.

7. Billing and Quotas

7.1 plans (Subscription Plans)

ColumnDescription
idPK
namePlan identifier
display_nameDisplay name

7.2 subscriptions (Subscription Relations)

ColumnDescription
idPK
account_idFK -> accounts
plan_idFK -> plans
statusStatus
current_period_startCurrent period start
current_period_endCurrent period end
cancelled_atCancellation time

7.3 plan_entitlements (Plan Feature Quotas)

ColumnDescription
idPK
plan_idFK -> plans
keyFeature key
valueQuota value
value_typeValue type

7.4 org_entitlement_overrides (Organization-level Overrides)

ColumnDescription
idPK
account_idFK -> accounts
keyFeature key
valueOverride value
reasonReason
expires_atExpiration time

7.5 credits / credit_transactions (Credits System)

TableKey Columns
creditsaccount_id, amount, balance
credit_transactionscredits_id, amount, type

7.6 usage_records (Usage Records)

Cached columns: input_tokens, output_tokens, cache_hit_rate.

8. Social and Sharing

TableDescription
thread_starsStars (thread_id + user_id)
thread_sharesShares (shared_by_user_id, recipient_user_id)
thread_reportsReports (reason, status)

9. Infrastructure

9.1 jobs (Background Task Queue)

Task queue implemented using a PostgreSQL table and Advisory Locks.

ColumnDescription
idPK
job_typeType (run.execute / webhook.deliver / email.send)
payload_jsonJSONB payload (cross-language protocol, must be versioned)
statusStatus
available_atAvailable time
leased_untilLease expiration
attemptsRetry attempts
worker_tagsWorker capability tags

9.2 worker_registrations (Worker Registrations)

ColumnDescription
idPK
nameWorker name
capabilities_jsonCapabilities set
heartbeat_atHeartbeat time

9.3 webhook_endpoints (Webhooks)

ColumnDescription
idPK
urlCallback URL
eventsArray of subscribed event types
activeActive status

9.4 api_keys (API Keys)

ColumnDescription
idPK
account_idFK -> accounts
key_prefixKey prefix
last_used_atLast used time

10. Authentication and Security

TableDescription
user_credentialsLogin credentials (login, password_hash)
refresh_tokensJWT refresh tokens (user_id, token, revoked_at)
email_verification_tokensEmail verification
email_otp_tokensOTP (email, code, expires_at)
rbac_rolesRole definitions (permissions_json)

11. Notifications and Auditing

TableDescription
notificationsUser notifications (type, title, body, read_at)
notification_broadcastsPlatform broadcasts (soft delete)
audit_logsAudit logs (user_id, action, resource_type, ip_address, user_agent)

12. MCP and External Integrations

12.1 mcp_configs (MCP Server Configurations)

ColumnDescription
idPK
account_idFK -> accounts
nameServer name
urlConnection URL
env_jsonEnvironment variables
tools_jsonTool definitions

12.2 asr_credentials (Speech-to-Text Credentials)

Structure similar to provider secret storage, managed independently.

13. Miscellaneous

TableDescription
user_memory_snapshotsUser memory snapshots (account_id, data_json, hits_json), interfaces with OpenViking
platform_settingsGlobal platform configuration (key-value JSONB)
feature_flagsFeature flags
redemption_codesRedemption codes (value, usage_count, expires_at)
invite_codesInvite codes

14. Architecture Decision Records

  • Storage Engine: PostgreSQL (sole production backend).
  • Encryption: AES-256-GCM (ARKLOOP_ENCRYPTION_KEY), used for LLM provider secrets, asr_credentials, and secrets.
  • Partitioning: run_events partitioned by month (created_at), with automatic cleanup of expired partitions.
  • Soft Deletion: threads, notification_broadcasts, and projects use deleted_at.
  • UUID: Primary keys use UUID (pgcrypto extension).
  • Task Queue: PostgreSQL table + Advisory Lock (no dependency on external MQ).
  • Real-time Push: PostgreSQL LISTEN/NOTIFY -> SSE.
  • Credential Scopes: LLM providers support both platform-level (account_id is NULL) and org-level scopes.

On this page