Context Pilot / Trust Center
Context Pilot / Trust Center / Subprocessors

Subprocessor Register

Last updated: June 22, 2026
Register SR-3.0

This register provides a complete inventory of all third-party services that Context Pilot is capable of communicating with. Context Pilot has no mandatory third-party dependencies beyond a single LLM provider. All other integrations are opt-in and require explicit operator configuration.

1. LLM Inference Providers

At least one LLM provider must be configured for AI inference capabilities. The operator selects their provider and supplies their own API key. Context Pilot supports the following providers. Only the configured provider receives data.

Provider Endpoint Data Categories Transmitted Opt-In Mechanism
Anthropic
api.anthropic.com
Direct API System prompt, conversation messages, code context from open files, tool definitions ANTHROPIC_API_KEY
Anthropic (OAuth)
api.claude.ai
Claude Code OAuth System prompt, conversation messages, code context from open files, tool definitions Claude Code CLI credentials
OpenAI
api.openai.com
Direct API System prompt, conversation messages, code context from open files, tool definitions OPENAI_API_KEY
Google AI
generativelanguage.googleapis.com
Generative Language API System prompt, conversation messages, code context from open files, tool definitions GOOGLE_AI_API_KEY
Mistral AI
api.mistral.ai
Direct API System prompt, conversation messages, code context from open files, tool definitions MISTRAL_API_KEY
Groq
api.groq.com
Direct API Short prompts for auxiliary inference (soul journal) GROQ_API_KEY
MMiniMax
api.minimaxi.com
Direct API System prompt, conversation messages, code context from open files, tool definitions MINIMAX_API_KEY

2. Auxiliary Services

The following services provide optional capabilities. Each requires explicit operator configuration. None are active by default.

Service Endpoint Purpose Data Categories Transmitted Opt-In Mechanism
Voyage AI
api.voyageai.com
REST API Text embedding generation for hybrid semantic search (voyage-code-3 model) Code and text chunks from indexed project files VOYAGE_API_KEY
Brave Search
api.search.brave.com
REST API Web search (result snippets and deep content extraction) Search query strings BRAVE_API_KEY
Firecrawl
api.firecrawl.dev
REST API Web page scraping, crawling, and structured content extraction Target URLs for scraping FIRECRAWL_API_KEY
Datalab (Surya)
www.datalab.to
REST API OCR and document-to-text conversion Document files (PDF, images) submitted for text extraction DATALAB_API_KEY
GitHub
api.github.com
REST API (via gh CLI) Repository operations (issues, pull requests, releases) Git operations scoped to operator's authenticated repositories GITHUB_TOKEN

3. Local Services

The following services run locally on the operator's workstation. No data leaves the machine through these services.

Service Binding Purpose External Traffic
Meilisearch
127.0.0.1 (localhost)
Loopback only Full-text and semantic project search indexing None
CConsole Server
Unix domain socket
Unix socket (filesystem) Child process lifecycle management (build commands, shells) None
OOrchestrator
127.0.0.1:7878
Loopback only Multi-agent fleet management, SSE streaming, REST API None
SQLite (Entities)
Embedded library
In-process Structured entity and domain knowledge storage None

4. Subprocessor Compliance Matrix

Detailed compliance posture for each external subprocessor, based on independent research conducted June 2026. Operators subject to regulatory requirements should verify all claims directly with each provider.

Provider Trust Center SOC 2 ISO 27001 DPA API Data Retention Training on API Data Data Location Risk
Anthropic trust.anthropic.com Type II 27001 Auto-included 7 days (ZDR avail.) No US (EU via Bedrock/Vertex) Low
OpenAI trust.openai.com Type 2 27001 Public DPA 30 days (ZDR avail.) No US (EU avail. Enterprise) Low
Google AI Cloud Compliance Type 2 27001 Cloud DPA 55 days (paid API) No (paid) Global (EU via Vertex) Low
Mistral AI trust.mistral.ai Type II Aligned Public DPA 30 days No EU + US (since Feb 2025) Medium
Groq trust.groq.com Type II None Public DPA Transient (no storage) N/A US only Low
MiniMax None None None Not public Not documented Unclear China High
Brave Search API Security Type II Aligned Public DPA 90 days (query logs) N/A US Low
Firecrawl Enterprise Type II None Available (not public) Transient (ZDR default) N/A US (self-host avail.) Low
Voyage AI None None None Not public Not documented Default YES US High
Datalab SOC 2 badge (no portal) Type II None Custom (VPC tier) Transient (OCR) Unclear US (self-host avail.) Medium
GitHub trust.github.com Type 2 27001 Enterprise Agreement Per service (90d audit logs) No (gh CLI) US (EU preview) Low

5. Detailed Compliance Notes

Material compliance observations for each subprocessor, including certifications, jurisdictional considerations, and known risks. Research conducted June 2026.

Anthropic (Low Risk)

SOC 2 Type II + ISO 27001:2022 + ISO/IEC 42001:2023 (AI Management System). API data retention reduced to 7 days (Sept 2025). Zero Data Retention available for Enterprise. DPA with SCCs auto-included in Commercial Terms. HIPAA BAA available. EU data residency for Team/Enterprise (Aug 2025). FedRAMP in progress. API data never used for training.

OpenAI (Low Risk)

SOC 2 Type 2 + ISO 27001/27017/27018/27701 + CSA STAR + PCI-DSS. Public DPA (effective Dec 2025) and public subprocessor list (updated Feb 2026). 30-day default retention; ZDR approval-gated. EU data residency available for Enterprise/Edu/API Projects (Nov 2025). HIPAA BAA available. AES-256 at rest. EU-US DPF certified.

Google AI — Gemini (Low Risk)

Context Pilot uses the paid Gemini API tier (not free AI Studio). Inherits Google Cloud certifications: SOC 1/2/3, ISO 27001/27017/27018/27701, ISO 42001 (AI Management), FedRAMP High, PCI-DSS v4.0. 55-day retention for abuse monitoring only. Section 17 "Training Restriction" contractually prohibits training. Cloud DPA auto-incorporated for EEA. Note: Free AI Studio tier has no DPA, is prohibited for EEA/UK users, and uses data for training — operators must not use the free tier.

Mistral AI (Medium Risk)

French company (Paris), directly subject to GDPR. SOC 2 Type I+II. ISO 27001/27701 framework alignment (not explicitly certified). Public DPA with SCCs. 30-day API retention, no training. Risks: (1) US processing added Feb 2025 creates CLOUD Act exposure despite EU headquarters; (2) CNIL complaint (Feb 2025) regarding free-tier opt-out difficulty — decision pending. Self-deployment option eliminates both risks.

Groq (Low Risk)

Inference platform (LPU hardware) running third-party open-source models — not a model developer. SOC 2 Type II. Data transient by design (not stored beyond request lifecycle). Formal DPA with SCCs covering GDPR/CCPA/PDPL. US-only (GCP). HIPAA explicitly not available. Training question is N/A since Groq does not train models.

MiniMax (High Risk)

Chinese company (Shanghai/Beijing) subject to PIPL, Cybersecurity Law, and Data Security Law. No trust center, no SOC 2, no ISO 27001, no public DPA, no public subprocessor list. Data retention and training policies not clearly documented. Key risks: Chinese jurisdiction enables government data access under domestic law; no Western-standard certifications; limited public transparency; PIPL cross-border transfer restrictions. Operators handling regulated data should avoid this provider or evaluate jurisdictional exposure carefully.

Brave Search (Low Risk)

SOC 2 Type II (Oct 2025). External pentests (HackerOne). ISO 27001/27701 framework aligned but not yet certified. Public DPA with SCCs — note: DPA does not cover Search Query Data (Brave's legal position: API queries are machine-to-machine, not personal data). 90-day query log retention; ZDR available for Enterprise. Independent 40B+ page search index. TEE for AI features.

Firecrawl (Low Risk)

SOC 2 Type II. Fundamentally transient processing: web pages processed in memory and immediately deleted — zero data retention by default. DPA available but not publicly linked. Full self-host option for air-gapped deployments. 99.9% SLA on Enterprise tier. Context Pilot sends only target URLs; scraped content returned and discarded by Firecrawl.

Voyage AI (High Risk)

No trust center, no SOC 2, no ISO 27001, no public DPA. Critical: Privacy policy grants Voyage AI a "worldwide, irrevocable, perpetual, royalty-free license" to use customer content for training by default. Opt-out requires explicit request. No public subprocessor list. Limited compliance maturity. Mitigating factor: Context Pilot sends only code chunks and log entries for embedding generation (semantic search indexing), not conversation content or prompts. Operators concerned about code exposure should consider disabling the Voyage AI embedder (VOYAGE_API_KEY is optional — keyword search works without it).

Datalab / Surya (Medium Risk)

SOC 2 Type II for Managed Cloud tier. Open-source models (Surya 56.6k stars, Marker 11.1k stars on GitHub) — fully auditable code. Three deployment tiers: Managed Cloud (SOC 2, pay-as-you-go), VPC (AWS/GCP/Azure, custom BAA/DPA), and On-prem/Air-gapped (zero internet). Trusted by Anthropic, Siemens Healthineers, Stanford, MIT. Context Pilot sends documents temporarily for OCR processing. Self-host option eliminates external data transfer entirely.

GitHub (Low Risk)

Microsoft subsidiary. SOC 1/2 Type 2, SOC 3, ISO 27001, CSA CAIQ, FedRAMP authorized. DPA included in Enterprise Customer Agreement; Microsoft DPA apparatus inherited. EU data residency in preview. Context Pilot uses the gh CLI for repository operations (issues, PRs) — not GitHub Copilot. Private repos not used for training without consent. EU-US DPF certified (via Microsoft).

6. Subprocessor Change Log

Material changes to the subprocessor register are documented below. Enterprise customers evaluating Context Pilot for procurement may reference this log to track third-party dependency changes across versions.

Date Change Type Service Details
June 2026 Added MiniMax Added as optional LLM provider. Note: Chinese jurisdiction — see Compliance Matrix for risk assessment
June 2026 Enhanced All providers Independent compliance research conducted: trust centers, certifications, DPA availability, data retention, training policies documented per subprocessor
June 2026 Added Groq Added as optional LLM provider for auxiliary inference workloads
May 2026 Added Voyage AI Added for text embedding generation (hybrid search). Optional; keyword search works without it
May 2026 Added Datalab (Surya) Added for OCR and document conversion. Optional service
April 2026 Initial All others Initial subprocessor register published with Anthropic, OpenAI, Google AI, Mistral, Brave, Firecrawl, GitHub, Meilisearch

7. Important Notice

Operator Responsibility

The operator is solely responsible for evaluating the data handling practices of their selected third-party providers. Context Pilot facilitates connections to these services but does not act as an intermediary, does not negotiate data processing terms on behalf of operators, and does not have access to data transmitted between the operator's workstation and the provider's endpoint.

Operators subject to regulatory requirements (GDPR, HIPAA, CCPA, etc.) should independently verify that their chosen providers offer appropriate data processing agreements, data residency options, and contractual safeguards for their jurisdiction and use case.

8. Vendor Risk Assessment Criteria

The following criteria are evaluated when considering new third-party service integrations for Context Pilot. This framework ensures that any addition to the subprocessor register meets minimum security and privacy standards.

Criterion Requirement Weight
API-key authentication Provider must support API key authentication (no OAuth flows that require Context Pilot to hold session tokens on behalf of users) Mandatory
TLS encryption Provider must serve all API endpoints over HTTPS with TLS 1.2 or higher Mandatory
Published privacy policy Provider must publish a clear privacy policy describing data handling, retention, and processing purposes Mandatory
Opt-in activation Integration must require explicit operator configuration (API key provisioning). No provider may be active by default. Mandatory
Data minimization Only data strictly necessary for the service's function is transmitted (e.g., search queries for search, code context for inference) Mandatory
No training on API data Provider should not use API-submitted data for model training without explicit opt-in from the operator Strong preference
DPA availability Provider should offer a Data Processing Agreement for enterprise customers Preferred
SOC 2 / ISO 27001 Provider should hold recognized security certifications or demonstrate equivalent controls Preferred

9. Data Flow Summary

The following diagram summarizes the data flow architecture. All arrows represent operator-initiated, operator-configured connections. No connection exists that is not explicitly listed below.

Operator's Workstation All processes local — no inbound connections
TUI Agent Rust binary
Orchestrator :7878 (loopback)
Web Frontend React / Vite
Meilisearch localhost only
Console Server Unix socket
SQLite Entities Embedded
TLS 1.2+ · Outbound only · Operator-initiated
Configured API Providers Operator's API keys — only configured providers receive data
LLM Inference
Anthropic OpenAI Google AI Mistral AI Groq MMiniMax
Auxiliary Services
Brave Search Firecrawl Voyage AI Datalab GitHub