Getting StartedArchitecture
Getting Started

Platform Architecture

Neostra's microservices architecture, service interactions, data flows, and database strategy.

Service Overview

Neostra is composed of independent microservices, each owning its domain. Services communicate via REST APIs and Google Cloud Pub/Sub for event-driven processing.

ServicePortLanguageDatabasePurpose
neostra-core9000Java/Spring BootMongoDBDSAR, workflows, assessments, privacy centers, breach management
astra-core9010Java/Spring BootMongoDBTenant provisioning, user management, RBAC, subscriptions
cpmp-api-service9001Java/Spring BootPostgreSQL (read)Consent REST API, publishes to Pub/Sub
cpmp-log-receipt-service9002Java/Spring BootPostgreSQL (write)Pub/Sub consumer, writes consent ledger
cpmp-reporting-service9003Java/Spring BootPostgreSQLAnalytics, exports, preference center
cpmp-crawler-service9004Java/Spring BootMongoDBSelenium-based cookie and script scanning
data-discovery-boot8080Java/Spring BootPostgreSQLScan orchestration, integration management
data-discovery-scannerWorkerPython 3PostgreSQLPII detection using Presidio + regex
cpmp-modalCDNSvelte 5/TypeScriptCookie consent banner widget

Database Strategy

Neostra uses a polyglot persistence approach:

Used by neostra-core, astra-core, and cpmp-crawler-service for flexible document storage.

Key collections include: tenants, users, roles, permissions, subject-requests, workflow-instances, assessments, assessment-templates, cookies, collection-points, privacy-centers, integrations, audit-records, and 30+ more.

All documents inherit from BaseDocument with fields: id, createdAt, updatedAt, version (optimistic locking). Multi-tenant isolation is enforced via indexed tenantId on every document.

The consent management system uses an event-driven architecture for reliable, auditable consent processing:

The consent ledger maintains integrity using SHA-256 hash chains. Each entry references the previous entry's hash, creating an immutable, tamper-evident audit trail.

Data Discovery Pipeline

Data discovery uses a two-tier architecture separating orchestration from scanning:

The Python scanner supports multiple data sources: PostgreSQL, MySQL, MongoDB, AWS S3, AWS DynamoDB. PII detection identifies: emails, phone numbers, Aadhaar numbers, PAN cards, IP addresses, UPI IDs, and more.

DSAR Workflow

Data Subject Access Requests flow through a configurable workflow engine:

Cross-Service Communication

Services communicate through these integration patterns:

Internal service calls use private key authentication. Each service holds a secret key configured via environment variables (CONSENT_API_PRIVATE_KEY, CRAWLER_SERVICE_PRIVATE_KEY, discProxyRequestPrivateKey). The calling service includes this key in request headers for verification.

Infrastructure