R-Dash Architecture Freeze (Checkpoint 2)
Decision
R-Dash architecture locked. Topology: modular monolith (3 processes / 1 codebase). Backend: Python 3.13 + FastAPI 0.115 + Pydantic 2 + SQLAlchemy 2.0 typed + Alembic. Frontend: React 19 + Vite 6 + TanStack (Query v5 + Router v1) + Zustand + Tailwind 4 + shadcn/Radix. Charts: ECharts 5 + visx + Observable Plot. Editor: Monaco. Queue: RQ 1.16 + rq-scheduler + Redis 7.4 (Valkey fallback if license). Metadata DB: Postgres 17. Semantic: Cube.js OSS self-hosted. AI runtime: Claude API + LangGraph + RestrictedPython sandbox. Auth: Azure AD OIDC primary + SAML2 fallback via python-saml2, JWT 15min access + 7d refresh, MFA upstream, break-glass AAD emergency access (no in-app local auth). Observability: OpenTelemetry → Prometheus + Grafana + Loki + Tempo. Deploy: Docker Compose POC → Kubernetes 1.32 prod. Tenancy: single-tenant workspace-partitioned (no tenant_id). Data model: 30 tables across 12 modules. RLS defense-in-depth (Cube primary + Postgres RLS secondary, paired contract tests CI-blocking). API: REST + OpenAPI 3.1 + WebSocket, RFC 7807 errors, typed codes RDASH-{MODULE}-{ERROR}. Canonical SoT artifacts: contracts/openapi.yaml + schema.sql + auth-model.md + rls-catalog.yaml.
Rationale
All choices latest-stable, OSS-only (zero paid-SaaS in v1), right-sized for 3-engineer team + 50→1000 user target + 6-month v1 deadline + ENTERPRISE scope. Monolith-first matches architecture-taste rule #1 — microservices would be premature optimization at this scale with this team. FastAPI over NestJS because Python keeps Redash-port path open and gives data-tooling ecosystem. Postgres over MySQL because JSONB + tsvector + RLS + pgcrypto + team skillset. Cube.js because it’s the only mature OSS semantic layer with Snowflake + RLS + code-as-schema support that doesn’t require dbt in critical path. Azure AD sole IdP because Runwal’s IdP is Azure AD — second IdP would create ambiguity and fail Law 2. React 19 + Tailwind 4 bleeding-edge risk mitigated by 1-week evaluation + fallback-branch at React 18 + Tailwind 3. Cube-first query path + per-DS RLS + paired contract tests + monthly pen-test covers the #1 risk (cross-workspace data leak). Self-approved Checkpoint 2 per feedback_expert_recommendations.md — AJ delegates tactical expert decisions.
Alternatives Rejected
Next.js 15 App Router (frontend) — rejected: SSR/ISR irrelevant for auth-walled internal admin; App Router still churning; Vite dev speed wins.
NestJS (backend) — rejected: End-to-end TS is nice but loses Redash-port ease, loses Python data-tooling ecosystem (pandas/numpy/sqlparse/RestrictedPython), team Python skillset stronger.
Django 5 + DRF — rejected: Heavier framework, slower async story than FastAPI, worse OpenAPI story.
Microservices topology — rejected: Violates architecture-taste rule #1 at this scale; 3-eng team cannot operate microservices; extract later if load profiles diverge year-2+.
MySQL 8 — rejected: Weaker JSONB, weaker RLS, less analytic SQL than Postgres 17.
Celery over RQ — rejected: Overkill for 200K queries/day ceiling; complex failure modes; RQ ported directly from Redash.
Malloy / dbt Metrics / custom-built semantic layer — rejected: Malloy immature operationally; dbt Metrics forces dbt into critical path; custom-built = multi-engineer-year effort.
OpenAI + AutoGen AI runtime — rejected: Breaks Runwal’s Claude API alignment; Project 120 uses Claude.
Authentik / Keycloak IdP — rejected: Duplicates Azure AD; single source of truth wins.
Next.js + tRPC end-to-end TS — rejected: Python backend rules it out.
Datadog / Honeycomb SaaS observability — rejected: Violates zero-paid-SaaS v1 directive.
Auth0 / Clerk SaaS auth — rejected: Violates zero-paid-SaaS + data-sovereignty; Azure AD handles it.
Schema-per-workspace or DB-per-workspace tenancy — rejected: Single-tenant platform; workspace isolation doesn’t need physical separation; RLS suffices; adds migration + backup complexity.
react-ace query editor (Redash choice) — rejected: Monaco’s LSP story is the year-1 AI completion bet.
Outcome
Pending
Related
- r-dash-requirements-freeze-checkpoint-1
- Redash architecture
Snowflake- Cube.js
- Azure AD
- FastAPI
- React 19
- Postgres 17
- r-dash-azure-ad-deferred-local-auth-v1-stack
- r-dash-modular-monolith-3-process-topology
- r-dash-frozen-stack-2025
- expert-recommendations-working-principle-aj
- modular-monolith-correct-for-small-team-single-tenant-intern
- r-same-azure-ad-sso-deferred-local-auth-v1
- modular-monolith-3-process-correct-for-3eng-1000user
- r-taskflow-path-b-clean-slate-web-native-pwa-rebuild