Skip to content

Architektur · KI-Plattform

Layer-Modell

┌─ User-Interfaces ─────────────────────────────────────────────────┐
│  ben-e-fit.ai (Hero)   chat.* (OWUI)   apps.* (Dify)              │
│  n8n.*   trino.*   catalog.*   minio.*   docs.*   grafana.*       │
└─ Cloudflare Tunnel · Edge ────────────────────────────────────────┘

┌─ Reverse Proxy / Tunnel ─────────────────────────────────────────┐
│  cloudflared (Tunnel UUID: bf6c481e-ef21-4278-8435-fbc327546335)  │
└──────────────────────────────────────────────────────────────────┘

┌─ Application Layer ─────────────────────────────────────────────┐
│  hub  hub-kg  impulse-suite  kpi-mining  mkdocs                  │
│  open-webui  open-webui-kg  dify-{api,worker,web}                │
│  n8n  n8n-kg  documenso  trino  openmetadata  airbyte            │
└──────────────────────────────────────────────────────────────────┘

┌─ Model Gateway ──────────────────────────────────────────────────┐
│  litellm (25 Modelle: Cloud + Local + NVIDIA NIM)                │
│    OpenRouter Auto  Claude 4.6  GPT-5  Gemini 2.5  DeepSeek      │
│    Llama 3.3 70b  Qwen 2.5 32b  Gemma 3 27b  Phi 4               │
│    bge-m3 / mxbai / nomic Embeddings                             │
└──────────────────────────────────────────────────────────────────┘

┌─ Inference / Compute ────────────────────────────────────────────┐
│  ollama (NVIDIA GB10 GPU)                                        │
└──────────────────────────────────────────────────────────────────┘

┌─ Data Layer ─────────────────────────────────────────────────────┐
│  Vector:    qdrant  weaviate  OWUI vector_db (ChromaDB)          │
│  Search:    opensearch (BM25 + Hybrid)                           │
│  Object:    minio (WORM, 6 Buckets)                              │
│  Relational: postgres × 7 (Dify, Langfuse, Keycloak, LiteLLM,   │
│              Airbyte, Documenso, OpenWebUI)                      │
│  KV:        redis × 2 (Dify, Plattform)                          │
│  OLAP:      clickhouse (Langfuse Traces)                         │
│  Federation: trino (cross-DB SQL)                                │
└──────────────────────────────────────────────────────────────────┘

┌─ Identity & Audit ───────────────────────────────────────────────┐
│  keycloak (Realms · Groups · SSO)                                │
│  langfuse (LLM-Observability · Audit-Trail)                      │
└──────────────────────────────────────────────────────────────────┘

┌─ Operations ────────────────────────────────────────────────────┐
│  prometheus  grafana  uptime-kuma  dozzle  gitops-watcher        │
│  restic-backup (Off-Site, verschlüsselt)                         │
└──────────────────────────────────────────────────────────────────┘

Datenfluss · LLM-Anfrage

User → chat.{tenant} (OWUI)
  → litellm:4000 (Gateway · Routing · Spend-Tracking)
    → Cloud-Modell (OpenRouter / Anthropic / OpenAI direkt)
    → ODER Local-Modell (ollama, lokales GB10-GPU)
  → Response zu OWUI
  → Langfuse-Trace (Audit · Cost · Quality)
  → Response zum User

RAG-Workflow

Datei-Upload → OWUI Knowledge
  → Chunking (~500 Tokens)
  → Embedding (bge-m3 lokal via Ollama)
  → ChromaDB (vector_db/)

Frage → Embedding der Frage
  → Vector-Search Top-K (Cosine-Similarity)
  → Kontext + Frage → LLM
  → Antwort mit Quellen-Zitation

Multi-Tenancy

ben-e-fit.ai (eigene Plattform):
  hub container → öffentliche Hero
  open-webui container → Mitarbeiter-Chat
  n8n container → Eigene Workflows

ki-guru.com (Kunde-Mandant):
  hub-kg container → Mandanten-Hero
  open-webui-kg container → Mandanten-Chat
  n8n-kg container → Mandanten-Workflows

Geteilt (single-instance):
  litellm, ollama, langfuse, keycloak (mit Realms),
  trino, openmetadata, minio (mit Bucket-Trennung)

Sicherheits-Ringe

  1. Edge — Cloudflare-Tunnel, kein offener Port
  2. DNS — Cloudflare-Authoritative, DNSSEC
  3. Network — Private rmki-edge Docker-Network, sysctl IPv6 disabled
  4. Auth — Keycloak SSO, OAuth2 / OIDC für alle Apps
  5. Application — Per-User-Sessions, Per-Group-Permissions
  6. Audit — Langfuse-Traces, Postgres-Logs, GitOps-Audit, MinIO-WORM

Engineering-Controls (Regulatorik)

13 Controls, mapped to GDPR / EU-AI-Act / 21 CFR Part 11 / ISO-27001:

# Control Tool Compliance
1 Network-Segmentation Docker rmki-edge ISO 27001 A.13
2 Edge-WAF cloudflared + Cloudflare-Rules OWASP
3 Identity Keycloak Realms + MFA GDPR Art. 32
4 Authorization OWUI Groups, Dify Workspaces ISO 27001 A.9
5 DLP / PII-Masking Presidio (LiteLLM-Pre) GDPR Art. 9
6 Guardrails NeMo · Halluzinations-Filter EU-AI-Act
7 Audit-Trail Langfuse + Postgres-Audit 21 CFR Part 11
8 Object-Lock MinIO WORM HGB §239, ALCOA+
9 Backup Restic Off-Site Encrypt DSGVO Art. 32
10 Vulnerability Scan Trivy (CI) NIS-2
11 Anomaly Detection CrowdSec ISO 27001 A.16
12 Documentation MkDocs Material (versioned) Art. 30 ROPA
13 Test-Automation Cucumber/BDD EU-AI-Act Hochrisiko