Architektur · KI-Plattform¶
Layer-Modell¶
┌─ User-Interfaces ─────────────────────────────────────────────────┐
│ ben-e-fit.ai (Hero) chat.* (OWUI) apps.* (Dify) │
│ n8n.* trino.* catalog.* minio.* docs.* grafana.* │
└─ Cloudflare Tunnel · Edge ────────────────────────────────────────┘
┌─ Reverse Proxy / Tunnel ─────────────────────────────────────────┐
│ cloudflared (Tunnel UUID: bf6c481e-ef21-4278-8435-fbc327546335) │
└──────────────────────────────────────────────────────────────────┘
┌─ Application Layer ─────────────────────────────────────────────┐
│ hub hub-kg impulse-suite kpi-mining mkdocs │
│ open-webui open-webui-kg dify-{api,worker,web} │
│ n8n n8n-kg documenso trino openmetadata airbyte │
└──────────────────────────────────────────────────────────────────┘
┌─ Model Gateway ──────────────────────────────────────────────────┐
│ litellm (25 Modelle: Cloud + Local + NVIDIA NIM) │
│ OpenRouter Auto Claude 4.6 GPT-5 Gemini 2.5 DeepSeek │
│ Llama 3.3 70b Qwen 2.5 32b Gemma 3 27b Phi 4 │
│ bge-m3 / mxbai / nomic Embeddings │
└──────────────────────────────────────────────────────────────────┘
┌─ Inference / Compute ────────────────────────────────────────────┐
│ ollama (NVIDIA GB10 GPU) │
└──────────────────────────────────────────────────────────────────┘
┌─ Data Layer ─────────────────────────────────────────────────────┐
│ Vector: qdrant weaviate OWUI vector_db (ChromaDB) │
│ Search: opensearch (BM25 + Hybrid) │
│ Object: minio (WORM, 6 Buckets) │
│ Relational: postgres × 7 (Dify, Langfuse, Keycloak, LiteLLM, │
│ Airbyte, Documenso, OpenWebUI) │
│ KV: redis × 2 (Dify, Plattform) │
│ OLAP: clickhouse (Langfuse Traces) │
│ Federation: trino (cross-DB SQL) │
└──────────────────────────────────────────────────────────────────┘
┌─ Identity & Audit ───────────────────────────────────────────────┐
│ keycloak (Realms · Groups · SSO) │
│ langfuse (LLM-Observability · Audit-Trail) │
└──────────────────────────────────────────────────────────────────┘
┌─ Operations ────────────────────────────────────────────────────┐
│ prometheus grafana uptime-kuma dozzle gitops-watcher │
│ restic-backup (Off-Site, verschlüsselt) │
└──────────────────────────────────────────────────────────────────┘
Datenfluss · LLM-Anfrage¶
User → chat.{tenant} (OWUI)
→ litellm:4000 (Gateway · Routing · Spend-Tracking)
→ Cloud-Modell (OpenRouter / Anthropic / OpenAI direkt)
→ ODER Local-Modell (ollama, lokales GB10-GPU)
→ Response zu OWUI
→ Langfuse-Trace (Audit · Cost · Quality)
→ Response zum User
RAG-Workflow¶
Datei-Upload → OWUI Knowledge
→ Chunking (~500 Tokens)
→ Embedding (bge-m3 lokal via Ollama)
→ ChromaDB (vector_db/)
Frage → Embedding der Frage
→ Vector-Search Top-K (Cosine-Similarity)
→ Kontext + Frage → LLM
→ Antwort mit Quellen-Zitation
Multi-Tenancy¶
ben-e-fit.ai (eigene Plattform):
hub container → öffentliche Hero
open-webui container → Mitarbeiter-Chat
n8n container → Eigene Workflows
ki-guru.com (Kunde-Mandant):
hub-kg container → Mandanten-Hero
open-webui-kg container → Mandanten-Chat
n8n-kg container → Mandanten-Workflows
Geteilt (single-instance):
litellm, ollama, langfuse, keycloak (mit Realms),
trino, openmetadata, minio (mit Bucket-Trennung)
Sicherheits-Ringe¶
- Edge — Cloudflare-Tunnel, kein offener Port
- DNS — Cloudflare-Authoritative, DNSSEC
- Network — Private rmki-edge Docker-Network, sysctl IPv6 disabled
- Auth — Keycloak SSO, OAuth2 / OIDC für alle Apps
- Application — Per-User-Sessions, Per-Group-Permissions
- Audit — Langfuse-Traces, Postgres-Logs, GitOps-Audit, MinIO-WORM
Engineering-Controls (Regulatorik)¶
13 Controls, mapped to GDPR / EU-AI-Act / 21 CFR Part 11 / ISO-27001:
| # | Control | Tool | Compliance |
|---|---|---|---|
| 1 | Network-Segmentation | Docker rmki-edge | ISO 27001 A.13 |
| 2 | Edge-WAF | cloudflared + Cloudflare-Rules | OWASP |
| 3 | Identity | Keycloak Realms + MFA | GDPR Art. 32 |
| 4 | Authorization | OWUI Groups, Dify Workspaces | ISO 27001 A.9 |
| 5 | DLP / PII-Masking | Presidio (LiteLLM-Pre) | GDPR Art. 9 |
| 6 | Guardrails | NeMo · Halluzinations-Filter | EU-AI-Act |
| 7 | Audit-Trail | Langfuse + Postgres-Audit | 21 CFR Part 11 |
| 8 | Object-Lock | MinIO WORM | HGB §239, ALCOA+ |
| 9 | Backup | Restic Off-Site Encrypt | DSGVO Art. 32 |
| 10 | Vulnerability Scan | Trivy (CI) | NIS-2 |
| 11 | Anomaly Detection | CrowdSec | ISO 27001 A.16 |
| 12 | Documentation | MkDocs Material (versioned) | Art. 30 ROPA |
| 13 | Test-Automation | Cucumber/BDD | EU-AI-Act Hochrisiko |