Initial commit: Whisper API with FastAPI, GPU support and Admin Dashboard

2026-01-28 23:16:44 +01:00
commit 008ef63bfd
28 changed files with 1871 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,335 @@
+# Whisper API
+
+Eine lokale Whisper-API mit GPU-Beschleunigung und Web-Admin-Interface für die Transkription von Audio-Dateien.
+
+## Features
+
+- **OpenAI-kompatible API** - Drop-in Ersatz für OpenAI Whisper API
+- **GPU-beschleunigt** - Nutzt NVIDIA GPUs (CUDA) für schnelle Transkription
+- **Default: large-v3** - Beste Qualität mit deiner RTX 3090
+- **Web-Admin-Interface** - API-Key Management und Statistiken unter `/admin`
+- **API-Key Authentifizierung** - Sichere Zugriffskontrolle
+- **Cross-Platform** - Docker-basiert, läuft auf Windows und Linux
+- **Automatische Cleanup** - Logs nach 30 Tagen automatisch gelöscht
+
+## Architektur
+
+```
+┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
+│   Client/App    │────▶│   FastAPI App    │────▶│  Whisper GPU    │
+│  (Clawdbot etc) │     │   (Port 8000)    │     │  (large-v3)     │
+└─────────────────┘     └──────────────────┘     └─────────────────┘
+                               │
+                               ▼
+                        ┌──────────────────┐
+                        │  /admin Panel    │
+                        │  - Key Mgmt      │
+                        │  - Dashboard     │
+                        │  - Logs          │
+                        └──────────────────┘
+```
+
+## Schnellstart
+
+### Voraussetzungen
+
+- Docker Desktop (Windows) oder Docker + docker-compose (Linux)
+- NVIDIA GPU mit CUDA-Unterstützung (RTX 3090)
+- NVIDIA Container Toolkit installiert
+
+### Installation
+
+1. **Repository klonen:**
+```bash
+git clone https://gitea.ragtag.rocks/b0rborad/whisper-api.git
+cd whisper-api
+```
+
+2. **Umgebungsvariablen konfigurieren:**
+```bash
+cp .env.example .env
+# Bearbeite .env nach deinen Wünschen
+```
+
+3. **Docker-Container starten:**
+```bash
+docker-compose up -d
+```
+
+4. **Erster Start:**
+   - Das `large-v3` Modell (~3GB) wird automatisch heruntergeladen
+   - Dies kann 5-10 Minuten dauern
+   - Status überprüfen: `docker-compose logs -f`
+
+### Verifizierung
+
+```bash
+# Health-Check
+curl http://localhost:8000/health
+
+# API-Info
+curl http://localhost:8000/v1/models
+```
+
+## API-Dokumentation
+
+### Authentifizierung
+
+Alle API-Endpunkte (außer `/health` und `/admin`) benötigen einen API-Key:
+
+```bash
+Authorization: Bearer sk-dein-api-key-hier
+```
+
+### Endpunkte
+
+#### POST /v1/audio/transcriptions
+
+Transkribiert eine Audio-Datei.
+
+**Request:**
+```bash
+curl -X POST http://localhost:8000/v1/audio/transcriptions \
+  -H "Authorization: Bearer sk-dein-api-key" \
+  -H "Content-Type: multipart/form-data" \
+  -F "file=@/pfad/zur/audio.mp3" \
+  -F "model=large-v3" \
+  -F "language=de" \
+  -F "response_format=json"
+```
+
+**Response:**
+```json
+{
+  "text": "Hallo Welt, das ist ein Test."
+}
+```
+
+#### POST /v1/audio/transcriptions (mit Timestamps)
+
+**Request:**
+```bash
+curl -X POST http://localhost:8000/v1/audio/transcriptions \
+  -H "Authorization: Bearer sk-dein-api-key" \
+  -F "file=@audio.mp3" \
+  -F "timestamp_granularities[]=word" \
+  -F "response_format=verbose_json"
+```
+
+**Response:**
+```json
+{
+  "text": "Hallo Welt",
+  "segments": [
+    {
+      "id": 0,
+      "start": 0.0,
+      "end": 1.5,
+      "text": "Hallo Welt",
+      "words": [
+        {"word": "Hallo", "start": 0.0, "end": 0.5},
+        {"word": "Welt", "start": 0.6, "end": 1.2}
+      ]
+    }
+  ]
+}
+```
+
+#### GET /v1/models
+
+Liste verfügbarer Modelle.
+
+#### GET /health
+
+Health-Check mit GPU-Status.
+
+**Response:**
+```json
+{
+  "status": "healthy",
+  "gpu": {
+    "available": true,
+    "name": "NVIDIA GeForce RTX 3090",
+    "vram_used": "2.1 GB",
+    "vram_total": "24.0 GB"
+  },
+  "model": "large-v3",
+  "version": "1.0.0"
+}
+```
+
+## Admin-Interface
+
+Das Web-Interface ist erreichbar unter: `http://localhost:8000/admin`
+
+### Login
+
+- **Benutzername:** `admin` (konfigurierbar in `.env`)
+- **Passwort:** `-whisper12510-` (konfigurierbar in `.env`)
+
+### Features
+
+- **Dashboard:** Übersicht über Nutzung, Performance-Statistiken
+- **API-Keys:** Verwalten (erstellen, deaktivieren, löschen)
+- **Logs:** Detaillierte Transkriptions-Logs mit Filter
+
+## Konfiguration
+
+### .env.example
+
+```bash
+# Server
+PORT=8000
+HOST=0.0.0.0
+
+# Whisper
+WHISPER_MODEL=large-v3
+WHISPER_DEVICE=cuda
+WHISPER_COMPUTE_TYPE=float16
+
+# Authentifizierung
+# Mehrere API-Keys mit Komma trennen
+API_KEYS=sk-dein-erster-key,sk-dein-zweiter-key
+ADMIN_USER=admin
+ADMIN_PASSWORD=-whisper12510-
+
+# Daten-Retention (Tage)
+LOG_RETENTION_DAYS=30
+
+# Optional: Sentry für Error-Tracking
+# SENTRY_DSN=https://...
+```
+
+### Docker-Compose Anpassungen
+
+```yaml
+services:
+  whisper-api:
+    # ...
+    environment:
+      - PORT=8000  # Änderbar
+      - WHISPER_MODEL=large-v3
+    volumes:
+      - ./models:/app/models    # Persistiert Modelle
+      - ./data:/app/data        # SQLite Datenbank
+      - ./uploads:/app/uploads  # Temporäre Uploads
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+```
+
+## Migration zu Linux
+
+Die Docker-Konfiguration ist plattformunabhängig. Für Linux:
+
+1. **NVIDIA Docker installieren:**
+```bash
+# Ubuntu/Debian
+distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
+curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
+curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
+
+sudo apt-get update
+sudo apt-get install -y nvidia-docker2
+sudo systemctl restart docker
+```
+
+2. **Projekt klonen und starten:**
+```bash
+git clone https://gitea.ragtag.rocks/b0rborad/whisper-api.git
+cd whisper-api
+docker-compose up -d
+```
+
+3. **GPU-Passthrough verifizieren:**
+```bash
+docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
+```
+
+## Integration mit Clawdbot
+
+Für die Integration in einen Clawdbot Skill:
+
+```python
+import requests
+
+API_URL = "http://localhost:8000/v1/audio/transcriptions"
+API_KEY = "sk-dein-api-key"
+
+def transcribe_audio(audio_path):
+    with open(audio_path, "rb") as f:
+        response = requests.post(
+            API_URL,
+            headers={"Authorization": f"Bearer {API_KEY}"},
+            files={"file": f},
+            data={"language": "de"}
+        )
+    return response.json()["text"]
+```
+
+## Performance
+
+Mit RTX 3090 und large-v3:
+- **1 Minute Audio:** ~3-5 Sekunden Verarbeitungszeit
+- **VRAM-Nutzung:** ~10 GB
+- **Batch-Verarbeitung:** Möglich für parallele Requests
+
+## Troubleshooting
+
+### GPU nicht erkannt
+
+```bash
+# NVIDIA Container Toolkit prüfen
+docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
+
+# Logs prüfen
+docker-compose logs whisper-api
+```
+
+### Modell-Download langsam
+
+```bash
+# Manuelles Downloaden möglich
+mkdir -p models
+# Modelle werden von HuggingFace heruntergeladen
+```
+
+### Port belegt
+
+```bash
+# Port in .env ändern
+PORT=8001
+```
+
+## Backup
+
+Wichtige Daten:
+- `./data/` - SQLite Datenbank (API-Keys, Logs)
+- `./models/` - Heruntergeladene Whisper-Modelle
+- `./.env` - Konfiguration
+
+```bash
+# Backup erstellen
+tar -czvf whisper-api-backup.tar.gz data/ models/ .env
+```
+
+## Lizenz
+
+MIT License - Siehe LICENSE Datei
+
+## Support
+
+Bei Problemen:
+1. Logs prüfen: `docker-compose logs -f`
+2. Health-Check: `curl http://localhost:8000/health`
+3. Issue auf Gitea erstellen
+
+---
+
+**Erstellt für:** b0rborad @ ragtag.rocks  
+**Hardware:** Dual RTX 3090 Setup  
+**Zweck:** Clawdbot Skill Integration