MinbarLive
A deep-dive into the architecture, AI pipeline, and infrastructure powering live khutbah captions for mosques worldwide.
Detaljan pregled arhitekture, AI pipeline-a i infrastrukture koja pokreće live titlove hutbe za džamije širom svijeta.
A multi-module web platform for live sermon captions with real-time AI transcription, multilingual translation, video studio, podcast companion, AI sermon preparation, and persistent archive.
Multi-modularna web platforma za live titlove hutbe s real-time AI transkripcijom, višejezičnim prijevodom, video studijem, podcast pratiocen, AI pripremom hutbi i trajnom arhivom.
Browser mic (AudioWorklet with ScriptProcessorNode fallback), phone upload, or external mixer. Admin selects ingest mode at session start.
Browser mikrofon (AudioWorklet sa ScriptProcessorNode fallbackom), upload s telefona, ili eksterni mikser. Admin bira mode pri startu sesije.
Audio chunks sent via Socket.IO to backend. Smart buffering + segmentation for smooth subtitle display.
Audio chunk-ovi šalju se putem Socket.IO na backend. Smart buffering + segmentacija za glatki prikaz titlova.
Real-time speech-to-text via Deepgram WebSocket streaming. 20 source languages + auto-detect. Islamic vocabulary keyword boosting for BS/HR/SR.
Real-time speech-to-text putem Deepgram WebSocket streaminga. 20 izvornih jezika + auto-detekcija. Keyword boosting islamskog rječnika za BS/HR/SR.
OpenAI GPT-4o-mini translates via two-step pipeline: normalize mixed content to Croatian, then translate to target language. On-demand for 135+ languages.
OpenAI GPT-4o-mini prevodi putem dvostepenog pipeline-a: normalizacija miješanog sadržaja na hrvatski, zatim prijevod na ciljni jezik. On-demand za 135+ jezika.
Translated segments pushed via Socket.IO broadcast to all connected listeners. Pre-computed languages stored in Supabase, on-demand in MongoDB. HTTP polling as fallback.
Prevedeni segmenti šalju se putem Socket.IO broadcasta svim slušaocima. Pre-computed jezici u Supabase, on-demand u MongoDB. HTTP polling kao fallback.
Session ends → transcript generation triggered (background thread, Redis job tracking). Export available per language.
Sesija završi → generacija transkripta pokreće se (background thread, Redis job tracking). Export dostupan po jeziku.
Backend (FastAPI + Python — 56 routers, 455 endpoints)
Backend (FastAPI + Python — 56 routera, 455 endpointa)
server.py — app assembly, Socket.IO handlers, lifespan (402 lines)sklapanje app-a, Socket.IO handleri, lifespan (402 linije)audio_pipeline.py — STT orchestration, Deepgram WebSocket, mixed Arabic/Latin detectionSTT orkestracija, Deepgram WebSocket, miješana Arabic/Latin detekcijapipeline_persistence.py — Two-step translation, BCS heuristic, on-demand cacheDvostepeni prijevod, BCS heuristika, on-demand cachestudio/ — 10 modules — pipeline, queue, audio, transcription, translation, subtitles10 modula — pipeline, queue, audio, transkripcija, prijevod, titlovirouters/admin_khutbah_kb.py — HutbaAsistent AI (generate, refine, templates, versions)HutbaAsistent AI (generiranje, doraba, predlošci, verzije)routers/admin_podcasts.py — Podcast Companion (CRUD, live stream, QR)Podcast Companion (CRUD, live stream, QR)routers/public_stream_routes.py — On-demand translation + stale detection + typing UXOn-demand prijevod + stale detekcija + typing UXcosting/ — 6 billing modules (plans, invoices, PDF, email dispatch)6 billing modula (planovi, fakture, PDF, email dispatch)271 Python modules, ~111,000 lines (+ 295 test files / ~59,000 lines)Frontend (Next.js 15 + React 19)
Frontend (Next.js 15 + React 19)
src/app/(admin)/ — 60+ admin pages (HutbaLive, Studio, Podcast, HutbaAsistent, Billing)60+ admin stranica (HutbaLive, Studio, Podcast, HutbaAsistent, Billing)src/app/join/ — Live session listener (Socket.IO, 135+ lang, typing animation)Live session slušalac (Socket.IO, 135+ jezika, typing animacija)src/lib/api.js — API client + auth session managementAPI klijent + auth session upravljanjesrc/lib/stream-realtime.js — Socket.IO real-time client + translation handlersSocket.IO real-time klijent + translation handlerisrc/lib/browser-audio-capture.js — AudioWorklet capture + ScriptProcessor fallbackAudioWorklet capture + ScriptProcessor fallbacksrc/lib/translations/ — 9 languages × 2,900+ i18n keys (~26,000 strings)9 jezika × 2.900+ i18n ključeva (~26.000 stringova)src/components/ — LanguagePicker, OEmbedPreview, Sidebar, Header, UI kitLanguagePicker, OEmbedPreview, Sidebar, Header, UI kitpublic/audio-capture-processor.js — AudioWorklet processor (separate audio thread)AudioWorklet procesor (odvojeni audio thread)420 source files, ~90,000 lines6-module billing engine: fixed costs, usage events (Deepgram + OpenAI), org allocation (equal/custom), daily breakdowns, monthly reports, markup separation.
6-modularni billing engine: fiksni troškovi, usage eventi (Deepgram + OpenAI), org alokacija (jednaka/custom), dnevni pregledi, mjesečni izvještaji, markup odvajanje.
Beyond live sessions — Studio Mode processes pre-recorded video files through the same AI pipeline, producing synchronized subtitles and styled exports.
Izvan live sesija — Studio Mode obrađuje unaprijed snimljene video fajlove kroz isti AI pipeline, proizvodeći sinkronizirane titlove i stilizirane exporte.
Admin uploads video file via Studio UI. Chunked upload bypasses proxy limits. Billing plan gating ensures only authorized orgs access Studio.
Admin uploada video putem Studio UI-a. Chunked upload zaobilazi proxy limite. Billing plan gating osigurava pristup samo ovlaštenim org.
Server-side FFmpeg extracts audio track from video. ffprobe validates format and duration. Supports MP4, MKV, WebM, MOV.
Server-side FFmpeg ekstrahuje audio iz videa. ffprobe validira format i trajanje. Podržava MP4, MKV, WebM, MOV.
Extracted audio sent to OpenAI Whisper for high-accuracy transcription with word-level timestamps. Segments stored with start_ms/end_ms precision.
Ekstrahirani audio šalje se na OpenAI Whisper za transkripciju visoke tačnosti s timestamps na nivou riječi. Segmenti se čuvaju s start_ms/end_ms preciznosti.
Transcribed segments translated via OpenAI GPT. Same Islamic-terminology-tuned prompts as live mode. Multi-language output per segment.
Transkribovani segmenti prevode se putem OpenAI GPT. Isti promptovi prilagođeni islamskoj terminologiji kao u live modu. Višejezični output po segmentu.
Blob URL-based video player with real-time subtitle overlay. Authenticated streaming without exposing tokens. Custom React component isolates HTML5 video from React Native View constraints.
Video player baziran na Blob URL-u s real-time subtitle overlay-om. Autentificirani streaming bez izlaganja tokena. Custom React komponenta izolira HTML5 video od React Native View ograničenja.
4 export formats: plain transcript, styled transcript (Source Serif 4 font), styled translation, bilingual side-by-side. Print-ready CSS with responsive design.
4 export formata: obični transkript, stilizirani transkript (Source Serif 4 font), stilizirana prijevod, dvojezični side-by-side. Print-ready CSS s responsive dizajnom.
An automated feedback system that analyzes every session, scores quality, identifies issues, and generates actionable improvement suggestions.
Automatizirani feedback sistem koji analizira svaku sesiju, boduje kvalitetu, identificira probleme i generira djelotvorne prijedloge za poboljšanje.
GET /api/admin/quality/report — fetch quality metricsdohvati quality metrikePOST /api/admin/quality/suggest — submit manual suggestionspodnesi manualne sugestijePATCH /api/admin/quality/suggestions/:id — apply/dismiss/reopenapply/dismiss/reopenPhase 1 — Stabilization
Phase 2 — Optimization
Phase 3 — Scaling
Phase 4 — Enterprise
Live audio capture, Deepgram STT, OpenAI translation, Socket.IO delivery, QR audience access.
Live audio capture, Deepgram STT, OpenAI prijevod, Socket.IO isporuka, QR pristup publike.
Video processing pipeline, live podcast companion, AI sermon preparation with templates and knowledge base.
Video obrada pipeline, live podcast companion, AI priprema hutbi s predlošcima i bazom znanja.
On-demand 135+ languages, stale cache detection, typing animation UX, BCS heuristic.
On-demand 135+ jezika, detekcija stalog cachea, typing animacija UX, BCS heuristika.
56 router files, modular studio, god-file decomposition (−88%), security audit, i18n (9 lang × 2,900+ keys), AudioWorklet.
56 router fajlova, modularni studio, dekompozicija god-fajlova (−88%), sigurnosni audit, i18n (9 jezika × 2.900+ ključeva), AudioWorklet.
Stripe + Revolut live. Self-serve subscriptions, usage-based billing, invoices/PDF, EU VAT, coupons, trial period.
Stripe + Revolut uživo. Self-serve pretplate, usage-based naplata, fakture/PDF, EU PDV, kuponi, probni period.
First paying mosques onboarded and live every Friday. Scaling the reference base, collecting weekly uptime & accuracy metrics, imam testimonials.
Prve plaćajuće džamije onboardane i uživo svaki petak. Širenje referentne baze, prikupljanje sedmičnih uptime i accuracy metrika, svjedočenja imama.
Custom domains, brand settings, DNS verification, multi-tenant branding — live for institutional clients.
Custom domene, brand postavke, DNS verifikacija, multi-tenant branding — uživo za institucionalne klijente.
Speaker identification, cross-session search, auto-tagging, content recommendations.
Identifikacija govornika, pretraga između sesija, auto-tagiranje, preporuke sadržaja.
CTO Assessment: "The product has expanded from a live translation tool into a complete imam workflow platform. ~200,000 lines of core product code across 691 files, 7 modules, 455 API endpoints, 295 test files. Payments (Stripe + Revolut) and white-label are live, and the first paying clients are onboarded — the focus now is scaling repeatable weekly usage." — CTO Brief, June 2026
CTO Procjena: "Proizvod se proširio od alata za live prijevod u kompletnu platformu za workflow imama. ~200.000 linija osnovnog koda u 691 fajlu, 7 modula, 455 API endpointa, 295 test fajlova. Plaćanje (Stripe + Revolut) i white-label su uživo, a prvi plaćajući klijenti su onboardani — fokus je sada na skaliranju ponavljajuće sedmične upotrebe." — CTO Brief, juni 2026