Dunno, but this guy (all about ai) builds one with 'faster-whisper', so perhaps you can get a few pointers there? I believe he chunks the Audio on silence. He have a few other speech2x videos. Have fun. https://youtu.be/k6nIxWGdrS4
this post was submitted on 10 Mar 2024
20 points (85.7% liked)
Open Source
31101 readers
806 users here now
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Useful Links
- Open Source Initiative
- Free Software Foundation
- Electronic Frontier Foundation
- Software Freedom Conservancy
- It's FOSS
- Android FOSS Apps Megathread
Rules
- Posts must be relevant to the open source ideology
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon from opensource.org, but we are not affiliated with them.
founded 5 years ago
MODERATORS
Just stumbled upon this speedy one: https://github.com/sanchit-gandhi/whisper-jax
And this one for word precision time marks: https://github.com/m-bain/whisperX
Here is an alternative Piped link(s):
https://piped.video/k6nIxWGdrS4
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I'm open-source; check me out at GitHub.
I found this so far: https://github.com/KoljaB/RealtimeSTT
Maybe I can modify it to use whisper api.
Don't have knowledge to answer your question but you could check how home assistant does it, I think that should point you to the right direction.