Speech to Text Apps
Related links:
π Speech to Text Apps
π Text to Speech Apps
π Speech to Speech (Fake Voice Generator)
Speech to Text
- DeepSpeech : simpler although inferior
- Kaldi : STT supports hybrid NN-HMM and lattice-free MMI models. Kaldi is used by many people both in research and in production.
- Lingvo is the open source version of Google speech recognition toolkit, with support mostly for end-to-end models.
- ESPNet is good and well known for end-to-end models as well.
- RASR + RETURNN are very good as well, both for end-to-end models and hybrid NN-HMM, but they are for non-commercial applications only (or you need a commercial licence) (disclaimer: I work at the university chair which develops these frameworks).
- http://gkarsay.github.io/parlatype/
- https://github.com/juanerasmoe/pmTrans
- https://pythonbasics.org/transcribe-audio/
- Wav2Letter, the tool by Facebook.
- snakers4/silero-models at mlnews Silero Speech to Text
- coqui Coqui STT and TTS
- voice2json - Command-line tools for speech and intent recognition on Linux
- VOSK Offline Speech Recognition API
-
Dataset
- English: Tedlium, Librispeech, etc.
- https://github.com/gooofy/zamia-speech
- https://commonvoice.mozilla.org/en/datasets
- https://www.openslr.org/resources.php
- snakers4/silero-models: Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Speech to Text Indonesian Support
- Voice Notebook
- Speech Texter
- Voicenote
- Speechnotes
- Dictation
- Dictanote
- oTranscribe
- Google Web Speech API
-
Google Docs Type for your Voice
Tools
and thenVoice typing
Speech Recognition
- Wav2vec: Semi and Unsupervised Speech Recognition - Vaclav Kosarβs Blog
- The Illustrated Wav2vec - Jonathan Bgn
Video Transcriber
- Transcribe File
- Edit Video Fast | Simon Says
- Audio/Video Transcription | 99% Accuracy, 12-HR Turnaround
- Transcription
- AssemblyAI | #1 API Platform for AI Models