Service information
Plans
free
Platforms
Description
Bark is an open-source text-to-audio service that generates realistic, multilingual speech and various audio types, such as music and sound effects. Available for free on GitHub, it utilizes a transformer-based model created by Suno and offers pretrained model checkpoints for research purposes.
ππ
Convert Text To Speech
#1
inputs
outputs
#2
inputs
outputs
Features
- Bark is a transformer-based text-to-audio model.
- It supports various languages and can automatically determine language from the input text.
- It has voice presets that include tone, pitch, emotion, and prosody, and can generate unique/random voices.
- It generates audio in real-time on GPU and requires around 12GB of VRAM for the full version. There are also smaller versions of the model for 8GB VRAM.
- It's capable of generating audio from scratch and supports 100 speaker presets. It works on both CPU and GPU.
Perfect for
- Researchers can use Bark for their studies.
- Developers can use it to generate speech or sound effects for their applications.
- Content creators may find it useful for generating unique voices, music, or sound effects.
- Language enthusiasts can use it to study language recognition and switching.
Similar services
Share this page: