Bark

Bark is an open-source text-to-audio service that generates realistic, multilingual speech and various audio types, such as music and sound effects. Available for free on GitHub, it utilizes a transformer-based model created by Suno and offers pretrained model checkpoints for research purposes.

Bark is a transformer-based text-to-audio model.
It supports various languages and can automatically determine language from the input text.
It has voice presets that include tone, pitch, emotion, and prosody, and can generate unique/random voices.
It generates audio in real-time on GPU and requires around 12GB of VRAM for the full version. There are also smaller versions of the model for 8GB VRAM.
It's capable of generating audio from scratch and supports 100 speaker presets. It works on both CPU and GPU.

Researchers can use Bark for their studies.
Developers can use it to generate speech or sound effects for their applications.
Content creators may find it useful for generating unique voices, music, or sound effects.
Language enthusiasts can use it to study language recognition and switching.

Bark

Examples

Convert Text To Speech

Features

Perfect for

Bark

Service information

Examples

Convert Text To Speech

Similar services

Features

Perfect for