Spaces:

juliankoe
/

tts2jk

Runtime error

App Files Files Community

tts2jk / docs /source /models /tortoise.md

juliankoe

Upload folder using huggingface_hub

a9384d7 almost 2 years ago

preview code

raw

history blame contribute delete

3.08 kB

	# 🐢 Tortoise
	Tortoise is a very expressive TTS system with impressive voice cloning capabilities. It is based on an GPT like autogressive acoustic model that converts input
	text to discritized acouistic tokens, a diffusion model that converts these tokens to melspeectrogram frames and a Univnet vocoder to convert the spectrograms to
	the final audio signal. The important downside is that Tortoise is very slow compared to the parallel TTS models like VITS.

	Big thanks to 👑[@manmay-nakhashi](https://github.com/manmay-nakhashi) who helped us implement Tortoise in 🐸TTS.

	Example use:

	```python
	from TTS.tts.configs.tortoise_config import TortoiseConfig
	from TTS.tts.models.tortoise import Tortoise

	config = TortoiseConfig()
	model = Tortoise.init_from_config(config)
	model.load_checkpoint(config, checkpoint_dir="paths/to/models_dir/", eval=True)

	# with random speaker
	output_dict = model.synthesize(text, config, speaker_id="random", extra_voice_dirs=None, **kwargs)

	# cloning a speaker
	output_dict = model.synthesize(text, config, speaker_id="speaker_n", extra_voice_dirs="path/to/speaker_n/", **kwargs)
	```

	Using 🐸TTS API:

	```python
	from TTS.api import TTS
	tts = TTS("tts_models/en/multi-dataset/tortoise-v2")

	# cloning `lj` voice from `TTS/tts/utils/assets/tortoise/voices/lj`
	# with custom inference settings overriding defaults.
	tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
	file_path="output.wav",
	voice_dir="path/to/tortoise/voices/dir/",
	speaker="lj",
	num_autoregressive_samples=1,
	diffusion_iterations=10)

	# Using presets with the same voice
	tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
	file_path="output.wav",
	voice_dir="path/to/tortoise/voices/dir/",
	speaker="lj",
	preset="ultra_fast")

	# Random voice generation
	tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
	file_path="output.wav")
	```

	Using 🐸TTS Command line:

	```console
	# cloning the `lj` voice
	tts --model_name tts_models/en/multi-dataset/tortoise-v2 \
	--text "This is an example." \
	--out_path "output.wav" \
	--voice_dir path/to/tortoise/voices/dir/ \
	--speaker_idx "lj" \
	--progress_bar True

	# Random voice generation
	tts --model_name tts_models/en/multi-dataset/tortoise-v2 \
	--text "This is an example." \
	--out_path "output.wav" \
	--progress_bar True
	```


	## Important resources & papers
	- Original Repo: https://github.com/neonbjb/tortoise-tts
	- Faster implementation: https://github.com/152334H/tortoise-tts-fast
	- Univnet: https://arxiv.org/abs/2106.07889
	- Latent Diffusion:https://arxiv.org/abs/2112.10752
	- DALL-E: https://arxiv.org/abs/2102.12092

	## TortoiseConfig
	```{eval-rst}
	.. autoclass:: TTS.tts.configs.tortoise_config.TortoiseConfig
	:members:
	```

	## TortoiseArgs
	```{eval-rst}
	.. autoclass:: TTS.tts.models.tortoise.TortoiseArgs
	:members:
	```

	## Tortoise Model
	```{eval-rst}
	.. autoclass:: TTS.tts.models.tortoise.Tortoise
	:members:
	```