Paid Robot

Speak text

🤖/text/speak synthesizes speech in documents.

You can use the audio that we return in your application, or you can pass the audio down to other Robots to add a voice track to a video for example.

Another common use case is making your product accessible to people with a reading disability.

Warning: Transloadit aims to be deterministic, but this Robot uses third-party AI services. The providers (AWS, GCP) will evolve their models over time, giving different responses for the same input media. Avoid relying on exact responses in your tests and application.

Supported languages and voices

AWS

Language	Voices
arb	female-1
cmn-CN	female-1
da-DK	female-1, male-1
nl-NL	female-1, male-1
en-AU	female-1, male-1
en-GB	female-1, male-1
en-IN	female-1
en-US	female-1, female-child-1, male-1, male-child-1
en-GB-WLS	male-1
fr-FR	female-1, male-1
fr-CA	female-1
de-DE	female-1, male-1
hi-IN	female-1
is-IS	female-1, male-1
it-IT	female-1, male-1
ja-JP	female-1, male-1
ko-KR	female-1
nb-NO	female-1
pl-PL	female-1, male-1
pt-BR	female-1, male-1
pt-PT	female-1, male-1
ro-RO	female-1
ru-RU	female-1, male-1
es-ES	female-1, male-1
es-MX	female-1
es-US	female-1, male-1
sv-SE	female-1
tr-TR	female-1
cy-GB	female-1

GCP

Language	Voices
ar-XA	female-1, male-1
bn-IN	female-1, male-1
yue-HK	female-1, male-1
cs-CZ	female-1
da-DK	female-1, male-1
nl-NL	female-1, male-1
en-AU	female-1, male-1
en-IN	female-1, male-1
en-GB	female-1, male-1
en-US	female-1, female-2, female-3, male-1
fil-PH	female-1, male-1
fi-FI	female-1
fr-CA	female-1, male-1
fr-FR	female-1, male-1
de-DE	female-1, male-1
el-GR	female-1
gu-IN	female-1, male-1
hi-IN	female-1, male-1
hu-HU	female-1
id-ID	female-1, male-1
it-IT	female-1, male-1
ja-JP	female-1, male-1
kn-IN	female-1, male-1
ko-KR	female-1, male-1
ml-IN	female-1, male-1
cmn-CN	female-1, male-1
cmn-TW	female-1, male-1
nb-NO	female-1, male-1
pl-PL	female-1, male-1
pt-BR	female-1
pt-PT	female-1, male-1
ro-RO	female-1
ru-RU	female-1, male-1
sk-SK	female-1
es-ES	female-1, male-1
sv-SE	female-1
ta-IN	female-1, male-1
te-IN	female-1, male-1
th-TH	female-1
tr-TR	female-1, male-1
uk-UA	female-1
vi-VN	female-1, male-1

Usage example

Synthesize speech from uploaded text documents, using a female voice in American English:

{
  "steps": {
    "synthesized": {
      "robot": "/text/speak",
      "use": ":original",
      "provider": "aws",
      "voice": "female-1",
      "target_language": "en-US"
    }
  }
}

Parameters

use

String / Array of Strings / Object required
Specifies which Step(s) to use as input.
- You can pick any names for Steps except ":original" (reserved for user uploads handled by Transloadit)
- You can provide several Steps as input with arrays:
```
"use": [
  ":original",
  "encoded",
  "resized"
]
```
💡 That’s likely all you need to know about use, but you can view Advanced use cases.
prompt

String

Which text to speak. You can also set this to null and supply an input text file.
provider

Stringrequired

Which AI provider to leverage. Valid values are "aws" and "gcp".

Transloadit outsources this task and abstracts the interface so you can expect the same data structures, but different latencies and information being returned. Different cloud vendors have different areas they shine in, and we recommend to try out and see what yields the best results for your use case.
target_language

String ⋅ default: "en-US"

The written language of the document. This will also be the language of the spoken text.

The language should be specified in the BCP-47 format, such as "en-GB", "de-DE" or "fr-FR". Please consult the list of supported languages and voices.
voice

String ⋅ default: "female-1"

The gender to be used for voice synthesis. Please consult the list of supported languages and voices.
ssml

Boolean ⋅ default: false

Supply Speech Synthesis Markup Language instead of raw text, in order to gain more control over how your text is voiced, including rests and pronounciations.

Please see the supported syntaxes for AWS and GCP.

Demos

Convert text into speech

Speak text

Supported languages and voices

AWS

GCP

Usage example

Parameters

`use`

`prompt`

`provider`

`target_language`

`voice`

`ssml`

Demos

Related blog posts