OCI Speech
OCI Text to Speech example
In this post, I’ll walk through the steps to get a very simple example of Text-to-Speech working. This example builds upon my previous posts on OCI Language, OCI Speech and others, so make sure you check out those posts.
The first thing you need to be aware of, and to check, before you proceed, is whether the Text-to-Speech is available in your region. At the time of writing, this feature was only available in Phoenix, which is one of the cloud regions I have access to. There are plans to roll it out to other regions, but I’m not aware of the timeline for this. Although you might see Speech listed on your AI menu in OCI, that does not guarantee the Text-to-Speech feature is available. What it does mean is the text trans scribing feature is available.
So if Text-to-Speech is available in your region, the following will get you up and running.
The first thing you need to do is read in the Config file from the OS.
#initial setup, read Config file, create OCI Client
import oci
from oci.config import from_file
##########
from oci_ai_speech_realtime import RealtimeSpeechClient, RealtimeSpeechClientListener
from oci.ai_speech.models import RealtimeParameters
##########
CONFIG_PROFILE = "DEFAULT"
config = oci.config.from_file('~/.oci/config', profile_name=CONFIG_PROFILE)
###
ai_speech_client = ai_speech_client = oci.ai_speech.AIServiceSpeechClient(config)
###
print(config)
### Update region to point to Phoenix
config.update({'region':'us-phoenix-1'})
A simple little test to see if the Text-to-Speech feature is enabled for your region is to display the available list of voices.
list_voices_response = ai_speech_client.list_voices(
compartment_id=COMPARTMENT_ID,
display_name="Text-to-Speech")
# opc_request_id="1GD0CV5QIIS1RFPFIOLF<unique_ID>")
# Get the data from response
print(list_voices_response.data)
This produces a long json object with many characteristics of the available voices. A simpler listing gives the names and gender)
for i in range(len(list_voices_response.data.items)):
print(list_voices_response.data.items[i].display_name + ' [' + list_voices_response.data.items[i].gender + ']\t' + list_voices_response.data.items[i].language_description )
------
Brian [MALE] English (United States)
Annabelle [FEMALE] English (United States)
Bob [MALE] English (United States)
Stacy [FEMALE] English (United States)
Phil [MALE] English (United States)
Cindy [FEMALE] English (United States)
Brad [MALE] English (United States)
Richard [MALE] English (United States)
Now lets setup a Text-to-Speech example using the simple text, Hello. My name is Brendan and this is an example of using Oracle OCI Speech service. First lets define a function to save the audio to a file.
def save_audi_response(data):
with open(filename, 'wb') as f:
for b in data.iter_content():
f.write(b)
f.close()
We can now establish a connection, define the text, call the OCI Speech function to create the audio, and then to save the audio file.
import IPython.display as ipd
# Initialize service client with default config file
ai_speech_client = oci.ai_speech.AIServiceSpeechClient(config)
TEXT_DEMO = "Hello. My name is Brendan and this is an example of using Oracle OCI Speech service"
#speech_response = ai_speech_client.synthesize_speech(compartment_id=COMPARTMENT_ID)
speech_response = ai_speech_client.synthesize_speech(
synthesize_speech_details=oci.ai_speech.models.SynthesizeSpeechDetails(
text=TEXT_DEMO,
is_stream_enabled=True,
compartment_id=COMPARTMENT_ID,
configuration=oci.ai_speech.models.TtsOracleConfiguration(
model_family="ORACLE",
model_details=oci.ai_speech.models.TtsOracleTts2NaturalModelDetails(
model_name="TTS_2_NATURAL",
voice_id="Annabelle"),
speech_settings=oci.ai_speech.models.TtsOracleSpeechSettings(
text_type="SSML",
sample_rate_in_hz=18288,
output_format="MP3",
speech_mark_types=["WORD"])),
audio_config=oci.ai_speech.models.TtsBaseAudioConfig(config_type="BASE_AUDIO_CONFIG") #, save_path='I'm not sure what this should be')
) )
# Get the data from response
#print(speech_response.data)
save_audi_response(speech_response.data)