Tropo is part of CiscoLearn More

Manipulating Say with SSML

There are many cases when you need or just want to control the pitch, volume and intonation of your prompts and responses. To make this easy, Tropo natively supports a standard called the Synthesized Speech Markup Language (SSML).

SSML is an international standard from the W3C for controlling the pace, tone, pitch and all around sound of computer-generated voices. Here’s a script that repeats the same sentence twice; once at normal speed and then again at half speed:

say("<speak>One potato, two potato, three potato, four. <prosody rate='-50%'>One potato, two potato, three potato, four.</prosody></speak>");
say "<speak>One potato, two potato, three potato, four.
    <prosody rate='-50%'>One potato, two potato, three potato, four.</prosody>
    </speak>"
<?php
    say("<speak>One potato, two potato, three potato, four. <prosody rate='-50%'>One potato, two potato, three potato, four.</prosody></speak>");
?>
say("<speak>One potato, two potato, three potato, four. <prosody rate='-50%'>One potato, two potato, three potato, four.</prosody></speak>")
say("<speak>One potato, two potato, three potato, four. <prosody rate='-50%'>One potato, two potato, three potato, four.</prosody></speak>")

The previous example made use of the rate property of the SSML prosody element to control the playback speed. Other attributes of the prosody element are pitch, contour and volume.

say-as

In addition to controlling pitch, volume and intonation, there are also times when you need to control how the Text to Speech engine interprets text, especially numbers. The SSML say-as element allows you to define whether the text should be interpreted as currency, digits, number, date, time and phone. While most of the options are self-explanatory, it may help to note that digits will interpret the text as individual numbers instead of one complete number ('1234' will be interpreted as 'one, two, three, four') while number will interpret the text as a complete value ('1234' will sound like 'one thousand two hundred thirty four'). Here's a code example displaying the use of say-as:

function say_as(value,type){
      ssml_start="<?xml version='1.0'?><speak>";
      ssml_end="</say-as></speak>";
      ssml ="<say-as interpret-as='"+ type + "'>" + value+"";
      complete_string = ssml_start + ssml + ssml_end;
      log('@@ Say as: ' + complete_string);
      say(complete_string);
}

wait(3000);

say_as('USD51.33','currency');
say_as('20314253','digits');
say_as('2031.435','number');
say_as('4075551212','phone');
say_as('20090226','date');
say_as('0515a','time');
def say_as(value,type)
    ssml_start="<?xml version='1.0'?><speak>"
    ssml_end="</say-as></speak>"
    ssml ="<say-as interpret-as='#{type}'>#{value}"
    complete_string = ssml_start + ssml + ssml_end
    log '@@ Say as: ' + complete_string
    say complete_string
end

wait(3000)

say_as('USD51.33','currency')
say_as('20314253','digits')
say_as('2031.435','number')
say_as('4075551212','phone')
say_as('20090226','date')
say_as('0515a','time')
<?php
    function say_as($value, $type) {
        $ssml_start = "<?xml version='1.0'?><speak>";
        $ssml_end="</say-as></speak>";
        $ssml ="<say-as interpret-as=\"$type\">$value";
        $complete_string = $ssml_start . $ssml . $ssml_end;
        _log('@@ Say as: ' . $complete_string);
        say($complete_string);
    }

    wait(3000);

    say_as("USD51.33","currency");
    say_as("20314253","digits");
    say_as("2031.435","number");
    say_as("4075551212","phone");
    say_as("20090226","date");
    say_as("0515a","time");
?>
def say_as(value,type):
    ssml_start="<?xml version='1.0'?><speak>"
    ssml_end="</say-as></speak>"
    ssml ="<say-as interpret-as='"+ type + "'>" + value+""
    complete_string = ssml_start + ssml + ssml_end
    log('@@ Say as: ' + complete_string)
    say(complete_string)

wait(3000)

say_as('USD51.33','currency')
say_as('20314253','digits')
say_as('2031.435','number')
say_as('4075551212','phone')
say_as('20090226','date')
say_as('0515a','time')
def say_as(value, type){
    ssml_start = "<?xml version='1.0'?><speak>"
    ssml_end = "</say-as></speak>"
    ssml = "<say-as interpret-as='$type'>$value"
    complete_string = ssml_start + ssml + ssml_end
    log('@@ Say as: ' + complete_string)
    say complete_string
}
    
await(3000)

say_as('USD51.33','currency')
say_as('20314253','digits')
say_as('2031.435','number')
say_as('4075551212','phone')
say_as('20090226','date')
say_as('0515a','time')

SSML Support

Tropo supports all of the Required elements of the SSML specification with one exception. SSML supports a tag called "voice" which allows a developer to specify the voice name. In Tropo voice is a parameter of the say or ask verbs and should be set as documented. Attempting to set voice within SSML will result in a failure for the text to render or use a voice other than the one intended.