Internet DRAFT - draft-cromwell-navdec-media-req
draft-cromwell-navdec-media-req
Internet Engineering Task Force D. Cromwell
INTERNET DRAFT M. Durling
File: draft-cromwell-navdec-media-req-00.txt Nortel Networks
Date: November 1998
Requirements For Control Of A Media Services Function
<draft-cromwell-navdec-media-req-00.txt>
Status of this Document
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
To view the entire list of current Internet-Drafts, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern
Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific
Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).
Abstract
This document describes the functional requirements for protocol used
by a call processing agent in a packet network to control an media
services function located in the same packet network or at the inter-
face of the packet network and the traditional telephone network. The
primary focus of the protocol is on audio, however the protocol could
be extended in the future to support other media streams such as
video.
The protocol provides the standard audio operations of play audio,
collect DTMF, and record speech. It supports direct references to
simple audio as well as indirect references to simple and complex
audio. It provides multi-language audio variables, interruptibility
of audio, digit buffer control, special key sequences, and support
for reprompting during data collection.
Cromwell, Durling expires May 1999 [Page 1]
INTERNET DRAFT Media Services Control November 1998
The approach used in specifying protocol functionality was to look at
several existing protocols currently in use in the telephone network,
taking the best concepts from each and attempting to avoid any limi-
tations. The following protocols were examined: ITU CS1-R [2], Nor-
tel Extended CS1-R [3], Bellcore GR-1129 [4], and Bellcore SR-3511
[5]. The protocol described in this document provides at a minimum a
superset of the functionality of these protocols.
Cromwell, Durling expires May 1999 [Page 2]
INTERNET DRAFT Media Services Control November 1998
Table of Contents
1. Notation
2. Segments
2.1. Terminology
2.2. Segment Types
3. Variables
3.1. Specification
3.2. Inflection
3.3. Variable Types
3.4. Date
3.5. Digits
3.6. Duration
3.7. Money
3.8. Month
3.9. Number
3.10. Silence
3.11. String
3.12. Text
3.13. Time
3.14. Tone
3.15. Weekday
4. Operations
4.1. Play Operation
4.1.1. Announcements
4.1.2. Iterations
4.1.3. Duration
4.1.4. Speed
4.1.5. Volume
4.1.6. Optional Parameters
4.1.7. Return Values
4.1.8. Examples
4.2. Play Collect Operation
4.2.1. Announcements
4.2.2. Speed
4.2.3. Volume
4.2.4. Interruptibility
4.2.5. Digit Buffer Control
4.2.6. Pattern Matching
4.2.7. Timers
4.2.8. Key Definitions
4.2.9. Number Of Attempts
4.2.10. Optional Parameters
4.2.11. Return Values
4.2.12. Examples
4.3. Play Record Operation
Cromwell, Durling expires May 1999 [Page 3]
INTERNET DRAFT Media Services Control November 1998
4.3.1. Announcements
4.3.2. Speed
4.3.3. Volume
4.3.4. Interruptibility
4.3.5. Digit Buffer Control
4.3.6. Timers
4.3.7. Key Definitions
4.3.8. Optional Parameters
4.3.9. Return Values
4.3.10. Examples
5. Other Requirements
5.1. Invoke Application
5.2. Audio Management
6. Open Issues
7. Implementation
8. References
9. Author's Address
Notation
Protocol operations are represented in this document by pseudo code.
These representations are not intended to imply an actual implementa-
tion syntax and are for purposes of illustration only.
1. Segments
1.1. Terminology
A discrete unit of playable speech can be classified as a fragment, a
segment, or an announcement. A fragment is the smallest unit and
typically consists of one or more phonemes, e.g. "\w\, the first
sound in "welcome." A segment can be either composed of a series of
fragments or defined atomically and typically consists of one or more
words, e.g. "Welcome" or "Welcome to." An announcement is composed
of one or more segments and typically embodies a complete logical
expression, e.g. "Welcome to Bell South's Automated Directory Assis-
tance Service." It is possible for an announcement to be defined as
a single segment. In this document "announcement" is a logical con-
cept while "segment" refers to actual audio.
Media operations supported by the protocol should reference announce-
ments. Announcements should be specifiable either as a sequence of
segment id's given as a parameter to a media output operation or as a
sequence of segment id's provisioned in data and referenced by a
Cromwell, Durling expires May 1999 [Page 4]
INTERNET DRAFT Media Services Control November 1998
single identifier. This identifier can be used as a parameter to a
media output operation. Allowing both parameter driven and data
driven specification of announcements provides application designers
a great deal of flexibility when choosing an application and provi-
sioning model.
In practice however, the majority of references made by a call pro-
cessing agent will likely be to a single segment which is a logically
complete pre-recorded announcement, e.g. play(27), where segment 27
points to a recording of "Please enter your card number after the
tone...<tone>"
1.2. Segment Types
The protocol should support following segment types:
RECORDING: A reference by unique id to a single piece of pro-
visioned audio.
TEXT: A reference to a block of text to be converted to speech
or to be displayed on a device. Reference may be by unique id
to a block of provisioned text or by direct specification of
text in a parameter.
SILENCE: A specification of a length of silence to be played in
units of 100 milliseconds.
TONE: The specification of a tone to be played by algorithmic
generation. Most tones however will probably be recorded, not
generated. Exact specification of this segment type is tbd.
VARIABLE: The specification of a multilanguage voice variable
by type, subtype, language, and value. Specification of vari-
ables is considered in more detail in a subsequent section of
this document.
COMPOSITE: A reference by unique id to a provisioned sequence of
mixed recording, text, silence, tone, variable, or composite
segments.
Recursive definition of composite segments should be allowed.
For example composite A could have as one of its elements compo-
site B which has as one of its elements composite C. However
this feature should be used with caution give the additional
complexity it introduces.
Direct or transitive definition of a composite segment in terms
Cromwell, Durling expires May 1999 [Page 5]
INTERNET DRAFT Media Services Control November 1998
of itself must not be permitted, e.g. composite A having as one
of its elements composite B, which has as one of its elements
composite A.
2. Variables
2.1. Specification
Variables should be specified by type, subtype, language, and value.
Subtype is a refinement of type. For example the variable type Money
might have an associated range of subtypes such as Dollar, Rupee,
Dinar, etc. Not all variables require a subtype, and for these vari-
ables the subtype parameter should be set to null.
ISO standard 639, Code For The Representation Of Names Of Languages
[6], lists the names of many languages and should be used as a start-
ing point in defining the range of available languages. A small
excerpt from ISO 639 follows:
_________________
|Code | Language |
|_____|__________|
| cs | Czech |
| cy | Welsh |
| da | Danish |
|_____|__________|
Note that ISO 639 is not a complete list. For example the standard
includes Chinese but does not mention the Mandarin or Cantonese
dialects.
In some cases it may be desirable to play an announcement with an
embedded variable without playing the variable itself. If the value
for a variable is NULL, the variable must not be played.
2.2. Inflection
Specification of inflection is beyond the scope of this protocol,
however a media services function should support rising, flat, and
falling inflections as appropriate.
Cromwell, Durling expires May 1999 [Page 6]
INTERNET DRAFT Media Services Control November 1998
2.3. Variable Types
The protocol should support the following multilanguage voice vari-
ables and should be extensible to support additional variable types.
A list of supported variables follows:
______________________________
|Type | Subtype |
|_________|___________________|
|DATE | none |
| | |
|_________|___________________|
|DIGITS | GENERIC |
| | NORTH AMERICAN DN |
|_________|___________________|
|DURATION | none |
| | |
|_________|___________________|
| | |
|MONEY | currency_type |
|_________|___________________|
|MONTH | none |
| | |
|_________|___________________|
|NUMBER | CARDINAL |
| | ORDINAL |
|_________|___________________|
|SILENCE | none |
| | |
|_________|___________________|
|STRING | none |
|_________|___________________|
|TEXT | DISPLAY |
| | SPEECH |
| | |
|_________|___________________|
|TIME | TWELVEHOUR |
| | TWENTYFOURHOUR |
|_________|___________________|
|TONE | none |
|_________|___________________|
|WEEKDAY | none |
|_________|___________________|
Cromwell, Durling expires May 1999 [Page 7]
INTERNET DRAFT Media Services Control November 1998
2.4. Date
Speaks a date. For example "101598" is spoken as "October fifteenth
nineteen ninety eight."
2.5. Digits
Speaks a string of digits one at a time. If the subtype is North
American DN, the format of which is NPA-NXX-XXXX, the digits are spo-
ken with appropriate pauses between the NPA and NXX and between the
NXX and XXXX. If the subtype is generic, the digits are spoken no
pauses.
2.6. Duration
Duration is specified in seconds and is spoken in one or more units
of time as appropriate, e.g. "3661" is spoken as "One hour, one
minute, and one second."
2.7. Money
Money is specified in the smallest units of a given currency and is
spoken in one or more units of currency as appropriate, e.g. "110" in
U.S. Dollars would be spoken "one dollar and ten cents." The list of
currency specified in ISO 4217, Currency And Funds Code List [7],
should be used as a starting point in defining the currency subtype.
A small excerpt from ISO 4217 follows:
__________________________________________________________
|Alpha-code | Numeric-code | Currency | Entity |
|___________|______________|__________|___________________|
|GQE | 226 | Ekwele | Equatorial Guinea |
|GRD | 300 | Drachma | Greece |
|GTQ | 320 | Quetzal | Guatemala |
|___________|______________|__________|___________________|
2.8. Month
Speaks the specified month, e.g. "October."
Cromwell, Durling expires May 1999 [Page 8]
INTERNET DRAFT Media Services Control November 1998
2.9. Number
Speaks a number in cardinal form or in ordinal form. For example,
"100" is spoken as "one hundred" in cardinal form and "one hundredth"
in ordinal form.
2.10. Silence
Plays a specified period of silence. Specification is in 100 mil-
lisecond units.
2.11. String
Speaks each character of a string, e.g. "a34bc" is spoken "A, three,
four, b, c."
2.12. Text
Produces the specified text as speech or displays it on a device.
2.13. Time
Speaks a time (specified in twenty four hour format) in either twelve
hour format or twenty four hour format. For example "1700" is spoken
as "Five pm" in twelve hour format or as "Seventeen hundred hours" in
twenty four hour format.
2.14. Tone
Plays an algorithmically generated tone, specification of which is
tbd. Probably most applications will use prerecorded tones.
2.15. Weekday
Speaks the day of the week, e.g. "Monday."
3. Operations
This section describes the functional requirements for a set of media
control operations. Three operations are defined: play, play col-
lect, and play record. Specification of endpoint, port, or channel
Cromwell, Durling expires May 1999 [Page 9]
INTERNET DRAFT Media Services Control November 1998
on a per operation basis is not a protocol requirement, however it
may be required in particular implementations.
3.1. Play Operation
The play operation should play an announcement in situations where
there is no need for interaction with the user. Because there is no
need to monitor the incoming media stream this operation is an effi-
cient mechanism for treatments, informational announcements, etc.
The play operation should specified as follows:
3.1.1. Announcements
The play operation should play an ordered sequence of one or more
segments of the following types: recording, text, silence, tone,
variable, and composite.
3.1.2. Iterations
The protocol should support specification of the maximum number of
times an announcement is to be played. It should be possible to
specify that an announcement be repeated forever, and it should also
be possible to specify a interval of silence (in 100 millisecond
units) to be inserted between announcement plays. If the number of
iterations is not specified, it should be assumed to be one (i.e. a
single play). If the inter-announcement interval is not specified,
it should be assumed to be one second.
3.1.3. Duration
The protocol should support specification of the maximum amount of
time (in 100 millisecond units) allowed to play and possibly replay
an announcement. It should be possible to specify that an announce-
ment be played forever. If duration is specified, it should take
precedence over iteration and interval. For example, if a 10 second
announcement is to be played 5 times with 2 seconds of silence
between plays, the total playing time would be 58 seconds. However,
if duration is set to 29, the announcement will only be played for 29
seconds, i.e. the entire announcement will be played twice but the
third play will be terminated after 5 seconds.
Cromwell, Durling expires May 1999 [Page 10]
INTERNET DRAFT Media Services Control November 1998
3.1.4. Speed
The relative playback speed of the announcement should be specifiable
as a percentage variation from the normal playback speed. This ini-
tial setting should apply to the entire playing of the announcement
and should not be changeable. The normal playback speed and the
range of change allowed is implementation dependent.
3.1.5. Volume
The relative playback level of an announcement. should be specifiable
as an percentage variation from the normal playback level. This ini-
tial setting should apply to the entire playing of the announcement
and should not be changeable. The normal volume and the amount of
change allowed is implementation dependent.
3.1.6. Optional Parameters
All parameters to the play operation except the announcement parame-
ter are optional. Certain parameters default to reasonable values.
This allows the call agent to specify only the minimum set of parame-
ters it needs in a given situation. If an announcement is not speci-
fied an error will be returned to the call agent. The defaults are:
_______________________
|Parameter | Default |
|___________|__________|
|Iterations | 1 |
| Interval | 1 second |
|___________|__________|
3.1.7. Return Values
In addition to a return code that describes the outcome of a play
operation, the following information is returned:
The interrupting key sequence, if any.
If an announcement was interrupted, the length of the portion of
the announcement that was played before the interrupt.
Cromwell, Durling expires May 1999 [Page 11]
INTERNET DRAFT Media Services Control November 1998
3.1.8. Examples
Assume the following syntax:
__________________________________________________________________
| |
| play(announcement,iterations,interval,duration,speed,volume) |
|________________________________________________________________|
Play a single recording, text, or composite segment:
______________________
| |
| play(segment(5)) |
|____________________|
Play a sequence of three segments:
____________________________________________
| |
| play(segment(5),segment(6),segment(7)) |
|__________________________________________|
Play three seconds of silence:
_______________________
| |
| play(silence(30)) |
|_____________________|
Play text as speech:
___________________________
| |
| play(speech("hello")) |
|_________________________|
Display text on a device:
____________________________
| |
| play(display("hello")) |
|__________________________|
Play "Eleven dollars and fifty three cents" in English:
Cromwell, Durling expires May 1999 [Page 12]
INTERNET DRAFT Media Services Control November 1998
_________________________________________________
| |
| play(variable(MONEY,USDOLLARS,ENGLISH,1153) |
|_______________________________________________|
Specification of a variable without a subtype:
_______________________________________
| |
| play(variable(DIGITS,,HINDI,1234) |
|_____________________________________|
Play a segment followed by 1 second of silence, followed by "one,
two, three, four in Hindi, followed by another segment:
_______________________________________________________________________
| |
| play(segment(45),silence(10),variable(DIGITS,,HINDI,segment(543)) |
|_____________________________________________________________________|
The same operation as above. The sequence of segment, variable,
silence, and segment is defined in data as segment 37:
__________________________________________
| |
| play(segment(37),DIGITS,,HINDI,1234) |
|________________________________________|
Play an announcement 10% faster than normal speed and 5% softer than
normal volume:
____________________________________________
| |
| play(segment(7),speed(+10),volume(-5)) |
|__________________________________________|
Play an announcement three times with two seconds of silence between
plays:
__________________________________________________
| |
| play(segment(98),iterations(3),interval(20)) |
|________________________________________________|
The same operation as above only the operation is terminated after
twenty seconds:
Cromwell, Durling expires May 1999 [Page 13]
INTERNET DRAFT Media Services Control November 1998
________________________________________________________________
| |
| play(segment(98),iterations(3),interval(20),duration(200)) |
|______________________________________________________________|
3.2. Play Collect Operation
The play collect operation should play a prompt and collect DTMF
digits. If no digits are entered or an invalid digit pattern is
entered, the user may be reprompted and given another chance to enter
the correct digits. The play collect operation should be specified
as follows:
3.2.1. Announcements
The play collect operation should optionally play one or more
announcements, each consisting of an ordered sequence of one or more
segments of the following types: recording, text, silence, tone,
variable, and composite. All play collect announcements are optional
and some default to other announcements if they are not specified.
For example if the user fails to enter any digits the no digits
reprompt is played. If the no digits reprompt is undefined then the
reprompt is played. If the reprompt is undefined then the initial
prompt is played, and if the initial prompt is not defined then no
announcement is played.
This concept of cascading defaults allows the level of audio customi-
zation to decay gracefully all the way back to a single announcement
for all errors and means that applications are not forced to specify
any more announcement functionality that they need. The following
announcements should be supported for the play collect command.
Default relationships are indicated by indentation.
Cromwell, Durling expires May 1999 [Page 14]
INTERNET DRAFT Media Services Control November 1998
INITIAL PROMPT - If the initial prompt is not specified, digit
collection should begin immediately.
REPROMPT - Played after the user has made an error; asks
the user to try again. Should default to Initial prompt if
not set.
NO DIGITS REPROMPT - Played when the user has not
entered any digits. Should default to Reprompt if not
set.
FAILURE ANNOUNCEMENT - Played when the all data entry attempts
have failed.
SUCCESS ANNOUNCEMENT - Played when the operation has succeeded.
3.2.2. Speed
The relative playback speed of the announcement should be specifiable
as a percentage variation from the normal playback speed. This ini-
tial setting should apply to the playing of all announcements associ-
ated with a particular play collect operation. The normal playback
speed and the range of change allowed is implementation dependent.
3.2.3. Volume
The relative playback level of an announcement. should be specifiable
as an percentage variation from the normal playback level. This ini-
tial setting should apply to the playing of all announcements associ-
ated with a particular play collect. The normal volume and the
amount of change allowed is implementation dependent.
3.2.4. Interruptibility
The play collect operation should support interruptibility by DTMF.
A prompt is interruptible if it stops playing when the user presses a
DTMF key; if it is non-interruptible it continues to play. Interrup-
tibility should be specifiable in a protocol command on a per segment
basis.
3.2.5. Digit Buffer Control
The protocol should support the ability to clear the digit buffer
prior to playing the initial prompt. The default should be to not
Cromwell, Durling expires May 1999 [Page 15]
INTERNET DRAFT Media Services Control November 1998
clear the buffer. By default the buffer should always be cleared fol-
lowing the playing of an uninterruptible segment and before playing a
reprompt in response to invalid input.
3.2.6. Pattern Matching
The protocol should support specification of the maximum and minimum
number of digits to collect. It should support digit pattern match-
ing using extended regular expressions as supported by the Rogue Wave
Class Library [8], which supports a subset of the POSIX.2 standard
[9] for regular expressions.
3.2.7. Timers
The protocol should support the following event timers for the play
collect operation:
FIRST DIGIT - The amount of time allowed for the user to enter
the first digit. Specified in units of 100 milliseconds.
INTER DIGIT - The amount of time allowed for the user to enter
each subsequent digit. Specified units of 100 milliseconds
seconds.
EXTRA DIGIT - The amount of time to wait for a user to enter a
digit once the maximum expected amount of digits have been
entered. Specified in units of 100 milliseconds. Typically
this timer is used to wait for a terminating key in applications
where a specific key has been defined to terminate input.
This timer addresses the "# key ambiguity problem." If the
application is expecting 5 digits terminated by the # key, but
the digits are valid even if not terminated by the # key, if the
digits are sent to the call processing agent as soon as the
fifth key is entered, the # key when and if it is received is
ambiguious since it could be interpreted as a terminating key
for the digits entered previously or as something else.
3.2.8. Key Definitions
The protocol should support the following keys: 0-9,#,*,A,B,C, and D
and should provide the ability to specify the semantics of keys
received during the play collect operation as defined below. Defined
keys are processed in the following order of precedence from highest
to lowest: command keys, playcontrol keys, startinput keys, and
Cromwell, Durling expires May 1999 [Page 16]
INTERNET DRAFT Media Services Control November 1998
endinput keys. Any keys not defined should be collected.
COMMAND KEY - A key followed by a sequence of zero or more keys
that has one of the following meanings:
RESTART - Discard any digits collected, replay the prompt,
and resume collection.
REINPUT - Discard any digits collected and resume
collection.
RETURN - Terminate the current operation and any queued
operations and return the terminating key sequence to the
call processing agent.
PLAYCONTROL KEY - A key that is valid only while an announcement
is playing and has one of the following meanings. Play control
keys are never collected.
POSITION - Stop playing the current announcement and resume
playing at another position within the announcement. A play
control key can be defined to resume playing at one of the
following positions: the beginning of the first, last,
previous, next, or the current segment of the announcement.
If the announcement consists of a single segment, the first
and previous positions are equivalent to the beginning of
the announcement. The last and next positions are
equivalent to the end of the announcement.
STOP - Terminate playback of the announcement.
STARTINPUT KEYS - A set of one or more keys that are acceptable
as the first digit collected. It should be possible to specify
for each key whether interrupts a playing announcement is
ignored during a playing announcement.
ENDINPUT KEY - A key that signals the end of user input. It
should be posible to specify whether or not this key is included
in the collected digits.
The protocol should support specification of the maximum number of
times a user may use a restart key to restart the operation or use a
reinput key to re-attempt DTMF entry.
3.2.9. Number Of Attempts
The protocol should support specification of the number of times the
Cromwell, Durling expires May 1999 [Page 17]
INTERNET DRAFT Media Services Control November 1998
user can attempt to make a valid entry.
3.2.10. Optional Parameters
All parameters to the play collect operation are optional. Certain
parameters default to reasonable values. This allows the call agent
to specify only the mimimum set of parameters it needs in a given
situation. The defaults are:
________________________________
| Parameter | Default |
|___________________|___________|
| Iterations | 1 |
| Interval | 1 second |
| Clear DTMF | false |
|First digit timer | 5 seconds |
| Interdigit timer | 3 seconds |
| Start input key | 0-9 |
| End input key | # |
|Number of attempts | 1 |
|___________________|___________|
3.2.11. Return Value
In addition to a return code that describes the outcome of a play
collect operation, the following information is returned:
The interrupting key sequence, if any.
If an announcement was interrupted, the length of the portion of
the announcement that was played before the interrupt.
The number of attempts it took the user to enter a valid
sequence of DTMF keys.
The digits that were collected.
Cromwell, Durling expires May 1999 [Page 18]
INTERNET DRAFT Media Services Control November 1998
3.2.12. Examples
Assume the following syntax:
__________________________________________________________________________
| |
| play_collect(prompt_block,timer_block,key_block,pattern_block,speed, |
| volume,cleardigits,attempts) |
| |
| prompt_block = (initial_prompt,reprompt,no_digits_reprompt, |
| success_announcement,failure_announcement) |
| |
| timer_block = (first_digit,inter_digit,extra_digit) |
| |
| key_block = (command_block,playcontrol_block,startinput,endinput) |
| |
| pattern_block = (min_digits,max_digits,pattern) |
| |
|________________________________________________________________________|
Clear the digit buffer before initial prompt, play 5% faster than
normal speed, 2 percent less than normal volume, and give the user
three attempts to enter some valid data:
___________________________________________________________________________
| |
| play_collect(prompt_block,timer_block,key_block,speed(+5),volume(-2), |
| cleardigits(TRUE),attempts(3)) |
|_________________________________________________________________________|
Prompt block with only an initial prompt defined:
_________________________________________
| |
| prompt_block = (initial_prompt(87)) |
|_______________________________________|
Prompt block with all prompts defined:
______________________________________________________________________
| |
| prompt_block = (initial_prompt(87),reprompt(5), |
| no_digits_reprompt(419),failure_announcement(9), |
| success_announcement(18)) |
|____________________________________________________________________|
Cromwell, Durling expires May 1999 [Page 19]
INTERNET DRAFT Media Services Control November 1998
Timer block with first_digit timer set to 3 seconds and the
inter_digit timer set to 2 seconds:
___________________________________________________
| |
| timer_block = (first_digit(3),inter_digit(2)) |
|_________________________________________________|
Pattern block specifying collection of 1 to 4 digits:
___________________________________________________
| |
| pattern_block = (min_digits(1),max_digits(4)) |
|_________________________________________________|
Pattern block specifying collection of 2 digits where the first digit
is 3,4, or 5 and the second digit is any digit except 5, 6, or 7.
________________________________________________________________________
| |
| pattern_block = (min_digits(1),max_digits(2),pattern([3-5][^567])) |
|______________________________________________________________________|
Key block specifying a set of digits that are valid as the first
digit of input and also specifying that hese keys will interrupt the
current announcement. Specification of a key to end input. This key
is not included in any digits collected.
_________________________________________________________________
| |
| key_block = (command_block,playcontrol_block, |
| startinput(0-9,INTERRUPT),endinput(#,EXCLUDE)) |
|_______________________________________________________________|
Command_block specifying a restart key sequence and a return key
sequence:
____________________________________________________
| |
| command_block = ((*,76,RESTART),(*,83,RETURN)) |
|__________________________________________________|
3.3. Play Record Operation
The play record operation should play a prompt and records user
speech. If the user does not speak, the user may be reprompted and
given another chance to record. The play record operation is speci-
fied as follows:
Cromwell, Durling expires May 1999 [Page 20]
INTERNET DRAFT Media Services Control November 1998
3.3.1. Announcements
The play record operation should optionally play one or more
announcements, each consisting of an ordered sequence of one or more
segments of the following types: recording, text, silence, tone,
variable, and composite. All play record announcements are optional
and some default to other announcements if they are not specified.
For example if the user does not speak the no speech reprompt is
played. If the no speech reprompt is undefined then the reprompt is
played. If the reprompt is undefined then the initial prompt is
played, and if the initial prompt is not defined then no announcement
is played.
This concept of cascading defaults allows the level of audio customi-
zation to decay gracefully all the way back to a single announcement
for all errors and means that applications are not forced to specify
any more announcement functionality that they need. The following
announcements should be supported for the play record command.
Default relationships are indicated by indentation.
INITIAL PROMPT - If the initial prompt is not specified, digit
collection should begin immediately.
REPROMPT - Played after the user has made an error; asks
the user to try again. Should default to Initial prompt if
not set.
NO SPEECH REPROMPT - Played when the user has not
spoken. Should default to Reprompt if not set.
FAILURE ANNOUNCEMENT - Played when the all data entry attempts
have failed.
SUCCESS ANNOUNCEMENT - Played when the operation has succeeded.
3.3.2. Speed
The relative playback speed of the announcement should be specifiable
as a percentage variation from the normal playback speed. This ini-
tial setting should apply to all announcements associated with a par-
ticular play record operation. The normal playback speed and the
range of change allowed is implementation dependent.
Cromwell, Durling expires May 1999 [Page 21]
INTERNET DRAFT Media Services Control November 1998
3.3.3. Volume
The relative playback level of an announcement. should be specifiable
as an percentage variation from the normal playback level. This ini-
tial setting should apply to all announcements associated with a par-
ticular play record operation. The normal volume and the amount of
change allowed is implementation dependent.
3.3.4. Interruptibility
The play record operation should support interruptibility by DTMF. A
prompt is interruptible if it stops playing when the user presses a
DTMF key; if it is non-interruptible it continues to play. Interrup-
tibility is specifiable in a protocol command on a per segment basis.
3.3.5. Digit Buffer Control
The protocol should support the ability to clear the digit buffer
prior to playing the initial prompt. The default should be to not
clear the buffer. By default the digit buffer should always be
cleared following the playing of an uninterruptible segment, before
playing a reprompt in response to invalid input, and before beginning
a recording.
3.3.6. Timers
The protocol should support the following event timers for the play
record operation:
PRE-SPEECH - The amount of time to wait for the user to
initially speak. Specified in units of 100 milliseconds.
POST-SPEECH - The amount of silence necessary after the end of
the last speech segment for the recording to be considered
complete. Specified in units of 100 milliseconds.
TOTAL RECORDING LENGTH - The maximum allowable length of the
recording not including pre or post speech silence. Specified
in units of 100 milliseconds.
3.3.7. Key Definitions
The protocol should support the following keys: 0-9,#,*,A,B,C, and D
and should provide the ability to specify the semantics of keys
Cromwell, Durling expires May 1999 [Page 22]
INTERNET DRAFT Media Services Control November 1998
received during the play record operation as defined below. Defined
keys are processed in the following order of precedence from highest
to lowest: command keys, playcontrol keys, and endinput key.
COMMAND KEY - A key followed by a sequence of zero or more keys
that has one of the following meanings:
RESTART - Discard any recording in progress, replay the
prompt, and resume collection.
REINPUT - Discard any recording in progress and resume
collection.
RETURN - Terminate the current operation and any queued
operations and return the terminating key sequence to the
call processing agent.
PLAYCONTROL KEY - A key that is valid only while an announcement
is playing and has one of the following meanings. Play control
keys are never collected.
POSITION - Stop playing the current announcement and resume
playing at another position within the announcement. A play
control key can be defined to resume playing at one of the
following positions: the beginning of the first, last,
previous, next, or the current segment of the announcement.
If the announcement consists of a single segment, the first
and previous positions are equivalent to the beginning of
the announcement. The last and next positions are
equivalent to the end of the announcement.
STOP - Terminate playback of the announcement.
ENDINPUT KEY - A key that signals the end of user input. It
should be posible to specify whether or not this key is included
in the collected digits.
The protocol should support specification of the maximum number of
times a user may use a restart key to restart the operation or use a
reinput key to re-attempt recording.
3.3.8. Optional Parameters
All parameters to the play record operation are optional. Certain
parameters should default to reasonable values. This allows the call
agent to specify only the mimimum set of parameters it needs in a
given situation. The defaults are:
Cromwell, Durling expires May 1999 [Page 23]
INTERNET DRAFT Media Services Control November 1998
________________________________
| Parameter | Default |
|___________________|___________|
| Iterations | 1 |
| Interval | 1 second |
| Clear DTMF | false |
| Pre speech timer | 3 seconds |
|Post speech timer | 2 seconds |
| Start input key | 0-9 |
| End input key | # |
|Number of attempts | 1 |
|___________________|___________|
3.3.9. Return Values
In addition to a return code that describes the outcome of a play
record operation, the following information is returned:
The interrupting key sequence, if any.
If an announcement was interrupted, the length of the portion of
the announcement that was played before the interrupt.
The number of attempts it took the user to make a recording.
A reference to any recording that was made.
Cromwell, Durling expires May 1999 [Page 24]
INTERNET DRAFT Media Services Control November 1998
3.3.10. Examples
Assume the following syntax:
___________________________________________________________________
| |
| play_record(prompt_block,timer_block,key_block,speed,volume, |
| cleardigits,attempts) |
| |
| prompt_block = (initial_prompt,reprompt,no_speech_reprompt, |
| success_announcement,failure_announcement) |
| |
| timer_block = (pre_speech,post_speech,total_recording_length) |
| |
| key_block = (command_block,playcontrol_block,endinput) |
|_________________________________________________________________|
Clear digit buffer before initial prompt, play all announcements at
5% faster than normal speed, 2 percent less than normal volume, and
give the user only one attempt to make a recording:
__________________________________________________________________________
| |
| play_record(prompt_block,timer_block,key_block,speed(+5),volume(-2), |
| cleardigits(TRUE),attempts(1)) |
|________________________________________________________________________|
Specify an initial prompt and a reprompt:
______________________________________________
| |
| prompt_block = (initial(3),reprompt(45)) |
|____________________________________________|
Specify prespeech timer of 5 seconds and interword timer of 2
seconds:
_______________________________________________
| |
| timer_block = (prespeech(5),interword(2)) |
|_____________________________________________|
Cromwell, Durling expires May 1999 [Page 25]
INTERNET DRAFT Media Services Control November 1998
4. Other Requirements
This section describes other functional requirements related to media
control. These operations do not necessarily directly map to actual
commands in a protocol implementation.
4.1. Invoke Application
The protocol should support invocation of a custom application resid-
ing on the media services function by application id and accompanying
unstructured data block.
4.2. Audio Management
Audio recordings are temporary by default and exist only for the life
of the call. The protocol should provide the capability to change at
call time the status of a piece of audio from temporary to permanent
or from permanent to temporary.
5. Open Issues
The following issues are unresolved:
1. Support for voice recognition.
2. Specification of a dynamically generated TONE segment.
6. Implementation
Some systems may not be capable of supporting the entire protocol.
Implementations of a subset of the protocol should make every attempt
to remain logically consistent.
Cromwell, Durling expires May 1999 [Page 26]
INTERNET DRAFT Media Services Control November 1998
7. References
[1] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[2] ITU Recommendation Q.1218, "INAP Protocol For Support Of
Capability Set 1", April-May, 1995.
[3] Nortel CS1-R Extensions Specification, internal Nortel
document.
[4] Bellcore GR-1129-CORE, "AINGR: Switch-Intelligent
Peripheral Interface (IPI)", Issue 3, September 1977.
[5] Bellcore SR-3511, "ISCP-IP Interface Specification", Issue
2, Version 5.0, January 1997.
[6] ISO 639, "Code For The Representation Of Names Of
Languages", 1998.
[7] ISO 4217, "Currency And Funds Code List", 1981.
[8] Tools.h++ Class Reference Version 7, Rouge Wave Software
Inc., 1996.
[9] ANSI/IEEE Standard 1003.2 (Portable Operating System
Interface), Version D11.2, September 1991.
8. Author's Address
David Cromwell
Nortel Networks
Box 13010
Research Triangle Park, NC 27709
Phone: (919) 992-1373
email: cromwell@nortel.ca
Michael Durling
Nortel Networks
Box 13010
Research Triangle Park, NC 27709
Cromwell, Durling expires May 1999 [Page 27]