Internet DRAFT - draft-burger-sipping-em-rqt
draft-burger-sipping-em-rqt
Network Work Group E. Burger
Internet Draft SnowShore Networks, Inc.
Document: draft-burger-sipping-em-rqt-00.txt October 12, 2001
Category: Informational
Expires: April 12, 2002
Why Early Media in SIP
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 [1].
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts. Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet- Drafts
as reference material or to cite them other than as "work in
progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Discussions of requirements for SIP occur in the SIPING workgroup.
The SIPING workgroup homepage is at
<http://www.softarmor.com/sipping/>. The SIPING discussion list is
at siping@ietf.org.
1. Abstract
This document describes the requirements for SIP Early Media. Early
Media is the ability of a SIP network to deliver real-time media
traffic from the called party to the calling party after the calling
party issues an INVITE but before the called party accepts the
INVITE with a 200 response.
2. Conventions used in this document
This document refers to a media server for playing announcements. A
media server is a general-purpose media resource processor that is
capable of tone detection and generation, conferencing, interactive
voice response, and announcements. We use the term media server in
this document for simplicity. However, any SIP endpoint, such as an
Burger INFORMATIONAL û Expires 4/2002 1
Why Early Media in SIP October 2001
intelligent SIP Phone or a dumb announcement server, can play the
role described, as appropriate to the situation.
This document refers to the calling party (the SIP User Agent Client
or UAC) in the masculine (he, him, his) and the called party (the
SIP User Agent Server or UAS) in the feminine (she, her, hers).
This convention is purely for convenience and makes no assumption
about the gender of the parties involved.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
this document are to be interpreted as described in RFC-2119 [2].
TABLE OF CONTENTS
1. Abstract...........................................................1
2. Conventions used in this document..................................1
3. Informative Text...................................................2
4. Introduction.......................................................3
5. PSTN Early Media...................................................3
6. SIP Early Media....................................................4
6.1. PSTN Interworking................................................4
6.1.1. Proxy Announcement.............................................5
6.1.2. Early Media with SIP Announcements.............................6
6.2. SIP Endpoints....................................................7
6.2.1. Intelligent SIP Endpoint.......................................7
6.2.2. Intelligent SIP Endpoint with External Media Server............8
7. Security Considerations............................................9
8. References.........................................................9
9. Acknowledgments...................................................10
10. Author's Addresses...............................................10
11. Full Copyright Statement.........................................11
3. Informative Text
Note well that while there are call flows in this document, they are
purely informative. Implementers MUST NOT depend on mechanisms or
proposals in this document as an agreed-upon standard of any type.
The good call flows will make it into their own, normative document.
This is an INFORMATIONAL document, not a STANDARDS TRACK document.
Burger INFORMATIONAL û Expires 4/2002 2
Why Early Media in SIP October 2001
4. Introduction
People have written at length about how to do early media in SIP
[3]. However, no one has published why there is a need for early
media. Some people state there is no need for early media with the
exception of providing an early talk path in the event the media
arrives at the caller's user agent before the signaling does.
Others state there is a need for early media to replicate the
billing mechanisms of the legacy PSTN. Neither position is entirely
correct.
This document describes a number of scenarios where early media may
be appropriate. Note that this document is a true Request for
Comments: if you know of other scenarios where early media is
appropriate, or if you can implement the scenarios without resorting
to early media, please contact the author and participate on the
SIPPING list.
5. PSTN Early Media
The PSTN uses early media for two purposes. The first is to deliver
the far-end talk path to the caller. If the far-end talk path was
not open before the signaling arrived, there is a possibility that
the network will clip the initial utterance spoken by the called
party. For example, if the called party answers the phone and says
"Hello", the network might clip the initial utterance to "Lo". With
early media, the calling party hears the entire utterance.
The second purpose for early media in the PSTN is to deliver
signaling information to the caller in the absence of a digital
signaling path.
Digital Analog
Signaling Signaling
+--------+ +----+ +----+ +----+ +--------------+
| Caller |---| CO | ... | CO |<--| EO |---| Called Party |
+--------+ +----+ +----+ +----+ +--------------+
The Caller's terminal device may have digital signaling to the PSTN,
such as through ISDN BRI. In the ISDN, the terminal device can
display call progress and generate local signaling tones. Call
progress is the state of the call attempt: Alerting (ringing), Busy,
No Answer, Redirected, etc. Signaling tones are the country-
specific tones you hear when you place a call. Example of such
tones are dial tone, ring tone, busy tone, reorder tone, etc. ITU-T
document E.180 Supplement 2 [4] describes the actual tones in use in
different countries.
Even though the caller's device can generate and display call
progress, call progress information may not be available. The
Figure above shows a typical situation where the call progress
information is not available in a digital form. In the depicted
Burger INFORMATIONAL û Expires 4/2002 3
Why Early Media in SIP October 2001
situation, the terminating End Office (EO) does not have digital
signaling connectivity to its adjacent Central Office (CO). The
only way for the EO to signal the call progress of the caller's call
attempt is with in-band tones. In the PSTN this is not a major
problem in that the caller expects to hear ring tone, busy tone, and
so on.
For the most part, callers cannot tell the difference between
locally generated and remotely generated call progress tones. One
situation where the caller can tell the call progress comes from the
far end is for international calls. As one can see from E.180
Supplement 2, there are many different tone plans for call progress.
Said differently, when placing an international call, the call
progress tones sound "foreign." Currently, almost all international
calls use the far-end generated signaling tones in early media for
caller signaling.
6. SIP Early Media
This section describes various scenarios that people have proposed
for early media in SIP. For some of the scenarios, there are
alternative ways of achieving the same results without resorting to
early media. For others, early media appears to be the appropriate
solution.
6.1. PSTN Interworking
Consider the following network topology.
SIP +-------+
------| Proxy |
+--------+ +----+ +----+ / +-------+
| Caller |===| CO | ... | GW |< | SIP
+--------+ +----+ +----+ \ RTP +--------+
======| Media |
| Server |
+--------+
In this figure, the caller places a call to a SIP endpoint. Before
terminating the call to the SIP endpoint or rejecting the call, the
proxy wishes to play an announcement to the caller. The
announcement could be verbal, but often is call progress tones.
Some drafts propose using early media between the Media Server and
the gateway. Proposed call flows (for example [5] and [6]) follow
the following general theme. The gateway invites the SIP endpoint.
A proxy intercepts the call and routes it to an media server, which
plays the appropriate call signaling tones. In the example
portrayed in the figure below, the media server plays an
announcement and then returns a 486 Busy Here indication. A
scenario for this would be a Do Not Disturb service that plays a
Burger INFORMATIONAL û Expires 4/2002 4
Why Early Media in SIP October 2001
"I'm sorry, but Chris is not available. Please try her again
later." message and then returns busy.
6.1.1. Proxy Announcement
GW Proxy Media Server
| | |
| INVITE | |
|--------------------->| INVITE |
| |--------------------->|
| | 180 Trying |
| 180 Trying |<---------------------|
|<---------------------| 183 Session Progress |
| 183 Session Progress |<---------------------|
|<---------------------| |
| RTP |
|<============================================|
| | 486 Busy Here |
| 486 Busy Here |<---------------------|
|<---------------------| |
| | |
The reason for using 183 and 486 on the SIP side of the network is
because mappings from SIP to ISUP, for example, map 200 OK to
answered, 180 to ringing, 486 to busy, etc.
It may be appropriate for the gateway and proxy to use early media
signaling. Early media signaling accurately indicates the state of
the call.
In the scenario described above, the media server generates the call
progress state (e.g., 180, 183, and 486). The proxy, as defined in
Section 17 of [3], passes all responses to the gateway unmodified.
The proxy could inform the media server which interim call progress
state and final result code in the INVITE, possibly as parameters to
the Request-URI, a new SIP header, or in the message body.
Issues that support such a configuration are that it is trivially
easy to correlate the gateway-proxy leg and the proxy-media server
leg. They are both part of the same, proxied call.
Although this scenario is workable, it does not truly drive a
requirement for early media between the proxy and the media server.
There are other configurations that can satisfy the need for early
media signaling to the gateway without passing early media
throughout the SIP network.
Burger INFORMATIONAL û Expires 4/2002 5
Why Early Media in SIP October 2001
6.1.2. Early Media with SIP Announcements
As described above, it is possible for the media server to drive
call progress. However, by definition [7], media servers do not
have the application logic to determine the appropriate interim and
final result codes.
It is more appropriate for the proxy to hand off the call to an
application server, or have the application server functionality
built-in, to terminate the call signaling and send the appropriate
events to the gateway.
GW Proxy App Server Media Server
| | | |
| INVITE | | |
|---------->| INVITE | |
| |---------->| INVITE |
| | 180 |------------->|
| 180 |<----------| 200 OK |
|<----------| 183 |<-------------|
| 183 |<----------| ACK |
|<----------| |------------->|
| RTP |
|<=====================================|
| | | |
| | | BYE |
| | 486 |<-------------|
| 486 |<----------| OK (BYE) |
|<----------| |------------->|
| | | |
In this configuration, the interaction between the application
server and the gateway is standard SIP and SIP-PSTN inter-working.
The status codes and the fact that there is early media makes sense
within the SIP framework. Likewise, the interaction between the
application server and the media server makes sense. While the call
from the gateway to the intended SIP endpoint may or may not be
successful, the call from the application server to the media server
is successful.
The application server contains all of the state machine and call
logic for generating the proper result codes in the direction of the
gateway.
One may expect proxies that implement SIP-PSTN inter-working to have
the application server functionality built in. This makes the
characteristics of the proxy more like a back-to-back user agent
(B2BUA).
Such a configuration might impact billing systems that use SIP
signaling for billing. Namely, there are now completed calls in the
network, between the application server and the media server. The
Burger INFORMATIONAL û Expires 4/2002 6
Why Early Media in SIP October 2001
billing system may need to correlate the gateway-application server
leg with the application server-media server leg. This is not an
insurmountable problem, however.
In addition, such a configuration has an added benefit. Service
providers can now outsource announcement services. Consider the
following administrative mapping.
:
Administrative : Administrative
Domain 1 : Domain 2
:
+----+ +-------+ +------------+ : +--------------+
| GW | | Proxy | | App Server | : | Media Server |
+----+ +-------+ +------------+ : +--------------+
:
Here, announcements are a service of administrative domain 2, while
PSTN termination and routing are a service of administrative domain
1. By using normal signaling between the application server and the
media server, SIP-based billing between the two domains works as
usual. There are completed "calls" to the announcement service, as
opposed to incomplete "early media sessions".
6.2. SIP Endpoints
6.2.1. Intelligent SIP Endpoint
Is early media purely a PSTN û SIP inter-working problem? We
propose that it is not. Here is an example of a service that is a
pure SIP user agent to SIP user agent interaction that requires
early media.
Consider an intelligent SIP phone with a Do Not Disturb feature.
The user of the phone can record or select an announcement to play
when a caller calls. Once the announcement plays, the phone rejects
the call ("hangs up").
In this scenario, a 200 OK response from the phone to the INVITE
would be incorrect. The user agent is not accepting the call. In
fact, the user agent will ultimately reject the call.
Here is a sample call flow.
Burger INFORMATIONAL û Expires 4/2002 7
Why Early Media in SIP October 2001
UAC UAS
| |
| INVITE |
|--------------------->|
| 180 Trying |
|<---------------------|
| 183 Session Progress |
|<---------------------|
| RTP |
|<=====================|
| |
| 486 Busy Here |
|<---------------------|
| |
One might ask why the UAS does not simply return a 486 result code
with Error-Info filled in with a URI for the announcement. In
certain circumstances that may be appropriate. However, in the case
of a SIP Phone UAS, it is unlikely to have the capacity to be
handling arbitrary announcement requests simultaneous with arbitrary
inbound calls. This method allows the UAS to control the serving of
the announcement.
6.2.2. Intelligent SIP Endpoint with External Media Server
What if the UAS does not have the capability of retrieving and
playing announcements? It can call in the services of a media
server.
The following figure describes such a call flow.
UAC UAS Media Server
| | |
| INVITE | |
|---------->| |
| | INVITE |
| |------------->|
| 180 | 200 OK |
|<----------|<-------------|
| 183 | ACK |
|<----------|------------->|
| RTP |
|<=========================|
| | |
| | BYE |
| |<-------------|
| 486 | OK (BYE) |
|<----------|------------->|
| | |
Burger INFORMATIONAL û Expires 4/2002 8
Why Early Media in SIP October 2001
Note that as described in section 6.1.1, it would not be appropriate
for the media server to be using early media signaling. However, it
is quite appropriate for the UAS to be using early media signaling
to the UAC. 200 OK is inappropriate between the UAS and the UAC, as
the UAS will not accept the call.
Said differently, even though the UAS û Media Server interaction
does not require early media, the UAC û UAS interaction does.
7. Security Considerations
A network that allows early media may treat it differently from
session media. For example, one or both of the parties may pay for
session media while one or both parties might not have to pay for
early media. If there is a billing difference between early media
and session media, there may be an incentive for users to abuse the
early media mechanisms to get free service.
We admonished the reader to not directly implement call flows in
Section 3 of this document. We have not analyzed these call flows
for any security issues they may present.
8. References
1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP
9, RFC 2026, October 1996.
2 Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997
3 Handley, M., Schulzrinne, H., Schooler, E., Rosenberg, J., "SIP:
Session Initiation Protocol", draft-ietf-sip-rfc2543bis-04.txt,
July 2001, work in progress.
4 -, "VARIOUS TONES USED IN NATIONAL NETWORKS", Publication E.180
Supplement 2, International Telecommunications Union
Telecommunications Sector (ITU-T), January 1994. An informative
reference for this document.
5 O'Connor, W., Burger, E., and Van Dyke, J., " Network
Announcements with SIP", draft-burger-sipping-netann-00.txt, July
2001, work in progress.
6 Sen, S., Bharatia, J., Hogg, C., Audet, F., "Early Media Issues
and Scenarios", draft-sen-sip-earlymedia-00.txt, July 2001, work
in progress.
Burger INFORMATIONAL û Expires 4/2002 9
Why Early Media in SIP October 2001
7 Hoffpauir, S., and Maxon, Lisa-Marie, "Enhanced Services
Framework", International Softswitch Consortium, June 2001, work
in progress.
9. Acknowledgments
10. Author's Addresses
Eric Burger
SnowShore Networks, Inc.
285 Billerica Rd.
Chelmsford, MA 01824-4120
USA
Phone: +1 978/367-8403
Email: eburger@snowshore.com
Burger INFORMATIONAL û Expires 4/2002 10
Why Early Media in SIP October 2001
11. Full Copyright Statement
Copyright (C) The Internet Society (2001). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns. This
document and the information contained herein is provided on an "AS
IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL
NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY
OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
The Internet Society currently provides funding for the RFC Editor
function.
Burger INFORMATIONAL û Expires 4/2002 11