Internet DRAFT - draft-hayata-ipo-carrier-needs
draft-hayata-ipo-carrier-needs
Internet Engineering Task Force Hirokazu Ishimatsu
Internet-Draft Yoshihiro Hayata
Susumu Yoneda
Japan Telecom Co., LTD.
Expiration Date: May 2001 Ramesh Bhandari
George Newsome
Eve Varma
Lucent Technologies
November, 2000
Carrier Needs Regarding Survivability and Maintenance for
Switched
Optical Networks
draft-hayata-ipo-carrier-needs-00.txt
Status of this Memo
This document is an Internet-Draft and is in full
conformance with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet
Engineering Task Force (IETF), its areas, and its working
groups. Note that other groups may also distribute working
documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other
than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be
accessed at http://www.ietf.org/shadow.html.
1. Abstract
As discussed in [1], the need for survivable optical networks is
critical, and introducing capabilities that further enhance network
survivability continues to be an essential objective. This is
particularly important for operators with stringent requirements for
network resilience and service survivability. However, disruption of
service can result not only from faults, but also from scheduled
maintenance procedures. This draft introduces some additional
considerations and carrier needs related to failure recovery and
scheduled maintenance work in switched optical networks. These are of
critical importance for serving -business customers who require super
high quality service assurance and pay correspondingly high tariffs in
order to guarantee this level of QoS.
2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as
described in RFC-2119.
3. Introduction
The explosion of data services is increasingly imposing challenging
network infrastructure requirements at the same time that wavelength
services are emerging in the marketplace. Next generation optical
networking solutions must enable scalable, flexible, and reliable
networks as well as increased responsiveness to client network needs.
Provision of an optical layer service framework has been discussed in
the context of service considerations considered important for inter-
city network operators [2]. As described in this material, some key
objectives include service functionality, a workable business model, and
evolvability in a heterogeneous network environment.
Key service functionality cited in [2] has included rapid provisioning
and restoration. Automated provisioning of optical layer resources in
support of scheduled and demand-based customer/client needs offers
opportunities for supporting new services as well as handling routine
maintenance activities in a non-service disrupting manner (e.g.,
scheduled or predictable maintenance-related churn).
Assuring support for a workable business model that can adapt to change,
e.g., arbitrage, is important. In particular, it has become clear that
there is a range of reasonable business models that might be utilized in
an operator's network, depending upon the scope and objectives of the
enterprise. In particular, as discussed in [4], such models might be
used in various ways, and for various purposes, even by different
organizations within the same network operator domain.
Evolvability is an important consideration as it is essential for
service providers to have a smooth network evolution path for addressing
the unique problems inherent in simultaneously supporting an existing
network while deploying a new multi-service infrastructure. Clearly, it
is also necessary to enable emergent service providers to optimally
tailor their networks for their targeted market and service offerings;
however, emergent providers quickly need to deal with embedded base as
soon as initial deployment of resources has occurred.
Within the remainder of this draft, we will focus upon service
functionality and business model objectives in relation to service
survivability and maintenance considerations for highly reliable
services such as the super high quality services discussed below.
4. Switched Optical Services
The basic requirement of a switched optical service is that a channel is
established via an appropriate signaling mechanism before data can be
transferred and that this establishment is achieved in the following
manner:
- a real-time client specifies its traffic characteristics and its end-
to-end performance requirements to the server
- the most suitable route for a channel that meets these requirements is
determined
- translate the end-to-end parameters into local parameters at each NE
and attempt to reserve resources via signaling.
The service abstraction defines a contractual relationship between
client and server. Hence once the connection is established the server
guarantees in the absence of a failure that it will meet its contractual
obligations. This contract is basically agreed before data transfer.
When the server guarantees the contract, several actions have to be
taken in case of a failure. This paper addresses those actions in Sec.
4.2.
4.1 Super High Quality Services Characteristics
Super high quality services (also known as private line services)
offered by a carrier currently have the following characteristics:
- The exact physical and logical location of a private line userÆs path
in the network is known and uniquely identifiable, (i.e. the optical
fiber cable, fiber, optical channel, SDH logical path, port of
transmission equipment/router, etc) is known to the network operator.
- When a logical path or port is switched to an alternate route (i.e.,
a back-up path) due to an unexpected event, after the event or
failure is repaired, the carrier switches traffic back from the
alternate path to the original path.
- For scheduled maintenance, the carrier always asks customers having
super high quality services (that may be affected due to this
maintenance work) their preference in terms of when this work may be
carried out. The carrier then carries out the scheduled maintenance
work according to customer preference regarding date and time, as it
is essential that important customers not be adversely impacted in
any way by scheduled maintenance work.
- The carrier provides for guaranteed service survivability in the
event of failures. It does so by providing alternate paths for
carrying services, with the service and alternate paths being
physically and topologically diverse.
4.2 Service Survivability Considerations
As discussed in [3], there is a range of failures that can occur
within a network, and high reliability applications will require a
variety of failures to be taken into account. Examples that have been
considered include office outages, failures arising from diverse
circuits traversing shared protection facilities such as rings, and
natural disasters. It is essential to fully prepare for those natural
disasters such as earthquakes, volcanoes and typhoons. Further, for
super high quality services, there is extreme sensitivity to service
interruptions. Thus, it is important that the service and alternate
paths do not have links that are part of any Shared Risk Link
Groups (SRLG) [3], or pass through the same "region of failure".
Additionally, in order to assure an optimized survivable network
architecture, it is desirable that the alternate path can be switched-
back to the original service path once the failure is repaired (note
that not all carriers may choose to revert). The following different
grades of services may be defined with actions to be taken in the event
of a failure:
- Standard service, which is provided from a given source to a given
destination over a path computed in accordance with normal network
capacity constraints; when the customer loses connection on account
of a fault, the customer may request the same connection which the
network will then try to establish on a newly computed path.
- Medium High Quality Service which, at the customerÆs request,
provides a connection over a path that avoids a certain set of cities
or regions, which are prone to damage due to natural disasters such
as earthquakes, volcanoes, typhoons, etc. These "regions of failure"
may each be ascribed a "radius of failure" determined from a study of
the past history of the spatial extent and severity of damage in
those regions; in the event of a failure of this service, the
customer may request reestablishment of a connection, which the
network will attempt to provide over a new path.
- High Quality Service, which is provided with a physically disjoint
back-up path in case of failure of the primary path; there are no
requirements on city avoidance, etc; as a result, the back-up
basically provides guarantee of continuity of service only in the
event of link or equipment failure.
- Super High Quality Service, which is provided with a physically
disjoint back-up path, constrained to have no "region of failure" in
common with the original path. Such type of service may be requested
by big business customers who essentially want continuity of service
at all times. In fact, since the downtime of the primary path may be
significantly large in major catastrophes such as those due to
earthquakes, floods, etc., a carrier may offer to provide a back-up
for the back-up over which the guaranteed services were switched upon
failure of the primary path.
The above four types of services may be summarized in the table below:
Service Type Physically disjoint Avoid a Region of
protection path Failure
Regular No No
Medium No Yes
High Yes No
Super High Yes Yes
In the event the constraints for the above high quality services can
only be met partially (e.g., 100% physical diversity between a given
pair of source and destination cannot be provided, e.g., because it just
does not exist for the particular source-destination pair), then the
customer, instead of being refused the desired service, may simply be
offered service with a correspondingly reduced level of service
protection; for example, if the percentage amount of fiber overlap on
the primary and secondary routes is x, then the customer may be offered
the service with a reduction in service continuity guarantee by x%, and
thus also with correspondingly reduced costs to the customer.
Furthermore, in those cases, where the customer does not want to pay the
full cost of the above high quality services, even when such service
exists, then service may still be provided, but with corresponding
reduced quality guarantees within the class of service under
consideration.
4.3 Data Bases and Algorithms
Because natural disasters such as earthquakes, typhoons, etc. can damage
a large area in one instance, it is important to ascertain the regions
within the service provider's network prone to damage by such
calamities. Normally, such areas have a history of damage, and it should
be possible to construct a data base on the location, intensity of
disaster, its frequency, and the size of the area affected; the area
affected may be expressed as a "radius of failure". It may also be
possible to use the information on the intensity of disaster and the
frequency of occurrence to assign probabilities of failure to the
offered services. For path computation, the following data bases are
needed:
- Nodes, links, and their fiber span content, or alternatively, nodes,
fiber spans and links riding the individual spans also called Shared
Risk Link Groups (SRLG's); clearly, if a link or node is not in
service, it is not included in path computation.
- Regions of failure, corresponding radii of failure and locations
within the service provider's network; these should be taken into
account before computing paths for the medium high and super high
quality services.
For highly reliable services such as the super high quality services,
physically-disjoint paths for real-life networks (which involve span-
sharing links or SRLGÆs) are required. Ref. [5] describes algorithms for
such real-life networks. The algorithms emphasize optimality to save
network costs. Depending upon the span-sharing topologies of a given
network, these optimal algorithms can be very fast, and thus suitable
for running in the real-time environment. For networks, with very
complicated span-sharing topologies, exact algorithms do exist [5], but
they are slow for large networks, since the problem becomes NP-complete.
In such situations, fast heuristics may be developed [5] (see also [2]
for a discussion on diversity).
4.4 Business Model Considerations
As described in [4], there are several business models that may be
applicable for network operators: ISP owning all Layer 1 infrastructure
and only delivering IP-based services, ISP owning or leasing Layer 1
infrastructure and only delivering IP-based services, retailer or
wholesaler for multi-services, and a carrierÆs carrier or bandwidth
broker. A carrier owns the layer 1 infrastructure and sells multiple
service types to customers, which may include other operator networks.
This bandwidth brokering, or reseller, role takes on a new meaning in
the context of service resilience. For many years, in Japan, operators
have collaborated to handle traffic in the event of natural disasters,
so that bandwidth can be borrowed from each other. Thus, if an operator
doesnÆt have the capacity, they can borrow capacity from another
network. Accommodating the unexpected is a key factor in this case.
Indeed it seems to be a common pattern in industry that businessÆs that
provide service and operate their own infrastructure tend to separate
into two businessÆs. This makes it likely that even though
infrastructure may be whole owned today, it may well not be tomorrow.
This makes it important to take account of fully separated business
models (case 3 and 4 of [4]) even if this does not seem to represent the
majority of today's business's.
5. Implications for switched optical networks
Considering the discussion in Subsections 4.1 - 4.4, switched optical
networks must minimally:
- Support the various grades of high quality services, including the
Super High Quality Service described in Sec. 4.1.
- Support survivability considerations related to diverse routing,
tailored to the unique characteristics of JapanÆs geography and
routing of fibers.
- Enable "bandwidth borrowing on demand" from other carriers as well as
support for multiple service types.
Examples of necessary functionality are provided in more detail below,
as well as some related connection setup operations.
5.1 Functions
- When referring to Section 4, we can see that the following functions
need to be supported:
- Ability for network operator to manually set the date and time that a
path switching function should take place, and have that occur
automatically. (The guarantee that the switch occurs as scheduled is
closely linked to resource allocation policies; see T1X1.5/2000-194
for further discussion on scheduled connections.)
- Ability to specify switching to a physically/topologically disjoint
path from the service path.
- Ability to maintain and update the data bases in a timely manner so
that a connection request is supported with the most current
knowledge of the network.
- Ability for operator to support a survivability policy that enables
the capability for switch-back to the original service path.
- Ability to support an operator policy to prioritize service requests
so that, in the event of a fault, customers with super high quality
services have first priority in being switched to disjoint paths.
- Ability to enable key customers to request constraints on the
connection path(e.g., avoid City X because an earthquake has just
occurred, or simply because the city is very much prone to damage
from natural disasters such as earthquakes, volcanoes and typhoons.
This involves the ability to express geographic constraints, as
opposed to just physical (equipment) or topological constraints.
- Ability to prevent new customers from being added to a particular
link for a certain amount of time (e.g., because of a failure,
natural disaster, scheduled maintenance). This requires the ability
to mark particular resources as out of service.
- Ability for the operator to query service management function to
establish the exact location and characteristics of service paths for
key customers.
- Ability for the operator to view information regarding which
customer/user is associated with which service path(s).
5.2 Connection Setup Operation
Referring to [4], some relevant connection setup parameters include:
1) Scheduled service - ability to request the connection to be made at
some specified time in the future (see T1X1.5/2000-194 for further
discussions).
2) Scheduled duration - ability to specify a duration for the
Connection.
3) Resilience - ability to request resilience against server layer
faults, and specify a particular degree of risk (see Sec. 4.2)
4) Connection Constraints - ability to specify the constraints as in the
three levels of high quality service described in Sec. 4.2.
6. References
[1] J. Luciani, B. Rajagopalan, D. Awduche, B. Cain, B. Jamoussi, "IP
over Optical Networks - A Framework", <draft-ip-optical-framework-
oo.txt>, March 2000
[2] John Strand, "Optical Layer Services Framework", T1X1.5/2000-142
[3] Monica Lazer, John Strand, "Some Routing Constraints", T1X1.5/2000-
143
[4] George Newsome, "ASON - Requirements at the Client API",
T1X1.5/2000-158
[5] Ramesh Bhandari, "Survivable Networks - Algorithms for Diverse
Routing", Kluwer Academic Publishers, 1999.
7. Authors' Contact Information
Hirokazu Ishimatsu
Japan Telecom
hirokazu@japan-telecom.co.jp
Yoshihiro Hayata
hayata@japan-telecom.co.jp
Sussumo Yoneda
Japan Telecom
yone@japan-telecom.co.jp
Ramesh Bhandari
Lucent Technologies
bhandari1@lucent.com
George Newsome
Lucent Technologies
gnewsome@lucent.com
Eve Varma
Lucent Technologies
evarma@lucent.com
Expiration Date: May 2001