Internet DRAFT - draft-bhattacharyya-monitoring-deployment
draft-bhattacharyya-monitoring-deployment
INTERNET-DRAFT Supratik Bhattacharyya
Gianluca Iannaccone
Sprint ATL
Christophe Diot
Intel
June 1 2003
Deployment of inter-operable and cost-effective
monitoring infrastructure in ISP networks
<draft-bhattacharyya-monitoring-deployment-00.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
The key words "MUST"", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC 2119].
Abstract
This document identifies issues and concerns in monitoring ISP
networks. It outlines the components of a monitoring infrastructure
designed to support ISP requirements. It discusses deployment and
inter-operability issues. Related IETF working groups are identified.
The goal of this document is to open a discussion on whether there
should be a BOF addressing this issue at the next IETF (Vienna, July
Bhattacharyya et al [Page 1]
INTERNET-DRAFT June 1 2003
2003).
1. Introduction
As the Internet continues to grow rapidly in size and complexity, it has
become increasingly clear that its evolution is closely tied to a
detailed understanding of network traffic. Network traffic measurements
are invaluable for a wide range of tasks such as network capacity
planning, traffic engineering and fault diagnosis. IP networks are
designed with the goal of providing high availability and low delay/loss
while keeping operational complexity low. Meeting these goals is a
highly challenging task and can only be achieved through a detailed
understanding of the network.
Monitoring and measuring traffic in IP networks is difficult for a
number of reasons. First, the designers of IP networks have
traditionally attached less importance to network monitoring and
resource accounting than to issues such as distributed management,
robustness to failure and support for diverse services and protocols
[1]. Thus IP network elements (routers and end-hosts) have not been
designed to retain detailed information about the traffic flowing
through them and IP protocols typically do not provide detailed
information about the state of the underlying network. This poses the
problem of adding enhanced monitoring and measurement capabilities to
existing equipment and/or integrating special purpose monitoring
equipment into existing networks. In addition, IP protocols have been
designed to automatically respond to congestion (e.g., TCP) and failures
(e.g., routing protocols such as IS-IS/OSPF). This makes it hard for a
network administrator to track down the cause of a network failure or
congestion before the network itself takes corrective action. Finally,
the Internet is organized as a set of loosely connected networks
(Autonomous Systems) that are administered independently. Hence the
operator of a single network has no control over events occurring in
other networks it exchanges traffic with. However, a network operator
can gain some knowledge of the state and problems of other networks by
studying the traffic exchanged with those networks.
This document highlights some concerns and issues in monitoring ISP
networks. The goal is to foster discussions about the work needed in the
IETF to address these issues, and how this work should be organized
under the aegis of different working groups. The major question to be
raised by this document is whether we need a new working group in the
Operation area of the IETF to work on the deployment of an inter-
operable monitoring infrastructure for ISPs.
Bhattacharyya et al [Page 2]
INTERNET-DRAFT June 1 2003
2. Challenges posed by Operational Monitoring
Various types of measurement data needs to be collected to support the
monitoring applications [1]. We classified them in two broad
categories: (i) aggregate information need to be collected at coarse
time-scales and reported on a regular basis (e.g. SNMP, flows, routing
tables) ; (ii) packet-level traces to analyze and understand a specific
phenomenon.
There are a number of implementation challenges in order to capture,
process, summarize and export data at the required level of granularity
at the time that it is needed. Some of these problems are being
addressed in different IETF working groups whereas some others have not
been.
The question we ask here is whether a new working group is needed to
undertake the following activities: (i) define a framework for
monitoring needed to support day-to-day operations in IP networks, (ii)
identify existing and on-going work in the IETF on various aspects of
the framework and ensure that this work guarantees inter-operability
among ISPs, and (iii) provide clear guidelines to equipment vendors on
what infrastructure is needed to support monitoring in ISP networks.
3. Related IETF Working Groups
The IP Performance Metrics (IPPM) Working Group [3] has been chartered
to develop a set of standard metrics that can be applied to the
quality, performance, and reliability of Internet data delivery
services. The focus of this group has been on defining metrics based
on active measurements such as one way delay, round-trip delay, link
bandwidth capacity, etc.
However, active measurements alone are insufficient to address the
scale and complexity involved in continuously monitoring large ISP
networks. Also, with the evolution of the technology and of the
understanding of IP networks, new metrics have emerged.
The Packet Sampling (PSAMP) working group [4] is chartered to define a
standard set of capabilities for network elements to sample subsets of
packets by statistical and other methods. The capabilities should be
simple enough that they can be implemented ubiquitously at any link
rate. They should be rich enough to support a range of existing and
Bhattacharyya et al [Page 3]
INTERNET-DRAFT June 1 2003
emerging measurement-based applications, and other IETF working groups
where appropriate.
While the work in PSAMP addresses a critical aspect of an operational
monitoring framework, it can benefit from the definition of a set of
metrics to be derived from the sampled and filtered packet data. It is
very likely that different metrics will require different sampling
techniques. Defining a set of metrics that are of common interest to
many ISPs will ensure that PSAMP-capable monitoring systems (routers
and/or special-purpose systems) have the capability to support the
derivation of these metrics from collected data.
The IP Flow Information Export (IPFIX) Working Group [5] has defined
an architecture for flow information export. This includes flow
definitions, a metering process at the observation point with
sampling/filtering capabilities, an export process to export the data,
and an export protocol for communication between the observation
points and the collection stations. The two-level monitoring system
envisaged in Section 2 fits well within the scope of the IPFIX
architecture. However, ISPs need to develop a better understanding of
their own monitoring needs and provide feedback to the IPFIX Working
Group in order to ensure that the IPFIX architecture and flow export
protocol meet their needs. There are several open issues, e.g., what
are some commonly useful metrics, what is the volume of information
that can be exported in practice, what features are needed for the
protocol that control the interaction between the observation points
and the collection stations, etc. ISPs need to work toward answering
these questions to ensure that systems based on the IPFIX architecture
meet their operational monitoring needs.
4. Discussion
There are several challenges that ISPs face in order to use monitoring
to ease the management of their network. Some of these, such as packet
sampling/filtering and IP flow information export are being addressed by
IETF working groups. However, the work in these groups would greatly
benefit from knowledge about real-world experience in monitoring ISP
networks.
In addition, there are a number of open issues:
(i) Inter-operability
The extent to which the monitoring infrastructure of different ISPs
need to inter-operate needs to be understood. This will involve
communication among ISPS to specify requirements for monitoring data
exchange, define metrics of common interest, etc.
Bhattacharyya et al [Page 4]
INTERNET-DRAFT June 1 2003
(ii) Storage, analysis and aging
The storage and analysis of exported information presents a significant
challenge for ISPs. This includes designing large storage systems (e.g.,
storage area networks) and harnessing processing power to analyze the
collected information on a continuous basis. Moreover, ISPs need to
determine how to age historical data that is retained for long-term
planning.
(iii) Control Plane
Given the diverse information needs of ISPs and the wide range of tasks
to be supported by monitoring, there needs to be a sophisticated control
protocol between the observation points and collection stations. This
protocol is primarily required for dynamically configuring the
sampling/filtering/summarization processes at the observation points. It
may also be used to coordinate communication between multiple
observation points and collection stations, or between collection
stations themselves. ISPs need to converge on a set of requirements on
which the design of such a control plane can be based.
ISPs will clearly benefit from a process that facilitates the sharing of
their experiences and requirements, leading to a faster deployment of
ubiquitous monitoring and management infrastructure. This process could
also benefit equipment vendors by specifying on what is needed to
support the monitoring needs of ISPs.
We propose to organize a BOF meeting at the 57th IETF (Vienna, July
2003) to discuss the need for a new working group whose charter would be
to (i) identify the missing parts in an operational monitoring
framework, (ii) define what needs to be standardized in order to
guarantee inter-operability among multiple ISPs, and (iii) provide
guidelines to routing or monitoring equipment vendors to help them meet
the requirements of the monitoring infrastructure.
We believe that this approach is essential to ease and expedite the
design of a standardized and comprehensive monitoring infrastructure.
5. References:
[1] S. Bhattacharyya et al. "Network Measurement and Monitoring: A
Sprint Perspective". Internet draft draft-bhattacharyya-monitoring-
sprint-01. Work in Progress.
Bhattacharyya et al [Page 5]
INTERNET-DRAFT June 1 2003
[2] G. Iannaccone et al. "Monitoring very high speed links". In
Proceedings of First ACM Sigcomm Internet Measurement Workshop (IMW),
November 2001.
[3] IP Performance Metric http://www.ietf.org/html.charters/ippm-
charter.html
[4] Packet Sampling http://www.ietf.org/html.charters/psamp-
charter.html
[5] IP Flow Information Export.
http://www.ietf.org/html.charters/ipfix-charter.html
7. Authors' Address:
Supratik Bhattacharyya
Gianluca Iannaccone
Sprint Advanced Technology Labs
1 Adrian Court
Burlingame CA 94010 USA
{supratik,gianluca}@sprintlabs.com
Christophe Diot
Intel
15 JJ Thomson Avenue
Cambridge CB3 0FD UK
christophe.diot@intel.com
Bhattacharyya et al [Page 6]