Internet DRAFT - draft-gutfreund-content-filtering-protocol
draft-gutfreund-content-filtering-protocol
HTTP/1.1 200 OK
Date: Tue, 09 Apr 2002 00:13:13 GMT
Server: Apache/1.3.20 (Unix)
Last-Modified: Mon, 10 May 1999 18:59:16 GMT
ETag: "2e6b78-4ff4-37372c84"
Accept-Ranges: bytes
Content-Length: 20468
Connection: close
Content-Type: text/plain
Network Working Group Keith Gutfreund
Internet Draft AltaVista Internet Software
May 10, 1999
Expires November 10, 1999
Internet Content Filtering Protocol
draft-gutfreund-content-filtering-protocol-00.txt
Status of This Memo
This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of FRC2026.
Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Internet Draft draft-content-filtering-protocol-00.txt May 1999
Internet Content Filtering Protocol
Abstract
The Content Filtering Protocol (CFP) has been developed to
facilitate the connection of content filtering databases to
Internet firewall systems. CFP compliance allows content filters
to be located "behind the firewall," where they are safe from
outside hostile attack.
The CFP is a binary protocol used by firewall systems to
communicate over a private TCP/IP connection to the content
filtering database server.
K. Gutfreund [Page 2]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
Table of Contents
1. Introduction 4
1.1 Firewalls 4
1.2 Content Filtering Databases 4
1.3 Content Filtering Protocol 4
2. The Content Filtering Protocol 5
2.1 Overview 5
2.2 Binary Protocol 5
2.3 Response Request Architecture 5
2.4 Message Header Block 6
2.5 Messages 6
2.5.1 SERVER_STATUS_REQUEST - Command ID: 1 6
2.5.2 VERSION_REQUEST - Command ID: 2 7
2.5.3 FEATURES_REQUEST - Command ID: 3 7
2.5.4 URL_LOOKUP_REQUEST - Command ID: 4 7
2.5.5 SERVER_STATUS_RESPONSE - Command ID: 101 7
2.5.6 VERSION_RESPONSE - Command ID: 102 8
2.5.7 FEATURES_RESPONSE - Command ID: 103 8
2.5.8 URL_LOOKUP_RESPONSE - Command ID: 104 8
3. Character Encoding 9
4. Security Consideration 9
5. Acknowledgements 10
6. Author's Address 11
K. Gutfreund [Page 3]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
1. Introduction
The Content Filtering Protocol (CFP) is a protocol used to communicate
between firewall systems and a content filtering database server. A
typical configuration is shown below:
< Internet > || +==========+ | < Firewall protected resources >
|| | Firewall | | .
|| | System | | .
|| +==========+ < === > +===========+
|| . (CFP) | Content |
|| . . | Filtering |
|| . . | Database |
|| +==========+ (CFP) | Server |
|| | Firewall | < === > +===========+
|| | System | | .
|| +==========+ | .
|| | .
Figure 1, Typical firewall and content filtering database configuration
1.1 Firewalls
Firewall systems are computer systems used to authorize and secure
network traffic between inside (secure) resources and outside
(insecure) resources. When used with a content filtering database, the
firewall can restrict access to undesirable content outside of the
firewall.
Firewall systems typically play a dual role:
a) Unauthorized users outside the firewall are prevented from accessing
resources behind the firewall.
b) Users behind the firewall are kept from accessing unauthorized
resources outside the firewall.
Firewall systems are considered to be the dividing line between the
internal (secure) resources and the external (insecure) world.
1.2 Content Filtering Databases
Content filtering databases are databases of network resource
addresses. In the context of firewalls, content filtering databases
are used to identify requests originating behind the firewall for
undesirable content outside of the firewall. Once a request is
identified as undesirable, the content filtering database notifies the
firewall system that the request should be denied. The firewall system
then takes the necessary steps to deny the request. These steps may
include, for example, logging and documenting the request or
redirecting the request to a more desirable location.
1.3 Content Filtering Protocol
The Content Filtering Protocol (CFP) is a network protocol that
K. Gutfreund [Page 4]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
describes how a CFP compliant firewall communicates with a CFP
compliant content filtering database.
2. The Content Filtering Protocol
2.1 Overview
The protocol described herein is used to describe the communications
between a firewall system and a content filtering database. Current
implementations are based on communications over a TCP/IP network
stack; this is, however, not a strict requirement for the protocol.
2.2 Binary Protocol
The Content Filtering Protocol (CFP) is a binary protocol with variable
length communication instances or messages. All numeric values are
transmitted in network byte order. Character strings are terminated by
a null character and are additionally accompanied by their location
within a message and their length in bytes.
Each message is composed of a fixed length header field (message header
block) followed by a variable length body (message body block). The
message body block may have zero(0) length. Figure 2 shows the format
of a message.
+======================+
| Message Header Block |
| ( 16 bytes) |
+======================+
| Message Body Block |
| ( variable ) |
+======================+
Figure 2, Message format
2.3 Response Request Architecture
Communications between the firewall and the database follow a typical
client-server, transactional architecture. The firewall (the client)
initiates all requests to the database (the server) and the
architecture defines a precise response from the database back to the
firewall for every request.
Typically, a client will issue some "request" for information (say a
URL lookup request) and the server will return a "response." This
request-response pair is called a "transaction". For example, a client
will transmit a SERVER_STATUS_REQUEST and the server will respond with
a SERVER_STATUS_RESPONSE. The command field (described below) in the
Message Header Block semantically identifies each request or response
in a transaction.
Communications are straightforward. The firewall clients and server
communicate over a standard TCP/IP connection, on port 18311. There
may be more than one firewall client connected to the database server.
K. Gutfreund [Page 5]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
2.4 Message Header Block
The Message Header Block is used for both request and response
messages. Depending upon the request or response, it may be followed
by a Message Body Block.
Field Size Field
Name (bytes) Description
----- ------- -----------------
Length short(2) Number of bytes in the entire message,
including the Message Header Block.
----- ------- -----------------
Version byte(1) The major version number of this protocol.
(Major)
----- ------- -----------------
Version byte(1) The minor number of this protocol.
(Minor)
----- ------- -----------------
Command short(2) The command identifier.
----- ------- -----------------
Reserved short(2) Reserved, must be 0.
----- ------- -----------------
Reserved long(4) Reserved, must be 0.
----- ------- -----------------
Transaction long(4) A unique identifier generated by the requester
ID and returned in the corresponding response
message.
----- ------- -----------------
2.5 Messages
Messages are composed of a fixed length message header block and a
variable length message body block. The message body block may have
zero length. Both requests from the firewall and responses from the
database use this same message format.
There are two types of messages, requests and responses. Requests are
sent from the client to the server; responses are sent from the server
to the client. There is a one-to-one correspondence between requests
and responses.
The command field in the message header block identifies the message.
For example, the SERVER_STATUS_REQUEST message has a message header
block containing the value of 1 in the Command ID field. The
SERVER_STATUS_RESPONSE message has a message header block containing
the value of 101 in the Command ID field. For ease of use, all
response messages have Command ID values 100 greater than their
corresponding request message ID values.
2.5.1 SERVER_STATUS_REQUEST - Command ID: 1
The purpose of this message is for the firewall to request status from
the content filtering database. The request is composed of a Message
K. Gutfreund [Page 6]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
Header Block with a command identifier of SERVER_STATUS_REQUEST. This
message has no message body block. The database returns status
information in a SERVER_STATUS_RESPONSE message.
2.5.2 VERSION_REQUEST - Command ID: 2
The purpose of this message is for the firewall to request version
information from the content filtering database. The request is
composed of a Message Header Block with a command identifier of
VERSION_REQUEST. This message has no message body block. The database
returns version information in a VERSION_RESPONSE message.
2.5.3 FEATURES_REQUEST - Command ID: 3
The purpose of this message is for the firewall to request information
from the content filtering server about supported protocols, features,
filtering categories, etc. For example, the firewall could determine
whether or not the server supports filtering for the FTP protocol.
[This message is reserved for future use and is not currently
implemented.]
2.5.4 URL_LOOKUP_REQUEST - Command ID: 4
The purpose of this message is for the firewall to determine if an URL
should be filtered. The URL_LOOKUP_REQUEST is composed of a Message
Header Block with a command identifier of URL_LOOKUP_REQUEST, followed
by a Message Body Block as shown below. The database responds with a
URL_LOOKUP_RESPONSE message.
URL_LOOKUP_REQUEST Message Body Block
Field Size Field
Name (bytes) Description
----- ------- -----------------
Protocol short(2) A value indicating the protocol type: HTTP,
FTP, NNTP, etc. The value used is the port
number for the protocol from RFC 1700.
----- ------- -----------------
URL string short(2) The length of the following URL string in
length bytes.
----- ------- -----------------
Source long(4) The IP address of the original client host
address (not the Firewall) requesting the URL.
----- ------- -----------------
URL char(var) The requested URL. This string is null
string terminated.
----- ------- -----------------
2.5.5 SERVER_STATUS_RESPONSE - Command ID: 101
The server status response is composed of the standard message header
and the following message body:
K. Gutfreund [Page 7]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
Field Size Field
Name (bytes) Description
----- ------- -----------------
Status code short(2) The server status code.
----- ------- -----------------
Status msg short(2) The length of the server status message,
length or 0 if no message.
----- ------- -----------------
License long(4) The number of workstations licensed to use
count the database.
----- ------- -----------------
Reserved long(4) Reserved, must be 0.
----- ------- -----------------
Reserved long(4) Reserved, must be 0.
----- ------- -----------------
Message char(var) A status message string corresponding to the
String status code. The string is null terminated.
----- ------- -----------------
2.5.6 VERSION_RESPONSE - Command ID: 102
This message has no message body block. The request is simply the
Message Header Block with a command identifier of VERSION_REQUEST. The
server responds with a VERSION_RESPONSE message.
2.5.7 FEATURES_RESPONSE - Command ID: 103
This message has no message body block. The request is simply the
Message Header Block with a command identifier of VERSION_REQUEST. The
server responds with a VERSION_RESPONSE message. [This message is
reserved for future use and is not currently implemented.]
2.5.8 URL_LOOKUP_RESPONSE - Command ID: 104
The URL lookup response is composed of a Message Header Block and the
message body below. When a URL is to be blocked, the database server
must return a lookup code greater than 0 and zero or more of the
following items:
a. An optional text message describing why the URL was blocked,
suitable for display on the client browser.
b. An optional "redirected" (replacement) URL.
c. An optional rating label or category
Field Size Field
Name (bytes) Description
----- ------- -----------------
lookup code short(2) Zero if the request should not be blocked.
Non-zero lookup codes indicate that a
request should be blocked, the license has
been exceeded, or an error occurred.
----- ------- -----------------
K. Gutfreund [Page 8]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
Rating short(2) Zero if no rating label or category found.
label Otherwise, length of the rating label and/
length or category string in bytes.
----- ------- -----------------
Rating short(2) Offset from the beginning of this structure
Label to the rating label and/or category string.
offset
----- ------- -----------------
Message short(2) Zero if no rating message. Otherwise,
label length of an HTML formatted message for
length display on the client's browser. This
string is null terminated.
----- ------- -----------------
Message short(2) Offset from the beginning of this structure
label to the rating message string.
offset
----- ------- -----------------
Redirect short(2) Zero if no message. Otherwise,
URL length length of a redirected URL string for
display on the client's browser. This
string is null terminated.
----- ------- -----------------
Redirect short(2) Offset from the beginning of this structure
URL offset to the redirected URL string.
----- ------- -----------------
Reserved short(2) Reserved, must be 0.
----- ------- -----------------
Rating char(var) The rating label and/or category for the
label requested URL, if available. This string
string is null-terminated. This string is only
present if the rating label string length
is > 0.
----- ------- -----------------
Message char(var) For blocked URLs, HTML formatted text for
string display on the client's browser. This
string is null-terminated. This string is
only present if the message string length
is > 0.
----- ------- -----------------
Redirected char(var) For blocked URLs, a redirected URL for the
URL string client's browser. This string does not
need to be null-terminated. This string
is only present if the redirected URL
string length is > 0.
----- ------- -----------------
3. Character Encoding
All character strings in CFP are UTF8 encoded, null terminated and are
accompanied by the length (in bytes).
4. Security Considerations
Using the content filtering protocol allows the content filtering
K. Gutfreund [Page 9]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
database server to be safely located behind the firewall.
Alternatively, the content filtering database server can also be
located on the Internet side of the firewall, as shown below:
< Internet >
^^
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ || ~ ~ ~ ~ ~ ~ ~ ~
vv
+===========+
+==========+ | Content |
| Firewall | < ===== > | Filtering |
| System | | Database |
+==========+ | Server |
^^ +===========+
~ ~ || ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
vv
< Firewall protected resources >
Figure 3, Content filtering database located outside the firewall
In this configuration, the content filtering database works as a proxy
server for all content requests sent to and retrieved from the
Internet. The principal advantage here is that many firewalls and
content servers can communicate with each other without any special
protocol. The firewall treats the content server as a super-firewall
(operating as a firewall within a firewall mode) and the content server
accepts or rejects content as per its configuration. This
configuration is very easy to implement.
The principal disadvantage to the above configuration is that the
content filtering database server is not protected by the firewall.
One other disadvantage is that the firewall management system is not
able to communicate with the database server, other than to send and
retrieve content requests.
5. Acknowledgements
The author would like to thank the following people for their support
and feedback on this specification:
The AltaVista Security Team, Compaq Computer Corporation
Steve Shannon, The Content Advisor
Chao Yu, Log-On Data Corporation
Myrna, Olivia, Sander and Maxine Gutfreund
K. Gutfreund [Page 10]
Internet Draft draft-content-filtering-protocol-00.txt May 1999
6. Author's Address
Please address all comments to:
Keith Gutfreund
AltaVista Internet Software
Compaq Computer Corporation
550 King Street
Littleton, MA 01460
Email: keith.gutfreund@compaq.com
Phone: (978) 506-2147
Document Expiration Date: November 10, 1999
K. Gutfreund [Page 11]