Internet DRAFT - draft-hanson-nnmp

draft-hanson-nnmp



INMD Ad Hoc Folks
INTERNET-DRAFT draft-hanson-nnmp-01.txt
Expires Feb 14th, 2000
C. Hanson
Arcticus
Sept 16th, 1999
INMD: Internet Metadata

Status of this Document
This is a rough draft of an idea that deserves wider dissemination and
comment. Distribution is unlimited. Sarcastic humor is included.

This is the second (01) revision and reflects changes made to expand
the scope of the rating system beyond Netnews to include generalized 
Internet resources such as web sites/servers/documents. The second
edition changes the proposed name from NNMP to INMD.


This document is an Internet-Draft and is NOT offered in accordance
with Section 10 of RFC2026, and the author does not provide the IETF 
with any rights other than to publish as an Internet-Draft.

Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups.  Note that other
groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time.  It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.


Contributing Authors
People who have recently contributed ideas to this brainstorm include
Steve Koren, Bob Maple, David Kessner, Ian Cahoon, Eric Schultz,
Frank Weed, Earl Miles, Jamie Krutz, Michael Ash and Dave Warner.

Abstract
We attempt to design an extensible parallel distributed client-server
customizable profiles-based ratings system with as many buzzwords as
possible. Additionally the system should be capable of amassing,
tracking and indexing other Netnews/Internet "metadata" -- any data
about netnews articles or Internet resources other than the articles/
resources themselves. By this method, we seek to win the Nobel Peace
Prize in Communications by staving off smothering regulation and making
the worlds largest and most democratic information/discussion forum
useful again.

Table of Contents
1. Definitions
1. Introduction
1. Pros
1. Cons
1. Client
1. Server
1. Example of Usage
1. Notes
1. Related Works
1. Security Considerations
1. Author's Address

Definitions
Metadata: Data about data. In the context of Netnews, metadata is any
information about the news articles other than the news articles
themselves. In this case, rating information. In the case of Internet
resources (web/ftp servers/sites/documents, etc),  this could mean
rating or classification information.

Introduction
NNTP-based Netnews, aka Usenet , once a valuable source of information
(signal), has for quite some time been drowning in 'noise'. The signal
has been lost. The noise is comprised of deliberate 'spam', off-topic
content, and content of little to no value. More vexing, the actual
signal level has not been increasing at the same rate as the noise,
and indeed may even be decreasing as numbers of quality authors depart
the Netnews in general disgust of the decline of the medium.

Those familiar with the 'early' and 'late' days of Usenet understand
the difference. Anyone who has seen the evolution (devolution?) of
the Lightwave mailing list into the Lightwave newsgroup, and the
degeneration of both into a morass of noise will see clearly the
problem.

A filter needs to be constructed to separate the signal from the noise
at the reader's bidding. The problems are many. To begin with, not all
readers agree on what is signal and what is noise. Additionally, a
filter must be capable of coping with the vast numbers of messages
in different formats and languages that travel the Usenet. A filter
must be very clever to interpret the actual content of a message 
regardless of the format, language, encoding or other medium-specific
attributes. Finally, a filter must be immune to accidental or even
very deliberate filter manipulation or spoofing.

What nature of computer exists that can meet these strict criteria?
Only one: The amassed brainpower of the global Internet userbase.

We propose to design and implement a system to facilitate the rating
of Usenet/Internet articles/content by each and every reader/browser
(if they so desire). The system stores and indexes these ratings and
will permit users to sort and present (unviewed) Usenet messages/
Internet resources based upon how a user-selected set of peers rated
said messages/content.


Pros
* Distributed Client-Server Architecture
* 'Expert' system
* Does not enforce censorship
* Difficult to spoof

Cons
* May be a resource hog with large numbers of raters.
* No (?) known interesting business model for implementers or service providers to profit from.

Client
The client needs to be integrated into popular newsreaders/browsers.
The client needs to contact the INMD server and negotiate any login/
security session issues to access the user's Profile. Subsequently,
when the user reads a Usenet message/views a document, the client
should offer the user the option to submit a rating for the message/
document. Ratings could be a simple 1 to 10 scale, or -10 to 10, or
such, or could attempt to rate multiple aspects of the message --
relevance, coherency, G/PG/PG13/R/NC17/X, overall score, etc.

When retrieving messages/documents, the client should request rating
information from the server pertaining to the message(s)/document(s)
in question, and use the ratings information to rank and/or filter
the unread messages. Which profile or profiles are used to generate
the rating, and in what proportions should be up to the user.

Server
The server should securely keep a profile for each authenticated user
it 'hosts', and track any rating submitted by that user by Usenet
message-id or Internet URL (Uniform Resource Locator). Additionally,
other users should be able to request a rating by specifying profile
id and message-id/URL. The server should be able to accept and fulfill
requests for rating from multiple profiles and multiple message-ids/
URLs in one transaction for efficiency. The format of the data in the
actual TCP/IP packets is not yet defined, but should be as concise and
brief as possible.

Optionally, the server could generate a list of the 'top 50' (or top n)
most popular profiles it hosts, and perhaps a list of profiles most
closely matching a users own ratings. (See Related Works for examples
of this type of feature.)

Database speed and compactness are the primary criteria for the server.
Record sizes for each rating have not yet been estimated.

Example of Usage
User 'Joe' starts up his INMD-enabled newsreader. It contacts his ISP's
NNTP and INMD servers (news.foo.net and ratings.foo.net) and performs
whatever authentication is necessary.

Joe browses the group alt.frogs.small.green. His newsreader sees that
Joe has selected several other user's profiles (biff@bar.com and
jim@baz.org) to use in rating messages in alt.frogs.small.green. As the
newsreader fetches NNTP headers from the NNTP server, it also contacts
ratings.bar.com and ratings.baz.org to fetch ratings for each article
via their message-id. The newsreader has not yet fetched the actual
article(s). It then sorts all of the messages in alt.frogs.small.green
according to the composite score it calculated from biff's and jim's
ratings.

As Joe reads, he rates each message himself. His ratings are submitted
to his ratings server (ratings.foo.net) by his netnews client.

Later, when Sue goes and reads alt.frogs.small.green, her news client
requests the ratings profiles from Joe, Biff and Jim (because she had
previously selected them) and sorts and filters based upon the
composite score.

Now, Sue decides to read alt.controversial.topic. She has never read
this group before. Her newsreader suggests the top 50 profiles for the
group. She recognizes several notable personalities whose opinions she
agrees with, and several she vehemently opposes. She selects a few
profiles of people she agrees with, and a few of her associates as
well. In this way, she avoids becoming victim to any sort of mass
censorship -- she chooses who has the ability to restrict her media
input.

Later, Joe begins to read alt.molecular.diagrams. He does not select
any profiles to sort his messages by because he does not recognize any
names in the field. He begins to rate messages himself. After a few
days (weeks?) he has rated a statistically significant number of
messages, and his news client queries his INMD server (and a few other
well-known servers?) for a list of other profiles that rated the same
articles as Joe, and rated them similarly. Now, though Joe does not
know any of these individuals, he at least knows they share a roughly
similar opinion on the content of messages in alt.molecular.diagrams,
so he selects a few (more is always better) profiles to govern the
filtering and sorting.

Similarly, in a web/Internet document situation, the client software
should query and retrieve rating profile entries for a document URL
prior to retrieving/displaying the document itself. Potentially, the
ratings could be retrieved and used for filtering as the browser/client
encountered references to URLs in other documents. In this way, the
browser/client could advise the user of the rating implications as the
user moved the mousepointer over destination URLs to 'test the waters'.
Submitting a rating for a page is considered trivially obvious and is
not elaborated on here.

Notes
* Because the system assigns a level of respect for popular 'raters',
one could imagine certain parties who perform well becoming de-facto
expert raters for certain groups. A cyber-Siskel or electronic Ebert.;)
This may be roughly analogous to the position of "Sub-Op" or "Board-Op"
or "Moderator" from the BBS and Compuserve era. Who knows what this
level of respect can be parleyed into. Power and World Domination? ;)

* Concerning cliques and Special Interest Groups: A system such as this
would allow organizations to create and maintain their own ratings/
approval profiles. Whether use of these profiles is voluntary or not
could be a subject of abuse. In a voluntary sense, companies such as
SurfWatch and NetNanny and the like could sell the use of their
profiles to those who desire them. Other organizations that might be
motivated to maintain and publish their own 'official' profile might
include RIAA, MPAA, PMRC, PTA, NEA, NRA, Republicans, Democrats,
Libertarians, Fascists, Communists, National Governments and every
other special interest group on the planet. If a users want their
thinking to be guided/restricted by a group or groups, that shall
be their choice.

* Spam, being the natural target of this endeavour, will not fare well.
Anyone who does read it will derive a natural satisfaction from
feeling like they're doing the world a favour (and getting revenge on
the poster) by downrating it. The server might want to specially flag
messages that get rapidly and substantially downrated by a majority of
the readers, and provide this list as an independent 'profile' for
those who wish to use it. This could present a non-legislative
technological and democratic answer to the spam problem. People only
spam because they know it works. By 'works' I mean that it gets read
by some fraction of a large number of people, willingly or not.
However, if we dramatically reduce that fraction, we reduce the
motivation to post ineffective spam, perhaps shuttling it permanently
into an evolutionary backwater, like wisdom teeth or the appendix.
Out-Darwinned. If a Spam gets posted in the forest and no one is
there to read it, does it continue to get posted?

* The protocol should be extensible -- if other useful categories of
metadata are devised, the same server and protocol should be capable
of supporting, storing, indexing and delivering the new metadata as
well. The index field is a NNTP message-id or an Internet URL in the
above cases, but could conceivably be any unique identifying ID for
other unforeseen types of metadata.

Related Works
Deja News	http://www.deja.com
Slashdot:	http://www.slashdot.org
MovieLens:	http://www.movielens.umn.edu
MovieCritic:	http://www.moviecritic.com
Net Nanny:	http://www.netnanny.com
SurfWatch:	http://www.surfwatch.com
Google		http://www.google.com

Security Considerations
There must be heaps of security considerations. They need to be
discussed. As I understand, in order to use a port number less than
1024, under Unix the program needs to run with special privileges.
Is this true? Desired? Other than that, it is hoped that the server
could run without any special system privileges.

Author's Address
Chris Hanson		xenon@arcticus.com
Steve Koren
Bob Maple
David Kessner
Ian Cahoon
Eric Schultz
Frank Weed
Earl Miles
Jamie Krutz
Michael Ash
Dave Warner

I have created a mailing-list to discuss this document at
NNMP@Arcticus.com. All of the people listed above are currently
subscribed.

Expires Feb 14th, 2000.