The Resource Catalog (RC) Server

The purpose of the RC server is to provide ``authoritative'' information about an Internet-accessible resource, to potential users of that resource. A RC server accepts network-based requests from users (clients) for information associated with URNs or LIFNs, and returns appropriate answers. ``Authoritative'' means that it is either provided by the naming authority associated with that URN or LIFN, or by someone authorized by the naming authority to add Assertions to a URN or LIFN.

The tremendous growth of the Internet imposes the requirement that the RC meta-information service be able to scale to several orders of magnitude. One way to scale the amount of load that the service can handle is to replicate data across several RC servers for a particular naming authority. (This also provides a measure of fault-tolerance.) Individual RC servers therefore need facilities for keeping in reasonable synchronization with their peers. To avoid needing an excessive number of servers for a particular domain, each single RC server needs to be reasonably efficient.

The RC service can be scaled in terms of the amount of information provided, by allowing the RC space for a particular naming authority to be subdivided, and delegating different portions of the RC space to different servers. To maintain fault-tolerance, it should be possible to delegate each portion of the namespace to multiple servers.

Information services on the Internet are still in flux, and the requirements of a RC service are not yet well understood. In addition, there are many different formats in use for resource cataloging data, and there may be no clean mapping from one kind of cataloging record to another. The RC server should therefore have no knowledge of the details of cataloging records, so that it may adapt to whatever standards are eventually chosen, or contain multiple representations for any particular resource.

The RC service should not restrict the types of data that can be stored as part of a resource description, nor be dependent on any particular written language. This implies that each of the various fields can potentially hold any sequence of byte values. Since there is no single character set that meets the needs of all countries, this also implies that it must be usable with multiple character sets.

Many important providers on the Internet are individuals or small organizations which lack the resources to hire a system administrator or database administrator to maintain a RC server. The prototype RC server is therefore designed to be easy to administer, in the hope that individuals and small organizations can continue to have a voice in the Internet information world.

The RC server supports the notion of multiple sources of data (``asserters'') for any particular URN and multiple sources of location data for any particular LIFN. Every piece of information about a URN carries with it a notion of who supplied that piece of information; every location of a LIFN is supplied by the file server for that location. But because it is undesirable for the RC server to have a wired-in knowledge of the structure of cataloging records, and because we do not want to have to specify permissions on a per-field basis, the RC server allows any authorized asserter to supply information for any particular field. (In other words, write permission is specified per-record, rather than per-field.) Any asserter authorized by the naming authority can supply information for any field. A user requesting information about a particular field for a particular URN may choose to receive all assertions for that field (along with the ID of each asserter), or only the most recent assertion by each asserter.

The RC server also supports the ability to include cryptographically signed certificates for any subset of fields about a URN. When an asserter supplies values for particular fields, the asserter may also include such Certificates. The RC server does not interpret the Certificates itself, but it does support the ability to return them in responses to queries. A client may request Certificates of one or more particular types. If any such Certificates are found, the RC server will return all fields necessary to verify each Certificate (not just those requested by the client). Finally, when updating the information about a particular URN, the RC server will not delete any field which is still needed to verify a current Certificate.

Multiple authentication algorithms are supported for Certificates, and there can be multiple Certificates for any set of Assertions.

With an appropriate method of key exchange, Certificates are sufficient for authentication of RC fields to users. However, due to the difficulties of key exchange and the restrictions imposed by various governments, these may not be usable everywhere. A separate security mechanism is therefore used to prevent unauthorized modification of a RC server's data from over a network. This will not thwart all attacks. If the RC server host is compromised by any means, the data it provides may also be compromised. However, the security mechanims built into the RC server are designed to thwart attacks based on the RC server protocol.

Structure of URN Data

Data on a RC server is stored in a keyed access file with a single-key. A resource name (assumed to be a URN) is the access key. Each record consists of zero or more Assertions followed by zero or more Certificates. In addition, Location data is also stored on an RC server. This is indexed by LIFN, and each record consists of zero or more Locations.

The naming authority (or an entity authorized by the naming authority) may advertise a number of ``Assertions'' about a URN. These Assertions essentially define the meaning of the URN, for example, by associating the resource's title, author, publisher, and the realization of that resource, with the URN. In addition, naming authorities or other authorized asserters may provide ``Certificates'' by which users can verify that a set of Assertions were indeed made by whomever made them.

All Assertions have the form: Asserter A says that at time T, the attribute named N for URN U had value V.

Each Assertion also has an optional expiration date. A Certificate has the form:

At time T, Certifier C agrees that the Assertions in list { A1, A2, ..., An } are valid for URN U.

along with a cryptographic signature algorithm identifier, and the signature itself.

For both Assertions and Certificates, all fields are supplied by the asserter. (The Certificate may be issued by another party, but the asserter supplies the field to the RC server.) Other information is maintained by the RC server itself.

An attribute-name used in an Assertion is a tuple of the form: (name, type), where 'name' is a string of ASCII characters, and 'type' is a data type identifier for the associated value. The 'type' is associated with the name, rather than the value, to avoid certain silly states. This also allows multiple data types to exist for any particular name.

Data type names are small integers. Except for some special case handling of LIFNs, the current RC server has no knowledge of data types and treats all objects the same. However, the type information may be useful to clients when presenting data or converting RC Assertions to other formats.

Currently-defined data types are octet-string, latin1-string, date, sequence, and LIFN. Additional data types may be defined in the future, for example, to support more capable character sets.

The data structure for Assertion is as follows:

typedef struct {

        /* supplied by asserter */
        String name;
        unsigned int type;

        Date assert_time;
        Date lifetime;
        char value<>;

        /* supplied by RC server */
        unsigned int index;
        unsigned int asserter;
        unsigned int serial;

} Assertion;
The assert_time is the value T that is part of the assertion, and is supplied by the asserter.

The lifetime is the date after which the asserter expects that the assertion will no longer be valid. A lifetime of infinity (all bits 1) indicates that the Assertion is expected to be valid forever; a lifetime of zero (all bits 0) indicates that there is no estimate of the lifetime of the Assertion.

The index is an unique number (within this record) used to associate Certificates with the Assertions they apply to.

The asserter id is an integer assigned by the naming authority. The RC server records the asserter id for every Assertion written to the database.

The serial number is the number of times that this asserter has supplied a new value for this name and type. These are maintained by the RC server.

Certificates look like this:

typedef struct {

        /* supplied by asserter */
        unsigned int assertions<>
        Date certificate_time;
        unsigned int algorithm_id;
        unsigned char signature<>;

        /* supplied by RC server */
        unsigned int asserter;
        unsigned int serial;
} Certificate;
A Certificate consists of an ordered list of Assertion indices (this is what the Assertion 'index' field is for), the time at which the certifier agrees with the Assertions, an algorithm identifier, and a signature. The format of the latter is defined by the signature algorithm. The asserter and serial number are similar to those for Assertions, except that the serial number for a Certificate is per-asserter rather than per field. (Actually this could be true for a Assertion also...the purpose of the serial number is to determine which of several Assertions or Certificates was the last one added.)

The serial number is a number which is increased each time the record changes, and which is never decreased. (However, it is not constrained to increase by one for each update.) It is returned in query responses. Update requests may be conditional on a particular serial number from a query response; if the record has changed since that query, the entire update fails.

Location Data

Certain kinds of data are associated with the locations of a resource. These are collected under a common location independent resource name, called a LIFN. LIFN stands for Location Independent File Name.

Currently, each location of a resource is denoted by a URL. In addition, each location may specify an expiration date after which the resource is not expected to be available at that location, and a time-to-live. The time-to-live field is the amount of advance warning (in seconds) that the server intends to provide to the RC server, before making the resource inaccessible from that location.

A Location looks like the following:

typedef struct {
        String url;
        Date expiration_date;
        unsigned int ttl;
} Location;
While a Location Information record looks like:
typedef struct {
        String lifn;
        Location location_list<>
} LocationInfo;

Format of Query Requests and Responses

There are two kinds of queries: (1) a query by resource name, which returns Assertions about a resource named by a URN (and optionally location information), and (2) a query by LIFN, which returns only location information. The latter would be used when the client already has a LIFN and needs to find additional locations, such as when a URN that names a resource and a LIFN that represents a realization of that resource, are served by different RC servers.

A query-by-resource-name contains:

Each field name goes in an AssertionRequest. The AssertionRequest contains the value and type tuple that comprise the field name. It also contains flag bits that specify lookup options.
typedef struct {
        String name;
        unsigned int type;
        unsigned int flags;
#define AR_WILDCARD             01
#define AR_WANT_CERTIFICATES    02
#define AR_ALL                  04
#define AR_WANT_LOCATION_INFO   010
} AssertionRequest;

typedef struct {
        String key;
        AssertionRequest assertion_request_list<>;
        unsigned int certificate_algorithm_identifier_list<>;
        String url_protocol_list<>
} QueryByResourceName;
A query-by-resource-name may also return zero or more Redirects. A Redirect tells the client that queries for a particular range of keys (either URNs or LIFNs) should be sent instead to a particular server. Multiple Redirects can be returned in response to a single query, and the key ranges of Redirects may overlap. Where a URN or LIFN is within the key range of multiple Redirects, the client may use any of them, with preference given to one with the lowest preference value.

The purpose for Redirects is to allow division of responsibility for URN/LIFN lookup service for a particular domain across several servers, without having the organize the URN/LIFN name space in advance to allow for this.

typedef struct {
        String low_key;
        String high_key;
        unsigned int ttl;
        unsigned int preference;

} Redirect;
A response to a query-by-resource-name contains:
typedef struct {
        String key;
        unsigned int status_code;
        Assertion assertion_list<>;
        Certificate certificate_list<>;
        Redirect redirection_list<>;
        LocationInfo location_info_list<>;
} QueryByResourceNameResponse;
Note that the 'index' values in Assertions and Certificates are not necessarily the same as those originally supplied to the URN server. They are only temporary identifiers for use within a QueryResponse.

LocationQuery requests are as follows:

typedef struct {
        String key;
        String protocol_list<>
} LocationQuery;
LocationQueryResponses are of the form:
typedef struct {
        int status;
        LocationInfo location_info<>;
};

Format of Update Request

URN update requests contain only the asserter-supplied portions of Assertions and Certificates:
typedef struct {
        String name;
        unsigned int type;

        Date assertion_time;
        Date expiration_date;
        char value<>;

        unsigned int index;

        /* modifier flags */
        unsigned int flags;
#define AU_DEL_PREV     01
#define AU_DEL_ALL      02
#define AU_WILDCARD     04
} AssertionUpdate;
The flags are as follows:
typedef struct {
        unsigned int assertion_list<>;
        unsigned int certificate_time;
        unsigned int algorithm_identifier;
        unsigned char signature<>;

        unsigned int index;
        unsigned int supercedes;
} CertificateUpdate;
typedef struct {
        String key;
        AssertionUpdate assertion_list<>;
        CertificateUpdate certificate_list<>;

        int asserter;
} URNUpdateRequest;
The asserter field indicates who is requesting the update. The value in this field is verified by an authentication method not defined here. If the requester is not the naming authority or the asserter field does not match the requester, the Update request fails.