Chapter 2: Volume Location Server Architecture

Section 2.1: Introduction

The Volume Location Server allows AFS agents to query the location and basic status of volumes resident within the given cell. Volume Location Server functions may be invoked directly from authorized users via the vos utility.
This chapter briefly discusses various aspects of the Volume Location Server's architecture. First, the need for volume location is examined, and the specific parties that call the Volume Location Server interface routines are identified. Then, the database maintained to provide volume location service, the Volume Location Database (VLDB), is examined. Finally, the vlserver process which implements the Volume Location Server is considered.
As with all AFS servers, the Volume Location Server uses the Rx remote procedure call package for communication with its clients.

Section 2.2: The Need For Volume Location

The Cache Manager agent is the primary consumer of AFS volume location service, on which it is critically dependent for its own operation. The Cache Manager needs to map volume names or numerical identifiers to the set of File Servers on which its instances reside in order to satisfy the file system requests it is processing on behalf of it clients. Each time a Cache Manager encounters a mount point for which it does not have location information cached, it must acquire this information before the pathname resolution may be successfully completed. Once the File Server set is known for a particular volume, the Cache Manager may then select the proper site among them (e.g. choosing the single home for a read-write volume, or randomly selecting a site from a read-only volume's replication set) and begin addressing its file manipulation operations to that specific server.
While the Cache Manager consults the volume location service, it is not capable of changing the location of volumes and hence modifying the information contained therein. This capability to perform acts which change volume location is concentrated within the Volume Server. The Volume Server process running on each server machine manages all volume operations affecting that platform, including creations, deletions, and movements between servers. It must update the volume location database every time it performs one of these actions.
None of the other AFS system agents has a need to access the volume location database for its site. Surprisingly, this also applies to the File Server process. It is only aware of the specific set of volumes that reside on the set of physical disks directly attached to the machine on which they execute. It has no knowlege of the universe of volumes resident on other servers, either within its own cell or in foreign cells.

Section 2.3: The VLDB

The Volume Location Database (VLDB) is used to allow AFS application programs to discover the location of any volume within its cell, along with select information about the nature and state of that volume. It is organized in a very straightforward fashion, and uses the ubik [4] [5] facility to to provide replication across multiple server sites.

Section 2.3.1: Layout

The VLDB itself is a very simple structure, and synchronized copies may be maintained at two or more sites. Basically, each copy consists of header information, followed by a linear (yet unbounded) array of entries. There are several associated hash tables used to perform lookups into the VLDB. The first hash table looks up volume location information based on the volume's name. There are three other hash tables used for lookup, based on volume ID/type pairs, one for each possible volume type.
The VLDB for a large site may grow to contain tens of thousands of entries, so some attempts were made to make each entry as small as possible. For example, server addresses within VLDB entries are represented as single-byte indicies into a table containing the full longword IP addresses.
A free list is kept for deleted VLDB entries. The VLDB will not grow unless all the entries on the free list have been exhausted, keeping it as compact as possible.

Section 2.3.2: Database Replication

The VLDB, along with other important AFS databases, may be replicated to multiple sites to improve its availability. The ubik replication package is used to implement this functionality for the VLDB. A full description of ubik and of the quorum completion algorithm it implements may be found in [4] and [5]. The basic abstraction provided by ubik is that of a disk file replicated to multiple server locations. One machine is considered to be the synchronization site, handling all write operations on the database file. Read operations may be directed to any of the active members of the quorum, namely a subset of the replication sites large enough to insure integrity across such failures as individual server crashes and network partitions. All of the quorum members participate in regular elections to determine the current synchronization site. The ubik algorithms allow server machines to enter and exit the quorum in an orderly and consistent fashion. All operations to one of these replicated "abstract files" are performed as part of a transaction. If all the related operations performed under a transaction are successful, then the transaction is committed, and the changes are made permanent. Otherwise, the transaction is aborted, and all of the operations for that transaction are undone.

Section 2.4: The vlserver Process

The user-space vlserver process is in charge of providing volume location service for AFS clients. This program maintains the VLDB replica at its particular server, and cooperates with all other vlserver processes running in the given cell to propagate updates to the database. It implements the RPC interface defined in the vldbint.xg definition file for the rxgen RPC stub generator program. As part of its startup sequence, it must discover the VLDB version it has on its local disk, move to join the quorum of replication sites for the VLDB, and get the latest version if the one it came up with was out of date. Eventually, it will synchronize with the other VLDB replication sites, and it will begin accepting calls.
The vlserver program uses at most three Rx worker threads to listen for incoming Volume Location Server calls. It has a single, optional command line argument. If the string "-noauth" appears when the program is invoked, then vlserver will run in an unauthenticated mode where any individual is considered authorized to perform any VLDB operation. This mode is necessary when first bootstrapping an AFS installation.