next up previous contents
Next: High Overhead Up: Away from a Standard Previous: Away from a Standard   Contents


Inflexibility

The fundamental difference between a standard and a custom format, of course, is who specifies the format of a file. With a standard, the defining authority specifies the format for everyone; without a standard, each user is free to design his or her own. Of course, whenever a user lets someone else design a format for his own data, he runs a very real risk of losing capabilities he needs to accomplish his task. That is, it is possible for a standard specification to be too complete, too narrowly defined. And if it is so tightly defined that it fails to provide the flexibility needed by the user, then the standard will be ignored by that user. (Standards which are ignored by their users soon cease to be standards.)

But, how tight is too tight? How much flexibility is too much? The answer lies in the ability of the user to make his or her own choices about what information can be incorporated into the file.

In writing a data file, a host of tradeoffs and compromises have to be made. If a standard file format is to meet the unique needs of its users, then they must to a large extent be able to make their own strategic choices between various conflicting goals. Thus, only a basic core format should be specified by a central defining authority, within which the user should have freedom to specify how the data are to be represented.

That is to say, a user must be able to say anything she needs to say, but all users should speak the same``language'' (file format).

Another, related, issue is whether a user can express any data needed in the standard format. Modelling a data field as a simple array of variables may not be the most appropriate approach in every case. One can perhaps think of each datum as a cluster of component items located in some coordinate space; the structure of such a cluster--its dimensions--constitute what we call ``Level 0'' dimensions. Examples might be the components of a wind stress tensor or a wind vector. Each datum is also associated with a set of coordinates specifying its location; we can split these dimensions up into those which vary within a data record in the file and those which vary between data records. We call these ``Level 1'' and ``Level 2'' dimensions, respectively. Finally, there are those coordinates at which the data are not located, but over which the data have been averaged in some way; these we call ``Level 3'' dimensions. Any standard file format must distinguish between and allow for the specification of any or all of these various levels of dimensions.

A user must also be able to label or flag any arbitrary subset of the data with an informational tag of some sort.


next up previous contents
Next: High Overhead Up: Away from a Standard Previous: Away from a Standard   Contents
Eric Nash 2003-09-25