Entity–attribute–value Model - The Critical Role of Metadata in EAV Systems

The Critical Role of Metadata in EAV Systems

In the words of Prof. Dr. Daniel Masys (formerly Chair of Vanderbilt University's Medical Informatics Department), the challenges of working with EAV stem from the fact that in an EAV database, the "physical schema" (the way data are stored) is radically different from the "logical schema" – the way users, and many software applications such as statistics packages, regard it, i.e., as conventional rows and columns for individual classes. (Because an EAV table conceptually mixes apples, oranges, grapefruit and chop suey, if you want to do any analysis of the data using standard off-the-shelf software, in most cases you have to convert subsets of it into columnar form. The process of doing this, called pivoting, is important enough to be discussed separately.)

Metadata helps perform the sleight of hand that lets users interact with the system in terms of the logical schema rather than the physical: the software continually consults the metadata for various operations such as data presentation, interactive validation, bulk data extraction and ad hoc query. The metadata can actually be used to customize the behavior of the system.

EAV systems trade off simplicity in the physical and logical structure of the data for complexity in their metadata, which, among other things, plays the role that database constraints and referential integrity do in standard database designs. Such a tradeoff is generally worthwhile, because in the typical mixed schema of production systems, the data in conventional relational tables can also benefit from functionality such as automatic interface generation. The structure of the metadata is complex enough that it comprises its own subschema within the database: various foreign keys in the data tables refer to tables within this subschema. This subschema is standard-relational, with features such as constraints and referential integrity being used to the hilt.

The correctness of the metadata contents, in terms of the intended system behavior, is critical and the task of ensuring correctness means that, when creating an EAV system, considerable design efforts must go into building user interfaces for metadata editing that can be used by people on the team who know the problem domain (e.g., clinical medicine) but are not necessarily programmers. (Historically, one of the main reasons why the pre-relational TMR system failed to be adopted at sites other than its home institution was that all metadata was stored in a single file with a non-intuitive structure. Customizing system behavior by altering the contents of this file, without causing the system to break, was such a delicate task that the system's authors only trusted themselves to do it.)

Where an EAV system is implemented through RDF, the RDF Schema language may conveniently be used to express such metadata. This Schema information may then be used by the EAV database engine to dynamically re-organize its internal table structure for best efficiency.

Some final caveats regarding metadata:

  • Because the business logic is in the metadata rather than explicit in the database schema (i.e., one level removed, compared with traditionally designed systems), it is less apparent to one who is unfamiliar with the system. Metadata-browsing and metadata-reporting tools are therefore important in ensuring the maintainability of an EAV system. In the common scenario where metadata is implemented as a relational sub-schema, these tools are nothing more than applications built using off-the-shelf reporting or querying tools that operate on the metadata tables.
  • It is easy for an insufficiently knowledgeable user to corrupt (i.e., introduce inconsistencies and errors in) metadata. Therefore, access to metadata must be restricted, and an audit trail of accesses and changes put into place to deal with situations where multiple individuals have metadata access. Using an RDBMS for metadata will simplify the process of maintaining consistency during metadata creation and editing, by leveraging RDBMS features such as support for transactions. Also, if the metadata is part of the same databae as the data itself, this ensures that it will be backed up at least as frequently as the data itself, so that it can be recovered to a point in time.
  • The quality of the annotation and documentation within the metadata (i.e., the narrative/explanatory text in the descriptive columns of the metadata sub-schema) must be much higher, in order to facilitate understanding by various members of the development team. Ensuring metadata quality (and keeping it current as the system evolves) takes very high priority in the long-term management and maintenance of any design that uses an EAV component. Poorly-documented or out-of-date metadata can compromise the system's long-term viability .

Read more about this topic:  Entity–attribute–value Model

Famous quotes containing the words critical, role and/or systems:

    The male has been persuaded to assume a certain onerous and disagreeable rôle with the promise of rewards—material and psychological. Women may in the first place even have put it into his head. BE A MAN! may have been, metaphorically, what Eve uttered at the critical moment in the Garden of Eden.
    Wyndham Lewis (1882–1957)

    So successful has been the camera’s role in beautifying the world that photographs, rather than the world, have become the standard of the beautiful.
    Susan Sontag (b. 1933)

    What is most original in a man’s nature is often that which is most desperate. Thus new systems are forced on the world by men who simply cannot bear the pain of living with what is. Creators care nothing for their systems except that they be unique. If Hitler had been born in Nazi Germany he wouldn’t have been content to enjoy the atmosphere.
    Leonard Cohen (b. 1934)