EAV and Cloud Computing
Many cloud computing vendors offer data stores based on the EAV model, where an arbitrary number of attributes can be associated with a given entity. Roger Jennings provides an in-depth comparison of these. In Amazon's offering, SimpleDB, the data type is limited to strings, and data that is intrinsically non-string must be coerced to string (e.g., numbers must be padded with leading zeros) if you wish to perform operations such as sorting. Microsoft's offering, Windows Azure Table Storage, offers a limited set of data types: byte, bool, DateTime, double, Guid, int, long and string . The Google App Engine offers the greatest variety of data types: in addition to dividing numeric data into int, long, or float, it also defines custom data types such as phone number, E-mail address, geocode and hyperlink. Google, but not Amazon or Microsoft, lets you define metadata that would prevent invalid attributes from being associated with a particular class of entity, by letting you create a metadata model.
Google lets you operate on the data using a subset of SQL; Microsoft offer a URL-based querying syntax that is abstracted via a LINQ provider; Amazon offer a more limited syntax. Of concern, built-in support for combining different entities through joins is currently (April '10) non-existent with all three engines. Such operations have to be performed by application code. This may not be a concern if the application servers are co-located with the data servers at the vendor's data center, but a lot of network traffic would be generated if the two were geographically separated.
An EAV approach is justified only when the attributes that are being modeled are numerous and sparse: if the data being captured does not meet this requirement, the cloud vendors' default EAV approach is often a mismatch for applications that require a true back-end database (as opposed to merely a means of persistent data storage). Retrofitting the vast majority of existing database applications, which use a traditional data-modeling approach, to an EAV-type cloud architecture, would require major surgery. Microsoft discovered, for example, that its database-application-developer base was largely reluctant to invest such effort. More recently, therefore, Microsoft has provided a premium offering – a cloud-accessible full-fledged relational engine, SQL Server Azure, which allows porting of existing database applications with modest changes.
One significant limitation of SQL Azure is that physical databases are limited to 150GB in size (as of Oct 2012). Microsoft recommends that data sets larger than this be split into multiple physical databases and accessed with parallel queries.
Read more about this topic: Entity–attribute–value Model
Famous quotes containing the word cloud:
“When the lamp is shattered,
The light in the dust lies dead;
When the cloud is scattered,
The rainbows glory is shed;
When the lute is broken,
Sweet tones are remembered not;
When the lips have spoken,
Loved accents are soon forgot.”
—Percy Bysshe Shelley (17921822)