Tuesday, February 12, 2008

Metadata and Ontologies

Cory Doctorow provided one of the best critiques of the limitations of metadata @ www.well.com/~doctorow/metacrap.htm

One limitation of an ontology is that it is a data model of entities over a domain ( see http://en.wikipedia.org/wiki/Ontology_(computer_science) ).

Although the relational data model (see E. Codd, “A Relational Model for Large Shared Databanks,” Comm. ACM, June 1970, pp. 377-387) can be used to represent any model of data within a domain, it does not address the semantics of any database because defining an entity (relation) is arbitrary (W. Kent, Data and Reality, North Holland Publishing, 1978); partitioning an entity into a hierarchy (Codd’s normalization) is arbitrary, and there is no a priori best hierarchy for this partitioning (W.S. Jevons, The Principles of Science, Dover Publications, 1874); and partitioning a set of sets (concept domain) into non-overlapping subsets is an NP-complete problem and has no polynomial time-limited algorithmic solution.

The best that can be done is to test a given partitioning to see whether it has overlapping subsets. If you do not require non-overlapping subsets, then any arbitrary partitioning will do, but you will not be able to use it for reasoning about the domain.

This makes the creation of an ontology (from a folksonomy or any other source) an arbitrary exercise of the author, and it reflects all of the author’s unstated assumptions and prejudices.

No comments:

Post a Comment