A site devoted mostly to everything related to Information Technology under the sun - among other things.

Tuesday, February 12, 2008

Metadata and Ontologies

Cory Doctorow provided one of the best critiques of the limitations of metadata @ www.well.com/~doctorow/metacrap.htm

One limitation of an ontology is that it is a data model of entities over a domain ( see http://en.wikipedia.org/wiki/Ontology_(computer_science) ).

Although the relational data model (see E. Codd, “A Relational Model for Large Shared Databanks,” Comm. ACM, June 1970, pp. 377-387) can be used to represent any model of data within a domain, it does not address the semantics of any database because defining an entity (relation) is arbitrary (W. Kent, Data and Reality, North Holland Publishing, 1978); partitioning an entity into a hierarchy (Codd’s normalization) is arbitrary, and there is no a priori best hierarchy for this partitioning (W.S. Jevons, The Principles of Science, Dover Publications, 1874); and partitioning a set of sets (concept domain) into non-overlapping subsets is an NP-complete problem and has no polynomial time-limited algorithmic solution.

The best that can be done is to test a given partitioning to see whether it has overlapping subsets. If you do not require non-overlapping subsets, then any arbitrary partitioning will do, but you will not be able to use it for reasoning about the domain.

This makes the creation of an ontology (from a folksonomy or any other source) an arbitrary exercise of the author, and it reflects all of the author’s unstated assumptions and prejudices.

No comments:

About Me

My photo
I am a senior software developer working for General Motors Corporation.. I am interested in intelligent computing and scientific computing. I am passionate about computers as enablers for human imagination. The contents of this site are not in any way, shape, or form endorsed, approved, or otherwise authorized by HP, its subsidiaries, or its officers and shareholders.

Blog Archive