Wednesday, February 25, 2015

Superschema or Common Schema?

After reading the articles that inspired them (here and here), it was fun to compare Adam's blog post on the useful simplicity behind Dublin Core to Tonya's post on the plausibility of creating a superschema to rule them all.  Using very basic set theory to describe these approaches, it seems to me that Dublin Core takes the approach of the intersection of metadata sets, while the superschema idea consists of the union of all metadata sets.  Both ideas seem to posit a system in which all elements would be optional, using only those appropriate to the object being described.  However, Dublin Core works by providing a limited number of common elements that can describe nearly anything generically, while the superschema would work by providing an almost unlimited number of elements that could describe nearly anything specifically.  What an interesting contrast!

From a practical standpoint, DC has a lot to offer in terms of interoperability, maintainability, and the ease of building fast indexes with understandable browsing facets. The superschema idea would allow a lot of freedom, but describers would need to have a very broad knowledgebase, and even local systems based on it would be highly complex.

From the user's standpoint, what would the superschema system look like?  I suspect that it would look a lot like Google.  The search algorithm would probably need to rely on keywords, with relevancy heavily informed by the tags from the various schema (so your search for "sage in herbal remedies" wouldn't be swamped by articles published by SAGE Publishing).  While I don't know how their proprietary indexing systems work, this sounds to me an awful lot like library discovery layers, and the direction they are moving in.

To me, the good news here is that the mix-and-match can, and probably will, happen at a higher level than the metadata schema.  Individual systems could continue to use the specialty schema that work best.  Knowing other schema is still important in case of migration, but hopefully combining datasets will come to rely on something more sophisticated than the most basic of crosswalks. It will be interesting to see where it all goes!

1 comment:

  1. I've enjoyed the multi-blog conversation around this topic! I'll just add one additional aspect related to the dreaded "theory/practice" divide ... ultimately, would the additional labor involved in crafting a super-schema be worth the additional cost. IOW, would enough people benefit from it versus a more "mix and match" approach? There's no definitive answer, but lots of well-formed opinions!

    ReplyDelete