Thursday, January 3, 2019

Citation.js: Wikidata Subclasses

Citation.js: Wikidata Subclasses

I am in the process of creating a better way to map CSL types to Wikidata types together with Jakob Voß, and while listing all subtypes of web page, I came across a number of these cases (link):

#defaultView:Graph
SELECT ?a ?aLabel ?b ?bLabel WHERE {
  wd:Q58803899 wdt:P279* ?a .
  ?a wdt:P279 ?b .
  FILTER EXISTS {
    ?b wdt:P279* wd:Q386724 .
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

I’m no expert on Wikidata data models, nor on the semantics of each of those items in specific, but I’m pretty sure Count of Barcelos (Q58803899) is not a subclass of web page (Q36774). However, the individual parts seem pretty reasonable, so where does it go wrong then?

Luckily, the description of subclass of (P279) provides some good guidance:

all instances of these items are instances of those items; this item is a class (subset) of that item.

So let’s check those:

  1. All counts of Barcelos are Portuguese counts (conde), as Barcelos lies in Portugal
  2. All condes are counts
  3. Counts are not hereditary titles themselves, so I think this should be a instance of (P31). Also, counts are indirectly subclass of royal or noble rank through hereditary titles, but directly instance of royal or noble rank.
  4. Hereditary titles do not seem like an exact subclass of royal or noble ranks to me, but it passes the basic test
  5. I am unsure about royal or noble rank as subclass of rank, since the latter seems way more abstract, but perhaps that’s the point
  6. (rank -> first-order metaclass) I have no idea what the intention with these metaclasses is, so I’m going to assume this is all correct (bold move)
  7. first-order metaclass -> fixed-order metaclass
  8. fixed-order metaclass -> Wikidata metaclass: “class of classes, class whose instances are classes”
  9. Wikidata metaclass -> class or metaclass of Wikidata ontology (should maybe be P31 as well, still no clue though)
  10. class or metaclass of Wikidata ontology -> Wikidata item: finally back into some form of concrete-ness. Problem being, not all instances of royal or noble ranks are Wikidata items, so something went wrong somewhere in between, maybe even at (6).
  11. Wikidata item -> Wikidata internal entity
  12. Wikidata internal entity -> Wikidata internal item
  13. Wikidata internal item -> MediaWiki page
  14. MediaWiki page -> web page: actually pretty reasonable, instances of Wikidata items are web pages.

There are a little less than 900 instances of web page so the impact is smaller than I had expected, but it’s still annoying. There are 191 subclasses of conde, not to mention the total of 280 royal or noble ranks that have no business being a subclass of web page.