The new update, v0.4.4
, contains a few Wikidata improvements (commit):
- 22 new mappings
- 2 fixed mappings (
ISSN
did not work andpublisher-place
was mapped to the wrong thing) - 2 improved mappings (
container-title
for chapters and moreURL
mappings) - 1 removed mapping (
genre
was inconsistent with the intended use, although it followed the specification)
Because most of these mappings require additional resources (recipient
has to fetch people, review-*
has to fetch the review subject and original-publisher-place
has to fetch the country and location of the publisher of the original version of a work) this would have caused a lot more requests to the API. It is a good thing, then, that there is a new resolver in place, which pre-fetches such information for all requested works at once.
This has to be done in levels; you cannot fetch the country if you do not have the location, the location requires the publisher and to find out the publisher you have to have the work already. However, temporarily ignoring the cap of 50 items per request, the number of requests is now only based on those levels (which is at most five), instead of the number of claims requiring additional requests.
For example, item Q21972834
(Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data) takes:
- 6 requests in
v0.4.0-rc.2
, before requests for properties were grouped (so it would make a request for every author) - 3 requests in
v0.4.3
: one for the work, one for the authors, and one for journal - 2 requests in
v0.4.4
, which includes a bunch of new mappings that would have cost a total of 5 requests in the previous version
Grouping the requests from all works has the added benefit of not having to fetch the same author and journal info repeatedly — which can be quite common for bibliographies. For example, the bibliography of Groovy Cheminformatics with the Chemistry Development Kit, currently containing 74 items, takes 11 requests in the new version. For the same bibliography, the previous version would make at least 150 requests — for each item one for all the authors and one for the journal — and probably more than that.
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q24599948%7CQ24650571%7CQ25713029%7CQ27061829%7CQ27062363%7CQ27062596%7CQ27065423%7CQ27093381%7CQ27134682%7CQ27134746%7CQ27134827%7CQ27162658%7CQ27211680%7CQ27499209%7CQ27656255%7CQ27783585%7CQ27783587%7CQ27902272%7CQ28090714%7CQ28133283%7CQ28186592%7CQ28837846%7CQ28837922%7CQ28837925%7CQ28837939%7CQ28837943%7CQ28837947%7CQ28842810%7CQ28842968%7CQ28843132%7CQ29039683%7CQ29042322%7CQ29616639%7CQ30149558%7CQ30853915%7CQ31127242%7CQ33874102%7CQ34160151%7CQ34206190%7CQ36662828%7CQ37988904%7CQ39811432%7CQ42704791%7CQ47543807%7CQ47632144%7CQ54062338%7CQ55880270%7CQ55934414%7CQ55954394%7CQ56112883&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q56170978%7CQ56454405%7CQ57836257%7CQ59771351%7CQ60167226%7CQ60167327%7CQ60167615%7CQ60167690%7CQ61463648%7CQ61649587%7CQ61779181%7CQ61779373%7CQ61779901%7CQ61779905%7CQ61779940%7CQ61780124%7CQ62926016%7CQ62926155%7CQ62927888%7CQ62968825%7CQ62969352%7CQ62969354%7CQ63367539%7CQ63367548&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q135122%7CQ1223539%7CQ59196164%7CQ1860%7CQ8513%7CQ766195%7CQ7441%7CQ28946414%7CQ50332566%7CQ46330054%7CQ57218960%7CQ57218835%7CQ55965812%7CQ43370919%7CQ37391332%7CQ57677801%7CQ60446154%7CQ60447576%7CQ60449513%7CQ29946263%7CQ60465926%7CQ60890297%7CQ20895241%7CQ910067%7CQ3007982%7CQ52113739%7CQ5111731%7CQ29405902%7CQ47473872%7CQ121182%7CQ128570%7CQ910164%7CQ2383032%7CQ1130645%7CQ908710%7CQ5727848%7CQ27065426%7CQ28946652%7CQ42783959%7CQ4420286%7CQ749647%7CQ309823%7CQ28540892%7CQ29052381%7CQ29052386%7CQ4914910%7CQ12149006%7CQ853614%7CQ1418791%7CQ50731867&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q56007851%7CQ5195068%7CQ11351%7CQ75%7CQ151332%7CQ3002926%7CQ27061944%7CQ27134800%7CQ6294930%7CQ27061853%7CQ28540435%7CQ28854723%7CQ766383%7CQ1425625%7CQ28540616%7CQ55406442%7CQ483666%7CQ11173%7CQ39972290%7CQ1069211%7CQ36534%7CQ27211732%7CQ251%7CQ1768406%7CQ1689854%7CQ30046697%7CQ55406542%7CQ5227350%7CQ38372872%7CQ38373802%7CQ3841253%7CQ55213915%7CQ7440973%7CQ30045595%7CQ451553%7CQ466769%7CQ203250%7CQ54837%7CQ3355939%7CQ40023319%7CQ26842658%7CQ109081%7CQ82264%7CQ898902%7CQ27711423%7CQ1767639%7CQ7209103%7CQ27768873%7CQ900316%7CQ48803&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1884753%7CQ15064016%7CQ895901%7CQ2166958%7CQ907701%7CQ309%7CQ28796322%7CQ19845641%7CQ27061849%7CQ28923506%7CQ30046701%7CQ28865170%7CQ43370334%7CQ29387575%7CQ50983316%7CQ7395247%7CQ192864%7CQ336658%7CQ58409692%7CQ56421913%7CQ55182163%7CQ55182165%7CQ56670283%7CQ925779%7CQ50731930%7CQ909510%7CQ381009%7CQ38173%7CQ30046335%7CQ43370883%7CQ3705921%7CQ1154615%7CQ2334061%7CQ1988917%7CQ898967%7CQ2425378%7CQ10354104%7CQ47 480%7CQ1709878%7CQ900502&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1120519%7CQ1860%7CQ113337%7CQ1767639%7CQ27134800%7CQ6294930%7CQ251%7CQ5776092%7CQ56421877%7CQ463360%7CQ109081%7CQ50731930%7CQ176916%7CQ26707540%7CQ30046697%7CQ28925563%7CQ1122491%7CQ3186908%7CQ969707%7CQ1948400%7CQ192864%7CQ898902%7CQ2835897%7CQ3007982%7CQ46155617%7CQ905549%7CQ485223%7CQ900502%7CQ15766522&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q7050%7CQ32%7CQ64%7CQ225471&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q47501431&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q33467704&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q55&format=json&languages=en {}
[core] GET https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q183%7CQ145&format=json&languages=en {}
However, I do notice slightly more requests with a low number of items than I would expect. I will look into that.
Conference papers in Wikidata
Part of the mappings that were added in the recent update are the event-*
properties, i.e. “name of the related event (e.g. the conference name when citing a conference paper)” (spec). Finding out how those are modelled in Wikidata proved a bit of a challenge, as most instances of Q23927052
(conference paper) are published in proceedings with the book
type. To get some numbers (query):
type | typeLabel | items |
---|---|---|
Q571 | book | 715 |
Q5633421 | scientific journal | 676 |
Q1143604 | proceedings | 315 |
Q16024164 | medical journal | 53 |
Q23927052 | conference paper | 48 |
Q1002697 | periodical literature | 32 |
Q41298 | magazine | 29 |
other | other | 61 |
48 conference papers are published in conference papers? That does not seem right. But maybe they just have the wrong type, but still link to the event? Well no, not really (query).
hasEvent | items |
---|---|
false | 1952 |
true | 6 |
Luckily, the information is not lost: usually, the label contains location and date information about the event, although extracting that would be probably be a tedious task. However, perhaps there are a lot of proper proceedings out there, but the articles linked to it just have the wrong type (query)?
type | typeLabel | items |
---|---|---|
Q13442814 | scholarly article | 3513 |
Q23927052 | conference paper | 315 |
Q3331189 | version, edition, or translation | 11 |
Q1143604 | proceedings | 9 |
Q10885494 | scientific conference paper | 7 |
Q1980247 | chapter | 6 |
Q191067 | article | 5 |
Q18918145 | academic journal article | 4 |
Q333291 | abstract | 3 |
other | other | 10 |
That seems to be the case: most works that are published in instances of proceedings are tagged as instances of scholarly articles, and only nine are tagged as both. And: relatively many have events linked to them (query).
hasEvent | items |
---|---|
true | 2692 |
false | 1184 |
That’s better. I’ll look into working together with WikiCite to work on the rest.