Tuesday, October 21, 2025

Updates to taxonomic coverage and search result scoring

Two parts of the Library of Identification Resources have gotten major updates in the last year. First, the taxon coverage field (also labeled “For identifying …” in some places) is now linked to external databases, namely Wikidata and GBIF. Second, the scoring and sorting of search results in the Find resources tool was made more transparent and visible.

Taxon coverage

The taxon coverage field specifies the taxa to which the identification resource applies. For example, for the Key to the British Scathophagidae (Diptera) by Stuart G. Ball (B17) that would be the family Scathophagidae.

Previously, this was a plain-text field. Now, the values are all linked to a separate table. In this table, the rank of the taxon is given, and mappings are made with Wikidata and GBIF. The main taxonomy data in GBIF does not include minor ranks such as superfamilies, subfamilies, and subgenera; for those taxa, the GBIF identifiers of its children are recorded instead. The same goes for outdated taxa which are now considered paraphyletic, or were synonymized or split up.

Additionally, the parent taxa of each taxon are recorded, allowing statistics on the number of resources in larger groups. This is also shown on the taxon pages, such as that of Animalia (T55):

Screenshot of page about the taxonomic kingdom Animalia, with a permalink, GBIF identifier, links to Wikidata and Scholia, classification ("Biota > Animalia"), and a donut chart labelled "Children" with sections "Arthropoda" (~85%), "Chordata" (~7%), etc.

Search result sorting

In the Find resources tool, available at identification-resources.github.io/find-resources, search results are scored on a number of different factors. The score is now visually displayed next to the taxonomic completeness. When clicking on the scores, their factors are now shown in three groups which are explained in more detail.

In addition, results can now be sorted by those specific groups of factors, instead of only by the total score. Results can now also be filtered by language or by characteristics of observations and/or organisms, including keys specifically for females or males, or keys for nests galls, eggs, nymphs, etc.

No comments:

Post a Comment