@Maciej1988 Measuring notability
Many individuals have low visibility and sparse information. To disentangle the most visible from the less visible, we build a synthetic notability index using five dimensions to figure out a ranking for this broader set of individuals. These dimensions are:
1.
the number of Wikipedia editions of each individual;
2.
the length, i.e total number of words found in all available biographies. It is equal to zero for individuals with just one Wikidata entry and no biography in Wikipedia;
3.
the average number of biography views (hits) for each individual between 2015 and 2018 in all available language editions, using an API available in
https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews or zero in the absence of a Wikipedia biography;
4.
the number of non-missing items retrieved from Wikipedia or Wikidata for birth date, gender and domain of influence. The intuition here is that the more notable the individual, the more documented his/her biographies will be;
5.
the total number of external links (sources, references, etc.) from Wikidata.
We then determine the quantile values from each dimension and add them all to define our notability measure that is used to compare/rank individuals over time and across space.