Data for "Updated science-wide author databases of standardized citation indicators"
Description
Citation metrics are widely used and misused. We have created a publicly available database of 100,000 top-scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator. Separate data are shown for career-long and single year impact. Metrics with and without self-citations and ratio of citations to citing papers are given. Scientists are classified into 22 scientific fields and 176 sub-fields. Field- and subfield-specific percentiles are also provided for all scientists who have published at least 5 papers. Career-long data are updated to end-of-2019. The dataset and code provides an update to previously released (version 1) data under https://doi.org/10.17632/btchxktzyw.1; The version 2 dataset is based on the May 06, 2020 snapshot from Scopus and is updated to citation year 2019. In addition to the time period and datacut update, it provides a longer list of authors: it also includes the top 2% for every subfield.
Files
Steps to reproduce
Code is provided with the dataset and runs on the ICSR Lab data sharing platform (https://www.elsevier.com/icsr/icsrlab) using Scopus data. It is written in python (pyspark) and can be used with other datasets on any pyspark platform.
Institutions
Categories
Additional metadata for Elsevier datasets
Date the data was collected | 2020-05-06T10:00:00.000Z |