Supplementary Data for "Differential correction of gender imbalance for top-cited scientists across scientific subfields over time"

Published: 13 October 2023| Version 3 | DOI: 10.17632/wwykk8d48g.3
Contributors:
, John Ioannidis, Kevin Boyack, Jeroen Baas

Description

A look at gender imbalance amongst top cited authors. The term "breakdown" as it appears here means aggregate counts in the following list: author_count_total, author_count_top_2_pct, author_count_top_2_pct_sy, author_count_total_male, author_count_top_2_pct_male, author_count_top_2_pct_sy_male, author_count_total_female, author_count_top_2_pct_female, author_count_top_2_pct_sy_female, author_count_total_unknown, author_count_top_2_pct_unknown, author_count_top_2_pct_sy_unknown. The analysis done produces the following computed field: Femaleprop: The number of female authors in the top 2pct of cited authors over all genderized authors in the top 2pct of cited author for entire career Femalepropsy: The number of female authors in the top 2pct of cited authors in 2021 over all genderized authors in the top 2pct of cited authors for entire career calculated for 2021 only Difference: Femalepropsy - Femaleprop Fmtoppropensity: For each cohort and subfield the product of (total male authors/total number of female authors) * (total female authors in the top 2pct of cited authors in 2021/ total male authors in the top 2pct of cited authors in 2021) or more simply. total male to female ratio * ratio of female to male authors int the top 2pct of cited authors in 2021.

Files

Steps to reproduce

Code is provided with the dataset. Underlying datasets are generated by the code in the related link and this runs on the ICSR Lab data sharing platform (https://www.elsevier.com/icsr/icsrlab) using Scopus data. It is written in python (pyspark) and can be used with other datasets on any pyspark platform.

Institutions

Elsevier BV, Stanford University

Categories

Bibliometrics

Licence