Supplementary Data for "Differential correction of gender imbalance for top-cited scientists across scientific subfields over time"

Name: Supplementary Data for "Differential correction of gender imbalance for top-cited scientists across scientific subfields over time"
Creator: Thomas Collins
Published: 2023-10-13T06:14:09.211Z
Keywords: Bibliometrics

Collins, Thomas; Ioannidis, John; Boyack, Kevin; Baas, Jeroen

doi:10.17632/wwykk8d48g.3

Supplementary Data for "Differential correction of gender imbalance for top-cited scientists across scientific subfields over time"

Published: 13 October 2023| Version 3 | DOI: 10.17632/wwykk8d48g.3

Contributors:

, John Ioannidis, Kevin Boyack, Jeroen Baas

Description

A look at gender imbalance amongst top cited authors. The term "breakdown" as it appears here means aggregate counts in the following list: author_count_total, author_count_top_2_pct, author_count_top_2_pct_sy, author_count_total_male, author_count_top_2_pct_male, author_count_top_2_pct_sy_male, author_count_total_female, author_count_top_2_pct_female, author_count_top_2_pct_sy_female, author_count_total_unknown, author_count_top_2_pct_unknown, author_count_top_2_pct_sy_unknown. The analysis done produces the following computed field: Femaleprop: The number of female authors in the top 2pct of cited authors over all genderized authors in the top 2pct of cited author for entire career Femalepropsy: The number of female authors in the top 2pct of cited authors in 2021 over all genderized authors in the top 2pct of cited authors for entire career calculated for 2021 only Difference: Femalepropsy - Femaleprop Fmtoppropensity: For each cohort and subfield the product of (total male authors/total number of female authors) * (total female authors in the top 2pct of cited authors in 2021/ total male authors in the top 2pct of cited authors in 2021) or more simply. total male to female ratio * ratio of female to male authors int the top 2pct of cited authors in 2021.

Files

Steps to reproduce

Code is provided with the dataset. Underlying datasets are generated by the code in the related link and this runs on the ICSR Lab data sharing platform (https://www.elsevier.com/icsr/icsrlab) using Scopus data. It is written in python (pyspark) and can be used with other datasets on any pyspark platform.

Institutions

Elsevier BV, Stanford University

Additional metadata for Elsevier datasets

Date the data was collected

2022-09-01T00:00:00.000Z

Supplementary Data for "Differential correction of gender imbalance for top-cited scientists across scientific subfields over time"

Description

Files

Steps to reproduce

Institutions

Categories

Additional metadata for Elsevier datasets

Related Links

Licence