Datasets within this collection
Filter Results
39 results
- August 2025 data-update for "Updated science-wide author databases of standardized citation indicators"Citation metrics are widely used and misused. We have created a publicly available database of top-cited scientists that provides standardized information on citations, h-index, co-authorship adjusted hm-index, citations to papers in different authorship positions and a composite indicator (c-score). Separate data are shown for career-long and, separately, for single recent year impact. Metrics with and without self-citations and ratio of citations to citing papers are given and data on retracted papers (based on Retraction Watch database) as well as citations to/from retracted papers have been added. Scientists are classified into 22 scientific fields and 174 sub-fields according to the standard Science-Metrix classification. Field- and subfield-specific percentiles are also provided for all scientists with at least 5 papers. Career-long data are updated to end-of-2024 and single recent year data pertain to citations received during calendar year 2024. The selection is based on the top 100,000 scientists by c-score (with and without self-citations) or a percentile rank of 2% or above in the sub-field. This version (7) is based on the August 1, 2025 snapshot from Scopus, updated to end of citation year 2024. This work uses Scopus data. Calculations were performed using all Scopus author profiles as of August 1, 2025. If an author is not on the list, it is simply because the composite indicator value was not high enough to appear on the list. It does not mean that the author does not do good work. PLEASE ALSO NOTE THAT THE DATABASE HAS BEEN PUBLISHED IN AN ARCHIVAL FORM AND WILL NOT BE CHANGED. The published version reflects Scopus author profiles at the time of calculation. We thus advise authors to ensure that their Scopus profiles are accurate. REQUESTS FOR CORRECIONS OF THE SCOPUS DATA (INCLUDING CORRECTIONS IN AFFILIATIONS) SHOULD NOT BE SENT TO US. They should be sent directly to Scopus, preferably by use of the Scopus to ORCID feedback wizard (https://orcid.scopusfeedback.com/) so that the correct data can be used in any future annual updates of the citation indicator databases. The c-score focuses on impact (citations) rather than productivity (number of publications) and it also incorporates information on co-authorship and author positions (single, first, last author). If you have additional questions, see attached file on FREQUENTLY ASKED QUESTIONS. Finally, we alert users that all citation metrics have limitations and their use should be tempered and judicious. For more reading, we refer to the Leiden manifesto: https://www.nature.com/articles/520429a
- Data for report "Artificial Intelligence: How knowledge is created, transferred, and used"There are the underlying data for our report "Artificial Intelligence: How knowledge is created, transferred, and used", published 2018. Data can be used to construct the graphs used in the report.
- Project: Fostering Transparent and Responsible Conduct of Research: What can Journals do?Description: Since their origin in the 17th century, publications in scientific journals have become the foundation of scholarly communication. Yet the publication process itself, duties and responsibilities of editors, and the preparation of manuscripts for submission have gone through many changes. The current drive towards study registration, sharing of protocols, manuscript pre-print, full transparency of reporting and the use of reporting guidelines, data sharing, and study replication, are seen as the future of scientific communication and methods of preventing scientific misconduct and undesirable research practices. The goals of this project are: 1) Study the current state of publication ethics, research integrity- and transparency-related policies of scholarly Journals (by analysing instructions to authors from a representative sample of journals in the humanities, social, natural, and life sciences); 2) Study the trends and changes in publication ethics, research integrity- and transparency-related policies of scholarly Journals (by conducting a systematic review of all studies indexed in MEDLINE, Web of Science and Scopus that have analysed instructions to authors of journals); 3) Study editors’, authors’ and reviewer’ perceptions and attitudes towards topics related to transparent and responsible conduct of research (by conducting large scale surveys, focus group, web-chats and acceleration room sessions); 4) Make (evidence-based) recommendations of how publishers and journals may implement publication principles and foster the integrity and transparency of research (by summarizing the evidence of the first 3 steps of the project). Team Members: Mario Malički ORCID iD: 0000-0003-0698-1930 IJsbrand Jan Aalbersberg ORCID iD: 0000-0002-0209-4480 Lex Bouter ORCID iD: 0000-0002-2659-5482 Gerben ter Riet ORCID iD: 0000-0002-2231-7637 Project collaborators: Ana Jerončić ORCID iD: 0000-0003-1621-1956 Adrian Mulligan Elsevier, Amsterdam, The Netherlands Funding This project was funded by Elsevier.
- CollectionCOVID-19: Vaccine, prevention, diagnosis & treatment datasetsWe selected Vaccine, prevention, diagnosis & treatment datasets indexed by the Mendeley Data Search engine on the 2019-present COVID-19 / Coronavirus pandemic. The aim was to make it easier to find potentially relevant datasets for this specific topic
- CollectionCOVID-19: Epidemiology & infectious modelling datasetsWe selected Epidemiology & infectious modelling datasets that are indexed by the Mendeley Data Search engine on the 2019-present COVID-19 / Coronavirus pandemic. The aim was to make it easier to find potentially relevant datasets for this specific topic.
- CollectionCOVID-19: Genetics, genomics & molecular structure datasetsWe selected Genetics, genomics & molecular structure datasets indexed by the Mendeley Data Search engine on the 2019-present COVID-19 / Coronavirus pandemic. The aim was to make it easier to find potentially relevant datasets for this specific topic
- CollectionCOVID-19: Public health, and societal and psychological impacts datasetsWe selected Public health, and societal and psychological impacts datasets indexed by the Mendeley Data Search engine on the 2019-present COVID-19 / Coronavirus pandemic. The aim was to make it easier to find potentially relevant datasets for this specific topic
- CollectionMendeley Data FAIRest DatasetsA collection of datasets published on Mendeley Data that recognize researchers or research groups who make their research data available for additional research and do so in a way that exemplifies the FAIR data principles – Findable, Accessible, Interoperable, Reusable. Datasets in this collection have been selected by Elsevier's independent Research Data Management Advisory Board. Read Elsevier's community blog - Elsevier Connect - to discover interviews from researchers who published these datasets. * Prof. Zhiyong Shao, Fudan University China: https://www.elsevier.com/connect/spotlighting-fair-data-and-the-researchers-behind-it * Prof Ricardo Sánchez-Murillo, UNA Costa Rica: https://www.elsevier.com/connect/we-dont-want-data-sitting-in-our-desk-says-tropical-cyclone-researcher * Dr. Vanessa Susini, University of Pisa, Italy: https://www.elsevier.com/connect/for-mendeley-data-winner-sharing-fair-data-helps-researchers-learn-from-each-other
- Elsevier OA CC-BY CorpusThis is a corpus of 40k (40,001) open access (OA) CC-BY articles from across Elsevier’s journals represent the first cross-discipline research of data at this scale to support NLP and ML research. This dataset was released to support the development of ML and NLP models targeting science articles from across all research domains. While the release builds on other datasets designed for specific domains and tasks, it will allow for similar datasets to be derived or for the development of models which can be applied and tested across domains.
- ChEMU dataset for information extraction from chemical patentsThe discovery of new chemical compounds and their synthesis process is of great importance to the chemical industry. Patent documents contain critical and timely information about newly discovered chemical compounds, providing a rich resource for chemical research in both academia and industry. Chemical patents are often the initial venues where a new chemical compound is disclosed. Only a small proportion of chemical compounds are ever published in journals and these publications can be delayed by up to 3 years after the patent disclosure. In addition, chemical patent documents usually contain unique information, such as reaction steps and experimental conditions for compound synthesis and mode of action. These details are crucial for the understanding of compound prior art, and provide a means for novelty checking and validation. Due to the high volume of chemical patents, approaches that enable automatic information extraction from these patents are in demand. To develop natural language processing methods for large-scale mining of chemical information from patent texts, a corpus is created providing chemical patent snippets and annotated entities and reaction steps.
1