A Representative User-centric Dataset of 10 Million GitHub Developers
Beskrivning
Using GitHub APIs, we construct an unbiased dataset of over 10 million GitHub users. The data was collected between Jul. 20 and Aug. 27, 2018, covering 10,649,574 users, 118,602,740 commits, and 20,999,258 repositories. Each data entry is stored in JSON format, representing one GitHub user, and containing the descriptive information in the user’s profile page, the information of her commit activities and created/forked public repositories.
Visa merPubliceringsår
2018
Typ av data
Upphovspersoner
Fudan University - Medarbetare
Harvard Dataverse - Utgivare
University of Göttingen - Medarbetare
University of Helsinki - Medarbetare
Projekt
Övriga uppgifter
Vetenskapsområden
Data- och informationsvetenskap
Språk
Öppen tillgång
Öppet