Back to Search
Start Over
SiSOB data extraction and codification: A tool to analyze scientific careers
- Source :
- Geuna, A, Kataishi, R, Toselli, M, Guzmán, E, Lawson, C, Fernández-Zubieta, A & Barros, B 2015, ' SiSOB data extraction and codification: A tool to analyze scientific careers ', Research Policy, vol. 44, no. 9, pp. 1645-1658 . https://doi.org/10.1016/j.respol.2015.01.017
- Publication Year :
- 2015
- Publisher :
- Elsevier BV, 2015.
-
Abstract
- This paper describes the methodology and software tool used to build a database on the careers and productivity of academics, using public information available on the Internet, and provides a first analysis of the data collected for a sample of 360 US scientists funded by the National Institute of Health (NIH) and 291 UK scientists funded by the Biotechnology and Biological Sciences Research Council (BBSRC). The tool’s structured outputs can be used for either econometric research or data representation for policy analysis. The methodology and software tool is validated for a sample of US and UK biomedical scientists, but can be applied to any countries where scientists’ CVs are available in English. We provide an overview of the motivations for constructing the database, and the data crawling and data mining techniques used to transform webpage-based information and CV information into a relational database. We describe the database and the effectiveness of our algorithms and provide suggestions for further improvements. The software developed is released under free software GNU General Public License; the aim is for it to be available to the community of social scientists and economists interested in analyzing scientific production and scientific careers, who it is hoped will develop this tool further.
- Subjects :
- Mobility of research scientists
Academic careers
Extraction and data integration
Information retrieval
Research productivity
Engineering (all)
Management of Technology and Innovation
Strategy and Management1409 Tourism, Leisure and Hospitality Management
Management Science and Operations Research
Computer science
Relational database
business.industry
Strategy and Management
Strategy and Management1409 Tourism
Leisure and Hospitality Management
Sample (statistics)
External Data Representation
Policy analysis
Data science
Software
Data extraction
The Internet
business
License
Engineering(all)
Subjects
Details
- ISSN :
- 00487333
- Volume :
- 44
- Database :
- OpenAIRE
- Journal :
- Research Policy
- Accession number :
- edsair.doi.dedup.....b1b55495ab788d8d895aa8a0551bd701