By Tal Galili
tl;dr: People contribute to R for various reasons, which evolves with time. The main reasons appear to be: “fun coding”, personal commitment to the community, interaction with like-minded and/or important people – leading to higher self-esteem, future job opportunities, a chance to express oneself and enjoyable social inclusion.
From the abstract
One of the cornerstones of the R system for statistical computing is the multitude of packages contributed by numerous package authors. This amount of packages makes an extremely broad range of statistical techniques and other quantitative methods freely available. Thus far, no empirical study has investigated psychological factors that drive authors to participate in the R project. This article presents a study of R package authors, collecting data on different types of participation (number of packages, participation in mailing lists, participation in conferences), three psychological scales (types of motivation, psychological values, and work design characteristics), and various socio-demographic factors. The data are analyzed using item response models and subsequent generalized linear models, showing that the most important determinants for participation are a hybrid form of motivation and the social characteristics of the work design. Other factors are found to have less impact or influence only specific aspects of participation.
Summary of results
R developers, statisticians, and psychologists from Harvard University, University of Vienna, WU Vienna University of Economics, and University of Innsbruck empirically studied psychosocial drivers of participation of R package authors. Through an online survey they collected data from 1,448 package authors. The questionnaire included psychometric scales (types of motivation, psychological values, work design), sociodemografic variables related to the work on R, and three participation measures (number of packages, participation in mailing lists, participation in conferences).
The data were analyzed using item response models and subsequently generalized linear models (logistic regressions, negative-binomial regression) with SIMEX corrected parameters.
The analysis reveals that the most …read more
Source:: r-statistics.com