By Mark Sellors
By Mark Sellors, Technical Architect – Mango Solutions
As more and more Data Science moves from individuals working alone, with small data sets on their laptops, to more productionised, or analytically mature settings, an increasing number of restrictions are being placed on Data Scientists in the workplace.
Perhaps, your organisation has standardised on a particular version of Python or R, or perhaps you’re using a limited subset of all available big data tools. This sort of standardisation can be incredibly empowering for the business. It ensures all analysts are working with a common set of tools and allows analyses to be run anywhere across the organisation It doesn’t matter if it’s a laptop, server, or a large-scale cluster, Data Scientists and the wider business, can be safe in the knowledge that the versions of your analytic tools are the same in each environment.
While incredibly useful for the business, this can, at times, feel very restricting for the individual Data Scientist. Maybe you want to try a new package that isn’t available for your ‘official’ version of R, or you want to try a new tool or technique that hasn’t made it into your officially supported environment yet. In all of these instances a Data Science Lab or Analytic Lab environment can prove invaluable to maintain pace with the fast paced data science world outside of your organisation.
An effective lab environment should be designed from the ground up to support innovation, both with new tools as well as new techniques and approaches. For the most part it’s rare that any two labs would be the same from one organisation to the next, however, the principles behind the implementation and operation are universal. The lab should provide a sandbox of sorts, where Data Scientists can work to improve what they do currently, …read more
Source:: r-bloggers.com