Analytics projects are often treated as ad-hoc projects. Code and content are often managed in a version control system (git), but often without full release management. Deployment of infrastructure and releases are often done manually. In this post, we'll take a look at why it makes sense to manage your analytics projects as full-blown software development projects.
Although a lot of the components usually are in place, analytics teams often are reluctant to go the extra mile and automate every aspect of the project life cycle.
A first step is to automate infrastructure deployment, or to apply "Infrastructure as Code" (IaC). According to Wikipedia, "Infrastructure as code (IaC) is the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools".
Briefly put, this means that all your installation and (environment and release specific) configuration should be run from code (scripts, templates) and managed as software development projects. Taking this beyond just infrastructure, the benefits of treating "Everything as Code", including deployment, testing etc, significantly outweigh the downsides.
Let's look at a number of these benefits in more detail.
- reduced cost: Automation reduces manual labor and thus saves time and ultimately money. The initial cost of developing the code to deploy infrastructure, code and content pays itself back quickly by being able to repeatedly launch completely configured servers and deploy code.
- Improved speed of execution: just like with cost, there is an initial investment that needs to be made to build the code to deploy servers, code and content. After this initial investment, being able to quickly spin up a lab environment or mimic your production environment to test a bug fix will be a lot faster than doing the same tasks by following manual procedures.
- reduced risk: manual procedures are prone to errors. Taking the human factor out of the equation makes the entire deployment process a lot more reliable. Once automated, all of the processes in your analytics project become easily repeatedly, a lot more reliable, scalable etc.
In summary, running your entire project "as code" requires an initial investment in time and skill development, but will provide quick returns. The benefits of Everything as Code are even bigger in cloud environments, where infrastructure (AWS CloudFormation), and code deployment (AWS CodeDeploy) can be completely scripted.
There are a lot of different technologies to automate everything (Chef, Puppet, Ansible, Juju, ...), all of which have slightly different angles, but whichever tool or framework you choose, the key takeaway is that automating everything will save you time, money and a lot of headaches.
know.bi has experts in AWS CloudFormation, AWS CodeDeploy and Puppet, which will be the subject of more detailed and more technical follow-up posts to this one. Stay tuned!