Amazon SageMaker is a "fully managed machine learning service". This means it provisions an environment for data scientists and developers without them needing to worry about managing servers.
Analytics projects are often treated as ad-hoc projects. Code and content are often managed in a version control system (git), but often without full release management. Deployment of infrastructure and releases are often done manually. In this post, we'll take a look at why it makes sense to manage your analytics projects as full-blown software development projects.
Although a lot of the components usually are in place, analytics teams often are reluctant to go the extra mile and automate every aspect of the project life cycle.
A first step is to automate infrastructure deployment, or to apply "Infrastructure as Code" (IaC). According to Wikipedia, "Infrastructure as code (IaC) is the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools".
Briefly put, this means that all your installation and (environment and release specific) configuration should be run from code (scripts, templates) and managed as software development projects. Taking this beyond just infrastructure, the benefits of treating "Everything as Code", including deployment, testing etc, significantly outweigh the downsides.
Let's look at a number of these benefits in more detail.
Quickly move from data to insight
Why move your BI to the cloud?
As discussed in a previous post, there are many reasons to move your BI to the cloud.
Security, being able to work from anywhere and delivering faster, with more resource flexibility and at a lower cost are just a few.
What is Amazon DMS
Every day, more and more companies are moving towards cloud computing, with Amazon Web Services (AWS) undoubtedly being the biggest player. Having all the possible AWS services available at your fingertips is great, but you still need to migrate your existing infrastructure and data into the (AWS) cloud. At re:Invent 2015, Amazon announced “AWS Database Migration Service”, aiming to make the process of moving data into databases on AWS a lot easier.
AWS DMS supports most open-source and commercial databases such as PostgreSQL, MySQL, MariaDB, Oracle, Microsoft SQL Server, and of course their own Aurora, Redshift, DynamoDB and S3 services. Both homogeneous (e.g. Postgres to Postgres) and heterogeneous migrations (e.g. Oracle to MySQL) are supported. Either the source or target database is required to be in the AWS cloud. DMS regularly gets updated with new features and supported engines.
At the highest level, you have three components to take care of when starting a migration using DMS: