The Great Lift & Shift Myth
By Van Diamandakis, Feb 15, 2021 in Industry
As much as we’d like them to, most good things in life happen neither instantaneously nor perfectly. And as with life, there’s just no silver bullet for migration of Hadoop data and applications to cloud environments either.
No doubt the “lift and shift” camp will debate this – but it’s true. And luckily there are viable alternatives. In this post, we’ll drill down into what lift and shift is, and why it’s just not a realistic option for large-scale Hadoop data lake migrations to the cloud.
Lift and Shift Defined
Lift and shift (also known as “rehosting,” or “the forklift approach”) is when you migrate an exact copy of a workload or application – including its data store and OS – from one IT environment to another IT environment. Usually, this relocation is from on-prem to the cloud.
With a relatively short implementation timeline (6-18 months) and lower cost scale, lift and shift is still looked upon (incorrectly) as one of the fastest and easiest approaches to cloud migration.
The theory is attractive: you can either upload data or use a service such as AWS Snowball or Azure Data Box to transport data to the cloud. There’s no change required to application architecture or code—making it less labor-intensive. It’s kind of like moving day from office to office—after a period of downtime, everything is back to business as usual in the new digs. And lift and shift allows enterprises to avoid the overhead of modifying their analytics environments.
It’s great, in theory. And it worked, in the earlier days of cloud migration. But in today’s more complex ecosystems…it really is not a viable strategy for terabyte and petabyte-scale data migration and replication to the cloud, especially when these are production datasets under active change. Here’s why:
First off, the lift-and-shift method means that any technical debt - sub-optimal software systems and business processes, for example - in the on-premises analytics environment will be carried over into the cloud. This means that problems you had before will come back to haunt you in the new cloud environment and that your new-old cloud app will not take full advantage of new capabilities available in the cloud – not to mention suffering from unexpected costs and performance issues.
Next, the assumption that no changes are required in on-prem apps when they’re moved to the cloud is simply incorrect. While workloads that are semi-cloud-ready (apps built on microservices architecture, VMware workloads, containerized applications) may work when lifted and shifted – the simple fact is that the majority of lift and shift projects fail. Instead of gaining the efficiencies promised by the cloud, they don’t take advantage of new capabilities available to them and end up transferring shortcomings from their existing implementation to the new cloud environment.
Another issue with lift and shift is the migration process itself. Lift and shift requires downtime during migration to prevent data changes from occurring, and all applications drawing on a given data source must be cut-over at one time - a risky “big-bang” approach.
Moreover, to ensure a non-disruptive migration, the lift and shift approach requires some very specific technical capabilities. Your business needs to be able to copy the production environment without shutting down the data lake, then synchronize that copy with all the potentially millions of changes that happened while the data was being shipped to the cloud. And to avoid the risks of a big bang cutover, the business will also need to run their on-premises and cloud environments side-by-side for validation. Throughout the testing period, it will be crucial to ensure that data on-premises and in the cloud are consistent at all times – not a simple task with existing tools.
Lift and shift can’t deliver on the promise of a smooth and cost-effective migration to the cloud without application downtime and other risks to the business. There is a better way. An automated Hadoop migration project with zero business disruption, zero data loss, and zero risk is not an impossible task.
WANdisco LiveData Migrator enables cloud migration projects by helping businesses ensure data consistency across on-premises, hybrid-cloud, and even multi-cloud environments. Why does this matter? Because once we’ve rejected lift and shift, and embraced the need to redevelop existing applications and workloads on cloud-friendly platforms (which is not as bad as you might think) – the big question facing cloud migration is one of deployment. And gradual or staged deployment of a new application over live data depends on the availability of that data in the old environment (on-prem) and in the new environment (the cloud) simultaneously.
Hadoop data migration – like the best things in life - may not come as easily as we all wish. But advanced technology like LiveData Migrator provides a viable alternative to lift and shift – and one that enables enterprises to take full advantage of cloud economics as well as gleaning better business insights from machine learning-powered cloud analytics like Databricks, Snowflake, Microsoft Synapse, or Amazon EMR.