Don’t Take the Cloud Migration Death March Using “Free” Tools
By Van Diamandakis, Jun 11, 2020 in Industry
Let’s consider two simple statements:
Enterprises want to migrate their business-critical Hadoop data to the cloud to cut costs and improve business insights; and
Enterprises can’t afford this process to take months or even years, cost seven figures working with a systems integrator and an army of internal staff or constantly block access to business critical applications.
Until now, points a) and b) were incompatible, if not mutually exclusive. Which is why over 200 Exabytes of analytics data is stuck on premise in Hadoop Clusters. And with Cloudera's future uncertain, it’s time to consider cloud migration to cut costs and take advantage of better tooling, flexibility and machine learning powered cloud analytics to make smarter business decisions in these uncertain economic times.
The fact is that enterprises want to fully embrace the benefits of big data in a cloud or hybrid cloud environment. They’re seeking the benefits of increased productivity, security and global data access for employees.
And most enterprises are already in the cloud. According to a recent Gartner survey, 94% of IT professionals responding were using at least one public or private cloud (McAfee put that number at 97%), and 84% had adopted a multi-cloud solution. Of those using a public cloud solution, 80% were using more than one service provider. And IDG found that 41% of enterprises are already migrating non-business critical data to the cloud and 21% plan to do so in the coming year. The migration of static data is easy. Phase two is about the migration and replication of business critical data under active change. The winners, like AMD, Daimler and Starbucks have done this and are reaping the benefits including Spark-based cloud analytics.
Yet until now enterprises have been hard-pressed to overcome the major hurdle of moving their large-scale on-prem data to the cloud without either losing data consistency or losing business. Large-scale data migration has thwarted CIOs for years because of the “live” nature of business data - which changes minute-to-minute and needs to be available 24/7 without interruption.
The Old Way—The Death March
Using yesterday’s “open source” migration tools like DistCP, which is the underlying migration technology in Cloudera CDP, attempts to migrate data by copying it. The problem is that, at scale, copying petabytes of data takes time. A lot of time. Since live business data changes during copying, a multi-pass methodology is required that dictates a re-scan following copying, which discovers the changes that occurred during copying and rectifies them in the copy. This pass can be lengthy for large data sets. And of course, while this re-scan is underway, additional changes occur to the live data…
You Can Pay Me Now, or Pay Me Later
One of the world’s leading car companies went down the route of using the “Free” open source migration tools to do a large scale cloud migration of their business critical customer data. After months and missteps, they brought in a large systems integrator. Paid them a lot of money but still the project stalled. Finally, WANdisco was brought in to help. The migration was successfully completed within 90 days with zero disruption to business operations. This is a very common scenario and reminds me of the famous Fram Oil Filter commercials with the tagline, “You can pay me now or you can pay me later.” In other words, don’t be fooled by “free” migration tools because you get what you pay for.
There is a better way to do Live Data Migrations
At the core of WANdisco’s LiveData Migrator solution is unique technology that enables migration of changing data in a single pass with guaranteed consistency. This means that during a cloud migration, LiveData Migrator enables applications to continue to fully access, ingest, and update data.
LiveData Migrator’s patented technology enables seamless migration of petabytes of unstructured data from on-premise data centers to any cloud vendor in one pass. Even as data is moving to the cloud, applications continue to access the existing on-prem environment, while users can choose to direct new workloads or queries at cloud assets.
What’s more, LiveData Migrator automatically keeps on-premise data consistent with migrated cloud-based data - forming a hybrid cloud environment - while still complying with strict availability and performance service level agreements. And LiveData Migrator requires no changes to the dependent application interface and does not impede application performance.
The Bottom Line
Migrating big data to the cloud requires skill and expertise – and 62% of big data migration efforts are harder to complete than expected or fail. Data migration obstacles – like how to rapidly and safely move petabyte-scale big data to the cloud without stopping business – have been the stumbling block to innovation long enough. At WANdisco we decided to bring this to an end. Now, LiveData Migrator changes everything.
About the author
Van Diamandakis, SVP of Marketing, WANdisco
Van is a proven Silicon Valley technology executive with over 25 years of operational experience that draws upon his track record leading global marketing transformations, driving to meaningful financial events including IPOs and acquisitions. Van has been at the forefront of B2B technology marketing and brings a unique ability to marry creativity, data, technology and leadership skills to rapidly build brand equity and successfully navigate tech companies through inflection points, accelerating revenue growth and valuation.
Recent Blog Posts
WANdisco LiveData Migrator will democratize and accelerate data lake migration to the cloud with zero downtime
By Van Diamandakis