2021 Hadoop-to-Cloud Migration Benchmark Report
By Tony Velcich, Jul 20, 2021
More than half of the Hadoop-to-cloud data migration projects happening today are achingly old-school, and as a result, painfully inefficient.
That’s a staggering statistic pointing to lost time and resources, and it’s just one of the insights that came out in the annual 2021 Hadoop-to-Cloud Migration Benchmark Report, which surveyed more than 200 C-level leaders (e.g. CIOs, CTOs), cloud and data architects, and data professionals. The subjects were all well acquainted with Hadoop, either because they are currently using it or they had previously migrated Hadoop data lakes to the cloud.
And the majority of the respondents admitted to relying on “old school” approaches and tools involving shipping data by truck to cloud vendors and/or using tools not designed for on-premises to cloud migration such as DistCp. That means that far too many companies are not taking advantage of technologies that can ease the transition and mitigate risks in moving to the cloud.
There’s clearly a disconnect because despite those goals, they are turning to clunky and inefficient methods of transfer -- even as they seek to become more streamlined and modern.
These same companies are moving their data to the cloud to increase flexibility and agility. They want to be able to crunch data and develop faster analytics. But there is clearly a disconnect, because despite those goals, they are looking to clunky and inefficient methods of transfer — even as they seek to become more streamlined and modern.
“Manual migration DistCp-based tooling and bulk transfer devices strain resources, add complexity and ultimately increase risks to the business,” said Van Diamandakis, SVP Marketing, WANdisco. “While customers are looking to accelerate time to business insights leveraging cloud-scale analytics, the survey says they are looking at the wrong ways to migrate their petabyte scale data. It’s a train wreck in the making. There is a much better way.”
Many leaders expect to live in a hybrid environment and are planning for multi-cloud data management to deliver business value.
These outdated approaches and manual tools such as bulk transfer devices and DistCp are not designed for modern migrations with large volumes of actively changing data. These manual approaches introduce business disruption or require unnecessary heroics to perform manual reconciliation or custom development to handle data changes that occurred during the data transfer or copy. Such migration techniques jeopardize the success of the migration projects and put the companies’ data and business at risk.
Hadoop-to-cloud migration key findings
In addition to the above, the key findings from the report were:
The next wave of Hadoop data migrations will be even larger. Having learned from the first wave of migrations led by large organizations, the next wave of companies to migrate will benefit both from lessons learned and more mature migration technology.
Migration concerns and business impacts can be solved. Each of the leading impacts to business disruption from companies planning, completing, and avoiding migration can be avoided with software designed to handle data changes and maintain security settings without the costly and risky development of custom code by sparse IT resources.
Top requirements emerge for Hadoop migration software. As organizational technology leaders set the strategy for their Hadoop migrations, the most requested requirements are 1) data migration validation, 2) support for multiple cloud targets, and 3) the ability to handle data changes without requiring operational downtime.
Companies are planning for hybrid and multi-cloud data management. Companies should have a mindset for justifying the Hadoop data migration software as an independent modern data management capability for delivering agility and ensuring data integrity across an active mix of on-premises and cloud environments.
The Report found that most IT leaders are pursuing cloud migration to lay the foundation for future business value creation. The three top drivers of migration to the cloud were:
data modernization initiatives;
cloud scale analytics; and
adopting scalable cloud shortage
This shows us that despite the use of outdated technologies, the appetite for more agile capabilities is alive and well.
The demand for cloud migration solutions that can move data to the cloud with zero downtime is clear. The challenge now lies in bringing companies into the future and encouraging more widespread adaptation to the technology. WANdisco’s LiveData Platform keeps geographically dispersed data at any scale consistent between on-premises and cloud environments allowing businesses to operate seamlessly in a hybrid or multi-cloud environment, with zero downtime and zero data loss.
The days when a company needed to resign themselves to downtime while migrating their data are over.
Download the 2021 Hadoop-to-Cloud Migration Benchmark Report
Tony is an accomplished product management and marketing leader with over 25 years of experience in the software industry. Tony is currently responsible for product marketing at WANdisco, helping to drive go-to-market strategy, content and activities. Tony has a strong background in data management having worked at leading database companies including Oracle, Informix and TimesTen where he led strategy for areas such as big data analytics for the telecommunications industry, sales force automation, as well as sales and customer experience analytics.