NEWS Coverage

WANdisco introduces Hive metadata migration to Databricks

May 28 2021

WANdisco introduced a new capability to its LiveData Migrator product that allows customers to move Apache Hive metadata into Databricks.

Customers trying to migrate on-premises Hadoop to Databricks in the cloud can now use LiveData Migrator's new capability to ensure full functionality at the target. LiveData Migrator incrementally converts Hive metadata to the Delta format during the migration, so relationships between the Hadoop data are maintained once it lands on Databricks.

Previously, when migrating Hadoop data to a cloud data warehouse, LiveData Migrator moved only the raw data itself -- not the dependencies between them. In order to ensure that their applications still work once they're in the cloud, customers would have to manually re-establish those relationships through a labor-intensive process of rewriting Hadoop code into the new cloud architecture.

WANdisco now automates that transformation for Databricks, making Hadoop and Hive data immediately available in Delta Lake on Databricks.

"It's not enough to just move the data," said WANdisco CEO David Richards.

Transforming the Hadoop and Hive data as it lands means customers can use their new cloud-based data and applications much faster and without the risk of failure inherent to doing it manually, Richards said. Customers' Hadoop environments tend to be in the petabyte (PB) scale, which makes the cloud migration task even more difficult. Customers recognize migration is essential, as the alternative is to keep buying hardware to support the environment's growth -- something that will eventually become unsustainable, Richards said.

LiveData Migrator can reflect changes in the data source into the data target mid-migration, and it can do the same with Hive metadata migrations. With all ongoing changes getting captured, customers don't have to take down their production environments to perform a migration. Some WANdisco customers handle millions of transactions per second, according to Richards, making any amount of downtime unfeasible.

LiveData Migrator's Hive metadata migration capability currently works only with Databricks, but WANdisco is working on extending it to Snowflake. WANdisco targeted Databricks first because that's where most Hadoop users are migrating, Richards said.

Most migration tools move only the data, making WANdisco's new capability relatively unique. Next Pathway is another migration vendor that can perform PB-scale migration to cloud data warehouses while keeping data dependencies intact.

The journey to the cloud for Hadoop environments is "pretty inevitable," said Merv Adrian, research vice president at Gartner. There comes a time with every environment where customers weigh the cloud cost against the depreciation of their hardware. For large-scale Hadoop environments, the cloud provides a greater value proposition.

quotemark
“There are a lot of people with lots of nodes of Hadoop, and this is de-risking a process people are worried about.”
 Merv Adrian, Research vice president, Gartner 
Gartner

Making the move to the cloud is the tricky part, Adrian said. It's a time-consuming, manual, risky and disruptive process. It's also a one-way movement, making it highly unlikely any organization has staff members who are experts on performing the migration. A third-party vendor would have that expertise and is the safest option, making WANdisco well-positioned to address an emerging market, Adrian said.

"There are a lot of people with lots of nodes of Hadoop, and this is de-risking a process people are worried about," Adrian said.

The biggest hurdle to Hadoop migration is that it's a high-transaction environment, so the rate of change for its data is very high, Adrian added. One of WANdisco's greatest benefits is that it can allow the environment to operate normally while the migration is happening.

Read more

FOLLOW

SUBSCRIBE

Get notified of the latest WANdisco Blog posts and Newsletter.

21st - 21st September 2021 | Webcast

Accelerate Hadoop to Azure Migrations in Financial Services

06th - 06th October 2021 | Webcast

Accelerate Your Move from Hadoop to AWS Cloud Analytics

20th - 20th October 2021 | Webcast

Accelerate Your Move from Hadoop to Azure Cloud Analytics

Cookies and Privacy

At WANdisco, we respect your concerns about privacy and value the relationship that we have with you.

Like many companies, we use technology on our website to collect information that helps us enhance your experience and our products and services. The cookies that we use at WANdisco allow our website to work and help us to understand what information and advertising is most useful to visitors.

Please take a moment to familiarise yourself with our cookie practices and let us know if you have any questions by getting in touch through any of the methods listed on our "Contact Us" page.

We have tried to keep this Notice as simple as possible, but if you’re not familiar with terms, such as cookies, IP addresses, and browsers, then read about these key terms first.