Press Release

WANdisco Deepens Product Integration with Databricks to Accelerate Time to Value for Cloud-Scale Analytics

May 24 2021

SAN RAMON, CA, May 24, 2021 - WANdisco, the LiveData company, announced today that its LiveData Migrator platform, which automates the migration and replication of Hadoop data from on-premises to the cloud, can now automate the migration of Apache Hive metadata directly into Databricks to help users save time, reduce costs, and more quickly enable new AI and machine learning capabilities. For the first time, enterprises that want to migrate on-premises Hadoop and Spark content from Hive to Databricks can do so at scale and with high efficiency, while mitigating the many risks associated with large-scale cloud migrations.

  • Data sets do not need to be migrated in full before they are converted into the Delta format. LiveData Migrator automates incremental transformation to Delta Lake.
  • Accelerate time to business insights by eliminating the need for manual data mappings with direct, native access to structured data in Databricks from on-premises environments.
  • Use a single pane of glass to manage both Hadoop data and Hive metadata migrations.

Ongoing changes to source metadata are reflected immediately in Databricks’ Lakehouse platform, and on-premises data formats used in Hadoop and Hive are automatically made available in Delta Lake on Databricks. By combining data and metadata and making on-premises content immediately usable in Databricks, users can eliminate migration tasks that previously required constructing data pipelines to transform, filter and adjust data - along with the significant up-front planning and staging. Work that would otherwise be required for setting up auto-load pipelines to identify newly-landed data, and convert it to a final form as part of a processing pipeline are set aside.

“This new feature brings together the power of Databricks and WANdisco LiveData Migrator,” said WANdisco CTO Paul Scott-Murphy. “Data and metadata are migrated automatically without any disruption or change to existing systems. Teams can implement their cloud modernization strategies without risk, immediately employing workloads and data that were locked up on-premises, now in the cloud using the lakehouse platform offered by Databricks.”

“Enterprises want to break silos and bring all their data into a lakehouse for analytics and AI but they have been constrained by their on-premises infrastructure,” said Pankaj Dugar, Vice President of Product Partnerships at Databricks. “With the new Hive metadata capabilities in WANdisco’s LiveData Migrator, it will now be much easier to take advantage of Databricks’ Lakehouse Platform.”

LiveData Migrator automates cloud data migration at any scale by enabling companies to easily migrate data from on-premises Hadoop-oriented data lakes to any cloud within minutes, even while the source data sets are under active change. Businesses can migrate their data without the expertise of engineers or other consultants to enable their digital transformation. LiveData Migrator works without any production system downtime or business disruption while ensuring the migration is complete and continuous and any ongoing data changes are replicated to the target cloud environment.

Making Hive data and metadata available for direct use in Delta Lake in Databricks can be achieved by configuring LiveData Migrator to have a data migration target available for the chosen cloud storage and Databricks. Users choose to convert content to the Delta Lake format when they create the Databricks metadata target. The desired data to migrate is then set by defining a migration rule, and selecting the Hive databases and tables that require migration.

Learn more about successful strategies for Hadoop to Cloud migration at the Databricks Data+AI Summit 2021, with sessions including accelerating analytics on Databricks (Wed., May 26, 4:25 p.m. PT) and Spark-based analytics by minimizing barriers of Hadoop migration (Thurs., May 27,11:35 a.m. PT) presented by WANdisco, Databricks, Microsoft and Avanade.

Media Contact

Josh Turner

Silicon Valley Communications

+1 (917) 231-0550



Get notified of the latest WANdisco Blog posts and Newsletter.

Terms of Service and Privacy Policy. You also agree to receive other marketing communications from WANdisco and our subsidiaries. You can unsubscribe anytime.

14th - 17th February 2023 | FLORIDA

WANdisco Booth #154

About WANdisco

WANdisco is the first and only data activation platform for accelerating digital transformation at scale. WANdisco makes infinite data actionable across clouds and enterprises in real time. WANdisco customers unleash the business value of the cloud with zero downtime, data loss, or disruption to fuel AI and machine learning, create new services, and transform businesses. For more information about WANdisco, visit

Cookies and Privacy

At WANdisco, we respect your concerns about privacy and value the relationship that we have with you.

Like many companies, we use technology on our website to collect information that helps us enhance your experience and our products and services. The cookies that we use at WANdisco allow our website to work and help us to understand what information and advertising is most useful to visitors.

Please take a moment to familiarise yourself with our cookie practices and let us know if you have any questions by getting in touch through any of the methods listed on our "Contact Us" page.

We have tried to keep this Notice as simple as possible, but if you’re not familiar with terms, such as cookies, IP addresses, and browsers, then read about these key terms first.