Data Migrator for Azure
Data Migrator for Azure is a native Azure service that enables users to migrate petabyte-scale Hadoop data and Hive metadata to the Azure cloud with zero application downtime and zero risk of data loss even while the source data is under active change.
With Data Migrator for Azure, you can deploy and manage your data lake migrations using the same Azure management experience you enjoy today through the Azure Portal and Azure CLI.

Native Integration with Azure Resources
As part of the Azure offering, Data Migrator for Azure is deeply integrated with native Azure resources such as Azure Portal, CLI, Role Based Access Control, Active Directory, Azure Policy enforcement, and Activity log integration. This tight integration provides customers with a turnkey user experience similar to other native Azure services. Further integration with Azure Billing means customers are billed directly through Azure, so there is no need to add a new vendor contract or require additional vendor approvals.

“If you’re worried about data migration, we have you covered. With WANdisco Data Migrator for Azure, you can migrate production data from on-premises big data platforms to Azure Data Lake Storage with no application downtime and no risk of data loss, even when data sets are undergoing active change.”Priya Vijayarajendran, Vice President, Data & AI at Microsoft
Use Cases
Adopt Azure Data Lake Storage Gen2
ADLS is the only cloud storage service that is purpose-built for big data analytics. It is designed to integrate with a broad range of analytics frameworks, enabling a true enterprise Data Lake, maximized performance via true filesystem semantics, and scalability to meet the needs of the most demanding analytics workloads.
Cloud Migration without Downtime
Non-disruptive automated migration reduces the risks involved with large data migration initiatives and allows your users and systems to continue operating while migration is underway.
Cloud Analytics
Once in ADLS, the data is available to Azure analytics services such as HDInsight, Synapse, and Azure Databricks.
Key Features
Core service within Microsoft Azure
Deep integration with Azure resources enables Data Migrator for Azure to be deployed at the same time as other native Azure services and with an equivalent user experience.
Support for native Azure security and manageability
Data Migrator for Azure leverages Azure features such as Role Based Access Control, Active Directory, Azure Policy enforcement, and Activity Log integration.
Billing integration
Customers are billed through Azure, eliminating the need for you to add a new vendor contract or require additional vendor approvals.
Quick deployment and operation
The Data Migrator for Azure resource can be created directly from the Azure portal. The Data Migrator for Azure service is installed on an edge node of your Hadoop cluster. Deployment can be performed in minutes without impact to current operations, so users can begin migrations immediately.
Self-service user experience
Migrations are designed to be easy to configure and perform, requiring simple definition of your target environment and full control of exactly what data to migrate and what data to exclude.
Hadoop data and Hive metadata migration
Data Migrator for Azure supports migration of HDFS data and Hive metadata to Azure Data Lake Storage (ADLS) Gen2. Hive metadata can optionally be further transformed to Azure SQL Database metastore, Delta Lake format on Azure Databricks, or to Snowflake. See the Data Migrator for Azure documentation for details.
Complete and continuous migration
Migration of the selected data sets are performed with a single pass through the source storage system, eliminating the overhead of repeated scans while also supporting continuous migration of any ongoing changes from source to target, with zero disruption to current production systems.
Migration at any scale
Data Migrator for Azure migrates big data sets at any scale, from terabytes to multi-petabytes, without impact to current production environments. Begin risk-free for small migrations and scale up to multi-petabyte initiatives without needing any additional installation requirements.
Azure Portal and CLI live data extension
Users can manage the full data migration directly from the Azure portal. Additionally, Data Migrator for Azure can be configured and operated from the Azure CLI.
Configurability and control
You will have the ability to configure the migrations to meet your specific needs. Data Migrator for Azure includes standard configuration — such as defining sources, targets, and data to be migrated or excluded — as well as advanced capabilities such as path mapping, scheduling and network bandwidth management controls.
Metrics and monitoring
Data Migrator for Azure enables hands-off operations by providing information to keep you updated on the migration jobs, from health and status metrics that provide estimates for migration completion, files transferred over time, excluded paths, items that failed to transfer, as well as other real-time insights regarding usage.
More details on the above capabilities can be found in the Data Platform for Azure documentation.