- Quality Engineering Success Story
- Aug 13
Leveraging TOSCA DI in Data Migration from On-Prem to Azure Databricks
Leveraging TOSCA DI in Data Migration from On-Prem to Azure Databricks
Background:
A leading American distributor of gasoline embarked on a significant data migration initiative. The project involved transitioning 36 tables, containing substantial volumes of data (approximately 800 million records), from SQL Server (On-Prem) to Azure Databricks (On-Cloud). The goal was to ensure data integrity and quality while addressing data format issues, security, and compliance challenges.
Challenges:
The distributor faced several challenges during the migration process:
- Data Integrity and Data Loss: Ensuring that data remained consistent and accurate during the migration.
- Data Format Issues: Managing different data formats between the source and target systems.
- Data Security and Compliance: Ensuring that the migration met all necessary security and compliance requirements.
- Downtime and Business Continuity: Minimizing downtime to ensure business operations were not disrupted.
- Skill and Knowledge Gaps: Bridging the gaps in skills and knowledge required for the new cloud platform.
The Solution:
To address these challenges, Narwal implemented a comprehensive solution involving Tosca Data Integrity (DI):
- Pilot Testing: Conducted pilot testing to identify potential issues before a full-scale migration.
- Data Validation: Ensured data accuracy through schema comparison, row count comparison, and checksum/hash totals.
- Data Integrity Testing: Performed consistency checks and data quality assessments, including row-by-row comparisons of high-volume tables.
- Automation: Implemented Tosca DI to automate test cases for file-file, database-database, file-database, and file-API comparisons, facilitating a more efficient and reliable validation process.
Detailed Implementation:
- Data Size and Composition: The pilot table contained approximately 340 million rows and 209 columns, confirming its substantial size.
- Chunking Strategy: Due to the large volume, the data was chunked into smaller segments to facilitate row-by-row comparison. Queries were created to divide the data into segments of 10 million records each, resulting in 34 distinct test cases.
- Execution of Test Cases: 34 test cases were executed on three DEX machines.
- Hardware Specifications: Each DEX machine was equipped with 32 GB RAM, 500 GB disk space, and a quad-core processor, sufficient for handling large data sets and intensive processing tasks.
- Execution Time: The total execution time for all 34 test cases amounted to 84 hours.
Outcomes:
The implementation of Tosca DI and the automated data validation solution resulted in significant benefits for the distributor:
- Risk Reduction: Minimized the risk of data loss and integrity issues during migration.
- Efficiency: Reduced manual testing time significantly, achieving faster validation with full data coverage.
- Cost and Time Savings: Achieved 50% savings in cost and time on maintenance.
- Business Continuity: Ensured smooth transition and supported ongoing business operations without significant downtime.
- Data Quality: Improved data availability and self-service capabilities, ensuring better optimization on the cloud platform.
The successful validation and migration of 36 tables from SQL Server to Azure Databricks facilitated the management of 800 million records, ensuring data quality and integrity. This partnership with Narwal demonstrated the power of automation in data validation, enabling the distributor to focus on their core business goals and innovation.
Contact us today to unlock your business’s full potential and experience the benefits of automated data validation with Narwal and Tosca DI.
Related Posts
Intelligent Automation for Operational Excellence: Achieving Efficiency, Cost Savings, and Scalability
Intelligent Automation for Operational Excellence: Achieving Efficiency, Cost Savings, and Scalability Introduction As digital transformation accelerates, enterprises increasingly leverage intelligent automation to streamline complex workflows, increase productivity, and reduce costs. By combining artificial intelligence (AI),…
- Nov 12
Evolving Software Testing: Embracing Automation for Future Success
Evolving Software Testing: Embracing Automation for Future Success Unlike in the past, software testing is not just about identifying and troubleshooting application errors. Previously, the primary goal was to ensure high performance and quality in…
- Aug 20
Categories
“We’re an Al, Data, and Quality Engineering company “
Headquarters
8845 Governors Hill Dr, Suite 201
Cincinnati, OH 45249
Our Branches
Narwal | © 2024 All rights reserved