skip to Main Content

What is data migration?

 

The question is not as obvious as it appears. At first blush, data migration could not be more straight forward. And, a multitude of choices for data migration tools exist.  Tools are available from legacy database providers, cloud vendors, open source databases, and data integration companies. Despite all these tools and choices, the results are frequently disappointing. According to Gartner, more than 50 percent of data migration projects will exceed budget and/or result in some form of business disruption due to flawed execution.  What is it about data migration that is not working?

Three factors about data migration are critical to the success of your next cloud migration project.

1. Data migration must be transactionally consistent

Kafka is a good example of a data pipeline that compromises on transaction semantics. Many Kafka sources and targets do not need to observe strict event ordering and transaction boundaries. However, when the source is transactional, failing to honor transaction semantics can destroy the data and lead to misleading results.  This is true even for analytics use cases on NoSQL databases.

For transactional source databases, Griddable preserves the order of transactions throughout its data pipeline.  Its relay process reads database redo logs using the system change number (SCN) assigned by the database. The relay publishes each change in SCN order using a high-speed internal buffer.  The Change History Service is the next component of the Griddable data pipeline.  The Change History Server persists changes, eliminating any need for the relay to read the redo log multiple times. Finally, the consumer process pulls changes in SCN order, either directly from the relay or from the Change History Server. The bottom line: the target in a Griddable data pipeline is always timeline-consistent with the source.

2. Your data migration does not need all the data 

Today nearly every database contains regulated data including personally-identifying information (PII). The penalties for data breaches are extremely punitive and the subsequent public exposure of a breach is even worse.  It’s illegal or extremely risky to copy PII, even with a Privacy Shield or binding corporate rule. As a result, data migration must include masking, encrypting, selectively removing or replacing regulated data. Masking is an extremely valuable technique because the migration process must ensure the continued effectiveness of analytic applications.

To comply with modern data privacy regulations, Griddable policies provide push-button controls to select the data migrated or transformed. Griddable easily masks or encrypts any number of individual data elements using separate algorithms or encryption keys.  It also filters and replaces data values, or selectively removes entire rows or columns.

3. Modernization is the big payoff

According to Oracle, the average company spends a staggering amount — from 60 to 85 percent of its IT budget — maintaining legacy applications that fail to meet the changing competitive needs of the business. By Forrester’s calculations, open source databases are capable of supporting 80% of all business applications, making them a viable alternative to commercial databases. Thus, modernization to open source databases, whether self or cloud-provider managed, can radically reduce IT costs in most enterprises.  

Therefore, its no surprise that companies like Amazon.com Inc. and Salesforce.com Inc., two of Oracle Corp.’s biggest customers, are actively working on data migration to replace Oracle with open-source database software. Amazon confirmed late this year that it would modernize to its own Aurora database by 2020, while Salesforce expects to be completely weaned of Oracle by 2023.

The process of modernizing usually requires data migration to multiple targets simultaneously.  Whether replacing a legacy Oracle database or re-architecting, the modernization process must efficiently copy data to multiple destinations and types.  Each type is usually different because applications each present distinct database requirements. Using Griddable, schema and data migration can be achieved through the Griddable policy engine.  Griddable rearchitects schemas in a monolithic database into multiple new databases which each contain the relevant data.

Next step

Of course, each organization needs to answer the question “what is data migration?”  To help, click the “Live Demo” button at the top of this page for a 10 minute, no-obligation tour.

Back To Top