Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleHow is the Data Lake different than replicating the VIP IMOS operational database to on-premise?

On-premise database replication facilitates direct queries and reports run against the replicated (on-premise) database, to feed BI and data warehouse solutions, or other in-house systems. The replication processes can be labor intensive and unreliable for large data sets, and the client must continue to modify their reporting and data transformation processes when the source Veson IMOS Platform database schema changes.

Data Lake solves this issue in two ways. First, Data Lake transforms the Veson IMOS Platform database extracts into Report Designer format prior to making the extracts available to clients, which makes them immune to schema changes. Second, the extracts are sent to the client on a daily or hourly basis, making database replication unnecessary.

Replication is also used traditionally for business continuity, for times when one data center is not available, clients can continue to access data in a replica. Amazon Web Services (AWS) provides redundancy that removes the likelihood of a single point of failure in any part of the Veson IMOS Platform stack (storage, compute, networking), so traditional on-premise replication for business continuity and disaster recovery is not necessary.

Expand
titleHow is the Data Lake different than taking a backup of the VIP IMOS operational database?

Backup and restore operations are typically undertaken in response to the corruption or loss of a server, or to archive the database data at regular intervals for analysis and/or compliance requirements.

Data Lake provides similar functionality by extracting the Veson IMOS Platform database tables on a daily or hourly basis and making them available to clients to import into their reporting, BI, or data warehouse solution. AWS provides the redundancy that removes the likelihood of a single point of failure in any part of the Veson IMOS Platform stack (storage, compute, networking), rendering backup and restore for data recovery unnecessary.

Expand
titleDoes Data Lake extract the entire VIP IMOS operational database each time it runs, or just incremental changes?

Each extraction from the Veson IMOS Platform operational database is a full load of the current database tables. After transformation, the extract files are compressed to ~90% of their original size, then delivered to the client’s directory for download.

...

Expand
titleDoes the Data Lake extract represent a complete copy of the VIP IMOS Data Dictionary?

No. The Data Dictionary is another name for our Report Designer schema (also known as the Data Map); it is the complete schema for BI reporting. The Data Lake uses a subset of this schema. Data Lake and Data Dictionary are different by design, since the Report Designer has predefined joins that are not relevant to a BI reporting scenario where you join the data yourself, outside of the system.