Mainframe Data Virtualization

Mainframes have been and continue to be the backbone of many mission-critical workloads. There is a high chance that the backend systems that gets invoked when you withdraw money for the ATM, make a hotel booking, reserve flight, submit your medical bills or create a tax claim are none other than these legacy systems. As a result, there is a wide variety of information that is stored on these systems. Organizations have been relying on these reliable and secure systems for decades and that’s why the applications and data on these systems are voluminous, complex, diverse, and have become indispensable.

With time, enterprises have stored data in multiple platforms, locations, and formats which has led to data fragmentation. The significant increase in types and amount of data that is stored in different systems makes it difficult to get a holistic view of the data estate. You can imagine data sources like mainframe, on-premises, IoT, Big Data, Cloud etc.

Mainframe data predates most data/applications and even uses a different character set EBCDIC versus ASCII. These differences (data distribution, character set, data dependency) have caused mainframe data hard and costly to access, thus having the potential to act as a barrier to innovation, enhanced customer experience, analytics, insights, reporting, auditing, and making informed decisions. To access or integrate the Mainframe data you can either keep the data in place or offload or replicate. In this technical blog we will focus on data virtualization where the data will not leave the Mainframe and will be accessed in multiple ways. We’ll start off by introducing what data virtualization is, followed by it’s benefits and common use cases.

What is Data Virtualization:

Data virtualization eases the methodology of accessing information from the mainframe, and makes it appear to external systems as just a database or web service call. In general terms, data virtualization provides a unified virtual representation of the data by integrating data from multiple disparate sources.

This allows the querying and manipulation of data from consolidated views on demand/batch. Data Virtualization software simplifies accessing the data, messages, and even transactional access by abstracting the storage, formats, and management complexities without moving the data from its original source system. Using this approach, we can enable external clients to integrate with mainframe systems using products like IBM DVM (Data Virtualization Manager) or Rocket RDV (Rocket Data Virtualization). These products provide an extensive list of data access and transformation capabilities that can be customized based on customer requirements.

So, imagine data from multiple siloed sources combined and transformed (if needed) for deeper and broader analysis, data insights, accessibility to distributed applications driving business decisions, revenue, and competitive edge. The users don’t have to request data from multiple channels like web portal, mobile apps, emails, service tickets etc. rather they get a 360-degree view and access to the information at their fingertips. The user need not care where the data is coming from, it can be a combination of structured or unstructured data types like DB2 z/OS, IMS DB, SQL, Big data, flat files, Cloud data, Data Lakes etc. Let’s discuss some of the benefits in more detail below.

Benefits:

Simplify data access.

Virtualization products offer a wide variety of access methods/drivers (JSON, ODBC, JDBC, REST, SOAP, SQL etc.) to simplify Mainframe data access, as well as capture and publish mainframe data events. Once the Mainframe data sources are defined and declared, the virtualization server can perform simple to complex data queries from multiple data types and allow it to integrate with external applications.

Cost effective.

There are many ways data virtualization can offer cost savings. For e.g., customers can save cost by replacing/optimizing ETL processes. Also, the data virtualization solution offloads processing on *zIIPs (zSystems Integrated Information Processor) wherever applicable, resulting in MIPS (Million Instructions Per Second) reduction. And as we know lower MIPS means lower cost. The processes running on GPPs will not be affected with this orchestration. Additionally, the solution if used for native applications can help reduce the current MIPS consumption.

*zIIP is a mainframe specialty engine that can bypass the mainframe General Purpose Processor engines depending on the operation. zIIP engines can deliver high performance at lower cost by handling specialized workloads like complex data queries over Java, Linux, and DB2.

Secure data access.

Data is secure at rest and in-transit. In transit, the data is encrypted using SSL along with TLS 1.2. Additionally, at rest, the security and access credentials of the calling user is based on the enterprise security manager in use (IBM RACF®, ACF2, and Top Secret). User can be granted/denied the request depending on what operations/roles the user is permitted to perform.

Supports multiple Mainframe data types.

The data virtualization product can be used to process, format, and combine data from different sources on Mainframe. Some of those data sources are listed below.

Structured

Semi-structured

Operational

DB2
Adabas/Natural
IDMS

IMS

MQ
VSAM
Sequential

Syslog
SMF (System Management Facilities)
Tape

Data abstraction for external users

Once virtualized and formatted as per applications, the user need not know all the data sources, complex processing/formatting, infrastructure etc. To the applications they are connecting to the data source as they would normally do it for any native source. To create virtual tables/views, query data, create data, and perform other management tasks users with required level of access can use IDEs like Rocket Data Virtualization Studio and IBM Data Virtualization Manager Studio.

Join data on and off Mainframe.

Some of the data virtualization tools allow you to join sources that are close to the Mainframe. The data can be merged, formatted, or transformed before sending out or it can be sent in its original form and later combined with other data sources outside Mainframe. This can help optimize/reduce the current ETL processes where the data through long running batch processes is first collected and later shared externally which can become time-consuming, inconsistent, and costly. This data can be utilized further to create reports, analytics, and complex queries.

Achieve high/similar performance.

Data virtualization products are designed to mitigate performance issues that may arise given the volume and variety of data. The data optimization features, such as parallel I/O , enable multithreading so that complex query results are received in chunks of data in parallel and therefore reduce the elapse time.

Use cases:

Now since we covered what DV is and what are its advantages, let’s touch upon the possible scenarios where DV might be a great fit. This is not a complete list, and many times organizations get creative and find unique business cases to leverage the tools.

Common scenarios:

Application first modernization
Security or compliance requirements where the data can’t leave Mainframe or needs to meet certain security standards,
Transactional data access
ETL replacement/optimization
Augment data warehouse
Near real time/real time analytics/data insights.

Conclusion:

Using data virtualization on mainframe unlocks the mainframe data by providing simplified and secure data access. It abstracts data and other management complexities without data movement or replication. The data is now available to forecast or capture current market/business trends, equip your workforce to make more data driven decisions and personalization and support your modernization/migration business decisions. Using this new capability, enterprises can enable various types of solutions/data access to mainframe data that has been hard/impossible. The mainframe can now participate in your corporate data ecosystem and help enterprises drive a stronger and fulfilling experience. To learn more please reach out to [email protected].

Published on: July 07, 2023

Learn more

Azure Migration and Modernization Blog articles

Blog image

Azure Migration and Modernization Blog articles

Learn more

Mainframe Data Virtualization

Related posts