AWS Mainframe Modernization Service (Managed Runtime Environment experience) is no longer open to new customers. For capabilities similar to AWS Mainframe Modernization Service (Managed Runtime Environment experience) explore AWS Mainframe Modernization Service (Self-Managed Experience). Existing customers can continue to use the service as normal. For more information, see [AWS Mainframe Modernization availability change](https://docs.aws.amazon.com/m2/latest/userguide/mainframe-modernization-availability-change.html).

# AWS Transform for mainframe Runtime concepts
<a name="ba-shared-concept"></a>

Understanding the basic concepts of the AWS Transform for mainframe Runtime can help you understand how your applications are modernized with automated refactoring.

**Topics**
+ [AWS Transform for mainframe Runtime high level architecture](ba-shared-architecture.md)
+ [AWS Transform for mainframe structure of a modernized application](ba-shared-structure.md)
+ [What are data simplifiers in AWS Transform for mainframe](ba-shared-data.md)
+ [AWS Transform for mainframe Blusam](ba-shared-blusam.md)
+ [AWS Transform for mainframe Blusam Administration Console](ba-shared-bac-userguide.md)

# AWS Transform for mainframe Runtime high level architecture
<a name="ba-shared-architecture"></a>

As a part of the AWS Transform for mainframe solution for modernizing legacy programs to Java, the AWS Transform for mainframe Runtime provides a unified, REST-based entry point for modernized applications, and a framework of execution for such applications, through libraries providing legacy constructs and a standardization of programs code organization.

Such modernized applications are the result of the AWS Transform for mainframe Automated Refactor process for modernizing mainframe and midrange programs (referred to in the following document as "legacy") to a web based architecture.

The AWS Transform for mainframe Runtime goals are reproduction of legacy programs behavior (isofunctionality), performances (with respect to programs execution time and resources consumption), and ease of maintenance of modernized programs by Java developers, though the use of familiar environments and idioms such as tomcat, Spring, getters/setters, fluent APIs.

**Topics**
+ [AWS Transform for mainframe runtime components](#ba-shared-architecture-components)
+ [Execution environments](#ba-shared-architecture-environments)
+ [Statelessness and session handling](#ba-shared-architecture-stateless)
+ [High availability and statelessness](#ba-shared-architecture-stateless-ha)

## AWS Transform for mainframe runtime components
<a name="ba-shared-architecture-components"></a>

The AWS Transform for mainframe Runtime environment is composed of two kinds of components:
+ A set of java libraries (jar files) often referenced as "the shared folder", and providing legacy constructs and statements.
+ A set of web applications (war files) containing Spring-based web applications providing a common set of frameworks and services to modernized programs.

The following sections detail the role of both of these components.

### AWS Transform for mainframe libraries
<a name="shared-architecture-components-libraries"></a>

The AWS Transform for mainframe libraries are a set of jar files stored in a `shared/` subfolder added to the standard tomcat classpath, so as to make them available to all modernized Java programs. Their goal is to provide features that are neither natively nor easily available in the Java programming environment, but typical of legacy development environments. Those features are exposed in a way that is as familiar as possible to Java developers (getters/setters, class-based, fluent APIs). An important example is the **Data Simplifier** library, which provides legacy memory layout and manipulation constructs (encountered in COBOL, PL1 or RPG languages) to Java programs. Those jars are a core dependency of the modernized Java code generated from legacy programs. For more information about the Data Simplifier, see [What are data simplifiers in AWS Transform for mainframe](ba-shared-data.md).

### Web application
<a name="ba-shared-architecture-components-webapp"></a>

Web Application Archives (WARs) are a standard way of deploying code and applications to the tomcat application server. The ones provided as part of the AWS Transform for mainframe runtime aim at providing a set of execution frameworks reproducing legacy environments and transaction monitors (JCL batches, CICS, IMS...), and associated required services.

The most important one is `gapwalk-application` (often shortened as "Gapwalk"), which provides a unified set of REST-based entry points to trigger and control transactions, programs and batches execution. For more information, see [AWS Transform for mainframe Runtime APIs](ba-runtime-endpoints.md).

This web application allocates Java execution threads and resources to run modernized programs in the context for which they were designed. Examples of such reproduced environments are detailed in the following section.

Other web applications add to the execution environment (more precisely, to the "Programs Registry" described below) programs emulating the ones available to, and callable from, the legacy programs. Two important categories of such are:
+ Emulation of OS-provided programs: JCL-driven batches especially expect to be able to call a variety of file and database manipulating programs as part of their standard environment. Examples include `SORT`/`DFSORT` or `IDCAMS`. For this purpose, Java programs are provided that reproduce such behavior, and are callable using the same conventions as the legacy ones.
+ "Drivers", which are specialized programs provided by the execution framework or middleware as entry points. An example is `CBLTDLI`, which COBOL programs executing in the IMS environment depend on to access IMS-related services (IMS DB, user dialog through MFS, etc.).

### Programs registry
<a name="ba-shared-architecture-components-registry"></a>

To participate in and take advantage of those constructs, frameworks and services, Java programs modernized from legacy ones adhere to a specific structure documented in [AWS Transform for mainframe structure of a modernized application](ba-shared-structure.md). At startup, the AWS Transform for mainframe Runtime will collect all such programs in a common "Programs Registry" so that they can be invoked (and call each other) afterwards. The Program Registry provides loose coupling and possibilities of decomposition (since programs calling each other do not have to be modernized simultaneously).

## Execution environments
<a name="ba-shared-architecture-environments"></a>

Frequently encountered legacy environments and choreographies are available:
+ JCL-driven batches, once modernized to Java programs and Groovy scripts, can be started in a synchronous (blocking) or asynchronous (detached) way. In the latter case, their execution can be monitored through REST endpoints.
+ A AWS Transform for mainframe subsystem provides an execution environment similar to CICS through:
  + an entry point used to start a CICS transaction and run associated programs while respecting CICS "run levels" choreography,
  + an external storage for Resource Definitions,
  + an homogeneous set of Java fluent APIs reproducing `EXEC CICS` statements,
  + a set of pluggable classes reproducing CICS services, such as Temporary Storage Queues, Temporary Data Queues or files access (multiple implementations are usually available, such as Amazon Managed Service for Apache Flink, Amazon Simple Queue Service, or RabbitMQ for TD Queues),
  + for user-facing applications, the BMS screen description format is modernized to an Angular web application, and the corresponding "pseudo-conversational" dialog is supported.
+ Similarly, another subsystem provides IMS message-based choreography, and supports modernization of UI screens in the MFS format.
+ In addition, a third subsystem allows execution of programs in an iSeries-like environment, including modernization of DSPF (Display File)-specified screens.

All of those environments build upon common OS-level services such as:
+ the emulation of legacy memory allocation and layout (**Data Simplifier**),
+ Java thread-based reproduction of the COBOL "run units" execution and parameters passing mechanism (`CALL` statement),
+ emulation of flat, concatenated, VSAM (through the **Blusam** set of libraries), and GDG Data Set organizations,
+ access to data stores, such as RDBMS (`EXEC SQL` statements).

## Statelessness and session handling
<a name="ba-shared-architecture-stateless"></a>

An important feature of the AWS Transform for mainframe Runtime is to enable High Availability (HA) and horizontal scalability scenarios when executing modernized programs.

The cornerstone for this is statelessness, an important example of which is HTTP session handling.

### Session handling
<a name="ba-shared-architecture-stateless-session"></a>

Tomcat being web-based, an important mechanism for this is HTTP session handling (as provided by tomcat and Spring) and stateless design. As such statelessness design is based on the following:
+ users connect though HTTPS,
+ application servers are deployed behind a Load balancer,
+ when a user first connects to the application it will be authenticated and the application server will create an identifier (typically within a cookie)
+ this identifier will be used as a key to save and retrieve the user context to/from an external cache (data store).

Cookie management is done automatically by the AWS Transform for mainframe framework and the underlying tomcat server, this is transparent to the user. The user internet browser will manage this automatically.

The Gapwalk web application may store the session state (the context) in various data stores:
+ Amazon ElastiCache (Redis OSS)
+ Redis cluster
+ in memory map (only for development and standalone environments, not suitable for HA).

## High availability and statelessness
<a name="ba-shared-architecture-stateless-ha"></a>

More generally, a design tenet of the AWS Transform for mainframe framework is statelessness: most non-transient states required to reproduce legacy programs behavior are not stored inside the application servers, but shared through an external, common "single source of truth".

Examples of such states are CICS's Temporary Storage Queues or Resource Definitions, and typical external storages for those are Redis-compatible servers or relational databases.

This design, combined with load balancing and shared sessions, leads to most of user-facing dialog (OLTP, "Online Transactional Processing") to be distributable between multiple "nodes" (here, tomcat instances).

Indeed a user may execute a transaction on any server and not care if the next transaction call is performed on a different server. Then when a new server is spawned (because of auto scaling, or to replace a non healthy server), we can guarantee that any reachable and healthy server can run the transaction as expected with the proper results (expected returned value, expected data change in database, etc.).

# AWS Transform for mainframe structure of a modernized application
<a name="ba-shared-structure"></a>

This document provides details about the structure of modernized applications (using AWS Mainframe Modernization refactoring tools), so that developers can accomplish various tasks, such as:
+ navigating into applications smoothly.
+ developing custom programs that can be called from the modernized applications.
+ safely refactoring modernized applications.

We assume that you already have basic knowledge about the following:
+ legacy common coding concepts, such as records, data sets and their access modes to records -- indexed, sequential --, VSAM, run units, jcl scripts, CICS concepts, and so on.
+ java coding using the [Spring framework](https://spring.io/projects/spring-framework).
+ Throughout the document, we use `short class names` for readability. For more information, see [AWS Transform for mainframe fully qualified name mappings](#ba-shared-structure-fqn-table) to retrieve the corresponding fully qualified names for AWS Transform for mainframe runtime elements and [Third party fully qualified name mappings](#ba-shared-structure-3pfqn-table) to retrieve the corresponding fully qualified names for third party elements.
+ All artifacts and samples are taken from the modernization process outputs of the sample COBOL/CICS [CardDemo application](https://github.com/aws-samples/aws-mainframe-modernization-carddemo).

**Topics**
+ [Artifacts organization](#ba-shared-structure-org)
+ [Running and calling programs](#ba-shared-structure-run-call)
+ [Write your own program](#ba-shared-structure-write)
+ [Fully qualified name mappings](#ba-shared-structure-fqn)

## Artifacts organization
<a name="ba-shared-structure-org"></a>

AWS Transform for mainframe modernized applications are packaged as java web applications (.war), that you can deploy on a JEE server. Typically, the server is a [Tomcat](https://tomcat.apache.org/) instance that embeds the AWS Transform for mainframe Runtime, which is currently built upon the [Springboot](https://spring.io/projects/spring-boot) and [Angular](https://angular.io/) (for the UI part) frameworks.

The war aggregates several component artifacts (.jar). Each jar is the result of the compilation (using the [maven](https://maven.apache.org/) tool) of a dedicated java project whose elements are the result of the modernization process.

![\[Sample modernized application artifacts.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/modernized_application_artifacts.png)


The basic organization relies on the following structure:
+ Entities project: contains business model and context elements. The project name generally ends with "-entities". Typically, for a given legacy COBOL program, this corresponds to the modernization of the I/O section (data sets) and the data division. You can have more than one entities project.
+ Service project: contains legacy business logic modernization elements. Typically, the procedure division of a COBOL program. You can have more than one service project.
+ Utility project: contains shared common tools and utilities, used by other projects.
+ Web project: contains the modernization of UI-related elements when applicable. Not used for batch-only modernization projects. These UI elements could come from CICS BMS maps, IMS MFS components, and other mainframe UI sources. You can have more than one Web project.

### Entities project contents
<a name="ba-shared-structure-org-entities"></a>

**Note**  
The following descriptions only apply to COBOL and PL/I modernization outputs. RPG modernization outputs are based on a different layout.

Before any refactoring, the packages organization in the entities project is tied to the modernized programs. You can accomplish this in a couple of different ways. The preferred way is to use the Refactoring toolbox, which operates before you trigger the code generation mechanism. This is an advanced operation, which is explained in the AWS Transform for mainframe trainings. For more information, see [Refactoring workshop](https://catalog.workshops.aws/aws-blu-age-l3-certification-workshop/en-US/refactoring). This approach allows you to preserve the capability to re-generate the java code later, to benefit from further improvements in the future, for instance). The other way is to do regular java refactoring, directly on the generated source code, using any java refactoring approach you might like to apply -- at your own risk. 

![\[Sample program CBACT04C entities packages.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/entities_packages.png)


#### Program related classes
<a name="ba-shared-structure-org-entities-program"></a>

Each modernized program is related to two packages, a business.context and a business.model package.
+ `base package.program.business.context`

  The business.context sub-package contains two classes, a configuration class and a context class.
  + One configuration class for the program, which contains specific configuration details for the given program, such as the character set to use to represent character-based data elements, the default byte value for padding data structure elements and so on. The class name ends with "Configuration". It is marked with the `@org.springframework.context.annotation.Configuration` annotation and contains a single method that must return a properly setup `Configuration` object.  
![\[Sample configuration in Java.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_configuration.png)
  + One context class, which serves as a bridge between the program service classes (see below) and the data structures (`Record`) and data sets (`File`) from the model sub-package (see below). The class name ends with "Context" and is a subclass of the `RuntimeContext` class.  
![\[Sample context class (partial view)\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_context.png)
+ `base package.program.business.model`

  The model sub-package contains all the data structures that the given program can use. For instance, any 01 level COBOL data structure corresponds to a class in the model sub-package (lower level data structures are properties of their owning 01 level structure). For more information about how we modernize 01 data structures, see [What are data simplifiers in AWS Transform for mainframe](ba-shared-data.md).  
![\[Sample record entity (partial view)\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_record_entity.png)

All classes extend the `RecordEntity` class, which represents the access to a business record representation. Some of the records have a special purpose, as they're bound to a `File`. The binding between a `Record` and a `File` is made in the corresponding \$1FileHandler methods found in the context class when creating the file object. For example, the following listing shows how the TransactfileFile `File` is bound to the transactFile `Record` (from the model sub-package).

![\[Sample record to file binding.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_record_file_binding.png)


### Service project contents
<a name="ba-shared-structure-org-service"></a>

Every service project comes with a dedicated [Springboot](https://spring.io/projects/spring-boot) application, which is used as the backbone of the architecture. This is materialized through the class named `SpringBootLauncher`, located in the base package of the service java sources:

![\[Service project SpringBoot application.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/springbootlauncher.png)


This class is notably responsible for:
+ making the glue between the program classes and managed resources (datasources / transaction managers / data sets mappings / etc ...).
+ providing a `ConfigurableApplicationContext` to programs.
+ discovering all classes marked as spring components (`@Component`).
+ ensuring programs are properly registered in the `ProgramRegistry` -- see the initialize method in charge of this registration.

![\[Programs registration.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/programs_registration.png)


#### Program related artifacts
<a name="ba-shared-structure-org-service-program"></a>

Without prior refactoring, the business logic modernization outputs are organized on a two or three packages per legacy program basis:

![\[Sample program packages.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_program_packages.png)


The most exhaustive case will have three packages:
+ `base package.program.service`: contains an interface named *Program*Process, which has business methods to handle the business logic, preserving the legacy execution control flow.
+ `base package.program.service.impl`: contains a class named *Program*ProcessImpl, which is the implementation of the Process interface described previously. This is where the legacy statements are "translated" to java statements, relying on the AWS Transform for mainframe framework:  
![\[Sample modernized CICS statements (SEND MAP, RECEIVE MAP)\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_cics_statements.png)
+ `base package.program.statemachine`: this package might not always be present. It is required when the modernization of the legacy control flow has to use a state machine approach (namely using the [Spring StateMachine framework](https://spring.io/projects/spring-statemachine)) to properly cover the legacy execution flow.

  In that case, the statemachine sub-package contains two classes:
  + `ProgramProcedureDivisionStateMachineController`: a class that extends a class implementing the `StateMachineController` (define operations needed to control the execution of a state machine) and `StateMachineRunner` (define operations required to run a state machine) interfaces, used to drive the Spring state machine mechanics; for instance, the `SimpleStateMachineController` as in the sample case.   
![\[Sample state machine controller.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_statemachine_controller.png)

    The state machine controller defines the possible different states and the transitions between them, which reproduce the legacy execution control flow for the given program.

    When building the state machine, the controller refers to methods that are defined in the associated service class located in the state machine package and described below:

    ```
    subConfigurer.state(States._0000_MAIN, buildAction(() -> {stateProcess._0000Main(lctx, ctrl);}), null);
    subConfigurer.state(States.ABEND_ROUTINE, buildAction(() -> {stateProcess.abendRoutine(lctx, ctrl);}), null);
    ```
  + `ProgramProcedureDivisionStateMachineService`: this service class represents some business logic that is required to be bound with the state machine that the state machine controller creates, as described previously.

    The code in the methods of this class use the Events defined in the state machine controller:  
![\[Statemachine service using a statemachine controller event.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/service_using_event_1.png)  
![\[Statemachine service using a statemachine controller event.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/service_using_event_2.png)

    The statemachine service also makes calls to the process service implementation described previously:  
![\[.statemachine service making calls to the process implementation\]](http://docs.aws.amazon.com/m2/latest/userguide/images/service_using_processimpl.png)

In addition to that, a package named `base package.program` plays a significant role, as it gathers one class per program, which will serve as the program entry point (more details about this later on). Each class implements the `Program` interface, marker for a program entry point.

![\[Program entry points\]](http://docs.aws.amazon.com/m2/latest/userguide/images/programs.png)


#### Other artifacts
<a name="ba-shared-structure-org-service-other"></a>
+ BMS MAPs companions

  In addition to program related artifacts, the service project can contain other artifacts for various purposes. In the case of the modernization of a CICS online application, the modernization process produces a json file and puts in the map folder of the /src/main/resources folder:  
![\[BMS MAPs json files in resources folder.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/maps_json_files.png)

  The AWS Transform for mainframe runtime consumes those json files to bind the records used by the SEND MAP statement with the screen fields.
+ Groovy Scripts

  If the legacy application had JCL scripts, those have been modernized as [groovy](https://groovy-lang.org/) scripts, stored in the /src/main/resources/scripts folder (more on that specific location later on):  
![\[groovy scripts (JCL modernization)\]](http://docs.aws.amazon.com/m2/latest/userguide/images/groovy_scripts.png)

  Those scripts are used to launch batch jobs (dedicated, non-interactive, cpu-intensive data processing workloads).
+ SQL files

  If the legacy application was using SQL queries, the corresponding modernized SQL queries have been gathered in dedicated properties files, with the naming pattern *program*.sql, where *program* is the name of the program using those queries.  
![\[SQL files in the resources folder\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sql_files.png)

  The contents of those sql files are a collection of (key=query) entries, where each query is associated to a unique key, that the modernized program uses to run the given query:  
![\[Sample sql file that the modernized program uses.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_sql_file.png)

  For instance, the COSGN00C program is executing the query with key "COSGN00C\$11" (the first entry in the sql file):  
![\[sample query usage by program\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_sql_query_usage.png)

### Utilities project contents
<a name="ba-shared-structure-org-utilities"></a>

The utilities project, whose name ends with "-tools", contains a set of technical utilities, which might be used by all the other projects.

![\[Utilities project contents\]](http://docs.aws.amazon.com/m2/latest/userguide/images/tools_project.png)


### Web project(s) contents
<a name="ba-shared-structure-org-web"></a>

The web project is only present when modernizing legacy UI elements. The modern UI elements used to build the modernized application front-end are based on [Angular](https://angular.io/). The sample application used to show the modernization artifacts is a COBOL/CICS application, running on a mainframe. The CICS system uses MAPs to represent the UI screens. The corresponding modern elements will be, for every map, a html file accompanied by [Typescript](https://www.typescriptlang.org/) files:

![\[Sample CICS maps modernized to Angular\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_cics_maps_angular.png)


The web project only takes care of the front end aspect of the application The service project, which relies on the utility and entities projects, provides the backend services. The link between the front end and the backend is made through the web application named Gapwalk-Application, which is part of the standard AWS Transform for mainframe runtime distribution.

## Running and calling programs
<a name="ba-shared-structure-run-call"></a>

On legacy systems, programs are compiled as stand-alone executables that can call themselves through a CALL mechanism, such as the COBOL CALL statement, passing arguments when needed. The modernized applications offer the same capability but use a different approach, because the nature of the involved artifacts differs from the legacy ones.

On the modernized side, program entry points are specific classes that implement the `Program` interface, are Spring components (@Component) and are located in service projects, in a package named `base package.program`. 

### Programs registration
<a name="ba-shared-structure-run-call-register-programs"></a>

Each time the [Tomcat](https://tomcat.apache.org/) server that hosts modernized applications is started, the service Springboot application is also started, which triggers the programs registration. A dedicated registry named `ProgramRegistry` is populated with program entries, each program being registered using its identifiers, one entry per known program identifier, which means that if a program is known by several different identifiers, the registry contains as many entries as there are identifiers.

The registration for a given program relies on the collection of identifiers returned by the getProgramIdentifiers() method:

![\[sample program (partial view)\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_program.png)


In this example, the program is registered once, under the name 'CBACT04C' (look at the contents of the programIdentifiers collection). The tomcat logs show every program registration. The program registration only depends on the declared program identifiers and not the program class name itself (though typically the program identifiers and program class names are aligned.

The same registration mechanism applies to utility programs brought by the various utility AWS Transform for mainframe web applications, which are part of the AWS Transform for mainframe runtime distribution. For instance, the Gapwalk-Utility-Pgm webapp provides the functional equivalents of the z/OS system utilities (IDCAMS, ICEGENER, SORT,and so on) and can be called by modernized programs or scripts. All available utility programs that are registered at Tomcat startup are logged in the Tomcat logs.

### Scripts and daemons registration
<a name="ba-shared-structure-run-call-register-scripts"></a>

A similar registration process, at Tomcat startup time, occurs for groovy scripts that are located in the /src/main/resources/scripts folder hierarchy. The scripts folder hierarchy is traversed, and all groovy scripts that are discovered (except the special functions.groovy reserved script) are registered in the `ScriptRegistry`, using their short name (the part of the script file name located before the first dot character) as the key for retrieval.

**Note**  
If several scripts have file names that will result in producing the same registration key, only the latest is registered, overwriting any previously encountered registration for that given key.
Considering the above note, pay attention when using sub-folders as the registration mechanism flattens the hierarchy and could lead to unexpected overwrites. The hierarchy does not count in the registration process: typically /scripts/A/myscript.groovy and /scripts/B/myscript.groovy will lead to /scripts/B/myscript.groovy overwriting /scripts/A/myscript.groovy.

The groovy scripts in the /src/main/resources/daemons folder are handled a bit differently. They're still registered as regular scripts, but in addition, they are launched once, directly at Tomcat startup time, asynchronously.

After scripts are registered in the `ScriptRegistry`, a REST call can launch them, using the dedicated endpoints that the Gapwalk-Application exposes. For more information, see the corresponding documentation.

### Programs calling programs
<a name="ba-shared-structure-run-call-programs"></a>

Each program can call another program as a subprogram, passing parameters to it. Programs use an implementation of the `ExecutionController` interface to do so (most of the time, this will be an `ExecutionControllerImpl` instance), along with a fluent API mechanism named the `CallBuilder` to build the program call arguments.

All programs methods take both a `RuntimeContext` and an `ExecutionController` as method arguments, so an `ExecutionController` is always available to call other programs.

See, for instance, the following diagram, which shows how the CBST03A program calls the CBST03B program as a sub-program, passing parameters to it:

![\[.sub-program call sample\]](http://docs.aws.amazon.com/m2/latest/userguide/images/subprogram_call_sample.png)

+ The first argument of the `ExecutionController.callSubProgram` is an identifier of the program to call (that is, one of the identifiers used for the program registration -- see paragraphs above).
+ The second argument, which is the result of the build on the `CallBuilder`, is an array of `Record`, corresponding to the data passed from caller to callee.
+ The third and last argument is the caller `RuntimeContext` instance.

All three arguments are mandatory and cannot be null, but the second argument can be an empty array.

The callee will be able to deal with passed parameters only if it was originally designed to do so. For a legacy COBOL program, that means having a LINKAGE section and a USING clause for the procedure division to make use of the LINKAGE elements.

For instance, see the corresponding [CBSTM03B.CBL](https://github.com/aws-samples/aws-mainframe-modernization-carddemo/blob/main/app/cbl/CBSTM03B.CBL) COBOL source file:

![\[Sample linkage in a COBOL source file\]](http://docs.aws.amazon.com/m2/latest/userguide/images/linkage_sample.png)


So the CBSTM03B program takes a single `Record` as a parameter (an array of size 1). This is what the `CallBuilder` is building, using the byReference() and getArguments() methods chaining.

The `CallBuilder` fluent API class has several methods available to populate the array of arguments to pass to a callee:
+ asPointer(RecordAdaptable) : add an argument of pointer kind, by reference. The pointer represents the address of a target data structure.
+ byReference(RecordAdaptable): add an argument by reference. The caller will see the modifications that the callee performs.
+ byReference(RecordAdaptable): varargs variant of the previous method.
+ byValue(Object): add an argument, transformed to a `Record`, by value. The caller won't see the modifications the callee performs.
+ byValue(RecordAdaptable): same as the previous method, but the argument is directly available as a `RecordAdaptable`.
+ byValueWithBounds(Object, int, int): add an argument, transformed to a `Record`, extracting the byte array portion defined by the given bounds, by value.

Finally, the getArguments method will collect all added arguments and return them as an array of `Record`.

**Note**  
It is the responsability of the caller to make sure the arguments array has the required size, that the items are properly ordered and compatible, in terms of memory layout with the expected layouts for the linkage elements.

### Scripts calling programs
<a name="ba-shared-structure-run-call-scripts"></a>

Calling registered programs from groovy scripts require using a class instance implementing the `MainProgramRunner` interface. Usually, getting such an instance is achieved through Spring's ApplicationContext usage:

![\[.MainProgramRunner : getting an instance\]](http://docs.aws.amazon.com/m2/latest/userguide/images/mpr.png)


After a `MainProgramRunner` interface is available, use the runProgram method to call a program and pass the identifier of the target program as a parameter:

![\[MainProgramRunner : running a program\]](http://docs.aws.amazon.com/m2/latest/userguide/images/mpr_runprogram.png)


In the previous example, a job step calls IDCAMS (file handling utility program), providing a mapping between actual data set definitions and their logical identifiers.

When dealing with data sets, legacy programs mostly use logical names to identify data sets. When the program is called from a script, the script must map logical names with actual physical data sets. These data sets could be on the filesystem, in a Blusam storage or even defined by an inline stream, the concatenation of several data sets, or the generation of a GDG.

Use the withFileConfiguration method to build a logical to physical map of data sets and make it available to the called program.

## Write your own program
<a name="ba-shared-structure-write"></a>

Writing your own program for scripts or other modernized programs to call is a common task. Typically, on modernization projects, you write your own programs when an executable legacy program is written in a language that the modernization process doesn't support, or the sources have been lost (yes, that can happen), or the program is an utility whose sources are not available.

In that case, you might have to write the missing program, in java, by yourself (assuming you have enough knowledge about what the program expected behaviour should be, the memory layout of the program's arguments if any, and so on.) Your java program must comply with the program mechanics described in this document, so that other programs and scripts can run it.

To make sure the program is usable, you must complete two mandatory steps:
+ Write a class that implements the `Program` interface properly, so that it can be registered and called.
+ Make sure your program is registered properly, so that it is visible from other programs/scripts.

### Writing the program implementation
<a name="ba-shared-structure-write-implementation"></a>

Use your IDE to create a new java class that implements the `Program` interface:

![\[Creating a new java Program class\]](http://docs.aws.amazon.com/m2/latest/userguide/images/new_program.png)


The following image shows the Eclipse IDE, which takes care of creating all mandatory methods to be implemented:

![\[Creating a new java Program class - editing source\]](http://docs.aws.amazon.com/m2/latest/userguide/images/new_program_ide.png)


### Spring integration
<a name="ba-shared-structure-write-spring"></a>

First, the class must be declared as a Spring component. Annotate the class with the `@Component` annotation:

![\[Using the spring @Component annotation\]](http://docs.aws.amazon.com/m2/latest/userguide/images/program_component.png)


Next, implement the required methods properly. In the context of this sample, we added the `MyUtilityProgram` to the package that already contains all modernized programs. That placement permits the program to use the existing Springboot application to provide the required `ConfigurableApplicationContext` for the getSpringApplication method implementation:

![\[Implementing the getSpringApplication method.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/getSpringApplication.png)


You might choose a different location for your own program. For instance, you might locate the given program in another dedicated service project. Make sure the given service project has its own Springboot application, which makes it possible to retrieve the ApplicationContext (that should be a `ConfigurableApplicationContext`).

### Giving an identity to the program
<a name="ba-shared-structure-write-identity"></a>

To be callable by other programs and scripts, the program must be given at least one identifier, which must not collide with any other existing registered program within the system. The identifier choice might be driven by the need to cover an existing legacy program replacement; in that case, you'll have to use the expected identifier, as met in CALL occurrences found throughout the legacy programs. Most of the program identifiers are 8 characters long in legacy systems.

Creating an unmodifiable set of identifiers in the program is one way of doing this. The following example shows choosing "MYUTILPG" as the single identifier:

![\[Program identifier example\]](http://docs.aws.amazon.com/m2/latest/userguide/images/program_identifier.png)


### Associate the program to a context
<a name="ba-shared-structure-write-context"></a>

The program needs a companion `RuntimeContext` instance. For modernized programs, AWS Transform for mainframe automatically generates the companion context, using the data structures that are part of the legacy program.

If you're writing your own program, you must write the companion context as well.

Referring to [Program related classes](#ba-shared-structure-org-entities-program), you can see that a program requires at least two companion classes:
+ a configuration class.
+ a context class that uses the configuration.

If the utility program uses any extra data structure, it should be written as well and used by the context.

Those classes should be in a package that is part of a package hierarchy that will be scanned at application startup, to make sure the context component and configuration will be handled by the Spring framework.

Let's write a minimal configuration and context, in the `base package.myutilityprogram.business.context` package, freshly created in the entities project:

![\[New dedicated configuration and context for the new utility program\]](http://docs.aws.amazon.com/m2/latest/userguide/images/new_program_context_package.png)


Here is the configuration content. It is using a configuration build similar to other -- modernized -- programs nearby. You'll probably have to customize this for your specific needs.

![\[New program configuration\]](http://docs.aws.amazon.com/m2/latest/userguide/images/new_program_configuration.png)


Notes:
+ General naming convention is *ProgramName*Configuration.
+ It must use the @org.springframework.context.annotation.Configuration and @Lazy annotations.
+ The bean name usually follows the *ProgramName*ContextConfiguration convention, but this is not mandatory. Make sure to avoid bean name collisions across the project.
+ The single method to implement must return a `Configuration` object. Use the `ConfigurationBuilder` fluent API to help you build one.

And the associated context:

![\[New program context in a Java file.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/new_program_context.png)


Notes
+ The context class should extend an existing `Context` interface implementation (either `RuntimeContext` or `JicsRuntimeContext`, which is an enhanced `RuntimeContext` with JICS specifics items).
+ General naming convention is *ProgramName*Context.
+ You must declare it as a Prototype component, and use the @Lazy annotation.
+ The constructor refers to the associated configuration, using the @Qualifier annotation to target the proper configuration class.
+ If the utility program uses some extra data structures, they should be:
  + written and added to the `base package.business.model` package.
  + referenced in the context. Take a look at other existing context classes to see how to reference data strcutures classes and adapt the context methods (constructor / clean-up / reset) as needed.

Now that a dedicated context is available, let the new program use it:

![\[The new program uses the freshly created context.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/new_program_uses_context.png)


Notes:
+ The getContext method must be implemented strictly as shown, using a delegation to the getOrCreate method of the `ProgramContextStore` class and the auto wired Spring `BeanFactory`. A single program identifier is used to store the program context in the `ProgramContextStore`; this identifier is referenced as being the 'program main identifier'.
+ The companion configuration and context classes must be referenced using the `@Import` spring annotation.

### Implementing the business logic
<a name="ba-shared-structure-write-business-logic"></a>

When the program skeleton is complete, implement the business logic for the new utility program.

Do this in the `run` method of the program. This method will be executed anytime the program is called, either by another program or by a script.

Happy coding\$1

### Handling the program registration
<a name="ba-shared-structure-write-registration"></a>

Finally, make sure the new program is properly registered in the `ProgramRegistry`. If you added the new program to the package that already contains other programs, there's nothing more to be done. The new program is picked up and registered with all its neighbour programs at application startup.

If you chose another location for the program, you must make sure the program is properly registered at Tomcat startup. For some inspiration about how to do that, look at the initialize method of the generated SpringbootLauncher classes in the service project(s) (see [Service project contents](#ba-shared-structure-org-service)).

Check the Tomcat startup logs. Every program registration is logged. If your program is successfully registered, you'll find the matching log entry.

When you're sure that your program is properly registered, you can start iterating on the business logic coding.

## Fully qualified name mappings
<a name="ba-shared-structure-fqn"></a>

This section contains lists of the AWS Transform for mainframe and third-party fully qualified name mappings for use in your modernized applications.

### AWS Transform for mainframe fully qualified name mappings
<a name="ba-shared-structure-fqn-table"></a>


| Short name | Fully qualified name | 
| --- | --- | 
|  `CallBuilder`  |  `com.netfective.bluage.gapwalk.runtime.statements.CallBuilder`  | 
|  `Configuration`  |  `com.netfective.bluage.gapwalk.datasimplifier.configuration.Configuration`  | 
|  `ConfigurationBuilder`  |  `com.netfective.bluage.gapwalk.datasimplifier.configuration.ConfigurationBuilder`  | 
|  `ExecutionController`  |  `com.netfective.bluage.gapwalk.rt.call.ExecutionController`  | 
|  `ExecutionControllerImpl`  |  `com.netfective.bluage.gapwalk.rt.call.internal.ExecutionControllerImpl`  | 
|  `File`  |  `com.netfective.bluage.gapwalk.rt.io.File`  | 
|  `MainProgramRunner`  |  `com.netfective.bluage.gapwalk.rt.call.MainProgramRunner`  | 
|  `Program`  |  `com.netfective.bluage.gapwalk.rt.provider.Program`  | 
|  `ProgramContextStore`  |  `com.netfective.bluage.gapwalk.rt.context.ProgramContextStore`  | 
|  `ProgramRegistry`  |  `com.netfective.bluage.gapwalk.rt.provider.ProgramRegistry`  | 
|  `Record`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.Record`  | 
|  `RecordEntity`  |  `com.netfective.bluage.gapwalk.datasimplifier.entity.RecordEntity`  | 
|  `RuntimeContext`  |  `com.netfective.bluage.gapwalk.rt.context.RuntimeContext`  | 
|  `SimpleStateMachineController`  |  `com.netfective.bluage.gapwalk.rt.statemachine.SimpleStateMachineController`  | 
|  `StateMachineController`  |  `com.netfective.bluage.gapwalk.rt.statemachine.StateMachineController`  | 
|  `StateMachineRunner`  |  `com.netfective.bluage.gapwalk.rt.statemachine.StateMachineRunner`  | 

### Third party fully qualified name mappings
<a name="ba-shared-structure-3pfqn-table"></a>


| Short name | Fully qualified name | 
| --- | --- | 
|  `@Autowired`  |  `org.springframework.beans.factory.annotation.Autowired`  | 
|  `@Bean`  |  `org.springframework.context.annotation.Bean`  | 
|  `BeanFactory`  |  `org.springframework.beans.factory.BeanFactory`  | 
|  `@Component`  |  `org.springframework.stereotype.Component`  | 
|  `ConfigurableApplicationContext`  |  `org.springframework.context.ConfigurableApplicationContext`  | 
|  `@Import`  |  `org.springframework.context.annotation.Import`  | 
|  `@Lazy`  |  `org.springframework.context.annotation.Lazy`  | 

# What are data simplifiers in AWS Transform for mainframe
<a name="ba-shared-data"></a>

On mainframe and midrange systems (referred to in the following topic as "legacy" systems), frequently used programming languages such as COBOL, PL/I or RPG provide low-level access to memory. This access focuses on memory layout accessed through native types such as zoned, packed, or alphanumeric, possibly also aggregated through groups or arrays.

A mix of accesses to a given piece of memory, through both typed fields and as direct access to bytes (raw memory), coexists in a given program. For example, COBOL programs will pass arguments to callers as contiguous sets of bytes (LINKAGE), or read/write data from files in the same manner (records), while interpreting such memory ranges with typed fields organized in copybooks.

Such combinations of raw and structured access to memory, the reliance on precise, byte-level memory layout, and legacy types, such as zoned or packed, are features that are neither natively nor easily available in the Java programming environment.

As a part of the AWS Transform for mainframe solution for modernizing legacy programs to Java, the **Data Simplifier** library provides such constructs to modernized Java programs, and exposes those in a way that is as familiar as possible to Java developers (getters/setters, byte arrays, class-based). It is a core dependency of the modernized Java code generated from such programs.

For simplicity, most of the following explanations are based on COBOL constructs, but you can use the same API for both PL1 and RPG data layout modernization, since most of the concepts are similar.

**Topics**
+ [Main classes](#ba-shared-data-main-classes)
+ [Data binding and access](#ba-shared-data-binding-access)
+ [FQN of discussed Java types](#ba-shared-data-fqn)

## Main classes
<a name="ba-shared-data-main-classes"></a>

For easier reading, this document uses the Java short names of the AWS Transform for mainframe API interfaces and classes. For more information, see [FQN of discussed Java types](#ba-shared-data-fqn).

### Low level memory representation
<a name="ba-shared-data-main-classes-memory"></a>

 At the lowest level, memory (a contiguous range of bytes accessible in a fast, random way) is represented by the `Record` interface. This interface is essentially an abstraction of a byte array of a fixed size. As such, it provides setters and getters able to access or modify the underlying bytes.

### Structured data representation
<a name="ba-shared-data-main-classes-structured"></a>

To represent structured data, such as "01 data items", or "01 copybooks", as found in COBOL DATA DIVISION, subclasses of the `RecordEntity` class are used. Those are normally not written by hand, but generated by the AWS Transform for mainframe modernization tools from the corresponding legacy constructs. It is still useful to know about their main structure and API, so you can understand how the code in a modernized program uses them. In the case of COBOL, that code is Java generated from their PROCEDURE DIVISION.

Generated code represents each "01 data item" with a `RecordEntity` subclass; each elementary field or aggregate composing it is represented as a private Java field, organized as a tree (each item has a parent, except for the root one).

For illustration purposes, here is an example COBOL data item, followed by the corresponding AWS Transform for mainframe generated code that modernizes it:

```
01 TST2.
 02 FILLER PIC X(4).
 02 F1     PIC 9(2) VALUE 42.
 02 FILLER PIC X.
 02        PIC 9(3) VALUE 123.
 02 F2     PIC X VALUE 'A'.
```

```
public class Tst2 extends RecordEntity {

	   private final Group root = new Group(getData()).named("TST2"); 
	   private final Filler filler = new Filler(root,new AlphanumericType(4));
	   private final Elementary f1 = new Elementary(root,new ZonedType(2, 0, false),new BigDecimal("42")).named("F1");
	   private final Filler filler1 = new Filler(root,new AlphanumericType(1));
	   private final Filler filler2 = new Filler(root,new ZonedType(3, 0, false),new BigDecimal("123"));
	   private final Elementary f2 = new Elementary(root,new AlphanumericType(1),"A").named("F2");
	

	   /**
	    * Instantiate a new Tst2 with a default record.
   	 * @param configuration the configuration
   	 */
	   public Tst2(Configuration configuration) {
		      super(configuration);
		      setupRoot(root);
	   }
	   /**
	    * Instantiate a new Tst2 bound to the provided record.
	    * @param configuration the configuration
	    * @param record the existing record to bind
	    */
	   public Tst2(Configuration configuration, RecordAdaptable record) {
		      super(configuration);
		      setupRoot(root, record);
	   }

	   /**
	    * Gets the reference for attribute f1.
	    * @return the f1 attribute reference
	    */
	   public ElementaryRangeReference getF1Reference() {
		      return f1.getReference();
	   }

	   /* *
	    * Getter for f1 attribute.
	    * @return f1 attribute
	    */
	   public int getF1() {
		      return f1.getValue();
	   }

	   /**
	    * Setter for f1 attribute.
	    * @param f1 the new value of f1
	    */
   	public void setF1(int f1) {
		      this.f1.setValue(f1);
	   }
	   /**
	    * Gets the reference for attribute f2.
	    * @return the f2 attribute reference
	    */
	   public ElementaryRangeReference getF2Reference() {
		      return f2.getReference();
	   }

	   /**
	    * Getter for f2 attribute.
	    * @return f2 attribute
	    */
	   public String getF2() {
		      return f2.getValue();
	   }

	   /**
	    * Setter for f2 attribute.
	    * @param f2 the new value of f2
	    */
	   public void setF2(String f2) {
		      this.f2.setValue(f2);
	   }
}
```

#### Elementary fields
<a name="ba-shared-data-main-classes-structured-elementary"></a>

Fields of class `Elementary` (or `Filler`, when unnamed) represent a "leaf" of the legacy data structure. They are associated with a contiguous span of underlying bytes ("range") and commonly have a type (possibly parameterized) expressing how to interpret and modify those bytes (by respectively "decoding" and "encoding" a value from/to a byte array).

All elementary types are subclasses of `RangeType`. Common types are:


| COBOL Type | Data Simplifier Type | 
| --- | --- | 
|  `PIC X(n)`  |  `AlphanumericType`  | 
|  `PIC 9(n)`  |  `ZonedType`  | 
|  `PIC 9(n) COMP-3`  |  `PackedType`  | 
|  `PIC 9(n) COMP-5`  |  `BinaryType`  | 

#### Aggregate fields
<a name="ba-shared-data-main-classes-structured-aggregate"></a>

Aggregate fields organize the memory layout of their contents (other aggregates or elementary fields). They do not have an elementary type themselves.

`Group` fields represent contiguous fields in memory. Each of their contained fields are laid out in the same order in memory, the first field being at offset `0` with respect to the group field position in memory, the second field being at offset `0 + (size in bytes of first field)`, etc. They are used to represent sequences of COBOL fields under the same containing field.

`Union` fields represent multiples fields accessing the same memory. Each of their contained fields are laid out at offset `0` with respect to the union field position in memory. They are for example used to represent the COBOL "REDEFINES" construct (the first Union children being the redefined data item, the second children being its first redefinition, etc.).

Array fields (subclasses of `Repetition`) represent the repetition, in memory, of the layout of their child field (be it an aggregate itself or an elementary item). They lay out a given number of such child layouts in memory, each being at offset `index * (size in bytes of child)`. They are used to represent COBOL "OCCURS" constructs.

#### Primitives
<a name="ba-shared-data-main-classes-structured-primitive"></a>

In some modernization cases, "Primitives" may also be used to present independent, "root" data items. Those are very similar in use to `RecordEntity` but don't come from it, nor are based on generated code. Instead they are directly provided by the AWS Transform for mainframe runtime as subclasses of the `Primitive` interface. Examples of such provided classes are `Alphanumeric` or `ZonedDecimal`.

## Data binding and access
<a name="ba-shared-data-binding-access"></a>

Association between structured data and underlying data can be done in multiple ways.

An important interface for this purpose is `RecordAdaptable`, which is used to obtain a `Record` providing a "writable view" on the `RecordAdaptable` underlying data. As we will see below, multiple classes implement `RecordAdaptable`. Reciprocally, AWS Transform for mainframe APIs and code manipulating low-level memory (such as programs arguments, file I/O records, CICS comm area, allocated memory...) will often expect a `RecordAdaptable` as an handle to that memory.

In the COBOL modernization case, most data items are associated with memory which will be fixed during the life time of the corresponding program execution. For this purpose, `RecordEntity` subclasses are instantiated once in a generated parent object (the program Context), and will take care of instantiating their underlying `Record`, based on the `RecordEntity` byte size.

In other COBOL cases, such as associating LINKAGE elements with program arguments, or modernizing the SET ADDRESS OF construct, a `RecordEntity` instance must be associated with a provided `RecordAdaptable`. For this purpose, two mechanisms exist:
+ if the `RecordEntity` instance already exists, the `RecordEntity.bind(RecordAdaptable)` method (inherited from `Bindable`) can be used to make this instance "point" to this `RecordAdaptable`. Any getter or setter called on the `RecordEntity` will then be backed (bytes reading or writing) by the underlying `RecordAdaptable` bytes.
+ if the `RecordEntity` is to be instantiated, a generated constructor accepting a `RecordAdaptable` is available.

Conversely, the `Record` currently bound to structured data can be accessed. For this, `RecordEntity` implements `RecordAdaptable`, so `getRecord()` can be called on any such instance.

Finally, many COBOL or CICS verbs require access to a single field, for reading or writing purpose. The `RangeReference` class is used to represent such access. Its instances can be obtained from `RecordEntity` generated `getXXXReference()` methods (`XXX` being the accessed field), and passed to runtime methods. `RangeReference` is typically used to access whole `RecordEntity` or `Group`, while its subclass `ElementaryRangeReference` represents accesses to `Elementary` fields.

Note that most observations above apply to `Primitive` subclasses, since they strive at implementing similar behavior as `RecordEntity` while being provided by the AWS Transform for mainframe runtime (instead of generated code). For this purpose, all subclasses of `Primitive` implement `RecordAdaptable`, `ElementaryRangeReference` and `Bindable` interfaces so as to be usable in place of both `RecordEntity` subclasses and elementary fields.

## FQN of discussed Java types
<a name="ba-shared-data-fqn"></a>

The following table shows the fully qualified names of the Java types discussed in this section.


| Short Name | Fully Qualified Name | 
| --- | --- | 
|  `Alphanumeric`  |  `com.netfective.bluage.gapwalk.datasimplifier.elementary.Alphanumeric`  | 
|  `AlphanumericType`  |  `com.netfective.bluage.gapwalk.datasimplifier.metadata.type.AlphanumericType`  | 
|  `BinaryType`  |  `com.netfective.bluage.gapwalk.datasimplifier.metadata.type.BinaryType`  | 
|  `Bindable`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.Bindable`  | 
|  `Elementary`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.structure.Elementary`  | 
|  `ElementaryRangeReference`  |  `com.netfective.bluage.gapwalk.datasimplifier.entity.ElementaryRangeReference`  | 
|  `Filler`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.structure.Filler`  | 
|  `Group`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.structure.Group`  | 
|  `PackedType`  |  `com.netfective.bluage.gapwalk.datasimplifier.metadata.type.PackedType`  | 
|  `Primitive`  |  `com.netfective.bluage.gapwalk.datasimplifier.elementary.Primitive`  | 
|  `RangeReference`  |  `com.netfective.bluage.gapwalk.datasimplifier.entity.RangeReference`  | 
|  `RangeType`  |  `com.netfective.bluage.gapwalk.datasimplifier.metadata.type.RangeType`  | 
|  `Record`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.Record`  | 
|  `RecordAdaptable`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.RecordAdaptable`  | 
|  `RecordEntity`  |  `com.netfective.bluage.gapwalk.datasimplifier.entity.RecordEntity`  | 
|  `Repetition`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.structure.Repetition`  | 
|  `Union`  |  `com.netfective.bluage.gapwalk.datasimplifier.data.structure.Union`  | 
|  `ZonedDecimal`  |  `com.netfective.bluage.gapwalk.datasimplifier.elementary.ZonedDecimal`  | 
|  `ZonedType`  |  `com.netfective.bluage.gapwalk.datasimplifier.metadata.type.ZonedType`  | 

# AWS Transform for mainframe Blusam
<a name="ba-shared-blusam"></a>

On mainframe systems (referred to in the following topic as "legacy"), business data is often stored using VSAM (Virtual Storage Access Method). The data is stored in "records" (byte arrays), belonging to a "data set".

There are four data set organizations:
+ **KSDS**: Key-Sequenced data sets - records are indexed by a primary key (no duplicate keys allowed) and optionally, additional "alternate" keys. All key values are subsets of the record byte array, each key being defined by: 
  + an offset (0-based, 0 being the start of the record byte array content, measured in bytes)
  + a length (expressed in bytes)
  + whether it tolerates duplicate values or not
+ **ESDS**: Entry-Sequenced data sets - records are accessed mostly sequentially (based on their insertion order in the data set) but can be accessed using additional alternate keys;
+ **RRDS**: Relative Records data sets - records are accessed using "jumps", using relative record numbers; The jumps can be done either forward or backward;
+ **LDS**: Linear data sets - no records there, simply a stream of bytes, organized in pages. Mainly used for internal purposes on legacy platforms.

When modernizing legacy applications, using AWS Transform for mainframe refactoring approach, modernized applications are no longer intended to access VSAM stored data, while preserving the data access logic. The Blusam component is the answer: it allows importing data from legacy VSAM data sets exports, provides an API for the modernized application to manipulate them along with a dedicated administration web application. See [AWS Transform for mainframe Blusam Administration Console](ba-shared-bac-userguide.md).

**Note**  
Blusam only supports KSDS, ESDS, and RRDS.

The Blusam API makes it possible to preserve data access logic (sequential, random, and relative reads; insert, update, and delete records), whereas the components architecture, relying on a mix of caching strategies and RDBMS-based storage, permits high throughput I/O operations with limited resources.

## Blusam infrastructure
<a name="ba-shared-blusam-infrastructure"></a>

Blusam relies on PostgreSQL RDBMS for data sets storage, both for raw records data and keys indexes (when applicable). The favorite option is to use the Amazon Aurora PostgreSQL compatible engine. The examples and illustrations in this topic are based on this engine.

**Note**  
At server startup, the Blusam runtime checks for the presence of some mandatory technical tables and create them if they cannot be found. As a consequence, the role used in the configuration to access the Blusam database must be granted the rights to create, update, and delete the database tables (both rows and the tables definitions themselves). For information about how to disable Blusam, see [Blusam configuration](#ba-shared-blusam-configuration).

### Caching
<a name="ba-shared-blusam-caching"></a>

In addition to the storage itself, Blusam operates faster when coupled to a cache implementation.

Two cache engines are currently supported, EhCache and Redis, each having its own use-case:
+ EhCache : Standalone embedded volatile local cache 
  + **NOT** eligible for AWS Mainframe Modernization managed environment deployment.
  + Typically used when a single node, like a single Apache Tomcat server, is used to run the modernized applications. For instance, the node might be dedicated to host batch jobs tasks.
  + **Volatile**: The EhCache cache instance is volatile; its content will be lost on the server shutdown.
  + **Embedded**: The EhCache and the server share the same JVM Memory Space (to be taken into account when defining the specifications for the hosting machine).
+ Redis: Shared persistent cache 
  + Eligible for AWS Mainframe Modernization managed environment deployment.
  + Typically used in multi-nodes situations, in particular when several servers are behind a load-balancer. The cache content is shared amongst all nodes.
  + The Redis is persistent and not bound to the nodes life-cycles. It is running on its own dedicated machine or service (for example, Amazon ElastiCache). The cache is remote to all nodes.

### Locking
<a name="ba-shared-blusam-locking"></a>

To deal with concurrent access to data sets and records, Blusam relies on a configurable locking system. Locking can be applied to both levels: data sets and records:
+ Locking a data set for write purpose will prevent all others clients from performing write operations to it, at any level (data set or record).
+ Locking at the record level for write will prevent other clients from performing write operations on the given record only.

Configuring the Blusam locking system should be done accordingly to the cache configuration:
+ If EhCache is chosen as cache implementation, no further locking configuration is required as the default in-memory locking system should be used.
+ If Redis is chosen as cache implementation, then a Redis-based locking configuration is required, to allow concurrent access from multiple nodes. The Redis cache used for locks does not have to be the same as the one used for data sets. For information about configuring a Redis-based locking system, see [Blusam configuration](#ba-shared-blusam-configuration).

## Blusam intrinsics and data migration from legacy
<a name="ba-shared-blusam-intrinsics"></a>


### Storing data sets: records and indexes
<a name="ba-shared-blusam-storage"></a>

Each legacy data set, when imported to Blusam, will be stored to a dedicated table; each row of the table represents a record, using two columns:
+ The numeric ID column, big integer type, that is the table primary key, and is used to store the Relative Byte Address (RBA) of the record. The RBA represents the offset in bytes from the start of the data set, and begins at 0.
+ The byte array record column, that is used to store the raw record's content.

See for example the content of a KSDS data set used in the CardDemo application:

![\[SQL query result showing KSDS data set with id and record bytes columns for CardDemo application.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_dataset_storage.png)

+ This particular data set has fixed length records, the length being 300 bytes (hence the collection of ids being multiples of 300).
+ By default, the pgAdmin tool used to query PostgreSQL databases doesn't show byte array column contents, but prints a [binary data] label instead.
+ The raw record content matches the raw data set export from the legacy, without any conversion. In particular, no character set conversion occurs; that implies that alphanumeric portions of the record will have to be decoded by modernized applications using the legacy character set, most likely an EBCDIC variant.

Regarding the data set metadata and keys indexes: each data set is associated with two rows in the table named `metadata`. This is the default naming convention. To learn how to customize it, see [Blusam configuration](#ba-shared-blusam-configuration).

![\[Table showing two rows of metadata with names and IDs for AWS M2 CARDDEMO ACCTDATA VSAM KSDS datasets.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_dataset_metadata_rows.png)

+ The first row has the data set name as the value of the *name* column. The *metadata* column is a binary column that contains a binary serialization of the general metadata of the given data set. For details, see [General data set metadata attributes](#ba-shared-blusam-metadata).
+ The second row has the data set name with the suffix `__internal'` as the value of the *name* column. The *metadata* column binary content depends on the "weight" of the data set.
  + For small/medium data sets, the content is a compressed serialization of: 
    + definition of the keys used by the data set; the primary key definition (for KSDS) and alternate keys definitions if applicable (for KSDS / ESDS)
    + the key indexes if applicable (KSDS / ESDS with alternate keys definitions): used for indexed browsing of records; the key index maps a key value to the RBA of a record;
    + records length map: used for sequential / relative browsing of records;
  + For Large/Very Large data sets, the content is a compressed serialization of: 
    + definition of the keys used by the data set; the primary key definition (for KSDS) and alternate keys definitions if applicable (for KSDS / ESDS)

Additionally, large/very large data sets indexes (if applicable) are stored using a pagination mechanism; index pages binary serializations are stored as rows of a dedicated table (one table per data set key). Each page of indexes is stored in a row, having the following columns:
+ id: technical identifier of the indexes page (numeric primary key);
+ firstkey: binary value of the first (lowest) key value stored in the indexes page;
+ lastkey: binary value of the last (highest) key value stored in the indexes page;
+ metadata: binary compressed serialization of the indexes page (mapping key values to records RBAs).

![\[Database table showing columns for id, firstkey, lastkey, and metadata with sample rows.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_index_pages.png)


The table name is a concatenation of the data set name and the key internal name, which contains information about the key, such as the key offset, whether the key accepts duplicates (set to true to allow duplicates), and the key length. For example, consider a data set named "AWS\$1LARGE\$1KSDS" that has the following two defined keys:
+ primary key [offset: 0, duplicates: false, length:18]
+ alternate key [offset: 3, duplicates: true, length: 6]

In this case, the following tables store the indexes related to the two keys.

![\[Two tables showing index storage for large_ksds_0f18 and large_ksds_3f6 keys.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/sample_large_dataset_indexes_tables.png)


### Optimizing I/O throughput using write-behind mechanism
<a name="ba-shared-blusam-write-behind"></a>

To optimize insert / update / delete operations performances, the Blusam engine relies on a configurable write-behind mechanism. The mechanism is built upon a pool of dedicated threads that deal with persistence operations using bulk update queries, to maximize I/O throughput towards the Blusam storage.

The Blusam engine collects all update operations done on records by applications and build records lots that are being dispatched for treatment to the dedicated threads. The lots are then being persisted to the Blusam storage, using bulk update queries, avoiding the usage of atomic persistence operations, ensuring the best possible usage of network bandwidth.

The mechanism uses a configurable delay (defaults to one second) and a configurable lot size (defaults to 10000 records). The build persistence queries are executed as soon as the first of the two following conditions is met:
+ The configured delay has elapsed and the lot is not empty
+ The number of records in the lot to be treated reaches the configured limit

To learn how to configure the write-behind mechanism, see [Optional properties](#ba-shared-blusam-configuration-engine-optional-properties).

### Picking up the proper storage scheme
<a name="ba-shared-blusam-storage-scheme"></a>

As shown in the previous section, the way data sets are being stored depends on their "weight". But what is considered as small, medium or large for a data set? When to pick the paginated storage strategy rather than the regular one?

The answer to that question depends on the following.
+ The amount of available memory on each of the servers hosting the modernized applications that will use those data sets.
+ The amount of available memory on cache infrastructure (if any).

When using non-paginated indexes storage scheme, the full key indexes and records sizes collections will be loaded into the server memory at data set opening time, for each data set. In addition, if caching is involved, all data set records might be pre-loaded into cache with the regular approach, which might lead to memory resource exhaustion on the cache infrastructure side.

Depending on the number of defined keys, the length of the key values, the number of records and the number of data sets opened at the same time, the amount of consumed memory can be roughly evaluated for the given known use-cases.

To learn more, see [Estimating the memory footprint for a given data set](#ba-shared-blusam-memory).

### Blusam migration
<a name="ba-shared-blusam-migration"></a>

Once the proper storage scheme has been selected for a given data set, the Blusam storage must be populated by migrating legacy data sets.

To achieve this, one has to use **raw binary exports** of the legacy data sets, without any charset conversion being used during the export process. When transferring data set exports from the legacy system, make sure not to corrupt the binary format. For example, enforce binary mode when using FTP.

The **raw binary exports** contain only the records. The import mechanism does not need the keys/indexes exports as all keys/indexes are being re-computed on the fly by the import mechanism.

Once a data set binary export is available, several options to migrate it to Blusam exist:

On AWS Mainframe Modernization managed environment:
+ Import data sets by using the dedicated feature. See [Import data sets for AWS Mainframe Modernization applications](applications-m2-dataset.md).

or
+ Use the data set bulk import facility. See [AWS Mainframe Modernization data set definition reference](datasets-m2-definition.md) and [Sample data set request format for VSAM](datasets-m2-definition.md#datasets-m2-definition-vsam).

or
+ Use a groovy script to import data sets, using dedicated loading services.

**Note**  
Importing LargeKSDS and LargeESDS on Mainframe Modernization managed environments is only possible using groovy scripts for now.

On AWS Transform for mainframe Runtime on Amazon EC2:
+ Import data set by using the [AWS Transform for mainframe Blusam Administration Console](ba-shared-bac-userguide.md).

or
+ Use a groovy script to import data sets, using dedicated loading services.

#### Import data sets using Groovy scripts
<a name="ba-shared-blusam-migration-groovy"></a>

This section will help you writing groovy scripts to import legacy data sets into Blusam.

It starts with some mandatory imports:

```
import com.netfective.bluage.gapwalk.bluesam.BluesamManager
import com.netfective.bluage.gapwalk.bluesam.metadata.Key;
import com.netfective.bluage.gapwalk.rt.provider.ServiceRegistry
import java.util.ArrayList; //used for alternate keys if any
```

After that, for each data set to import, the code is built upon the given pattern:

1. create or clear a map object

1. fill the map with required properties (this varies with data set kinds -- see below for details)

1. retrieve the proper loading service to be used for data set kind in the service registry

1. run the service, using the map as argument

There are 5 service implementations that can be retrieved from the service registry, using the following identifiers:
+ `"BluesamKSDSFileLoader"`: for small/medium sized KSDS
+ `"BluesamESDSFileLoader"` for small/medium sized ESDS
+ `"BluesamRRDSFileLoader"`: for RRDS
+ `"BluesamLargeKSDSFileLoader"`: for large KSDS
+ `"BluesamLargeESDSFileLoader"`: for large ESDS

Whether to pick the regular vs large version of service for KSDS/ESDS depends on the size of the data sets and the storage strategy you want to apply for it. To learn how to pick the proper storage strategy, see [Picking up the proper storage scheme](#ba-shared-blusam-storage-scheme).

To be able to successfully import the data set into Blusam, the proper properties must be provided to the loading service.

Common properties:
+ Mandatory (for all kinds of data sets)
  + "bluesamManager" : expected value is `applicationContext.getBean(BluesamManager.class)`
  + "datasetName" : name of the data set, as a String
  + "inFilePath" : path to the legacy data set export, as a String
  + "recordLength": the fixed record length or 0 for variable record length data set, as an integer
+ Optional
  + **Not supported for Large data sets:**
    + "isAppend" : a boolean flag, indicating that the import is happening in append mode (appending records to an existing Blusam data set).
    + "useCompression" : a boolean flag, indicating that compression will be used to store metadata.
  + **Only for Large data sets:**
    + "indexingPageSizeInMb" : the size in megabytes of each index page, for each of the keys of the data set, as a strictly positive integer

Data set kind dependant properties:
+ KSDS/Large KSDS:
  + mandatory 
    + "primaryKey" : the primary key definition, using a `com.netfective.bluage.gapwalk.bluesam.metadata.Key` constructor call.
  + optional: 
    + "alternateKeys" : a List ( `java.util.List` ) of alternate key definitions, built using `com.netfective.bluage.gapwalk.bluesam.metadata.Key` constructor calls.
+ ESDS/Large ESDS:
  + optional: 
    + "alternateKeys" : a List ( `java.util.List` ) of alternate key definitions, built using `com.netfective.bluage.gapwalk.bluesam.metadata.Key` constructor calls.
+ RRDS:
  + none.

Key constructor calls:
+ `new Key(int offset, int length)`: creates a Key object, with given key attributes (offset and length) and no duplicates allowed. This variant should be used to define a primary key.
+ `new Key(boolean allowDuplicates, int offset, int length)`: creates a Key object, with given key attributes (offset and length) and duplicates allowing flag.

The following Groovy samples illustrate various loading scenarios.

Loading a large KSDS, with two alternate keys:

```
import com.netfective.bluage.gapwalk.bluesam.BluesamManager
import com.netfective.bluage.gapwalk.bluesam.metadata.Key;
import com.netfective.bluage.gapwalk.rt.provider.ServiceRegistry
import java.util.ArrayList;

// Loading a large KSDS into Blusam
def map = [:]
map.put("bluesamManager", applicationContext.getBean(BluesamManager.class));
map.put("datasetName", "largeKsdsSample");
map.put("inFilePath", "/work/samples/largeKsdsSampleExport");
map.put("recordLength", 49);
map.put("primaryKey", new Key(0, 18));
ArrayList altKeys = [new Key(true, 10, 8), new Key(false, 0, 9)]
map.put("alternateKeys", altKeys);
map.put("indexingPageSizeInMb", 25);
def service = ServiceRegistry.getService("BluesamLargeKSDSFileLoader");
service.runService(map);
```

Loading a variable record length ESDS, with no alternate keys:

```
import com.netfective.bluage.gapwalk.bluesam.BluesamManager
import com.netfective.bluage.gapwalk.bluesam.metadata.Key;
import com.netfective.bluage.gapwalk.rt.provider.ServiceRegistry

// Loading an ESDS into Blusam
def map = [:]
map.put("bluesamManager", applicationContext.getBean(BluesamManager.class));
map.put("datasetName", "esdsSample");
map.put("inFilePath", "/work/samples/esdsSampleExport");
map.put("recordLength", 0);
def service = ServiceRegistry.getService("BluesamESDSFileLoader");
service.runService(map);
```

Variable record length data sets exports will contain the mandatory Record Decriptor Word (RDW) information to allow records splits at reading time.

Loading a fixed record length RRDS:

```
import com.netfective.bluage.gapwalk.bluesam.BluesamManager
import com.netfective.bluage.gapwalk.bluesam.metadata.Key;
import com.netfective.bluage.gapwalk.rt.provider.ServiceRegistry

// Loading a RRDS into Blusam
def map = [:]
map.put("bluesamManager", applicationContext.getBean(BluesamManager.class));
map.put("datasetName", "rrdsSample");
map.put("inFilePath", "/work/samples/rrdsSampleExport");
map.put("recordLength", 180);
def service = ServiceRegistry.getService("BluesamRRDSFileLoader");
service.runService(map);
```

Loading data sets in **Multi-schema mode**:

**Multi-schema mode**: In some legacy systems, VSAM files are organized into file sets, allowing programs to access, and modify data within specified partitions. Modern systems treat each file set as a schema, enabling similar data partitioning and access control.

 To enable **Multi-schema mode** in the `application-main.yml` file refer to [Blusam configuration](#ba-shared-blusam-configuration). In this mode, data sets can be loaded into a specific schema using a Shared Context which is an in-memory registry for runtime information. To load a data set into a specific schema, prefix the data set name with the relevant schema name. 

Loading a KSDS file into a specific schema for **Multi-schema mode**:

```
import com.netfective.bluage.gapwalk.bluesam.BluesamManager
import com.netfective.bluage.gapwalk.bluesam.metadata.Key;
import com.netfective.bluage.gapwalk.rt.provider.ServiceRegistry
import java.util.ArrayList;
import com.netfective.bluage.gapwalk.rt.shared.SharedContext;

// Loading a KSDS into Blusam
def map = [:]
String schema = "schema1";
String datasetName = schema+"|"+"ksdsSample";
SharedContext.get().setCurrentBlusamSchema(schema);
schema = SharedContext.get().getCurrentBlusamSchema();
map.put("bluesamManager", applicationContext.getBean(BluesamManager.class));
map.put("datasetName", datasetName);
map.put("inFilePath", "/work/samples/ksdsSampleExport");
map.put("recordLength", 49);
map.put("primaryKey", new Key(0, 18));
map.put("indexingPageSizeInMb", 25);
def service = ServiceRegistry.getService("BluesamKSDSFileLoader");
service.runService(map);
```

Loading a Large KSDS file into a specific schema for **Multi-schema mode**: 

```
import com.netfective.bluage.gapwalk.bluesam.BluesamManager
import com.netfective.bluage.gapwalk.bluesam.metadata.Key;
import com.netfective.bluage.gapwalk.rt.provider.ServiceRegistry
import java.util.ArrayList;
import com.netfective.bluage.gapwalk.rt.shared.SharedContext;

// Loading a Large KSDS into Blusam
def map = [:]
String schema = "schema1";
String datasetName = schema+"|"+"largeKsdsSample";
SharedContext.get().setCurrentBlusamSchema(schema);
schema = SharedContext.get().getCurrentBlusamSchema();
map.put("bluesamManager", applicationContext.getBean(BluesamManager.class));
map.put("datasetName", datasetName);
map.put("inFilePath", "/work/samples/LargeKsdsSampleExport");
map.put("recordLength", 49);
map.put("primaryKey", new Key(0, 18));
map.put("indexingPageSizeInMb", 25);
def service = ServiceRegistry.getService("BluesamLargeKSDSFileLoader");
service.runService(map);
```

In addition, a configuration entry (to be set in the `application-main.yml` configuration file) can be used to fine tune the import process:
+ `bluesam.fileLoading.commitInterval`: a strictly positive integer, definining the commit interval for regular ESDS/KSDS/RRDS import mechanism. **Does not apply to Large data sets imports.** Defaults to 100000.

## Blusam configuration
<a name="ba-shared-blusam-configuration"></a>

Configuring Blusam happens in the `application-main.yml` configuration file (or in the `application-bac.yml` configuration file for the stand-alone deployment of the Blusam Administration Console -- BAC -- application).

Blusam has to be configured on two aspects:
+ Blusam storage and caches access configuration
+ Blusam engine configuration

### Blusam storage and caches access configuration
<a name="ba-shared-blusam-configuration-storage"></a>

For information about how to configure access to Blusam storage and caches using either secrets managers or datasources, see [Set up configuration for AWS Transform for mainframe Runtime](ba-runtime-config.md).

**Note**  
Regarding the access to the Blusam storage, the credentials used will point to a connection role, with according privileges. For the Blusam engine be able to operate as expected, the connection role must have the following privileges:
+ connect to the database
+ create / delete / alter / truncate tables and views
+ select / insert / delete / update rows in tables and views
+ execute functions or procedures

### Blusam engine configuration
<a name="ba-shared-blusam-configuration-engine"></a>


#### Disabling Blusam support
<a name="ba-shared-blusam-configuration-disabling"></a>

First, let's mention that it is possible do completely disable Blusam support, by setting the `bluesam.disabled` property to `true`. An information message will be displayed in the server logs at application startup to remind Blusam disabling:

```
BLUESAM is disabled. No operations allowed.
```

No further configuration about Blusam is required in that case and any attempt to use Blusam related features (either programmatically or through REST calls) will raise an `UnsupportedOperationException` in the Java code execution, with a relevant explanation message about Blusam being disabled.

#### Blusam engine properties
<a name="ba-shared-blusam-configuration-engine-properties"></a>

The Blusam engine configuration properties are regrouped under the bluesam key prefix:

##### Mandatory properties
<a name="ba-shared-blusam-configuration-engine-mandatory-properties"></a>


+ `cache`: to be valued with the chosen cache implementation. Valid values are: 
  + `ehcache`: For local embedded ehcache usage. See the related use case restrictions above.
  + `redis`: For shared remote redis cache usage. This is the preferred option for the AWS Mainframe Modernization managed use case.
  + `none`: To disable storage caching
+ `persistence`: to be valued with pgsql (PostgreSQL engine: minimal version 10.0 – recommended version >=14.0
+ datasource reference: `<persistence engine>.dataSource` will point to the dataSource definition for the connection to the Blusam storage, defined elsewhere in the configuration file. Commonly it's being named `bluesamDs` .

**Note**  
Whenever Redis is used as cache mechanism, either for data or locks (see below), access to the Redis instances is to be configured. For details, see [Available Redis cache properties in AWS Transform for mainframe Runtime](ba-runtime-redis-configuration.md).

##### Optional properties
<a name="ba-shared-blusam-configuration-engine-optional-properties"></a>

Blusam Locks: the properties are prefixed with `locks`
+ `cache`: only usable value is `redis` , to specify that the redis-based locking mechanism will be used (to be used when Blusam storage cache is redis-based as well). If the property is missing or not set to `redis` , the default in-memory locks mechanism will be used instead.
+ `lockTimeOut`: a positive long integer value, giving the timeout expressed in milliseconds before an attempt to lock an already locked element is marked as failed. Defaults to `500` .
+ `locksDeadTime`: a positive long integer value, representing the maximum time, expressed in milliseconds, an application can hold a lock. Locks are automatically marked as expired and released after that elapsed time. Defaults to `1000` ;
+ `locksCheck`: a string, used to define the locking check strategy used by the current Blusam lock manager, about expired locks removal. To be picked amongst the following values: 
  + `off`: no checks are performed. Discouraged, as dead locks might happen.
  + `reboot`: checks are performed at reboot or application start time. All expired locks are released at that time. This is the default.
  + `timeout`: checks are performed at reboot or application start time, or when a timeout expires during an attempt to lock a data set. Expired locks are released immediately.

Write-behind mechanism: the properties are prefixed with `write-behind` key:
+ `enabled`: `true` (default and recommended value) or `false` , to enable or disable the write-behind mechanism. Disabling the mechanism will greatly impact write performance and is discouraged.
+ `maxDelay`: a maximal duration for the threads to be triggered. Defaults to `"1s"` (one second). Keeping the default value is generally a good idea, unless specific conditions require this value to be tuned. In any case the value should be kept low (under 3 seconds). The format for the delay string is: `<integer value><optional whitespace><time unit>` where `<time unit>` is to be picked amongst the following values: 
  + `"ns"`: nanoseconds
  + `"µs"`: microseconds
  + `"ms"`: milliseconds
  + `"s"`: seconds
+ `threads`: the number of dedicated write-behind threads. Default to `5` . You need to adjust this value according to the computing power of the host running the Blusam engine. It's not relevant to use a much higher value, hoping for performance increase as the limiting factor will become the storage RDBMS ability to deal with numerous concurrent batch queries. Recommended values are usually in the range 4-8.
+ `batchSize`: a positive integer representing the maximal number of records in a lot that will be dispatched for bulk treatment to a thread. Its value must be between 1 and 32767. Defaults to `10000` . Using `1` as value defeats the purpose of the mechanism which is to avoid using atomic update queries; the suitable minimal value to use is around `1000` .

Embedded EhCache fine-tuning: the properties are prefixed with `ehcache` key:
+ `resource-pool`: 
  + `size`: allocated memory size for the embedded cache, expressed as a string. Defaults to `"1024MB"` (1 gigabyte). To be adjusted with regards to the available memory of the machine hosting the Blusam engine and the size of the datasets being used by the application. The format of the size string is: `<integer value><optional whitespace><memory unit>` where `<memory-unit>` is to be picked amongst the following values: 
    + `B`: bytes
    + `KB`: kilobytes
    + `MB`: megabytes
    +  `GB`: gigabytes
    + `TB`: terabytes
  + `heap`: `true` or `false` , to indicate whether the cache will consume JVM heap memory or not. Defaults to `true` (fastest option for cache performance, but cache storage consumes memory from the JVM on-heap RAM memory). Setting this property to `false` will switch to Off-Heap memory, which will be slower, due to required exchanges with the JVM heap.
+ `timeToLiveMillis`: The duration (in Milliseconds) for which a cache entry remains in the cache before being considered expired and removed. If this property is not specified, cache entries will not automatically expire by default.

##### Optional properties for large data sets:
<a name="ba-shared-blusam-optional-properties-large-data-sets"></a>

Localized in-memory caching for Paginated Indexes:
+ `indexesPrefetchWindowSize`: This property applies to Blusam Large data sets using Redis cache enabled operations. It specifies the maximum in-memory cache size (in MB) available for storing paginated indexes. Default value is 0. This value may be adjusted depending on the available system memory and the size of the data sets being processed.

Sample configuration snippet:

```
dataSource:
  bluesamDs:
    driver-class-name: org.postgresql.Driver
    ...
    ...
bluesam:
  locks:
    lockTimeOut: 700
  cache: ehcache
  persistence: pgsql
  ehcache:
    resource-pool:
      size: 8GB
  write-behind:
    enabled: true
    threads: 8
    batchsize: 5000
    indexesPrefetchWindowSize: 25  
  pgsql:
    dataSource : bluesamDs
```

##### Multi-schema configuration properties
<a name="ba-shared-blusam-configuration-multi-schema"></a>
+ `multiSchema`: false (default value) or true, to disable or enable the Multi-schema mode for Blusam - Available starting version 4.4.0.
+ `pgsql`: 
  + `schemas`: A list of schema names that the application will utilize in Multi-schema mode for Blusam.
  + `fallbackSchema`: The fallback schema name for use in Multi-schema mode. If a data set is not found in the current schema context, this schema will be used for Blusam-related operations on that data set.

Sample configuration snippet (with Multi-schema mode enabled for Blusam):

```
dataSource:
  bluesamDs:
    driver-class-name: org.postgresql.Driver
    ...
    ...
bluesam:
  locks:
    lockTimeOut: 700
  cache: ehcache
  persistence: pgsql
  ehcache:
    resource-pool:
      size: 8GB
  write-behind:
    enabled: true
    threads: 8
    batchsize: 5000 
  multiSchema: true 
  pgsql:
    dataSource : bluesamDs
    schemas: 
      - "schema1"
      - "schema2" 
      - "schema3"
    fallbackSchema: schema3
```

**Note**  
Blusam metadata schemas, including schemas listed in the `application-main.yml` file for Multi-schema mode, are created in the Blusam database if they don't exist and the user has sufficient privileges.

## Blusam DD File Configuration
<a name="ba-shared-blusam-dd-file-configuration"></a>

These options can be used for Blusam file configuration (data definition in JCL) in groovy:
+ `largeKSDS()`: indicates large KSDS nature. When used, if the specified file is a missing file and it should be created (depending on the OPEN mode and optional file option), a large KSDS file and its index table will be created.

  ```
  .bluesam("TESTFILE").dataset("TESTFILE").largeKSDS().build()
  ```
+ `indexPageSize(Integer param)`: specify the page size (in mb) of index table to be created. Applicable to file with `largeKSDS()` option. Value of `param` should be strictly positive. A default value of 15 will be used for invalid `param` value.

  ```
  .bluesam("TESTFILE").dataset("TESTFILE").largeKSDS().indexPageSize(15).build()
  ```

 You can set default file configurations in the `ds-config.yml` file, located at `entities/src/main/resources/ds-config.yml` (or other locations for configuration files). These settings serve as fallback configurations when no specific file configuration is provided to the run unit. All Bluesam file configuration options are supported in `ds-config.yml`.

```
TESTFILE:
 provider: Bluesam
 dataset: TESTFILE
 largeKSDS: true
 indexPageSize: 15
```

## Blusam Administration Console
<a name="ba-shared-blusam-administration-console"></a>

The Blusam Administration Console (BAC) is a web-application, used to administrate the Blusam storage. For information about the BAC, see [AWS Transform for mainframe Blusam Administration Console](ba-shared-bac-userguide.md).

## Appendix
<a name="ba-shared-blusam-appendix"></a>


### General data set metadata attributes
<a name="ba-shared-blusam-metadata"></a>

General data set metadata serialization attributes list:
+ name (of the data set)
+ type (KSDS, LargeKSDS, ESDS, LargeESDS or RRDS)
+ cache warm-up flag (whether the data set should be preloaded in cache at server startup or not)
+ compression usage flag (whether to store records in a compressed or raw format)
+ creation date
+ last modification date
+ fixed length record flag (whether the data set records are all having the same length or not)
+ record length -- only meaningful for fixed record length
+ page size (used to customize the paginated sql queries used to preload cache when required)
+ size (size of the data set - cumulated length of the records)
+ last offset (offset i.e. RBA of the latest record added to the data set)
+ next offset (next avaliable offset for adding a new record to the data set)
+ if meaningful, definition of the keys used by the data set; each key being defined by its kind (primary or part of the alternate keys collection) and three attributes: 
  + offset : position in the record of the starting byte of the key value;
  + length : length in bytes of the key value. Thus the key value is the byte array which is the subset of the record starting at `key offset` and ending at position `key offset + length - 1` ;
  + duplicates allowed flag: whether the key accepts duplicates or not (set to true to allow duplicates).

### Estimating the memory footprint for a given data set
<a name="ba-shared-blusam-memory"></a>

For small to medium sized data sets, the metadata (sizes and indexes for various keys) will be fully loaded into memory. Allocating proper resources for the machine hosting the server used to run modernized applications requires to figure out the memory consumption induced by the Blusam data sets, in particular regarding metadata. This section give practical answers to concerned operators.

The given formulas only apply to Blusam small to medium data sets, not using the "Large" storage strategy.

#### Blusam data set metadata
<a name="ba-shared-blusam-metadata-parts"></a>

For a Blusam data set, metadata are split into two parts:
+ core metadata : holds global information about the data set. The memory footprint of this can be considered as negligeable compared to the internal metadata.
+ internal metadata: holds information about the records sizes and key indexes; when a data set is not empty, this is what consumes memory when loaded into the application server hosting modernized applications. The sections below detail how the consumed memory grows with the number of records.

#### Calculating Internal Metadata footprint
<a name="ba-shared-blusam-footprint-calculation"></a>


##### Records sizes map
<a name="ba-shared-blusam-footprint-sizesmap"></a>

First, the internal metadata stores a map to hold the size of every record (as an integer) given its RBA (relative byte address — stored as a long number).

The memory footprint of that data structure is, in bytes: `80 * number of records`

This applies to all data set kinds.

##### Indexes
<a name="ba-shared-blusam-footprint-indexes"></a>

Regarding the indexes for either the primary key of KSDS or alternate keys on both ESDS and KSDS, the calculation of the footprint depends on two factors:
+ the number of records in the data set;
+ the size of the key, in bytes.

The graphic below shows the size of the key index per record (y-axis) based on the size of the key (x-axis).

![\[Graph showing step-wise increase in index size per record as key size increases.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/indexes_size_per_record.png)


The corresponding formula for evaluating the footprint for a given key index of a data set is:

```
index footprint = number of records * ( 83 + 8 (key length / 8))
```

where ' / ' stands for the integer division.

Examples:
+ data set 1:
  + number of records = 459 996
  + key length = 15 therefore (key length / 8) = 1
  + index footprint = 459 996 \$1 (83 \$1 (8\$11)) = 41 859 636 bytes (= 39 MB approx.)
+ data set 2:
  + number of records = 13 095 783
  + key length = 18 therefore (key length / 8) = 2
  + index footprint = 13 095 783 \$1 (83 \$1 (8\$12)) = 1 296 482 517 bytes ( = 1.2 GB approx.)

The total footprint for a given data set is the sum of all the footprints for all keys indexes and the footprint for the records sizes map.

For instance, taking the example data set 2, that has only a single key, the global footprint is:
+ Records sizes map: 13 095 783 \$1 80 = 1 047 662 640 bytes
+ Key Indexes : 1 296 482 517 bytes (see above)
+ Total footprint = 2 344 145 157 bytes ( = 2.18 GB approx.)

# AWS Transform for mainframe Blusam Administration Console
<a name="ba-shared-bac-userguide"></a>

The Blusam Administration Console (BAC) is a secure web-application for handling Blusam data sets. This guide covers the BAC user interface. For remote management through REST endpoints, see [Blusam application console REST endpoints](ba-endpoints-bac.md).

**Topics**
+ [Deploying the BAC](bac-deployment.md)
+ [Using the BAC](bac-usage.md)
+ [LISTCAT JSON format](ba-shared-bac-listcat-json-format.md)

# Deploying the BAC
<a name="bac-deployment"></a>

The BAC is available as a secured single web application, using the web-archive format (.war). It is intended to be deployed alongside the AWS Transform for mainframe Gapwalk-Application, in an Apache Tomcat application server, but can also be deployed as a standalone application. The BAC inherits the access to the Blusam storage from the Gapwalk-Application configuration if present.

The BAC has its own dedicated configuration file, named `application-bac.yml`. For configuration details, see [BAC dedicated configuration file](#ba-shared-bac-configuration-file).

The BAC is secured. For details about security configuration, see [Configuring security for the BAC](#ba-shared-bac-securing).

## BAC dedicated configuration file
<a name="ba-shared-bac-configuration-file"></a>

Standalone deployment: If the BAC is deployed alone the Gapwalk-Application, the connection to the Blusam storage must be configured in the application-bac.yml configuration file.

Default values for data sets configuration used to browse data set records must be set in the configuration file. See [Browsing records from a data set](bac-usage.md#ba-shared-bac-read-dataset). The records browsing page can use an optional mask mechanism that makes it possible to show a structured view on a record's content. Some properties impact the records view when masks are used.

The following configurable properties must set in the configuration file. The BAC application does not assume any default value for these properties.


| Key | Type | Description | 
| --- | --- | --- | 
| bac.crud.limit | integer | A positive integer value giving the maximum number of records returned when browsing records. Using 0 means unlimited. Recommended value: 10 (then adjust the value data set by data set on the browsing page, to fit your needs). | 
| bac.crud.encoding | string | The default character set name, used to decode records bytes as alphanumeric content. The provided charset name must be java compatible (please see the java documentation for supported charsets). Recommended value: the legacy charset used on the legacy platform where data sets are coming from; this will be an EBCDIC variant most of the times. | 
| bac.crud.initCharacter | string | The default character (byte) used to init data items. Two special values can be used: "LOW-VALUE", the 0x00 byte (recommended value) and "HI-VALUE", the 0xFF byte. Used when masks are applied. | 
| bac.crud.defaultCharacter | string | The default character (byte), as a one character string, used for padding records (on the right). Recommended value: " " (space). Used when masks are applied. | 
| bac.crud.blankCharacter | string | The default character (byte), as a one character string, used to represent blanks in records.Recommended value: " " (space). Used when masks are applied. | 
| bac.crud.strictZoned | boolean | A flag to indicate which zoned mode is used for the record. If true, the Strict zone mode will be used; if false, the Modified zoned mode will be used. Recommended value: true. Used when masks are applied. | 
| bac.crud.decimalSeparator | string | The character used as decimal separator in numeric edited fields (used when masks are applied). | 
| bac.crud.currencySign | string | The default character, as a one character string, used to represent currency in numeric edited fields, when formatting is applied (used when masks are applied). | 
| bac.crud.pictureCurrencySign | string | The default character, as a one character string, used to represent currency in numeric edited fields pictures (used when masks are applied). | 

The following sample is a configuration file snippet.

```
bac.crud.limit: 10
bac.crud.encoding: ascii
bac.crud.initCharacter: "LOW-VALUE"
bac.crud.defaultCharacter: " "
bac.crud.blankCharacter: " "
bac.crud.strictZoned: true
bac.crud.decimalSeparator: "."
bac.crud.currencySign: "$"
bac.crud.pictureCurrencySign: "$"
```

## Configuring security for the BAC
<a name="ba-shared-bac-securing"></a>

Configuring security for the BAC relies on the mechanisms detailed in this documentation page. The authentication scheme is OAuth2, and configuration details for Amazon Cognito or Keycloak are provided.

While general setup can be applied, some specifics about the BAC need to be detailed here. The access to the BAC features is protected using a role-based policy and relies on the following roles.
+ ROLE\$1USER:
  + Basic user role
  + No import, export, creation, or deletion of data sets allowed
  + No control over caching policies
  + No administration features allowed
+ ROLE\$1ADMIN:
  + Inherits ROLE\$1USER permissions
  + All data set operations allowed
  + Caching policies administration allowed

## Installing the masks
<a name="ba-shared-bac-masks"></a>

In Blusam storage, data sets records are stored in a byte array column in the database, for versatility and performance considerations. Having access to a structured view, using fields, of the business records, based on application point of view is a convenient feature of the BAC. This relies on the SQL masks produced during the AWS Transform for mainframe driven modernization process.

For the SQL masks to be generated, please make sure to set the relevant option (`export.SQL.masks`) in the configuration of the AWS Transform for mainframe refactor Transformation Center to true:

![\[Property set configuration with export.sql.masks option set to true and boolean type.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-bluinsights-generate-masks-option.png)


The masks are part of the modernization artifacts that can be downloaded from AWS Transform for mainframe refactor for a given project. They are SQL scripts, organized by modernized programs, giving the applicative point of view on data sets records.

For example, using the [AWS CardDemo sample application](https://github.com/aws-samples/aws-mainframe-modernization-carddemo/tree/main/app/cbl), you can find in the downloaded artifacts from the modernization result of this application, the following SQL masks for the program CBACT04C.cbl:

![\[List of SQL mask files for CBACT04C program, including account, discrep, and transaction records.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-sample-masks.png)


Each SQL mask name is the concatenation of the program name and the record structure name for a given data set within the program.

For example, looking at the [[CBACT04C.cbl](https://github.com/aws-samples/aws-mainframe-modernization-carddemo/blob/main/app/cbl/CBACT04C.cbl) program, the given file control entry:

```
    FILE-CONTROL.      
        SELECT TCATBAL-FILE ASSIGN TO TCATBALF   
               ORGANIZATION IS INDEXED
               ACCESS MODE  IS SEQUENTIAL
               RECORD KEY   IS FD-TRAN-CAT-KEY
               FILE STATUS  IS TCATBALF-STATUS.
```

is associated with the given FD record definition

```
       FILE SECTION. 
       FD  TCATBAL-FILE.  
       01  FD-TRAN-CAT-BAL-RECORD.  
           05 FD-TRAN-CAT-KEY.  
              10 FD-TRANCAT-ACCT-ID             PIC 9(11).  
              10 FD-TRANCAT-TYPE-CD             PIC X(02).
              10 FD-TRANCAT-CD                  PIC 9(04).  
           05 FD-FD-TRAN-CAT-DATA               PIC X(33).
```

The matching SQL mask named `cbact04c_fd_tran_cat_bal_record.SQL` is the mask that gives the point of view of the program CBACT04C.cbl on the FD record named `FD-TRAN-CAT-BAL-RECORD`.

Its content is:

```
-- Generated by AWS Transform for mainframe Velocity
-- Mask : cbact04c_fd_tran_cat_bal_record

INSERT INTO mask (name, length) VALUES ('cbact04c_fd_tran_cat_bal_record', 50);
  INSERT INTO mask_item (name, c_offset, length, skip, type, options, mask_fk) VALUES ('fd_trancat_acct_id', 1, 11, false, 'zoned', 'integerSize=11!fractionalSize=0!signed=false', (SELECT MAX(id) FROM mask));
  INSERT INTO mask_item (name, c_offset, length, skip, type, options, mask_fk) VALUES ('fd_trancat_type_cd', 12, 2, false, 'alphanumeric', 'length=2', (SELECT MAX(id) FROM mask));
  INSERT INTO mask_item (name, c_offset, length, skip, type, options, mask_fk) VALUES ('fd_trancat_cd', 14, 4, false, 'zoned', 'integerSize=4!fractionalSize=0!signed=false', (SELECT MAX(id) FROM mask));
  INSERT INTO mask_item (name, c_offset, length, skip, type, options, mask_fk) VALUES ('fd_fd_tran_cat_data', 18, 33, false, 'alphanumeric', 'length=33', (SELECT MAX(id) FROM mask));
```

Masks are stored in the Blusam storage using two tables:
+ mask: used to identify masks. The columns of the mas table are: 
  + name: used to store mask identification (used as primary key, so must be unique)
  + length: size in bytes of the record mask
+ mask\$1item: used to store mask details. Every elementary field from a FD record definition will produce a row in the mask\$1item table, with details on how to interpret the given record part. The columns of the mask\$1item table are: 
  + name: name of the record field, based on the elementary name, using lowercase and replacing dash with underscore
  + c\$1offset: 1-based offset of the record sub-part, used for the field content
  + length: length in bytes of the record sub-part, used for the field content
  + skip: flag to indicate whether the given record part should be skipped or not, in the view presentation
  + type: the field kind (based on its legacy picture clause)
  + options: additional type options -- type-dependant
  + mask\$1fk: reference to the mask identifier to attach this item to

Note the following:
+ SQL masks represent a point of view from a program on records from a data set: several programs might have a different point of view on a given data set; only install the masks that you find relevant for your purpose.
+ A SQL mask can also represent the point of view from a program based on a 01 data structure from the WORKING STORAGE section, not only from a FD record. The SQL masks are organized into sub-folders according to their nature:
  + FD record based masks will be located in the sub-folder named `file`
  + 01 data structure based masks will be located in the sub-folder named `working` 

  While FD records definitions always match the record content from a data set, 01 data structures might not be aligned or might only represent a subset from a data set record. Before you use them, inspect the code and understands the possible shortcomings.

# Using the BAC
<a name="bac-usage"></a>

Because the BAC is secured and delivers permissions to use features based on the user role, the first step to access the application is to authenticate yourself. After the authentication step, you'll be redirected to the home page. The home page presents the paginated list of data sets found in the Blusam storage:

![\[Blusam Administration Console showing configuration settings and a table of data sets.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-list-datasets.png)


To return to the home page with the data sets listing, choose the AWS Transform for mainframe logo in the upper left corner of any page of the application. The following image shows the logo.

![\[Blu Age logo with stylized blue text and orange hyphen.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/logo_blu_age_aws_console_s.png)


The foldable header, labelled "Blusam configuration", contains information about the used Blusam storage configuration:
+ `Persistence`: the persistent storage engine (PostgreSQL)
+ `Cache Enabled`: whether the storage cache is enabled

On the right side of the header, two drop-down lists, each one listing operations related to data sets:
+ **Bulk actions**
+ **Create actions**

To learn about the detailed contents of these lists, see [Existing data set operations](#ba-shared-bac-usage-datasets).

The **Bulk Actions** button is disabled when no data set selection has been made.

You can use the search field to filer the list based on the data sets names:

![\[Search field and table showing KSDS data sets with details like keys, records, and dates.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-filtered-list-datasets.png)


The paginated list that follows shows one data set per table row, with the following columns:
+ Selection checkbox: A checkbox to select the current data set.
+ Name: The name of the data set.
+ Type: The type of the data set, one of the following:
  + KSDS
  + ESDS
  + RRDS
+ Keys: A link to show or hide details about the keys (if any). For example, the given KSDS has the mandatory primary key and one alternative key.   
![\[Key details table showing primary and alternative keys with their names, uniqueness, offsets, and lengths.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-shared-bac-keys-details.png)

  There is one row per key, with the following columns. None of the fields are editable.
  + Key nature: either a primary key or an alternative key
  + Name: the name of the key
  + Unique: whether the key accepts duplicate entries
  + Offset: offset of the key start within the record
  + Length: length in bytes of the key portion in the record
+ Records: The total number of records in the data set.
+ Record size max: The maximal size for records, expressed in bytes.
+ Fixed record length: A checkbox that indicates whether the records are fixed length (selected) or variable length (unselected).
+ Compression: A checkbox that indicates whether compression is applied (selected) or not (unselected) to stored indexes.
+ Creation date: The date when the data set was created in the Blusam storage.
+ Last modification date: The date when the data set was last updated in the Blusam storage.
+ Cache: A link to show or hide details about the caching strategy applied to this dataset.   
![\[Cache details section with options to enable cache at startup and warm up cache.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-shared-bac-cache-details.png)
  + Enable cache at startup: A checkbox to specify the startup caching strategy for this data set. If selected, the data set will be loaded into cache at startup time.
  + Warm up cache: A button to load the given data set into cache, starting immediately (but hydrating the cache takes some time, depending on the data set size and number of keys). After the data set gets loaded into cache, a notification like the following one appears.  
![\[Green box indicating successful achievement of DataSet AWS.M2.CARDDEMO.CUSTDATA.V SAM.KSDS cache warm up.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-warmed-up-cache-notification.png)
+ Actions: A drop-down list of possible data sets operations. For details, see [Existing data set operations](#ba-shared-bac-usage-datasets).

At the bottom of the page, there is a regular paginated navigation widget for browsing through the pages of the list of data sets.

## Existing data set operations
<a name="ba-shared-bac-usage-datasets"></a>

For each data set in the paginated list, there is an **Actions** drop-down list with the following content:

![\[Dropdown menu showing options: Read, Load, Export, Clear, and Delete.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-actions-dropdown.png)


Each item in the list is an active link that makes it possible to perform the specified action on the data set:
+ Read: browse records from the data sets
+ Load: import records from a legacy data set file
+ Export: export records to a flat file (compatible with legacy systems)
+ Clear: remove all records from the data set
+ Delete: remove the data set from the storage

Details for each action are provided in the following sections.

### Browsing records from a data set
<a name="ba-shared-bac-read-dataset"></a>

When you choose the **Read** action for a given data set, you get the following page.

![\[Blusam Administration Console interface for dataset management with search and filter options.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-browse-empty.png)


The page is made of:
+ a header, with: 
  + Dataset: the data set name
  + Record size: the fixed record length, expressed in bytes
  + Total Records: the total number of records stored for this data set
  + Show configuration button (on the right side): a toggle button to show/hide the data set configuration. At first, the configuration is hidden. When using the button, the configuration you see the configuration, as shown in the following image.  
![\[Dataset configuration panel with fields for encoding, characters, separators, and currency signs.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-configuration.png)

    When configuration is shown, two new buttons: Save and Reset, used respectively to:
    + save the configuration for this data set and current work session
    + reset the configuration to default values for all fields.
  + A list of configurable properties to tailor the browsing experience for the given data set.

The configurable properties match the configuration properties described in [BAC dedicated configuration file](bac-deployment.md#ba-shared-bac-configuration-file). Refer to that section to understand the meaning of each column and applicable values. Each value can be redefined here for the data set and saved for the work session (using the Save button). After you save the configuration, a banner similar to the one shown in the following image appears.

![\[Success message indicating configuration has been saved for the current dataset view session.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-configuration-saved-banner.png)


The banner states that the work session ends when you leave the current page.

There is an extra configurable property that is not documented in the configuration section: Record size. This is used to specify a given record size, expressed in bytes, that will filter the applicable masks to this data set: only masks whose total length matches the given record size will be listed in the Data mask drop-down list.

Retrieving records from the data set is triggered by the Search button, using all options and filters nearby.

First line of options:
+ the Data mask drop-down list show applicable masks (respecting the record size). Please note that, matching the record size is not enough to be an effective applicable mask. The mask definition must also be compatible with the records contents. The Data mask picked here has
+ Max results: limits the number of records retrieved by the search. Set to 0 for unlimited (paginated) results from the data set.
+ Search button: launch the records retrieval using filters and options
+ Clear mask button: will clear the used mask if any and switch back the results page to a raw key/data presentation.
+ Clear filter button: will clear the used filter(s) if any and update the results page accordingly.
+ All fields toggle: When selected, mask items defined with `skip = true` are shown anyway, otherwise mask items with `skip = true` are hidden.

Next lines of filters: It is possible to define a list of filters, based on the usage of filtering conditions applied to fields (columns) from a given mask, as shown in the following image.
+ Filter mask: The name of the mask to pick the filtering column from. When you choose the field, the list of applicable masks appears. You can choose the mask you want from that list.  
![\[Text input field labeled "Filter mask" with a dropdown arrow and placeholder text.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-mask-quick-select.png)
+ Filter column: The name of the field (column) from the mask, used to filter records. When you choose the field, the list of mask columns appears. To fill the **Filter column** field, choose the desired cell.  
![\[Dropdown menu showing filter column options for a data mask, including transaction and account IDs.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-filter-column.png)
+ Filter operator: An operator to apply to the selected column. The following operators are available.
  + equals to: the column value for the record must be equal to the filter value
  + starts with: the column value for the record must start with the filter value
  + ends with: the column value for the record must end with the filter value
  + contains: the column value for the record must contain the filter value
+ Filter options:
  + Inverse: apply the inverse condition for the filter operator; for instance, 'equals to' is replaced by 'not equals to';
  + Ignore case: ignore case on alphanumeric comparisons for the filter operator
+ Filter value: The value used for comparison by the filter operator with the filter column.

Once the minimal number of filter items are set (at least: Filter mask, filter column, filter operator and Filter value must be set), the Add Filter button is enabled, and clicking on it creates a new filter condition on the retrieved records. Another empty filter condition row is added at the top and the added filter condition has a Remove filter button that can be used to suppress the given filter condition:

![\[Filter configuration interface with options for mask, column, operator, and value.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-added-filter.png)


When you launch the search, the filtered results appear in a paginated table.

**Note**
+ Successive filters are linked by an **and** or an **or**. Every new filter definition starts by setting the link operator, as shown in the following image.  
![\[Dropdown menu showing options for filter link operator: "and" or "or".\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-filter-link-operator.png)
+ There might not be any records that match the given filter conditions.

Otherwise, the results table looks like the one in the following image.

![\[Data table showing transaction records with account IDs, types, and numerical data.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-filtered-results.png)


A header indicates the total number of records that match the filter conditions. After the header, you see the following.
+ Reminder of the used data mask (if any) and the filter conditions.
+ A refresh button that you can use to trigger the refresh of the whole results table with latest values from the Blusam storage (as it might have been updated by another user for instance).

For each retrieved record, the table has a row that shows the result of applying the data mask to the records' contents. Each column is the interpretation of the record sub-portion according to the column's type (and using the selected encoding). To the left of each row, there are three buttons:
+ a magnifying glass button: leads to a dedicated page showing the detailed record's contents
+ a pen button: leads to a dedicated edit page for the record's contents:
+ a trashcan button: used to delete the given record from the blusam storage

Viewing the record's contents in detail:

![\[Data mask table showing fields for a transaction record with name, type, options, and value columns.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-record-ro-details.png)

+ Three toggle buttons for hiding or showing some columns: 
  + Hide/show the type
  + Hide/show the display flag
  + Hide/show the range
+ To leave this dedicated page and go back to the results table, choose **Close**.
+ Each row represents a column from the data mask, with the following columns: 
  + Name: the column's name
  + Type: the column's type
  + Display: the display indicator; a green check will be displayed if the matching mask item is defined with `skip = false`, otherwise a red cross will be displayed
  + From & To: the 0-based range for the record sub-portion
  + Value: the interpreted value of the record sub-portion, using type and encoding

Editing the record's contents:

![\[Data record editor showing fields for transaction account details and data.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-record-rw-details.png)


The editing page is similar to the view page described above, except that the mask items values are editable. Three buttons control the update process:
+ Reset: resets the editable values to the initial record values (prior to any edition);
+ Validate: validates the input, with regards to the mask item type. For each mask item, the result of the validation will be printed using visual labels (`OK` and checkbox if validation succeeded, `ERROR` and red cross if validation failed, alongside an error message giving hints about the validation failure). If the validation succeeded, two new buttons will appear:
  + Save: attempt to update the existing record into Blusam storage
  + Save a copy: attempt to create a new record into Blusam storage  
![\[Data record form with fields for transaction account details and validation status.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-record-rw-valid-details.png)
  + If saving the record to the storage is successful, a message is displayed and the page will switch to a read-only mode (mask items values cannot be edited anymore):   
![\[Data mask record details showing fields, types, options, and values in a table format.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-record-updated.png)
  + If for any reason the record persistence to the storage fails, an error message is displayed in red, providing a failure reason. The most common case of failures are that storing the record would lead to a key corruption (invalid or duplicate key). For an illustration, see the following note. 
  + To exit, choose the **Close** button.
+ Cancel: Ends the editing session, closes the page, and takes you back to the records list page.

**Note:**
+ The validation mechanism only checks that the mask item value is formally compatible with the mask item type. For example, see this failed validation on a numeric mask item:  
![\[Data entry form with validation error on numeric field, showing incompatible value.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-record-rw-invalid-format.png)
+ The validation mechanism might try to auto-correct invalid input, displaying an informational message in blue to indicate that the value has been automatically corrected, according to its type. For example, inputting 7XX0 as the numeric value in the numeric `fd_trncat_cd` mask item:   
![\[Data mask interface showing auto-correction of numeric value 7XX0 in fd_trncat_cd field.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-record-rw-half-invalid-format.png)

  Calling validation leads to the following:  
![\[Data mask interface showing record fields, types, options, and values for a transaction category.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-record-rw-half-invalid-format-autofix.png)
+ The validation mechanism does not check whether the given value is valid in terms of key integrity (if any unique key is involved for the given data set). For instance, despite validation being successful, if provided values lead to an invalid or duplicate key situation, the persistence will fail and an error message will be displayed:  
![\[Data entry form with error message and fields for transaction details.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-record-rw-invalid-key.png)

Deleting a record:

To delete a record, choose the trashcan button:

![\[Confirmation dialog for deleting a record, with Cancel and Confirm options.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-record-deletion-confirmation-popup.png)


### Loading records into a data set
<a name="ba-shared-bac-load-dataset"></a>

To loading records into a data set, choose **Actions**, then choose **Load**.

![\[Dropdown menu showing options: Read, Load, Export, Clear, and Delete.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-load-cmd.png)


A window with load options appears.

![\[Data set loading interface with reading parameters and file selection options.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-load-popup.png)


At first, both the **Load on server** and **Load on Blusam** buttons are disabled.

Reading parameters:
+ Record length kind:
  + Fixed or Variable record length: use the radio-button to specify whether the legacy data set export uses fixed length records or variable length records (the records are expected to start with RDW bytes). If you choose Fixed, the record length must be specified (in bytes) as a positive integer value in the input field. The value should be pre-filled by the information coming from the data set. If you choose Variable, the given input field disappears.
  + File selection: 
    + Local: choose the data set file from your local computer, using the file selector below (Note: the file selector uses your browser's locale for printing its messages -- here in french, but it might look different on your side, which is expected). After you make the selection, the window is updated with the data file name and the **Load on server** button is enabled:   
![\[File selection interface with Local and Server options, Browse button, and Load on server button.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-load-selection.png)

      Choose **Load on server**. After the progress bar reaches its end, the **Load on Blusam** button gets enabled:  
![\[Progress bar fully loaded, with "Load on Blusam" button enabled.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-load-selection-uploaded.png)

      To complete the load process to the Blusam storage, choose the **Load on Blusam**. Otherwise, choose **Cancel**. If you choose to go on with the load process, a notification will appear in the lower right corner after the loading process is completed:  
![\[Green success notification indicating file loading completed successfully.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-load-notification.png)
    + Server: choosing this option makes an input field appear while the **Load on server** button disappears. The input field is where you must specify the path to the data set file on the Blusam server (this assumes that you have transferred the given file to the Blusam server first). After you specify the path, **Load on Blusam** gets enabled:   
![\[File selection interface with server option and file path input field.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-load-from-server.png)

      To complete the loading process, Choose **Load on Blusam**. Otherwise, choose **Cancel**. If you choose to proceed with the loading, a notification appears after the loading process is complete. The notification is different from the load from the browser as it displays the data file server path followed by the words **from server**:  
![\[Green success notification showing file loaded from server path.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-load-from-server-notification.png)

### Exporting records from a data set
<a name="ba-shared-bac-export-dataset"></a>

To export data set records, choose **Actions** in the current data set row, then choose **Export**:

![\[Dropdown menu showing options: Read, Load, Export, Clear, and Delete.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-export-cmd.png)


The following pop-up window appears.

![\[Data dump configuration window with options for local or server storage and zip dump.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-export-popup.png)


Options:

**To** : a radio button choice, to pick the export destination, either as a download in the browser (**Local (on browser)**) or to a given folder on the **Server** hosting the BAC application. If you choose to export using the **Server** choice, a new input field will be displayed: 

![\[Radio button for selecting Server as the export destination, with an input field for target folder.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-export-server-folder-location.png)


As the red asterisk on the right of the input field indicates, it is mandatory to provide a valid folder location on the server (the Dump button will be inactive while no folder location has been provided).

To export to the server, you must have the sufficient access rights for the server file system, if you plan to manipulate the exported data set file after the export.

**Zip dump**: a checkbox that produces a zipped archive instead of a raw file.

**Options**: To include a Record Descriptor Word (RDW) at the beginning of each record in the exported data set in the case of variable length record data set, choose **Include RDW fields**.

To launch the data set export process, choose **Dump**. If you choose to export to browser, check the download folder for the export data set file. The file will have the same name as the data set:

![\[File name AWS.M2.CARDDEMO.CARDXREF.VSAM.KSDS with details on size and type.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-export-result-file.png)


**Note:**
+ For KSDS, the records will exported following the primary key order.
+ For ESDS and RRDS, the records will be exported following the RBA (Relative Byte Address) order.
+ For all data sets kinds, records will be exported as raw binary arrays (no conversion of any kind happening), ensuring direct compatibility with legacy platforms.

### Clearing records from a data set
<a name="ba-shared-bac-clear-dataset"></a>

To clear all records from a data set, choose **Actions**, then choose **Clear**:

![\[Dropdown menu showing options: Read, Load, Export, Clear, and Delete.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-clear-cmd.png)


After all records are removed from a data set, the following notification appears.

![\[Green success notification showing "Succeeded" with a checkmark and data set details.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-clear-notification.png)


### Deleting a data set
<a name="ba-shared-bac-delete-dataset"></a>

To delete a data set, choose **Actions**, then choose **Delete**:

![\[Dropdown menu showing options: Read, Load, Export, Clear, and Delete.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-delete-cmd.png)


After you delete a data set, the following notification appears:

![\[Green success notification with checkmark indicating data set deletion completed.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-delete-notification.png)


### Bulk operations
<a name="ba-shared-bac-bulk-usage-existing-datasets"></a>

Three bulk operations are available on data sets:
+ Export
+ Clear
+ Delete

Bulk operations can only be applied to a selection of data sets (at least one data set needs to be selected); selecting data sets is done through ticking selection checkboxes on the left of data sets rows, in the data sets list table. Selecting at least one data set will enable the Bulk Actions drop down list:

![\[Dropdown menu showing Bulk Actions options: Export, Clear, and Delete.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-bulk-actions-dropdown.png)


 Apart from the fact that the given actions apply on a selection of data sets rather than a single one, the actions are similar to those described above, so please refer to dedicated actions documentation for details. The pop-up windows text contents will be slightly different to reflect the bulk nature. For instance, when trying to delete several data sets, the pop-up window will look like:

![\[Confirmation dialog asking if user wants to delete all selected data sets.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-delete-bulk-popup.png)


## Creating operations
<a name="ba-shared-bac-usage-creating-datasets"></a>

### Create a single data set
<a name="ba-shared-bac-create-single-dataset"></a>

Choose **Actions**, then choose **Create single data set**:

![\[Dropdown menu showing "Bulk Actions" and "Create Actions" buttons with options.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-single-create.png)


The data set creation form will then be displayed as a pop-up window:

![\[Data set creation form with fields for name, record size, type, and other configuration options.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-creation-form-window.png)


You can specify the following attributes for the data set definition:
+ Enabling and disabling naming rules: Use the 'Disable naming rules / Enable naming rules' toggle widget to disable and enable data set naming conventions. We recommend that you leave the toggle on the default value, with enabled data set naming rules (the toggle widget should display "Disable naming rules"):  
![\[Toggle switch for disabling or enabling naming rules, currently set to "Disable naming rules".\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-disable-dataset-naming-rules.png)  
![\[Toggle switch for enabling naming rules, shown in the off position.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-enable-dataset-naming-rules.png)
+ Data Set name: The name for the data set. If you specify a name that is already in use, the following error message appears.  
![\[Error message indicating dataset name already exists, prompting user to choose another.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/bac-bac-dataset-name-already-used-err-msg.png)

  The name must also respect the naming convention if it is enabled:  
![\[Input field with naming convention rule for dataset names using alphabetic or national characters.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-name-segment-convention-err-msg.png)  
![\[Text field labeled "DataSet Name" with input validation instructions for allowed characters.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-name-segment-characters-err-msg.png)  
![\[Input field for dataset name with character limit instruction in red text.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-name-segment-length-err-msg.png)  
![\[Input field with error message indicating dataset name must not end with a period.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-name-ends-with-period-err-msg.png)
+ Record size max: This must be a positive integer representing the record size for a data set with fixed-length records. You can leave it blank for data sets with variable-length records .
+ Fixed length record: A check box to specify whether the record length is fixed or variable. If selected, the data set will have fixed-length records, otherwise the record length will be variable.

  When you import legacy data to a variable length records data set, the provided legacy records must contain the Record Descriptor Word (RDW) that gives the length of each record.
+ Data set Type: A drop-down list for specifying the current data set type. The following types are supported.
  + ESDS
  + LargeESDS
  + KSDS

  For KSDS, you must specify the primary key:  
![\[Form fields for KSDS dataset configuration, including Primary Key, Offset, Length, and Unique option.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-creation-ksds.png)

  For the primary key, specify the following:
  + Name: This field is optional. The default is **PK**.
  + Offset: The 0-based offset of the primary key within the record. The offset must be a positive integer. This field is required.
  + Length: The length of the primary key. This length must be a positive integer. This field is required.

  For KSDS and ESDS, you can optionally define a collection of alternate keys, by choosing the Plus button in front of the Alternate Keys label. Each time you choose that button, a new alternate key definition section appears in the data set creation form:  
![\[Form fields for defining alternate keys with options for key name, offset, length, and uniqueness.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-altkey-definition.png)

  For each alternative key, you need to provide:
  + Name: This field is optional. The default value is **ALTK\$1\$1**, where \$1 represents an auto-incremented counter that starts at 0.
  + Offset: The 0-based offset of the alternative key within the record. Must be a positive integer. This field is required.
  + Length: The length of the alternative key. This length must be a positive integer. This field is required.
  + Unique: A checkbox to indicate whether the alternative key will accept duplicate entries. If selected, the alternative key will be defined as unique (NOT accepting duplicate key entries). This field is required.

  To remove the alternate key definition, use the trashcan button on the left.
+ Compression: A checkbox to specify whether compression will be used to store the data set.
+ Enable cache at startup: A checkbox to specify whether the data set should be loaded into cache at application startup.

After you specify the attribute definitions, choose **Create** to proceed:

![\[Data set creation form with fields for name, size, type, keys, and other settings.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-form-complete-sample.png)


The creation window will be closed and the home page showing the list of data sets will be displayed. You can view the details of the newly created data set.

![\[Data set details showing primary and alternative keys with their properties.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-freshly-created.png)


### Create a single data set in Multi-schema mode
<a name="ba-shared-bac-create-single-dataset-Multi-schema"></a>

A data set can be created in a Multi-schema mode by prefixing the data set name with the schema name followed by a pipe (\$1) symbol (e.g., `schema1|AWS.M2.CARDDEMO.ACCTDATA.VSAM.KSDS`).

**Note**  
The Schema used for creating the data set should be specified in the `application-main.yml` configuration. For more information, see [Multi-schema configuration properties](ba-shared-blusam.md#ba-shared-blusam-configuration-multi-schema).

![\[Data set creation form with fields for name, size, type, and other configuration options.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-create-single-dataset-Multi-schema.png)


If no schema prefix is provided, the data set will get created in the default schema specified in the Blusam datasource URL in [Blusam Datasource configuration](ba-shared-blusam.md#ba-shared-blusam-configuration-multi-schema). If no schema is specified in the Blusam datasource URL, then 'public' schema is used by default.

**Note**  
In Multi-schema mode, BAC console displays the schema information of the data set in the first column.

![\[Blusam Administration Console showing configuration details and dataset information.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-create-display-datasets-Multi-schema.png)


### Create data sets from LISTCAT
<a name="ba-shared-bac-create-datasets-from-listcat"></a>

This feature makes it possible to take advantage of the LISTCAT JSON files created during the AWS Transform for mainframe transformation process using AWS Transform for mainframe refactor Transformation Center as the result of parsing LISTCAT export from the legacy platforms: LISTCAT exports are parsed and transformed into JSON files that hold the data set definitions (names, data set type, keys definitions, and whether the record length is fixed or variable).

Having the LISTCAT JSON files makes it possible to create data sets directly without having to manually enter all the information required for data sets. You can also create a collection of data sets directly instead of having to create them one by one.

If no LISTCAT JON file is available for your project (for example, because no LISTCAT export file was available at transformation time), you can always manually create one, provided you adhere to the LISTCAT JSON format detailed in the appendix.

From the Create Actions drop-down list, choose **Create data sets from LISTCAT**.

The following dedicated page will be displayed:

![\[Interface for creating datasets from LISTCAT files, with options for file source and folder path.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-load-LISTCAT.png)


At this stage, the **Load** button is disabled, which is expected.

Use the radio buttons to specify how you want to provide the LISTCAT JSON files. There are two options:
+ You can use your browser to upload the JSON files.
+ You can select the JSON files from a folder location on the server. To choose this option, you must first copy the JSON files to the given folder path on the server with proper access rights.

**To use JSON files on the server**

1. Set the folder path on the server, pointing at the folder containing the LISTCAT JSON files:  
![\[Text input field for server folder path with a "Load" button below.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-creation-from-server-listcat-files.png)

1. Choose the **Load** button. All recognized data set definitions will be listed in a table:  
![\[List of AWS_M2_CARDDEMO data set definitions from LISTCAT, showing various VSAM_KSDS types.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-creation-from-server-listcat-files-list.png)

   Each row represents a data set definition. You can use trashcan button to remove a data set definition from the list.
**Important**  
The removal from the list is immediate, with no warning message.

1. The name on the left is a link. You can choose it to show or hide the details of the data set definition, which is editable. You can freely modify the definition, starting on the basis of the parsed JSON file.  
![\[Data set configuration form with fields for name, record size, type, and key settings.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-creation-definition-edit-form.png)

1. To create all data sets, choose **Create**. All data sets will be created, and will be displayed on the data sets results page. The newly created data sets will all have 0 records.  
![\[Data sets results page showing newly created AWS M2 CARDDEMO data sets with 0 records.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-freshly-created-datasets-from-listcat.png)

**To upload files to the server**

1. This option is similar to using the files from the server folder path, but in this case you must first upload the files using the file selector. Select all files to upload from your local machine, then choose **Load on server**.  
![\[File upload interface with Browse, Load on server, and Remove all buttons, and a progress bar.\]](http://docs.aws.amazon.com/m2/latest/userguide/images/ba-bac-dataset-creation-from-uploaded-listcat-files.png)

1. When the progress bar reaches the end, all files have been successfully uploaded to the server and the **Load** button is enabled. Choose the **Load** button and use the discovered data set definitions as explained previously.

# LISTCAT JSON format
<a name="ba-shared-bac-listcat-json-format"></a>

The LISTCAT JSON format is defined by the following attributes:
+ optional "catalogId": identifier of the legacy catalog as a String, or "default" for the default catalog.
+ "identifier": the data set name, as a String.
+ "isIndexed": a boolean flag to indicate KSDS: true for KSDS, false otherwise.
+ "isLinear": a boolean flag to indicate ESDS: true for ESDS, false otherwise.
+ "isRelative": a boolean flag to indicate RRDS: true for RRDS, false otherwise
+ **Note**: "isIndexed", "isLinear", and "isRelative" are mutually exclusive.
+ "isFixedLengthRecord": a boolean flag: set to true if fixed length records data set, false otherwise.
+ "avgRecordSize": Average record size in bytes, expressed as a positive integer.
+ "maxRecordSize": Maximal Record size in bytes, expressed as an integer. Should be equal to avgRecordSize for fixed length record size.
+ for KSDS only: Mandatory primary Key definition (as nested object)
  + labelled "primaryKey"
  + "offset": 0-based bytes offset for the primary key in the record.
  + "length": length in bytes of the primary key.
  + "unique": must be set to true for primary key.
+ for KSDS/ESDS, collection of alternate keys (as collection of nested objects):
  + labelled "alternateKeys"
  + For each alternate key: 
    + "offset": 0-based bytes offset for the alternate key in the record.
    + "length": length in bytes of the alternate key.
    + "unique": must be set to true for alternate key, if the key does not accept duplicate entries, false otherwise.
+ if no alternate keys are present, provide an empty collection:

  ```
  alternateKeys: []
  ```

The following is a sample KSDS LISTCAT JSON file.

```
{
  "catalogId": "default",
  "identifier": "AWS_M2_CARDDEMO_CARDXREF_VSAM_KSDS",
  "isIndexed": true,
  "isLinear": false,
  "isRelative": false,
  "isFixedLengthRecord": true,
  "avgRecordSize": 50,
  "maxRecordSize": 50,
  "primaryKey": {
    "offset": 0,
    "length": 16,
    "unique": true
  },
  "alternateKeys": [
    {
      "offset": 25,
      "length": 11,
      "unique": false
    }
  ]
}
```