Automatic data migration

TopLogic offers the possibility to perform an automatic data migration when changing framework versions and when importing new application versions. When a framework version is changed, the new framework version provides corresponding migration descriptions. The same mechanism can also be used in application development to migrate existing data when a new application version is imported.

Such an automatic data migration can consist of several successive steps that operate on the different levels of the persistence layer:

  • SQL directly on the database
  • Replay of the history
  • versioned changes at KnowledgeBase level
  • versioned model updates

schema version

A migration is automatically triggered at application startup if the schema version of the deployed application is greater than the schema version found in the database. The schema version does not consist of a simple version number, but of the set of migration descriptions that have already been imported into the dataset of the application. Using this system, the schema version can be composed modularly from a framework version and one or more application schema versions.

Technically, the schema version is stored in the table TL_PROPERTIES in keys that begin with the prefix databaseVersion.. For each framework or application module, it is stored here which migration description was last imported into the application's dataset. In an application consisting of only one module my-app for which a migration my-migration was performed last, the schema version could be composed as follows:

SELECT * FROM TL_PROPERTIES

node propKey propValue
__global__ databaseVersion.tl-model-search Upgrade_layout
__global__ databaseVersion.tl Multiple_forms_inapp
__global__ databaseVersion.tl-element Move_Roles_from_Singletons_to_Modules
__global__ databaseVersion.my-app My_migration

Each module of an application can provide its contribution to the schema version of the application. The schema version of a module is equivalent to the last migration description that was imported into the database for this module.

In the above example, the framework modules tl-model-search, tl, and tl-element have corresponding versions that were delivered with the framework version on which the application is based. The application itself consists of only one module my-app and has the schema version my-migration. There is no corresponding entry in the schema version in an initially set up application for which no migration has yet been performed.

Migration description

Migrations are described in XML files. These end with .migration.xml. To create a migration file, the Ant target create_migration_script_template must be called from the build.xml of the respective module for which the migration is to be performed. The target requires you to enter the name of the migration description. The script automatically calculates the dependencies and creates the new migration file in the folder WEB-INF/kbase/migration/[module-name]/. Migrations can be defined modularly, but their order must be defined globally. For this, each migration description must contain all its base versions on which it is based. When the migration template is created, this base version is entered into the dependencies of the new migration description.

In order for an application module to contribute to the schema version at all, this must be made known once in the application configuration:

<config service-class="com.top_logic.knowledge.service.migration.MigrationService">
   <instance>
      <modules>
         <module name="[module-name]>" />
      </modules>
   </instance>
</config>

A corresponding entry is automatically created by the application template when a new application is set up. With this setting, migration descriptions are then searched for in the folder WEB-INF/kbase/migration/[module-name]/ when the application is started and imported for this module.

A migration file has the following structure:

<migration>
   <version name="[migration-name]"
      module="[module-name]"
   />
   <dependencies>
      <dependency name="[migration-name-of-dependency]"
         module="[module-name-of-dependency]"
      />
      ...
   </dependencies>
   <processors>
      ...
   </processors>
   <migration>
      ...
   </migration>
   <post-processors>
      ...
   </post-processor>
</migration>

The version and dependencies sections are automatically populated at creation and must not be customized. The latest versions of the dependent modules are listed at dependencies. Before this migration can be executed, all migrations of the dependent modules up to the listed version must have been performed. The other sections are explained below.

SQL Migration

The processors in the processors section are executed at the beginning of a migration.

<processors>
   <processor class="[migration-processor-class]"/>
   ...
</processors>

The migration-processor-class must be an implementation ofcom.top_logic.knowledge.service.migration.MigrationProcessor. The processor works in its interface directly on a connection to the database and can execute any SQL commands over it. This makes it possible to make unversioned changes to the application data. If multiple migrations are executed at once, all processors are executed first before the replay migration begins.

There are a number of predefined processors that are defined by their own tag names. Examples include:

  • Updating the table definition. If a database table is created or deleted manually, you can use
    <store-type-configuration />
    
    to make this new configuration known to the application.
  • If only one SQL script is to be executed, this can be done with the SQL processor. For this, a script is expected in the dialect of the database. With the processor
    <sql file-name-prefix="webinf://kbase/migration/scripts/migration4711" />
    
    the SQL script to be executed is expected e.g. in the folder "WEB-INF/kbase/migration/scripts". For example, if the target database is "Oracle" then the file "migration4711.oracle.sql" is executed, on a "MySQL" the file "migration4711.mysql.sql" and on a "MSSQL" database "migration4711.mssql.sql".
  • For unversioned model changes (e.g., creating a compartment object or a property of a compartment object as if it always existed), there are a set of pre-built processors described in Unversioned Model Adjustment. The advantage of these processors is that the developer can use them independently of the database.

Replay

If the subsection migration is filled, a replay of the application history is performed. In this process, the changes are extracted from the persistence layer into change sets, rewritten and reimported.

<migration>
   <steps>
      <step name="[name-of-migration-step]">
         <rewriter class="[rewriter-class]"/>
         ...
      </step>
  </steps>		
</migration>

The replay migration consists of potentially several groups of named migration steps (steps). A migration step consists of executing a series of change-set rewrites (rewriter). The rewriter-class must implement thecom.top_logic.knowledge.event.convert.EventRewriter interface. A rewriter get passed a ChangeSet, extracted from the persistence layer. This can be modified or removed. Furthermore, new ChangeSets can also be created. These modified or newly created ChangeSets are then injected into the target database.

There are a number of predefined rewriters that can be defined using custom tag names. Examples of these are:

  • Changing the annotations of a TLModelPart: This replaces the annotations of a TLModelPart in the database with the configured value. E.g. in the module DemoTypes in the type DemoTypes.A the attribute date2:
    <set-part-annotation  module="DemoTypes" type="DemoTypes.A" attribute="date2">
     <annotations>
    
    <!-- Annotation des Attributes wie es z.B. in der Datenbank oder einer `*.model.xml` steht. -->
      <annotations>
       <format format="d. MMMMM y"/>
      </annotations>
    
     </annotations>
    </set-part-annotation>
    
    Note the additional annotations tag.
  • Changing parts of the configuration of a TLModelPart: Here not the whole annotation is replaced but only special parts. E.g. replace in all configurations <table name="OldTable"/> by <table name="NewTable"/>:
    <attribute-value-rewrite
     types="MetaElement"
     source-attribute="annotations"
     target-attribute="annotations"
    >
     <config-as-string>
      <regex-replace>
       <pattern><![CDATA[<table name="OldTable"/>]]></pattern>
       <replacement><![CDATA[<table name="NewTable"/>]]></replacement>
      </regex-replace>
     </config-as-string>
    </attribute-value-rewrite>
    
  • Transformation of a String- or ConfigurationItem-valued attribute: If the value of a database attribute is XML-valued, it can be changed using an XSL transformation. E.g. in the database type "A" the attribute "attr" with the help of the XSLT script "xslt-script.xls":
    <rewriter class="com.top_logic.knowledge.service.db2.migration.rewriters.AttributeRewriter"
     types="A"
    >
     <algorithm class="com.top_logic.knowledge.service.db2.migration.rewriters.XsltValueTransform"
      attribute="attr"
      transform="WEB-INF/.../xslt-script.xls"
     />
    </rewriter>
    
  • Removing Attribute Values: Here database attributes whose names satisfy a given Regular Expression can be removed. This is e.g. useful to remove the data of removed attributes as well. E.g. in the database type "A" the values of the attributes whose names end with "Legacy":
    <rewriter class="com.top_logic.knowledge.service.db2.migration.rewriters.AttributeRewriter"
     types="A"
    >
     <algorithm class="com.top_logic.knowledge.service.db2.migration.rewriters.AttributeNameFilter"
      exclude-pattern=".*Legacy"
     />
    </rewriter>
    
  • Renaming Database Types. E.g. rename the database type "A" to "B":
    <type-renaming types="A" target-type="B"/>
    

Versioned Changes

After a potential replay migration, versioned changes can be made to the dataset directly at the persistence layer level. Such changes happen even before the application and thus the dynamic type system starts. Therefore, no application functionality other than access to the versioning persistence layer (knowledge base) is available for this part of a migration.

Corresponding processors are configured in the post-processors section. A post-processor-class must implement thecom.top_logic.knowledge.service.migration.MigrationPostProcessor interface.

<post-processors>
   <post-processor class="[post-procesor-class]"/>
</post-processors>

Such a post-processor can directly perform operations on the KnowledgeBase. In doing so, unlike SQL migration or rewrite, no change can be made to the application history. Instead, one-time versioned changes can be made to the application dataset.