Upgrade from 4.5 and earlier to 5.0

Prior to upgrade, please review the Release Model. The Release Model describes the various release statuses: Preview, Early Review, and Generally Available. Refer to the Release Model to identify the release version and status that is most suitable for upgrading your Sandbox, Developer, User Acceptance Testing (UAT), or Production environment.

Note

If using version of Incorta that is earlier than Release 4.2, you must upgrade first to Release 4.2.


Upgrade from 4.2, 4.3, 4.4, or 4.5 to 5.0

This guides details how to upgrade a standalone Incorta cluster. Upgrading your Incorta cluster to Release 5.0 requires team resources:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark
  • a CMC Administrator
  • a Database Administrator
  • a SuperUser that can access each tenant in the Incorta environment
  • an Incorta Developer to resolve identified issues with formula expressions, schema alias, joins between tables, and dependencies between objects such as dashboards and business schemas

It also requires time as these general timelines for various procedures and processes indicate:

Stage Estimated Time
Prepare for Upgrade Readiness 15 minutes to 3 hours
Achieve Upgrade Readiness 2 hours to 3 days
Stop the Incorta cluster 5 minutes to 30 minutes
Create backups 15 minutes to 3 hours
Upgrade the Incorta cluster 15 minutes to 3 hours
Start the Incorta cluster 5 minutes to 5 hours
Upgrade the Incorta Metadata database 5 minutes to 3 hours
Update Snapshot files 15 minutes to 1 day
Verify the successful upgrade 15 minutes to 1 day

Prepare for Upgrade Readiness

To prepare for Upgrade Readiness requires:

  • a System Administrator with root access to the host or hosts running Incorta Nodes as well as the host running the Cluster Management Console (CMC)
  • a CMC Administrator
  • a Database Administrator

The estimated time to complete the following is from 15 minutes to 3 hours:

  • Pause all scheduled jobs
  • Export all tenants
  • Add a Create View database grant

Pause all scheduled jobs in the CMC

Enable this setting to pause active scheduled schema loads, dashboards, and data alerts. This is helpful when importing or exporting an existing tenant. Here are the steps to enable this option as default tenant configuration:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Data Loading.
  • Enable the Pause Scheduled Jobs setting.
  • Select Save.

Export of all tenants with the Tenant Management Tool

A System Administrator with root access to the host running the Cluster Management Console (CMC) is able to run the Tenant Management Tool (TMT). Here are the steps:

  • Secure shell in to the CMC host.
  • As the incorta user, navigate to the installation path of the TMT. The default installation path for the TMT is:

<CMC_INSTALLATION_PATH>/IncortaAnalytics/cmc/tmt

  • For each tenant in the cluster, create a tenant export file.
./tmt.sh -clnm <CLUSTER_NAME> --export <TENANT_NAME>  <TENANT_EXPORT>.zip

Example

./tmt.sh -clnm myIncortaCluster --export myTenant /tmp/myTenant.zip
  • As desired, secure copy each tenant export file to your local host.

Add a Create View database grant

A Database Administrator with root access to the MySQL or Oracle database server that runs the Incorta Metadata database is able to add the Create View database grant. The estimated time to complete the following is from 5 minutes.

MySQL

Here are the steps for MySQL:

  • Sign in to the MySQL Incorta metadata database as the root user.
mysql -h0 -uroot -proot_password incorta_metadata
Note

-h = host, where 0 is a shorthand reference for localhost
-u = user, where root is the user
-p = password, where the password is rootpassword
incorta
metadata is the database

  • Verify the incorta database user for the incorta_metadata database.
SELECT User, Host FROM mysql.user WHERE user = 'incorta';
  • Verify the current grants for all users.
SHOW GRANTS for 'incorta'@'locahost';
SHOW GRANTS for 'incorta'@'127.0.0.1';
SHOW GRANTS for 'incorta'@'192.168.128.101';
  • If needed, add the CREATE VIEW grant to the all incorta users.
GRANT CREATE VIEW ON `incorta_metadata`.* TO 'incorta'@'localhost';
GRANT CREATE VIEW ON `incorta_metadata`.* TO 'incorta'@'127.0.0.1';
GRANT CREATE VIEW ON `incorta_metadata`.* TO 'incorta'@'192.168.128.101';

Oracle

To add grants for a user in an Oracle database, please refer to Oracle Database SQL Language Reference.


Achieve Upgrade Readiness

Please review Concepts → Upgrade Readiness. Achieving Upgrade Readiness requires:

  • a System Administrator with root access to the host or hosts running Incorta Nodes as well as the host running the Cluster Management Console (CMC)
  • a CMC Administrator
  • a SuperUser that can access each tenant in the Incorta environment
  • an Incorta Developer to resolve identified issues with formula expressions

The estimated time to complete the following is from 2 hours to 3 days:

  • Resolve alias issues with the Alias Sync Tool
  • Resolve issues with formula expressions that the Formula Validation Tool identifies
  • Resolve Severity-1 issues that the Inspector Tool identifies

Resolve alias issues with the Alias Sync Tool

Here are the resources required to run the Alias Sync Tool:

  • A System Administrator with root access to the host running an Incorta Node is able to run the Alias Sync Tool.

To resolve issues with Alias tables, you must download the alias_sync.py file, secure copy the file to IncortaNode/bin directory, and the run the script for each tenant in your cluster.

To learn more, please review Tools → Alias Sync Tool.

Resolve issues with formula expressions that the Formula Validation Tool identifies

Here are the resources required to run the Formula Validation Tool and identify outstanding issues with formula expressions:

  • A CMC Administrator to export tenants in a cluster or a System Administrator who can export tenants using the Tenant Management Tool (TMT).
  • A System Administrator with root access to the host running an Incorta Node will need to run the Formula Validation Tool.
  • A SuperUser that can access each tenant in the Incorta environment.
  • An Incorta Developer to resolve the identified issues with formula expressions.

For a given tenant export, the Formula Validation Tool checks the formula syntax in dashboards, business schemas, and schemas. For example, the tool identifies issues with formula columns and runtime security filters in a schema table. One such issue is with aggregation formula expressions that are missing commas between input values of the type integers, longs, and doubles. In Release 5.0, commas must separate input parameter values for built-in functions. In previous releases, the Formula Builder accepted spaces between input values. The Formula Validation Tool identifies this and other issues with formula syntax.

Important

Resolving issues with formula expressions is an iterative process. The Formula Validation Tool requires a tenant export file. You must resolve all issues in the failedFormulas.tsv file. In many cases, resolving one instance of an issue will resolve issues in dependent objects. After resolving issues with a formula expression, you will need to repeat the process. This means exporting the tenant again, running the Formula Validation Tool using the new tenant file, and again resolving outstanding issues. This iterative approach may take several hours or days.

To learn more, please review Tools → Formula Validation Tool.

Identify issues with the Inspector Tool

The Incorta Inspector Tool is a lineage and a consistency check tool. Follow these steps to download and run the Inspector Tool for each tenant in your cluster.

  • Download the Inspector Tool
  • Unzip the file
  • In the Inspector_1_0_63_Build_971 directory, edit the config.properties file as required:
incortaDir=./example
enginePort=5436
port=8080
offlineDataDir=./example.zip
caseSensitiveWords=
username=admin
sparkPort=5442
timeout=600000
tenant=demo
password=admin
host=http://localhost
  • Next, run the Inspector_1_0_63_Build_971.jar to create the related Inspector CSV files for each tenant:
java -jar Inspector_1_0_63_Build_971.jar -v -l -p  <tenantname>/data
  • From the Inspector_1_0_63_Build_971/importdirectory, import the following into the tenant

    • Import Schema.zip into Schema
    • Import business_schema.zip into Business Schemas
    • Import Dashboards.zip into Content
  • For each Inspector Schema in each tenant, perform a** Full Load.**
  • In the Inspector dashboard folder, view the 1-Validation UseCases dashboard.
  • In Error & Warning Codes, identify all the Severity-1 issues.
  • Resolve all the Severity-1 issues

Stop the Incorta cluster

Here are the resources required to stop all the services in the Incorta cluster:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark

The estimated time to stop the Incorta cluster and all related services is from 5 minutes to 30 minutes. Here are the steps involved in stopping the Incorta cluster:

Stop the Loader Service

In order to stop the Loader Service, you need to know the name of the service. You can read the services.index file to find out the name of the services running on an Incorta Node.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/services/services.index

Once you know the name of the Loader Service, you can then execute the following:

LOADER_SERVICE=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stopService.sh ${LOADER_SERVICE}

Stop Apache Spark

You can stop Apache Spark using the stopSpark.sh shell script:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stopSpark.sh

Stop the CMC

The default directory for the CMC is ~/IncortaAnalytics/cmc. Stop the CMC with the stop-cmc.sh shell script:

<CMC_INSTALLATION_PATH>/cmc/stop-cmc.sh

Stop the Node Agent

For each Incorta Node, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/nodeAgent/agent.sh stop

Stop the Export Server

To stop the Export Server, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stop-exportserver.sh

Stop Apache Zookeeper

To stop Apache Zookeeper, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/stop-zookeeper.sh

Create backups

Here are the resources required to create a various backups:

  • A Database Administrator with root access to the MySQL or Oracle database server that runs the Incorta Metadata database.
  • A System Administrator with root access to the host or hosts running Incorta Nodes, the Cluster Management Console (CMC), and Apache Spark.

The estimated time to complete the following is from 30 minutes to 3 hours:

  • Create a backup of the Incorta Metadata database
  • Create a backup of the IncortaAnalytics directory
  • Create a backup of the Apache Spark configuration files

Create a backup of the Incorta Metadata database

Here are the resources required to create a backup of the Incorta Metadata database:

  • A Database Administrator with root access to the MySQL or Oracle database server that runs the Incorta Metadata database.

MySQL

To create a backup of the incorta metadata database, use mysqldump command line utility:

mysqldump -u [user] -p [database_name] > [filename].sql
Example

Here is example with the MySql user as root with the password incorta_root:

mysqldump -uroot -pincorta_root incorta_metadata > /tmp/incorta_metadata.sql

Oracle

To create a backup of the incorta metadata database, please refer to Oracle documentation.

Create a backup of the Incorta installation directory

To create a backup of the Incorta installation directory, use the following command:

zip -r IncortaAnalytics_Backup.zip <INCORTA_NODE_INSTALLATION_PATH>

Create a backup of the Apache Spark configuration files

Create a backup of the following spark configuration files present in the $SPARK_HOME/conf directory:

  • spark-defaults.conf
  • spark-env.sh
SPARK_HOME=<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/spark
cd $SPARK_HOME/conf
zip -r Spark_Conf_Backup.zip spark-defaults.conf spark-env.sh

Upgrade the Incorta cluster

Here are the resources required to upgrade he Incorta cluster:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark

To begin, run the incorta-installer.jar file from the shell:

java -jar incorta-installer.jar -i console

In the Incorta Installer console, enter these values for a standalone (Typical) upgrade:

Welcome                     : Enter
License Agreement/Copyright : Enter
License Agreement/Copyright : Y
Installation Type           : 2- Upgrade
Installation Set            : 1- Typical
Choose Installation Folder  : Enter- Default
Installation Status         : Enter
Start CMC                   : 3- Finish without starting CMC

Kill unwanted processes

After upgrading, you will want to kill any processes related to Incorta as you will start Incorta manually. To kill any unwanted processes, run the following commands:

sudo kill -9 $(ps -aux | grep '[n]odeAgent.jar' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[d]erby' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[e]xportServer' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[z]ookeeper' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[c]mc' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[s]park' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[h]adoop' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[p]ostgres' | awk '{print $2}')
sudo kill -9 $(ps -aux | grep '[I]ncortaNode' | awk '{print $2}')

Upgrade an external Apache Spark environment

If the Incorta Cluster is using an external Apache Spark environment, you must also upgrade the Apache Spark environment by following these steps:

  • Zip the bundled spark directory under IncortaNode:
zip -r Incorta-Bundled-Spark.zip <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/spark
  • Zip the bundled hadoop directory under IncortaNode:
zip -r Incorta-Bundled-Hadoop.zip <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/hadoop
  • Copy Incorta-Bundled-Spark.zip and Incorta-Bundled-Hadoop.zip to the external Apache Spark environment.
  • In the external Apache Spark environment, remove the spark directory.
  • Unzip Incorta-Bundled-Spark.zip to recreate the Spark environment.
  • Unzip Incorta-Bundled-Hadoop.zip to recreate the Hadoop environment.

Review the Upgrade logs

Check to see if there are any critical errors with the upgrade in the following log files and directories:

  • Installer log
cat /tmp/DebuggingLog.log
  • Incorta Node upgrade logs
cd <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/logs/
  • CMC logs
ls -l <CMC>/logs/

Remove unused SSO jars

Important: Single-Sign On (SSO) authentication

If you are using Single-Sign On (SSO) authentication for your Incorta cluster, a System Administrator with root access to the host or hosts running the Incorta Nodes must remove one of the two related SSO JAR files. The two files are:

  • incorta.onelogin.valv-1.0.jar
  • incorta-sso.jar

If your Single-Sign On provider is Okta, you need to delete the incorta.onelogin.valv-1.0.jar file. For all other SAML-compliant providers, such as OneLogin or Azure Active Directory (Azure AD), you need to delete the incorta-sso.jar file.

Here are the steps to remove the unused SSO JAR:

  • Determine the universally unique identifier (UUID) of the Analytics Service.
INCORTA_NODE_INSTALLATION_PATH=<INCORTA_NODE_INSTALLATION_PATH>
cat ${INCORTA_NODE_INSTALLATION_PATH}/IncortaNode/services/services.index
  • Once you know the UUID the Analytics Service, grep the server.xml file for Okta.
ANALYTICS_SERVICE_UUID=<UUID>
cat ${INCORTA_NODE_INSTALLATION_PATH}/IncortaNode/services/${ANALYTICS_SERVICE_UUID}/conf/server.xml | grep 'Okta'
  • If grep finds a match for Okta in the server.xml file, follow the steps:

    • Confirm that the <Valve /> element is not commented out as in <!-- <Valve /> --> in the server.xml file.
    • Delete the incorta.onelogin.valv-1.0.jar file.
sudo rm -f ${INCORTA_NODE_INSTALLATION_PATH}/IncortaNode/runtime/lib/incorta.onelogin.valv-1.0.jar
  • If grep does not find a match for Okta, follow the steps:

    • Confirm that the <Valve /> element is not commented out as in <!-- <Valve /> -->
    • Delete the incorta-sso.jar file.
sudo rm -f ${INCORTA_NODE_INSTALLATION_PATH}/IncortaNode/runtime/lib/incorta-sso.jar

Start the Incorta cluster

Here are the resources required to start all the services in the Incorta cluster:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark

The estimated time to start the Incorta cluster and all related services is from 5 minutes to 5 hours. Depending on schema data size and various tenant configurations, it may take the Incorta Analytics Service several hours to load schemas into memory.

Here are the steps to start the Incorta cluster:

Start Apache Zookeeper

To start Apache Zookeeper, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/start-zookeeper.sh

Start Apache Spark

You can start Apache Spark using the startSpark.sh shell script:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/startSpark.sh

Start the Export Server

To start the Export Server, run the following:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/start-exportserver.sh

Start the Node Agent

For each Incorta Node, run the following to start the node agent:

<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/nodeAgent/agent.sh start

Start the Loader Service

In order to start the Loader Service, you need to know the name of the service. You can read the services.index file to find out the name of the services running on an Incorta Node.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/services/services.index

Once you know the name of the Loader Service, you can then execute the following:

LOADER_SERVICE=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/startService.sh ${LOADER_SERVICE}

Start the Analytics Service

In order to start the Analytics Service, you need to know the name of the service. You can read the services.index file to find out the name of the services running on an Incorta Node.

cat <INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/services/services.index

Once you know the name of the Analytics Service, you can then execute the following:

ANALYTICS_SERVICE=<SERVICE_NAME>
<INCORTA_NODE_INSTALLATION_PATH>/IncortaNode/startService.sh ${ANALYTICS_SERVICE}

Start the CMC

The default directory for the CMC is ~/IncortaAnalytics/cmc. Start the CMC with the start-cmc.sh shell script:

<CMC_INSTALLATION_PATH>/cmc/start-cmc.sh

Upgrade the Incorta metadata database

A CMC Administrator is able to upgrade the Incorta metadata database. Depending on the number of tenants and schemas in your Incorta cluster, the process can take between 5 minutes and 3 hours.

To sign in to the Cluster Management Console (CMC), visit your CMC host at one of the following:

  • http://<Public_IP>:6060/cmc
  • http://<Public_DNS>:6060/cmc
  • http://<Private_IP>:6060/cmc
  • http://<Private_DNS>:6060/cmc

The default port for the CMC is 6060. Sign in to the CMC using your CMC administrator username and password.

To upgrade the Cluster Metadata database, follow the steps:

  • In the Navigation bar, select Clusters.
  • For each cluster name in the Cluster list, in the Actions column, select Upgrade Cluster Metadata.
Note

A dialog indicates to restart the Incorta Services. In the dialog, select OK.


Update Snapshot files

It is mandatory that you update the Snapshot files.

There are several options for updating snapshot files:

  • Option: Run the Migration Snapshot Tool
  • Option: Perform a Full Load
  • Option: Load from Staging

Option: Run the Migration Snapshot Tool

The Migration Snapshot Tool is a shell script. You do not need to run this script if:

  • the tenant data volume is relatively low and you are able to perform a Full Load of all tenant schemas easily and quickly
  • the tenant data volume is relatively high but you are able to perform a Load from Staging for each schema tenant
Important

Either using the CMC or shell scripts, stop the Analytics and Loader Services. The Incorta Metadata database must be available and running.

Create a backup of Parquet and Snapshot directories

Before running the Migration Snapshot script, backup the tenant parquet and snapshot folders.

Here are examples:

zip -r tenant-parquet-backup.zip /incorta/IncortaAnalytics/Tenants/ebs_cloud/parquet
zip -r tenant-snapshot-backup.zip /incorta/IncortaAnalytics/Tenants/ebs_cloud/snapshots

Run migrateSnapshotsTool.sh

To run the shell script, execute the following:

cd $INCORTA_HOME/IncortaNode
./migrateSnapshotsTool.sh

Provide the metadata database URL, username & password, and tenants to migrate. Follow the instructions and use defaults for the other parameters. Optionally, you can use migration.properties file and pass it as a parameter to the tool and pass it as a parameter.

cd $INCORTA_HOME/IncortaNode
./migrateSnapshotsTool.sh migration.properties

If any table fails to migrate, the tool will recommend you to take one of the following actions

  • Load the table from staging
  • Perform a full load on the table

Review the Migration Snapshot Tool logs

The logs are available as follows:

  • Redirect of stdout screen output
$INCORTA_HOME/IncortaNode/migrationTool.20190726-151032.log
  • Log across all tenants
    $INCORTA_HOME/IncortaNode/migration/incorta-migration.2019-07-26.log
  • Log for specific tenants
    $INCORTA_HOME/IncortaNode/migration/<Tenant_Name>/incorta-migration.2019-07-26.log

Review the migration snapshot tool log files for issues:

  • If you see the error message that a particular table was not able to migrate, it will ask you to load the table from staging.
  • Ignore the message saying Duplicate Joins.

Option: Perform a Full Load

This option is for a tenant with low data volume for which performing a Full Load of all tenant schemas will take a relatively short amount of time. Simply perform a Full Load for each schema in the given tenant.

Option: Load from Staging

This option is for a tenant with high data volume where both performing a Full Load of all tenant schemas will take too much time and where it is possible to perform a Load from Staging.

Here are the steps to Load from Staging:

  • Stop the Incorta Cluster including the Cluster Management Console, Apache Spark, and Incorta Nodes.
  • Stop Apache ZooKeeper.
  • For the given tenant, rename the snapshots directory:
cd /home/incorta/IncortaAnalytics/Tenants/<Tenant_Name>
mv snapshots/ snapshots_original/
  • Start Apache ZooKeeper.
  • Start the Incorta Cluster including the Cluster Management Console, Apache Spark, and Incorta Nodes.
  • For each schema in the tenant, perform a Load from Staging.
  • After confirming the success of the load, delete the snapshots_original directory when required.

Verify the successfully upgrade

Next, verify the successful upgrade. Here are the resources required:

  • a System Administrator with root access to the host or hosts running Incorta Nodes, the host running the Cluster Management Console (CMC), and the host or hosts running Apache Spark
  • a CMC Administrator
  • a SuperUser that can access each tenant in the Incorta environment
  • an Incorta Developer to resolve identified issues with formula expressions, schema alias, joins between tables, and dependencies between objects such as dashboards and business schemas

Review and Monitor scheduled jobs

As the tenant SuperUser, sign in to each each tenant and review the scheduled jobs.

Run the Inspector Tool

Please note that the Inspector Tool is a CMC Scheduler job in this release. For more details, see Tools → Inspector Tool.

  • As CMC Administrator, schedule the Inspector Tool for each tenant.
  • As SuperUser that can access each tenant in the Incorta cluster, review the Inspector Tool dashboard.
  • As Incorta Developer, resolve identified issues in the 1- Validation UseCases dashboard.

Export of all tenants with the Tenant Management Tool

A System Administrator with root access to the host running the Cluster Management Console (CMC) is able to run the Tenant Management Tool (TMT). Here are the steps:

  • Secure shell in to the CMC host.
  • As the incorta user, navigate to the installation path of the TMT. The default installation path for the TMT is: <CMC_INSTALLATION_PATH>/IncortaAnalytics/cmc/tmt
  • Export ALL tenants:
./exportAlltenants.sh -c <CLUSTER_NAME> -f False  /tmp/<TENANT_EXPORT>.zip
© Incorta, Inc. All Rights Reserved.