Release Notes 4.5

Welcome to the latest features, enhancements, and main fixes in version 4.5.

Release Highlights

This release improves Incorta hardware reduction, with a focus on the following areas:

  • Reduce loader service start time
  • Reduce loader service memory usage to increase performance
  • Support larger data sets
  • Connect to more data sources
  • Co-existence with data lakes

The following section provides more details and information on other features in this release.

New Features

The following new features were released in Release 4.5.

Hardware Reduction

This enhancement results in the following major improvements:

  • Loader memory reduction by reducing memory usage for Incorta schemas by not loading all columns into memory and evicting unused columns if memory is low.
  • Reduced loader service startup time. Only a subset of columns are loaded into memory.
  • Improved speed of incremental updates by appending existing data rather than rewriting it with each incremental load.

In previous releases, all columns had to be loaded into memory to create direct data mapping files (previously called snapshot files), which required large loader service memory. With Release 4.5, only columns used in joins and formulas are loaded into loader service memory. All other data columns are stored in Parquet and are not kept in memory. Data columns are directly loaded into the analytics service memory, when needed. This significantly reduces the total memory footprint in the loader service.

The analytics service has now been enhanced to service queries using Parquet and direct data mapping files.

Note:

Tables with load filters and encryption are still loaded into memory to create direct data mapping files.

Additional memory changes in this release:

  • Analytics Service Column Warmup. This tenant configuration setting in the CMC under Advanced > Warmup Mode changed. Reading data columns directly from parquet reduces the time for dashboards and insight queries to refresh after an incremental load. However, after you restart the analytics service, dashboard queries load more slowly. To decrease the time to load dashboard queries after you restart the analytics service, you can choose to load and warmup specific columns first. You can choose one of the following column warmup strategies:

    • Business view columns: Load all columns referenced in business schema views in the Analytics service only
    • None: Don’t pre-load the columns. Only load on demand. None works best for small deployments with ad-hoc queries
    • Last used columns: Load the previous state prior to shut down in the loader and analytics services
    • All (Replaces “Eager Load”): Pre-load all columns into memory. All works best when you need to support ad-hoc queries, if there are no business schemas in place, and when the time between the analytics service startup and dashboard usage is significant.

Scalability

In previous releases, you could load up to 1.7 billion records for tables with a single key column. In this release, you can load up to 3.4 billion unique values per column. If you use composite keys, you can load up to 1 trillion records.

Admin UI Settings Moved to CMC

In this release, the Admin UI was removed and all configuration options were moved to the Cluster Management Console (CMC). You can now upgrade cluster metadata from the CMC. Most options that you used in the Admin UI display are now in the CMC. For specific details on what changed, see Admin UI to CMC Details near the bottom of this page.

Enhanced User Experience: Formula Builder

The formula builder user interface changed in this release. The following is a screenshot of the new layout:

formula builder 4 5

Analyze users can now:

  • See a physical table column and a business schema business view column at the same time.
  • Search to find an entity by data type.
  • Search to find a formula function or variable.
  • Search to find a schema, table, and column.
  • See the syntax and an example of a selected formula function.
  • Modify the formulas.
  • Add comments to steps in a large formula.
  • Double click to add a column.
  • Have syntax validated.
  • See line numbers.

The following formula functions were added:

Function Function Type
between(exp value, exp min, exp max) Boolean
isNan(double value) Boolean
like(field exp, string pattern) Boolean
double(string exp) Conversion
toChar(date exp, string format) Conversion
rowNumber() Misc
exp(double exp) Arithmetic
sqrt(double exp) Arithmetic
trunc(double exp) Arithmetic
addHours(date exp, int hours) Date
addMilliseconds(date exp, int milliseconds) Date
addMonths(date exp, int months) Date
addSeconds(date exp, int seconds) Date
addYears(date exp, int years) Date
dateTrunc(date exp, string part) Date
formatDuration(int duration) Date
repeat(string value, int count) String
reverse(string value) String

Enhanced User Experience Changes: Schema, Session Variables, and Scheduler Pages

The schema listing page changed in this release. The following is a screenshot of the new layout:

schema page 4 5

New features include:

  • Last Load Type
  • Next Load Time
  • Last Modified By
  • Pagination
  • Data Size is based on Parquet and direct data mapping file size, not memory size
  • Schema Description
  • Search

The session variables page changed in this release. The following is a screenshot of the new layout.

session variable 4 5

New scheduler page for dashboards, schema loads, and data alerts changed in this release. The following are screenshots of the new layout.

scheduler 4 5a

schema load 4 5

Data Lake Support

Incorta now supports data lakes as a host for Incorta tenants. The following data lakes are supported:

  • ADLS Gen2
  • HDFS
  • AWS S3

You can use the following file types from the data lakes as data sources:

  • Parquet
  • ORC
  • CSV
  • Excel

For all data lakes you can:

  • Place all Incorta installer files, tenants, and objects on the data lake shared storage.
  • Read from the data lake using a pre-built connector.
  • Write the output of a materialized view back into the data lake file system.

You must perform some steps to use Hadoop with Spark on Windows. See Configure Spark to Work with Hadoop on Windows for more information on how to set up Spark on Windows so you can use it with Hadoop.

New Migration Tool

To take advantage of the memory enhancements, existing customers must run a migration tool to upgrade to Incorta 4.5 from previous versions of Incorta.

For more information on how to run the migration tool to upgrade to Incorta 4.5 from versions 4.3 and later, see Migration Upgrade from 4.3 to 4.5.

For information on how to run the migration tool to upgrade to Incorta 4.5 from versions earlier than 4.3, see Migration Upgrade from 3.x to 4.5.

OpenJDK 11 and Oracle JDK 8 Support

You can now use OpenJDK 11 and Oracle JDK 1.8+ with Incorta. To use OpenJDK 11, set JAVA_HOME and JRE_HOME to the OpenJDK 11 main folder. For more information on how to use OpenJDK 11 and Oracle JDK 1.8+ with Incorta, see Install Java.

Connectors

The following new data sources and connectors are supported in this release of Incorta. You can now select them when you select a data source:

JDBC Connection Properties

You can create a new connector with JDBC connection properties using property name and value pairs. When you select JDBC as a data source, you need to add field type, value, descriptions, and properties.

Data File Enhancements

Data Type Discovery: Improved the ability to discover date and timestamps from CSV source data and format them in Incorta.

Chunking: Support multi-thread extract and automatic chunking of large CSV files to enhance extraction in local or file share systems.

Simpler Tenant Administration

You can now list, export, and import the following in bulk from the command line using an asterisk:

  • Data sources
  • Session variables
  • Alerts from the command line

SQL Interface Properties

You can specify new properties for the SQL interface in the <INCORTA_INSTALLATION>/IncortaNode/services/<SERVICE_ID>/conf/services.properties file for each service or the <INCORTA_INSTALLATION>/IncortaNode/node.properties file for each node:

Property name: sql.spark.bridge.partitions.memory.percentage

Description: The percentage of memory to use in the SQL interface application to send query results to Incorta.

Property name: sql.spark.bridge.backoff.time

Description: The number of milliseconds to wait to process an incoming query if the SQL Interface does not have enough memory to process all queries. Set this property to optimize query speed.

Property name: sql.spark.bridge.backoff.retries

Description: The maximum number of times the SQL Interface attempts to send a single query result to Incorta. Set this property when you observe timeouts.

Example:

sql.spark.bridge.partitions.memory.percentage=60
sql.spark.bridge.backoff.time=100
sql.spark.bridge.backoff.retries=300

More Information and Implementation Notes

Admin UI to CMC Details

In the Admin UI, server configuration options were under System Configuration > Server Configs. In the CMC, server configuration options are under Local > Cluster Configurations > Server Configurations.

In the Admin UI, default tenant configuration options were under System Configuration > Default Tenant Configs. In the CMC, default tenant configuration options are under Local > Cluster Configurations > Default Tenant Configurations.

In the Admin UI, individual tenant configurations were under Tenants. In the CMC, individual tenant configurations are available when you select Configure in the Configurations column when you view the list of tenants you set up.

The following options were removed from the Admin UI and are not available in the CMC:

  • Under Default Tenant Configurations > Advanced

    • Eager Load was removed and added to the new Warm Up Options configuration option.
  • Under Analytics / Loader Service

    • Removed ICC port because it is not used.

The following options from the Admin UI were changed in the CMC:

  • Under Default Tenant Configs > Security

    • Minimum Password Length default was changed to 5.
  • Analytics / Loader Service

    • Engine CPU Utilization (%) renamed to CPU Utilization (%) to represent all the utilization assigned to Incorta.

The following new options were added to server and tenant configurations in the CMC:

  • Under Server Configurations > Spark Integration

    • Spark App control channel port. The port used to send a shutdown signal to the Spark SQL app if the Incorta server needs to shut down Spark.
  • Under Default Tenant Configurations > Advanced

    • Warmup Mode allows you to select what data to pre-load into memory after a restart. The options are None (don’t pre-load any data into memory), Business View Cols (pre-load business view columns only), Last used columns (pre-load last used columns only), All (pre-load all data). If you do not set this, Business View Cols is the default.
  • Under Default Tenant Configurations > Incorta Labs

    • Wall-E, a new Incorta Assist beta feature, allows you to preview Incorta blueprints in your environment. See blueprints.incorta.com.
    • CLAIM Server URL, a new Incorta Assist feature.

Fixed Issues

The following bug fixes and minor improvements were made in the 4.5 release.

Component Release Note
Performance Enhanced the compaction logic for non-key columns.
Performance Fixed an issue where log files were large because they were flooded with KAFKA warning messages
Security Fixed an issue where a user signed into tenant A navigates to a tab, then switches to tenant B, but still sees their data (for which they have permissions) for tenant A.
Security Fixed an issue where user names with more than one space in the first and last name columns created an error on the Security tab.
Security Previously, Incorta Okta setup required a URL with a slash in the end. An end slash is no longer required.
CMC Fixed an issue where a 404 error displayed after a user logged in to CMC > Clusters > [Cluster Name] > Services > [Service_Name], showed the logs, then clicked back to the CMC home.
CLI Tools Added an asterisk (*) on the end of sample data files to provide users with a hint in exporting and importing session variables.
Data Sources & Data Files Fixed an error that prevented users from adding a multi-source table when a table from an SAP ERP Connector existed.
Installer Changed the version of Hadoop (now 3.2.0) that comes with Spark that is bundled with Incorta Analytics.
Schema Alias tables now show the number of records from the base table.
Schema Fixed an issue where editing a table, making a change, and then adding a join caused the change to disappear.
Schema Fixed an issue where modifying a query with a Kafka data source and a WHERE condition created an error.
Schema Fixed an issue where the number of columns in the table details were different from the number of columns in the schema view.
Schema Fixed an issue where the schema page spun for 30+ minutes.
Schema Prior to 4.5, you modified a materialized Incorta table by adding or deleting a column which removed and re-created the whole table. In Release 4.5, the materialized Incorta table is retained and only the modified columns are altered. This helps to better manage the lifecycle of an Incorta table.
Schema Table Aliases now refresh automatically when a base table changes.
Materialized Views Fixed an issue that caused an error to display when saving materialized view using a Python script.
Materialized Views Fixed an issue where materialized view updates were ignored when the SQL format included leading spaces, line breaks, and tabs.
Compaction Fixed an issue where old compacted versions were not deleted after running a new compaction.
Variables Fixed an issue where a case-sensitive session variable did not return the login user name as expected.
Dashboards & Insights Added support for the Yen in Incorta Analytics.
Dashboards & Insights Fixed an issue that caused an ArrayIndexOutOfBoundsException error on the dashboard.
Dashboards & Insights Fixed an issue where a 24 hour time format did not work.
Dashboards & Insights Fixed an issue where a user set the number of decimal points for data fields to 4, but only 2 decimals displayed on the dashboard.
Dashboards & Insights Fixed an issue where an incorrect date mask in a formula was allowed to save even though it was not able to run.
Dashboards & Insights Fixed an issue where Apple Maps did not render in the Analyzer mode until after a user clicked done to see the dashboard.
Dashboards & Insights Fixed an issue where color format (using the Format Color Palette option) didn’t work when the coloring dimension was identified in a formula.
Dashboards & Insights Fixed an issue where conditional formatting on a formula column is not displaying as expected.
Dashboards & Insights Fixed an issue where insight rows do not align when the fixed column feature is used.
Dashboards & Insights Fixed an issue where null values evaluation was not consistent when used in an arithmetic operation while aggregation was turned on.
Dashboards & Insights Fixed an issue where sorting a formula column by negative, null, or positive values did not work properly.
Dashboards & Insights Fixed an issue where the country code “CN” displayed “Cyprus No Mans Area” instead of “China.”
Dashboards & Insights Fixed an issue where the date and timezone for the same date displayed differently in the insight filter than in the insight.
Dashboards & Insights Fixed an issue where the error, “Required columns not loaded” displayed while rendering a dashboard.
Dashboards & Insights Fixed an issue where the parse date function did not work when a month was written in capital letters instead of title case.
Dashboards & Insights Fixed an issue where the SchemaRefreshTime formula did not update as expected after a full load.
Dashboards & Insights Fixed an issue where the Total columns were left-justified when transpose was enabled in a report, even though the content was centered while editing the report.
Dashboards & Insights Fixed an issue where users couldn’t see the field attributes modal when the field area was expanded on an insight with a large number of columns.
Dashboards & Insights Fixed an issue with formatting within a transposed insight.
Dashboards & Insights When data is being loaded as part of an incremental load, incorrect data was displayed on a dashboard for a short period of time at the end of a load cycle.
UI Added a help button to the UI of the Incorta Analytics application.
UI Added support for Japanese characters in schema and business view names.
UI Added support for Japanese in Incorta Analytics.
© Incorta, Inc. All Rights Reserved.