Release Notes 4.8

Release Highlights

The goal of the Incorta 4.8 release is to enhance analytical capabilities with new analytic and date functions, empower business users with dashboard personalization, and improve data management and security administration. To that end, the 4.8 release introduces several key improvements to the Cluster Management Console (CMC), Incorta Loader Service, and the Incorta Analytics Service. In addition, the release includes an Incorta Labs offering for enabling a new Schema Diagram. The release also introduces accessibility support that includes enhanced keyboard controls and voiceover audio cues.

Important New Features and Enhancements

There are several important features in this release:

Additional Improvements and Enhancements

Upgrade to Incorta 4.8

IMPORTANT:

Prior to upgrading to Incorta 4.8, please review and follow the procedures outlined in the Upgrade to Incorta 4.8 documentation.


Cluster Management Console (CMC)

The following new configurations and enhancements are available in the Cluster Management Console (CMC) for this release:

CMC Administrator Alerts

As of this release, you can enable Administrator Alerts for a given Incorta Cluster. Administrator Alerts monitor both the On Heap and Off Heap memory usage of the loader or analytics services. The CMC sends an email alert when either the Analytics Service or the Loader Service reaches 90% of either On Heap Memory or Off Heap Memory.

To enable this option for an existing Cluster, follow these steps:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a cluster name.
  • In the canvas tabs, select Details.
  • In the view panel, to edit the cluster details, select the Pen.
  • In the edit panel, enable Admin Alerts.
  • Select Update.

Here are the steps to enable this option when creating a new Cluster using the wizard:

  • In the Monitoring & Alerts step, enable Admin Alerts.
  • Select Next.
  • In the Review step, review the new Cluster details.
  • Select Create.

Configure Email Alerts

In order to receive email Administration Services alerts associated with the Loader or Analytics Service, you must configure SMTP Email for the Cluster’s Default Tenant configuration.

To enable SMTP Email in the CMC, follow these steps:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Email.
  • In the right pane, specify the SMTP details such as:

    • SMTP Host, e.g., smtp.gmail.com,
    • SMTP Server Port
    • Email Server Protocol
    • If required, enable Email Host Requires Authentication and specify the System Email Address, Person Name, System Email Password.
    • If required, enable Sender’s Username Auth, and specify the System Email Username.
  • Select Save.

NotResponding Status

In certain cases, the Loader Service or the Analytics Service will not respond to a status request from the CMC. In these cases, for the given Incorta Cluster, for the specific Service (Analytics or Loader), the CMC will…

  • In Details, show a status of NotResponding
  • In Services, in State, show a status of NotReponding and show N / A for both On Heap and Off Heap memory usage.

Log Summary

The 4.8 release contains an enhancement for generating a summary of selected logs. The Logs Summary feature supports the following logs:

  • Analytics Service
  • Loader Service
  • Tenant Activity
  • SQLi (applicable only for Analytics Service)

The Analytics Service, Loader Service, Tenant, and SQLi logs all share the same naming convention: incorta.YYYY-MM-DD.log

NOTE:

In this release, the Logs Summary feature does not support catalina.logs. System is an internal tenant.

Create a Logs Summary

Here are the steps to create a Logs Summary:

  • In the Navigation bar, select Logs.
  • In the Logs tab, specify the following:

    • Date
    • Cluster Name
    • Service Name
    • Tenant Name
  • Select Get Files.
  • In the List view, for the given log files to summarize, select each checkbox.
  • In the Footer bar, select Actions.
  • In the Actions menu, select Summarize.
  • A notification confirms the successful creation of the Logs Summary file.
View a Logs Summary

In the Logs Summary dialog, you can review the Entity Mode or Service Mode summary. Here are the steps to view a Logs Summary:

  • In the Navigation bar, select** Logs**.
  • In the Logs Summary History tab, specify the following:

    • Date
    • Cluster Name
  • In the List view, select the Logs_summary_MM-DD-YYYY_HH-MM-SS file name.
  • In the Logs Summary dialog, select Entity mode or Service Mode.
Downloads a Logs Summary

In the Logs Summary dialog, you can download a Logs Summary ZIP archive that contains the summary details as CSV or JSON files. The Logs_summary_MM-DD-YYYY_HH-MM-SS_csv.zip archive contains individual files where applicable:

  • Analytics_Dashboard_And_Queries.csv
  • Analytics_logGaps.csv
  • Analytics_Schemas_And_Jobs.csv
  • Analytics_services_startUps_and_shutdowns.csv
  • Analytics_System_errors.csv
  • Analytics_Tenants_info.csv
  • Loader_Service_Startups_and_Shutdowns.csv
  • Loader_System_errors.csv
  • Loader_Tenants_Compactions.csv
  • Loader_Tenants_info.csv
  • Loader_Tenants_loggaps.csv
  • Loader_Tenants_Schemas.csv

The Logs_summary_MM-DD-YYYY_HH-MM-SS_json.zip archive contains two files:

  • entityMode.json
  • serviceMode.json

Here are the steps to download the Logs Summary files:

  • In the Navigation bar, select** Logs**.
  • In the Logs Summary History tab, specify the following:

    • Date
    • Cluster Name
  • In the List view, for the given Loader_Tenants_Schemas.csv file name, select csv or json.

Dynamic resource allocation for materialized views

In this release, you can configure an Incorta Cluster to dynamically allocate Apache Spark executors for processing materialized views. This configuration allows Spark to dynamically size the number of executors for a given workload.

NOTE:

Dynamic resource allocation for executors requires that the Apache Spark Shuffle service is running. To determine if the Shuffle Service is running, on the host running Apache Spark, execute the following bash command: ps aux | grep org.apache.spark.deploy.ExternalShuffleService To start the shuffle service, run the following from the Apache Spark sbin folder: ./start-shuffle-service.sh

Here are the steps to enable dynamic resource allocation in the CMC:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Server Configurations.
  • In the left pane, select Spark Integration.
  • In the right pane, toggle on Enable dynamic allocation in MVs.
  • Select Save.

With Dynamic Allocation enabled, the Data Source dialog for a materialized view shows this configuration:

Dynamic allocation mode is currently enabled
Scale-up and Scale-Down

With Dynamic Allocation enabled, a materialized view load job starts in Apache Spark with one executor and after every second, will scale up exponentially. After 2 minutes of idle time, the number of executors will begin to scale down to 0. Here are the basic rules for dynamic allocation:

  • Scale-up occurs after 1 second
  • Scale-down occurs after 2 minutes of idle time
  • The default number of initial executors is 1
  • The default number of minExecutors is 0.

Additional considerations for Dynamic Allocation

Dynamic Allocation is an Incorta Cluster Server Configuration. When enabled, the Incorta Cluster dynamically allocates Apache Spark executors for processing materialized views.

In the CMC, in Server Configurations, in Spark Integration, the Materialized view application cores property defines the number of CPU cores Incorta instructs Apache Spark to use while executing a materialized view job.

With Dynamic Allocation enabled, Incorta uses the** Materialized view application cores** value to determine and allocate the number of CPU cores for the initial executors for the materialized view job. With Dynamic Allocation enabled, there is no maximum limit for CPU scale-up unless an upper limit is specified.

To limit the Dynamic Allocation scale-up, you can add and assign a value to spark.cores.max property for a given materialized view.

To set the maximum number of cores in Apache Spark for Dynamic Allocation, specify a value for the cores.max property in the $SPARK_HOME/conf/spark-defaults.conf file.

Existing Materialized Views with Spark Properties

If upgrading to Incorta 4.8, all materialized views with existing Spark property configurations are not affected by dynamic allocation. However, the Data Source dialog will show the following: Dynamic allocation mode is currently enabled

After upgrade, you can enable Dynamic Allocation for an existing materialized view with Spark properties. In the Data Source dialog for a materialized view, in Properties, for key, enter the following key value pairs:

  • spark.dynamicAllocation.enabled = true
  • spark.shuffle.service.enabled = true
Override Dynamic Allocation for a Materialized View

For a given materialized view, in the Data Source dialog, you can specify additional properties to override dynamic allocation. In the Data Source dialog for a materialized view, in Properties, for key, enter the following key value pairs:

  • spark.dynamicAllocation.enabled = false
  • spark.shuffle.service.enabled = false
Notebook Integration with Dynamic Allocation

A notebook for a materialized view inherits the configured Apache Spark properties. With dynamic allocation enabled, however, the behavior changes:

  • A notebook is not restricted to the materialized views’s maximum memory or maximum number of cores.
  • Since dynamic allocation sets the minimum executors to zero, an idle notebook may take time to scale up to 1 executor or or more, especially if there are numerous jobs running in Spark.
  • The cached idle time is 30 minutes for a notebook’s resources (spark.dynamicAllocation.cachedExecutorIdleTimeout=30).

After 30 minutes, Incorta releases the cached resources. However, in order for a notebook to return to normal operations, the notebook needs to recalculate the cached resources again.

Enhanced LDAP authentication for the SQL Interface (SQLi)

In this release, you can now specify an authentication method for the SQL Interface (SQLi). This means that it is now possible to have one authentication method for Incorta and a separate authentication method for the SQLi Interface.

The new property is Allow Different Authentication Type for External Tools. When enabled, you can specify the External Tools Authentication Type. There are two choices: Incorta Authentication and LDAP. All tenants will inherit changes to the Default Tenant Configuration. It is possible to override these settings for a specific tenant. When External Tools Authentication Type is set to LDAP, the following properties need to be set:

  • External Tools Base Provider URL

    Specify the LDAP server URL, for example: ldap://<LDAP_SERVER_HOST>:<LDAP_SERVER_PORT>

    Incorta requires this LDAP URL when you select LDAP authentication.

  • External Tools Base Distinguished Name

    Enter a distinguished name (DN) to specify where the search begins in the LDAP directory of information tree (DIT).

  • External Tools System User

    Specify the LDAP System User that has the proper privileges to query users. If the LDAP Server does not require authentication, this property does not need to be set.

  • External Tools System User Password

    Please provide the password for the LDAP system user here.

  • External Tools User Mapping Login

    The unique identifier attribute in LDAP for the user such as ID or email that is used to sign in to Incorta.

  • External Tools User Mapping Authentication

    An optional LDAP attribute to authenticate users in Incorta. If empty, Incorta uses the External Tools User Mapping Login value.

NOTE:

Any changes to these properties require that you restart all services in the Incorta Cluster.

To enable LDAP authentication for the SQL Interface (SQLi) for all tenants, in the CMC, follow these steps:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Security.
  • In the right pane, enable Allow Different Authentication Type for External Tools.
  • In External Tools Authentication Type, select LDAP, then specify the following:

    • External Tools Base Provider URL
    • External Tools Base Distinguished Name
    • External Tools System User
    • External Tools System User Password
    • External Tools User Mapping Login
  • Optionally, specify a value for External Tools User Mapping Authentication
  • Select Save.

You must then restart all services in the Incorta Cluster. To restart all services in the CMC, follow these steps:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the Details tab, for the given cluster, select Restart.

Incorta Labs

Incorta Labs are experimental features and functionality that Incorta supports for non-production use. As such, some experimental features will potentially become part of an Incorta release and others potentially will be deprecated. Incorta Support will investigate issues with Incorta Labs features. Here are the Incorta Labs new features in this release:

Schema Diagram

In this release, as an Incorta Labs feature, you can enable Schema Diagram. When enabled, users experience an enhanced user interface for viewing schema diagrams in the Schema Designer and query plans in the Analyzer. The schema diagram visually distinguishes between a Table, Materialized View, Alias, and Incorta Table.

Here are the steps to enable this option as default tenant configuration in the CMC:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Incorta Labs.
  • In the right pane, enable Schema Diagram (toggle on).
  • Select Save.

Here are the steps to enable this option for a specific tenant configuration in the CMC:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select the Tenants tab.
  • In the Tenant list, for the given tenant, select Configure.
  • In the left pane, select Incorta Labs.
  • In the right pane, enable Schema Diagram (toggle on).
  • Select Save.

To view the new schema diagram for a given schema in an enabled tenant, sign in to the Incorta Unified Data Analytics Platform as a user that belongs to the Schema Manager role. Then, follow these steps:

  • In the Navigation bar, select Schema
  • In the Schema Manager, in the Schema tab, select a specific schema name in the schema list.
  • In the Schema Designer, in the Action bar, select Diagram.

For a selected schema diagram, you can:

  • Search for a table
  • Zoom in or out
  • Fit the diagram to the screen
  • View the details about a table or join between tables
  • Select one of the following layout options:

    • Default Layout
    • Compact Layout
    • Circular Layout
  • Use the Overview panel to focus on a specific section of the diagram

In the Analyzer, for a measure in the measure tray, you can view a query plan using the new schema diagram. Here are the steps:

  • In the Properties menu, select Advanced.
  • Select Query Plan.

Incorta Analytics and Loader Service

The 4.8 release introduces several key improvements to the Incorta Analytics and Loader Services such as:

Support for Google Sheets as a data source

In this release, you can now create a Data Source for Google Sheets. A Google Sheet data source requires authorized access to a Google Drive with a Google Drive Client ID and Client Secret.

Application administrations need to enable the Google Drive API and the Google Sheets API for the Google Account. The Google Drive API gets the authenticated user’s list of spreadsheet files and their respective IDs. The Google Sheets API reads the spreadsheets.

To enable this feature for a tenant, you must specify the Google Drive Client ID and the Google Drive Client Secret in the tenant configuration in the Cluster Management Console.

Specify the Google Drive Client ID and Google Drive Client Secret

To specify these Google Drive Client ID and Client Secret for the Default Tenant Configuration in the CMC, follow these steps:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Cluster Configurations.
  • In the panel tabs, select Default Tenant Configurations.
  • In the left pane, select Integration.
  • In the right pane, specify the

    • Google Drive Client ID
    • Google Drive Client Secret
  • Select Save.

To specify these Google Drive Client ID and Client Secret for a specific tenant Configuration in the CMC, follow these steps:

  • In the Navigation bar, select Clusters.
  • In the cluster list, select a Cluster name.
  • In the canvas tabs, select Tenant.
  • For the given tenant, select Configure.
  • In the left pane, select Integration.
  • In the right pane, specify the

    • Google Drive Client ID
    • Google Drive Client Secret
  • Select Save.

NOTE:

After specifying a Google Drive Client ID and Client Secret, you must restart all services in the Incorta cluster.

Create a Google Sheets Data Source

To create a new Google Sheet data source, follow these steps.

  • In the Navigation bar, select Data.
  • In the Data tab, for External Data Sources, in the Action bar, select + New.
  • In the Add New menu, select Add Data.
  • In the Choose a Data Source dialog, in Other, select Google Sheets.
  • In the New Data Source dialog, specify the Data Source Name
  • Select Authorize.
  • To test the connection, select Test Connection.
  • To save, select Ok.

With the Schema Wizard, create a Google Sheet Data Source schema

Here are the steps to create a Google Sheet data source schema using the Schema Wizard:

  • In the Navigation bar, select Schema.
  • In the Action bar, select + New.
  • In the Add New Menu, select Schema Wizard.
  • In the Schema Wizard, in Choose a Source (Step 1)…

    • Enter the Schema Name
    • Select the Google Sheet data source name
    • Optionally specify a schema description
  • Select Next.
  • In Manage Tables, select the Google Sheets in the directory hierarchy.
  • In the Data Panel, for the given Google Sheet, select the individual sheets to import as tables.
  • Select Next.
  • Select Create Schema.

Create a Table using a Google Sheets Data Source

For an existing schema, here are the step to create a schema table using a Google Sheets data source:

  • For an existing schema open in the Schema Designer, in the Action bar, select + New.
  • In the Add New menu, select Table, then select Google Sheets.
  • In the Data Source dialog, in Data Source, select the Google Sheets data source.
  • In File Name, click Select.
  • In Sheet, select the specific sheet in the file.
  • Optionally specify the:

    • Number of Rows to Skip From Header
    • Number of Rows to Skip From Footer
  • Optionally specify Incremental Load.
  • Select Add.
  • Specify a Table Name.
  • In the Action bar, select Done.

Additional considerations for Google Sheets

You can access Google Sheets found in My Drive or Shared with me. It is possible to have Google Sheets with identical names in a Google Drive. Incorta will attempt to load all identically named Google Sheets, most likely resulting in possible errors. The easiest way to resolve this potential issue is to uniquely name each Google Sheet.

Incorta discovers the data type for a column in a sheet based on the column format. Otherwise, Incorta infers the data type based on the first 50,000 rows for the column.

With Incremental Load enabled, if there is not a Key column defined, new sheet rows will be appended and no existing rows will be updated.

Cached files for Oracle Cloud Applications

In this release, for an Oracle Cloud Application (formerly Oracle Fusion) data source, you can now enable keeping cached files for a specific duration. A cache clean task runs every hour to clean all cached files that are scheduled for deletion. You can configure two data source properties: Keep cached files and Keep file(s) for.

Create a Oracle Cloud Application

To create a new Oracle Cloud Application data source, follow these steps:

  • In the Navigation bar, select Data.
  • In the Data tab, for External Data Sources, in the Action bar, select + New.
  • In the Add New menu, select Add Data.
  • In the Choose a Data Source dialog, in Application, select Oracle Cloud Applications.
  • In the New Data Source dialog, specify the

    • Data Source Name
    • Username
    • Password
    • Oracle Cloud Applications URL
    • Root Query Text
    • Data Type Discovery Policy
    • Metadata directory
    • File Name Pattern
    • File Criteria - Last Modified Timestamp
  • Next, enable the Keep cached files toggle.
  • In Keep file(s) for, select a duration:

    • a Day
    • a Week
    • a Month
    • a Year
    • Forever
  • To test the connection, select Test Connection.
  • To save, select Ok.

Specify an Offset for an Apache Kafka Data Source

In this release, you can specify a previous offset timestamp for a new or existing Apache Kafka data source. Referred to as an offset rewind, this setting instructs the Kafka Consumer in Incorta to fetch messages from the Kafka topic that may have been retrieved previously. The offset rewind overwrites all rows, even when a Kafka table contains a key column.

An example of the supported timestamp format (yyyy-MM-dd HH:mm:ss.SSS zzz) is: 2020-04-24 04:20:09.123 UTC

To create a new Apache Kafka data source, follow these steps.

  • In the Navigation bar, select Data.
  • In the Data tab, for External Data Sources, in the Action bar, select + New.
  • In the Add New menu, select Add Data.
  • In the Choose a Data Source dialog, in Streaming data, select Kafka.
  • In the New Data Source dialog, specify the:

    • Data Source Name
    • Topic
    • Broker List
    • Message Type Field (optitonal)
    • Trimm messaType after dash
    • Kafka Version
    • Enable Kafka Consumer
    • Data Expiration (optional)
    • Offset Rewind to timestamp (yyyy-MM-dd HH:mm:ss.SSS zzz)
    • Mapping file (Apache Avro)
  • To test the connection, select Test Connection.
  • To save, select Ok.

To edit an existing Apache Kafka data source, follow these steps.

  • In the Navigation bar, select Data.
  • In the Data tab, for External Data Sources, select the Kafka schema.
  • In th Edit Data Source dialog, specify the:

    • Offset Rewind to timestamp (yyyy-MM-dd HH:mm:ss.SSS zzz)
  • To test the connection, select Test Connection.
  • To save, select Ok.

Specify an Incremental Column of the type double, integer, or long

In previous versions of Incorta, an incremental column supported only columns of the type date or timestamp. In this release, for an incremental table of the type SQL Database (MySQL or Oracle data source), an incremental column now also supports — in addition to date and timestamp — numeric types such as double, integer, and long.

To specify an incremental column that is of the type double, integer, or long, follow these steps:

  • In the Data Source dialog, for Incremental Extract Using, select Maximum Value of a Column.
  • In Incremental Column, select the relevant column that is of the type integer.
  • In Incremental Field Type, select Numeric.
  • Select Save.

Enhanced multisource tables

In the Table Editor, you can define one or more data sources for a table. A table with two or more data sources is a multisource table. A multisource table allows schema developers to create a schema table that ingests data from multiple data sources — SQL Database and a File System — into one schema destination table.

In previous versions of Incorta, a multisource table required an identical schema (matching query) for all data sources. In this version of Incorta, it is possible to manage the data type for each output column in a multisource table. This feature helps schema developers manually resolve any data type conflicts for an output column. To manage the data types for each output columns for a multisource table, follow these steps:

  • For the given multisource table, in the Table Editor, in the Summary section, select Manage output.
  • In the Manage output dialog, for each Output column row, specify the

    • Incorta function (key, dimension, measure)
    • Incorta Type (date, double, integer, long, string, text, timestamp, or null)
  • Select Ok.
  • To save your changes, in the Action bar, select Done

Additional considerations for multisource tables

When you add a new data source to an existing table, all previous column changes are updated to the multisource table column output schema.

If two or more data sources contain key columns, and these key columns are similar, Incorta creates a compacted Parquet file for the last loaded data source.

For multisource tables, avoid mixing incremental enabled data sources with non-incremental data sources as this can result in unexpected results.

Parallelized ingest with Apache Spark

In this release, for a schema table that is a File System (CSV) or SQL Database (MySQL or Oracle) data source, you can enable Spark Extraction for a full load.

With Spark Extraction enabled for a supported table, the Loader Service directs Apache Spark to connect to the data source directly to ingest data. Spark Extractor appears as the job name in the Spark Master Web UI.

A distributed load with Apache Spark allows for parallelized extraction. You can specify the degree of parallelism in terms of Worker CPU and Worker Memory. For SQL Database tables, you can also specify the column for parallelized queries.

File System: Spark Extraction

Here are the steps to enable Spark Extraction for a File System (CSV) table in a given schema:

  • In the Table Editor, select the File System table.
  • In the Data Source dialog, toggle on Enable Spark Based Extraction and then…

    • Specify the Max Number of Parallel File Extractors
    • Specify the Memory Per Extractor in gigabytes.
  • Select Save.
  • To save your changes to the table, in the Action bar, select Done.

SQL Database: Spark Extraction

Here are the steps to enable Spark Extraction for a SQL Database (MySQL or Oracle) table in a given schema:

  • In the Table Editor, select SQL Database table.
  • In the Data Source dialog, toggle on Enable Spark Based Extraction and then…

    • Specify the Max Number of Parallel Queries
    • Select the Column to Parallelize Queries on
    • Specify the Memory Per Extractor in gigabytes.
  • Select Save.
  • To save your changes to the table, in the Action bar, select Done.

Notebook and Editor Options for Materialized Views

In this release, with Notebook Integration enabled for the tenant, you can now edit a materialized view using either the Script Editor or with the Notebook Editor. In the Data Source dialog for a materialized view, select Editor in Editor or Edit in Notebook.

WARNING

A materialized view only stores the declared execution language paragraphs. Reverting back from a Notebook to the Editor removes the non-conforming language paragraphs.

Scala 2.12 Support for Materialized Views

Apache Spark is written in Scala. Incorta supports running Apache Spark with Scala 2.12. For the best Spark performance, most data engineers code Spark jobs in Scala for the following reasons:

  • Scala supports both functional and object oriented programming paradigms.
  • Scala offers robust exception and error handling. As a statically typed language, Scala will throw unexpected type exceptions at compile time, avoiding trivial type exception bugs common to PySpark.
  • As compiled code, Scala’s execution is up to 10x faster than PySpark for most dataframe operations.
  • Scala offers numerous productivity classes including Type inference.

Scala Example

The following example reads ten rows from the SALES.CUSTOMERS table and then persists the results:

val df = read("SALES.CUSTOMERS").limit(10)`

save(df)

Available Helper Methods

There are several helper methods available for creating a Materialized View with Scala.

  • Returns the last refreshment date for the MV

    get_last_refresh_time ( ): Long
  • Required method to persist the Materialized View.

    save(dataFrame: DataFrame): Unit
  • Read a schema table and return the table as a dataframe.

    read(tableName: String): DataFrame
  • Reads a data source and a path and returns a dataframe object.

    readFormat(format: String, path: String): DataFrame

Available Helper Methods with Notebook Integration enabled

In addition to the existing helper methods, here are the following helper methods for Scala and Notebooks:

  • Displays the dataframe results.

    display(dataFrame: DataFrame): Unit
  • Displays the dataframe results.

    incorta.show(df: DataFrame)  
  • Prints the schema of the dataframe.

    incorta.printSchema(df: DataFrame)
  • Displays for each column in the schema, the count, mean, standard deviation, min, and max values.

    incorta.describe(df: DataFrame): Unit
  • Displays the N number or dataframe results, where N is optional.

    incorta.head(dataFrame: DataFrame, n: Int  = 1): Unit
  • Adds a new property to the map of properties.

    incorta.put(key: String, value: Object): Unit
  • Retrieves a property from the map of properties.

    incorta.get(key: String): Object

Create a Materialized View with Scala using the Script Editor

With Notebook Integration disabled for a given tenant, you can only edit a materialized view using the Script Editor. In this release, with Notebook Integration enabled for the tenant, you can edit using the materialized view using either the Script Editor or with the Notebook Editor.

You must call the save(dataframe) method to persist the materialized view.

Here are the steps to creating Materialized View with Scala using the Script Editor:

  • For the given schema in Schema Designer, in the Action bar, select + New.
  • In the Add New menu, select Materialized View.
  • In the Data Source dialog, in Language, select SCALA.
  • In Script…

    • without Notebook Integration enabled, select the Open icon to open the Script Editor.
    • with Notebook Integration enabled, select Edit in Editor.
  • In Edit Script, enter your Scala code.
  • Select Done.
  • To specify additional materialized view Spark properties, select Add Property.
  • Select Save.
  • Specify a Table Name.
  • In the Action bar, select Done.

Create a Materialized View with Scala and Notebook Integration enabled

A Scala Notebook has the %spark declaration. You must call the save(dataframe) method to persist the materialized view.

Here are the steps to creating Materialized View with Scala:

  • For the given schema in Schema Designer, in the Action bar, select + New.
  • In the Add New menu, select Materialized View.
  • In the Data Source dialog, in Language, select SCALA.
  • In Script, select Edit in Notebook.
  • In one or more paragraphs, enter the Scala code for the materialized view.
  • Select Done.
  • To specify additional materialized view Spark properties, select Add Property.
  • Select Save.
  • Specify a Table Name.
  • In the Action bar, select Done.

Create a load filter with the Formula Builder

In this release, you can now create a load filter for a table in a schema using the Formula Builder. Here are the steps to create a table load filter:

  • For the given table in the Table Editor, in the Load Filter section, select the textbox.
  • In the Formula Builder, create a formula expression.
  • Select Validate & Save.
  • In the Action bar, select Done.

Enhanced list view for security users and groups

This release includes an improved list view of users and groups in the Security Manager. Improvements include:

  • A count of title items for Users and Groups
  • Improved Pagination
  • For Users, the ability to sort by Name, Email, and Last Signed In timestamp.
  • For groups, the ability to sort by Name

Create a custom color palette for a dashboard

In this release, you can now create and apply a custom color palette for a given dashboard. Here are the configuration steps:

  • For the given dashboard, in the Action bar, select More Options.
  • In the More Options menu, select Configure Settings.
  • In the Configure Settings dialog, in General, select the Customize Color Palette checkbox.
  • In the Color Palette dropdown, select your default palette.
  • Select one of the 12 colors.
  • In the Color Picker dialog, select a color or enter a Hex, RGB, or HSL value.
  • To close the Color Picker dialog, select outside of the Color Picker.
  • Select Save to save your custom color settings.

Configure the order of dashboard prompts and presentation variables

In this release, for a given dashboard, you can now configure the order of prompts and presentation variables in the Filter Menu. To alphabetically order prompts and presentation variables for a given dashboard, follow these configuration steps:

  • For the given dashboard, in the Action bar, select More Options.
  • In the More Options menu, select Manage Filter & Prompts.
  • In the Dashboards filters, select Order Alphabetically.
  • In the Action bar, select Done.

To manually order prompts and presentation variables for a given dashboard, follow these configuration steps:

  • For the given dashboard, in the Action bar, select More Options.
  • In the More Options menu, select Manage Filter & Prompts.
  • In the Dashboards filters, select Order Manually.
  • In the Action bar, select Done.

Dashboard Analyzer Role

This release introduces the new Dashboard Analyzer role. Users that belong to this role are able to view shared dashboards and create a personalized view of a shared dashboard. In Incorta, you can add or remove a user from a group. You can assign one or more roles to a group. A user can belong to zero or more groups.

Dashboard Personalization

Dashboard Personalization is a new feature in this release. Personalization allows a user to customize a shared dashboard as a dashboard personalized view. A user can switch between a personalized view and an original view of a dashboard. A dashboard personalized view cannot be shared with other users or groups.

To create a personalized view, a user must have at least view access permissions to the dashboard and must belong to one or more of the following roles:

  • Analyzer User
  • Dashboard Analyzer
  • Individual Analyzer
  • SuperRole

In this release, there are three customizable options for a dashboard personalized view:

  • Select Insights
  • Edit Layout
  • Edit Insight Settings

Select Insights

A user can select one or more insights in the Select Insights Panel for the given dashboard. Here are the steps:

  • For the given dashboard, in the Action bar, select the Personalize icon.
  • In the Personalize menu, select Personalize.
  • The Select Insight tab is the default tab in the context bar.
  • In the Selected Insights pane, select or deselect one or more insights checkboxes.
  • In the Action bar, select Save.

Edit Layout

A user can customize the layout of the selected insights. Here are the steps:

  • For the given dashboard, in the Action bar, select the Personalize icon.
  • In the Personalize menu, select Personalize.
  • In the Context bar, select the Edit Layout tab.
  • To resize the width of a selected insight, select the Resize options (1x, 2x, 3x, 4x).
  • To resize the height of a selected insight, drag & drop the bottom and/or top resize height arrows.
  • To move a select insight, select a Move arrow (down, up, left, right) or use the keyboard arrows.
  • In the Action bar, select Save.

Edit Insight Settings

In this release, the Edit Insight Settings options allows a user to choose and/or reorder the Table Columns for the following Table insight visualizations:

  • Table
  • Aggregated
  • Aggregated - Total
  • Aggregated - Subtotal
  • Aggregated - Subtotal - Total

Here are the steps for a user to choose and/or reorder the columns in a Table insight visualization:

  • For the given dashboard, in the Action bar, select the Personalize icon.
  • In the Personalize menu, select Personalize.
  • In the Context bar, select the Edit Insights Settings tab.
  • In the dashboard, select an applicable Table insight visualization.
  • In the Table Columns pane, select or deselect the checkbox for one or more columns.
  • To reorder a column, select the column in the Table Columns pane and move it above or below another column using drag & drop.
  • In the Action bar, select Save.

Reset to Original

Here are the steps to reset a personalized view back to the original dashboard view:

  • For the given dashboard, in the Action bar, select the Personalize icon.
  • In the Personalize menu, select Personalize.
  • In the Action bar, select Reset to original.

Switch between Original and Personalized Views

Having created a personalized view, a user can switch between their personalized view and the original view of the dashboard. Here are the steps to switch to a Personalized View:

  • For the given dashboard, in the Action bar, select the Personalize icon.
  • In the Personalize menu, select Personalized View.

Here are the steps to switch to the Original View:

  • For the given dashboard, in the Action bar, select the Personalize icon.
  • In the Personalize menu, select Original View.

Playlist of Favorite Dashboards

In the 4.8 release, you can now view your favorite dashboards as a playlist of dashboards. To play your favorite dashboards, in the Content tab, in Favorites, select the Playlist icon. To exit an active playlist, use the ESC keystroke.

An active playlist…

  • ignores the Open in Maximized View dashboard setting
  • shows each favorite dashboard in Full Screen
  • continues to play one or more visible dashboards until exited

An active playlist does not show the dashboard Navigation bar, Actions bar, or Filter bars. Instead, an active playlist shows a Playlist bar that consists of the following:

  • Current dashboard name
  • Current dashboard progress
  • Current dashboard number and the total number of dashboards in the playlist
  • The name of the next dashboard in the playlist
  • Settings (Gear) icon

An active playlist offers the following configuration options:

  • Reorder, hide, or show dashboards
  • Select the dashboard view duration in terms of speed (Slow, Medium, Fast)
  • If enabled for the Tenant, select the Theme (Light or Dark mode)
  • Select the View mode for the dashboards, either Original or Personalized

Here are the steps to configure the active playlist:

  • In the Playlist bar, select the** Settings** (Gear) icon.
  • In the Configure Playlist dialog, select the available options.
  • Select Play Dashboards.

NOTE:

Configurations for an active playlist cannot be saved.

New Analytic Functions

This release introduces three new Analytic Functions:

  • rank()
  • denseRank()
  • index()

The following table illustrates the behavior of the Analytic functions:

Category Product Min List Price rank() denseRank() index()
Fruit Apple 0.25 1 1 1
Fruit Orange 0.35 2 2 2
Fruit Banana 0.40 3 3 3
Fruit Lemon 0.40 3 3 4
Fruit Kiwi 0.75 5 4 5
Fruit Plum 0.75 5 4 6
Fruit Apricot 0.75 5 4 7
Fruit Yellow Melon 3.50 8 5 8
Fruit Cantaloupe 3.50 8 5 9
Fruit Pineapple 5.00 10 6 10

The new Analytic functions — rank(), denseRank(), and index() — require two input functions:

  • groupBy()
  • orderBy()
groupBy()
Usage

groupBy(dimensions...)

Specify one or more dimension columns.

orderBy()
Usage

orderBy(aggregation_function(measure), true|false)

Specify an aggregation and a boolean value. True enables an ascending order and false enables a descending order.

rank()
Usage

rank(groupBy( dimension, ...), orderBy(aggregation_function( measure), true|false, ...))

Returns the rank based on the order of the grouped values. Rows with identical values share the same rank value resulting in nonconsecutive rank values. Subsequent rows account for the number of previous rows. For example, if three rows have the identical rank value of N, then the subsequent row has a rank value of N + 3.

Example
rank(
    groupBy(
        SALES.PRODUCT.PRODUCT_CATEGORY
    ),
    orderBy(
        min(
            SALES.PRODUCT.LIST_PRICE
        )
        ,true
    )
)
denseRank()
Usage

denseRank(groupBy( dimension, ...), orderBy(aggregation_function( measure), true|false, ...))

Returns a consecutive rank of each row based on the order of the grouped values. Rows with identical values share the same rank value. Subsequent rows increment the rank in consecutive sequence. For example, if three rows have the identical rank value of N, then the subsequent row has a rank value of N + 1.

Example
denseRank(
    groupBy(
        SALES.PRODUCT.PRODUCT_CATEGORY
    ),
    orderBy(
        min(
            SALES.PRODUCT.LIST_PRICE
        )
        ,true
    )
)
index()
Usage

index(groupBy( dimension, ...), orderBy(aggregation_function( measure), true|false, ...))

Returns an index of rows based on the order of the grouped values. For identical results, the index order is not guaranteed.

Example
index(
    groupBy(
        SALES.PRODUCT.PRODUCT_CATEGORY
    ),
    orderBy(
        min(
            SALES.PRODUCT.LIST_PRICE
        )
        ,true
    )
)

New Date Functions

This release includes the following new Date functions:

  • addQuarters
  • monthEndDate
  • monthStartDate
  • quarterEndDate
  • quarterStartDate
  • weekEndDate
  • weekStartDate

NOTE:

Italicized functions names are overloaded functions.

Function overloading allows for functions to have the same name but have different input signatures and return types. This document does not include existing functions from previous versions of Incorta.

addQuarters
Usage

addQuarters(date_timestamp exp, int quarters)

Returns a scalar date or timestamp as specified by the number of quarters for the increment.

Example
addQuarters(
    date(
        "2020-05-02"
    ),
    1
)

addQuarters(
	timestamp(
		"2020-05-02 03:09:11"
	),
	1
)
monthEndDate
Usage

monthEndDate()

Returns the last date of the month for the current date.

monthStartDate
Usage

monthStartDate()

Returns the first date of the month for the current date.

quarterEndDate
Usage

quarterEndDate()

Returns the last date of the quarter for the current date.

quarterStartDate
Usage

quarterStartDate()

Returns the first date of the quarter for the current date.

weekEndDate
Usage

weekEndDate()

Returns the last date of the week for the current date where the week ends on Saturday and begins on Sunday.

weekStartDate
Usage

weekStartDate()

Returns the first date of the week for the current date where the week ends on Saturday and begins on Sunday.

New Conditional Statement Functions

This release includes the following new Conditional Statement functions:

  • decode

NOTE:

This document does not include existing functions from previous versions of Incorta.

decode
Usage
decode(field, caseValue, thenValue, elseValue)

If the “field” value matches the “caseValue” then the function returns the “thenValue” or else it returns the “elseValue”.

Example
decode(
	SALES.PRODUCTS.PROD_CATEGORY_ID,
	201,
	"My Electronics",
	"Other"
)

New Conversion Functions

This release includes the following new Conversion functions such as conversions for the current date and for between date calculations:

  • day
  • minute
  • month
  • monthEndDate
  • monthName
  • monthsBetween
  • monthStartDate
  • monthWeek
  • quarter
  • quarterDay
  • quarterEndDate
  • quarterMonth
  • quarterStartDate
  • quarterWeek
  • second
  • weekday
  • weekEndDate
  • weeknum
  • weekStartDate
  • year
  • yearsBetween

NOTE:

Italicized functions names are overloaded functions. Function overloading allows for functions to have the same name but have different input signatures and return types. This document does not include existing functions from previous versions of Incorta.

day
Usage

day()

Returns the day of the current date.

minute
Usage

minute()

Returns the minute value of the current time.

month
Usage

month()

Returns the month value of the current month.

monthEndDate
Usage

monthEndDate(date_timestamp exp)

Returns the last date of the month for the given date or timestamp.

Examples
monthEndDate(
	date(
		"2020-05-02"
	)
)

monthEndDate(
	timestamp(
		"2020-05-02 03:09:11"
	)
)
monthName
Usage

monthName()

Returns the month name of the current month.

monthName
Usage

monthName(date_timestamp exp)

Returns month name for the given date or timestamp.

Examples
monthName(
	date(
		"2020-05-02"
	)
)

monthName(
	timestamp(
		"2020-05-02 03:09:11"
	)
)
monthsBetween
Usage

monthsBetween(date_timestamp startDate, date_timestamp endDate)

Returns a double for the number of months between two dates or timestamps.

Examples
monthsBetween(
	date(
		"2019-04-20"
	),
	date(
		"2020-04-19"
	)
)

monthsBetween(
	timestamp(
		"2019-04-20 04:20:00"
	),
	timestamp(
		"2020-04-19 04:19:00"
	)
)
monthStartDate
Usage

monthStartDate(date_timestamp exp)

Returns the first date of the month for the given date or timestamp.

Examples
monthStartDate(
	date(
		"2020-05-02"
	)
)

monthStartDate(
	timestamp(
		"2020-05-02 03:09:11"
	)
)
monthWeek
Usage

monthWeek()

Returns the week number of the current month where the week begins on Sunday and ends on Saturday.

monthWeek
Usage

monthWeek(date_timestamp exp)

Returns the month week of the given date or timestamp where the week begins on Sunday and ends on Saturday.

Examples
monthWeek(
	date(
		"2020-05-02"
	)
)

monthWeek(
	timestamp(
		"2020-05-31 16:20:00"
	)
)
quarter
Usage

quarter()

Returns quarter number for the current quarter.

quarter
Usage

quarter(date_timestamp exp)

Return the quarter number for the given date or timestamp.

Examples
quarter(
	date(
		"2020-04-20"
	)
)

quarter(
	timestamp(
		"2021-01-01 00:00:00"
	)
)
quarterDay
Usage

quarterDay()

Returns the count of days from the beginning of the quarter for the current date.

quarterDay
Usage

quarterDay(date_timestamp exp)

Returns the count of days from the beginning of the quarter for the given date or timestamp.

Examples
quarterDay(
	date(
		"2020-04-20"
	)
)

quarterDay(
	timestamp(
		"2020-02-01 01:03:05"
	)
)
quarterEndDate
Usage

quarterEndDate(date_timestamp exp)

Returns the last date of the quarter for the given date or timestamp.

Examples
quarterEndDate(
	date(
		"2020-05-20"
	)
)

quarterEndDate(
	timestamp(
		"2020-07-01 11:13:15"
	)
)
quarterMonth
Usage

quarterMonth()

Returns the month of the quarter from the beginning of the quarter for the current date.

quarterMonth
Usage

quarterMonth(date_timestamp exp)

Returns the month of the quarter from the beginning of the quarter for the given date or timestamp.

Examples
quarterMonth(
	date(
		"2020-04-20"
	)
)

quarterMonth(
	timestamp(
		"2020-02-01 01:03:05"
	)
)
quarterStartDate
Usage

quarterStartDate(date_timestamp exp)

Returns the first date of the quarter for the given date or timestamp.

Examples
quarterStartDate(
	date(
		"2020-05-20"
	)
)

quarterStartDate(
	timestamp(
		"2020-07-01 11:13:15"
	)
)
quarterWeek
Usage

quarterWeek()

Returns the week of the quarter from the beginning of the quarter for the current date where the week begins on Sunday and ends on Saturday.

quarterWeek
Usage

quarterWeek(date_timestamp exp)

Returns the week of the quarter from the beginning of the quarter for the given date or timestamp where the week begins on Sunday and ends on Saturday.

Examples
quarterWeek(
	date(
		"2020-02-29"
	)
)

quarterMonth(
	timestamp(
		"2020-12-01 01:03:05"
	)
)
second
Usage

second()

Returns the count of seconds as an integer for the current time from 0 to 59.

weekday
Usage

weekday()

Returns the count of days as an integer for the current date where the week begins on Sunday as day 1 and ends on Saturday as day 7.

weekEndDate
Usage

weekEndDate(date_timestamp exp)

Returns the last date of the week for the given date or timestamp where the week ends on Saturday and begins on Sunday.

Examples
weekEndDate(
	date(
		"2020-04-20"
	)
)

weekEndDate(
	timestamp(
		"2020-07-01 04:20:59"
	)
)
weeknum
Usage

weeknum()

Returns the week number of the current week for the current year where the week begins on Sunday and ends on Saturday.

weekStartDate
Usage

weekStartDate(date_timestamp exp)

Returns the first date of the week for the given date or timestamp where the week ends on Saturday and begins on Sunday.

Examples
weekStartDate(
	date(
		"2020-04-20"
	)
)

weekStartDate(
	timestamp(
		"2020-07-01 04:20:59"
	)
)
year
Usage

year() Returns the year integer for the current year.

yearsBetween
Usage

yearsBetween(date_timestamp startDate, date_timestamp endDate)

Returns a double for the number of years between two dates or timestamps.

Examples
yearsBetween(
	date(
		"2019-04-20"
	),
	date(
		"2020-04-19"
	)
)

yearsBetween(
	timestamp(
		"2019-04-20 04:20:00"
	),
	timestamp(
		"2020-04-19 04:19:00"
	)
)

Additional considerations for new Date functions

In this release, for a given table in a schema, you can define a Load Filter using the Formula Builder. Consider using the new date functions for a Load Filter.


Command Line Tools

There are no new features or enhancements for command line tools in this release.


Additional Improvements and Enhancements

In the 4.8 release, there are additional improvements and enhancements:

Enhanced Accessibility Support

For users with motor impairments, this release offers enhanced keyboard controls. Users can use the Tab keystroke to navigate through all screen elements, including Navigation bar tabs, icon, buttons, dropdown lists, and textboxes.

For the visually challenged, users can navigate using screen readers and accessibility voice controls. Audio clues specify the currently selected item with instructions on how to navigate.

Changes to the NetSuite Suite Analytics data source connection string

In this release, for a NetSuite Suite Analytics data source, a Connection String now consists of the following properties:

  • Host:Port
  • ServerDataSource
  • Encrypted
  • CustomProperties

Embedded IFRAME property in nodes.properties

In this release, you can configure the behavior that affects if a dashboard embedded in an IFRAME can navigate back to the parent folder or to the Content folder. To enable navigation to the parent folder or Content folder, you must edit the ~/IncortaAnalytics/IncortaNode/node.properties file for all Incorta Nodes that run Analytics Services in the cluster with the following setting: iframe.navigation.enabled=true


© Incorta, Inc. All Rights Reserved.