Server Configurations
Set the following options to configure your server. These options are available in the Cluster Management Console (CMC) at Cluster Configurations > Server Configurations.
Cluster Configurations
Configuration Property: Kafka Consumer Service Name
- Analytics Service Restart Required? No
- Loader Service Restart Required? Yes
- Description: Name of the loader service that acts as the Kafka consumer. Use the format
<NODE_NAME>.<SERVICE_NAME>.
You must configure the name for the node running the Kafka loader service. If you do not configure the name for the node running the Kafka loader service, the system automatically generates a loader node name which can result in unexpected values. When you change the Kafka loader service consumer from one loader service (A) to another loader service (B), you must restart the current loader service (A) first, then restart the loader service (B).
SQL Interface Configurations
The SQL Interface (SQLi) enables Incorta to act as a PostgreSQL database so users can utilize Incorta’s powerful engine performance and features through other BI tools (e.g. Tableau).
Configuration Property: SQL interface port
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: Provide a number for the port used to connect to the Incorta engine from other BI tools, and run queries against the data loaded in memory. In this case, if the query is not supported by the Incorta engine, it will automatically be routed through Spark to be executed. You can choose to bypass the Incorta engine and run queries directly using Spark against data loaded in the staging area using the “Data Store (DS) port” property.
Configuration Property: Data Store (DS) port
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: Provide the port number to use for running queries directly using Spark against data loaded in the staging area.
Configuration Property: New Metadata handler port
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: Specifies a port for an auxiliary process that resolves metadata for SQLI clients. Requires restart of analytics services.
Configuration Property: Enable SSL for SQL interface ports
- Analytics Service Restart Required? No
- Loader Service Restart Required? No
- Description:
Configuration Property: Enable Connection Pooling
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: Enable this option to create a pool of open connections between external BI tools and Incorta Analytics. Enabling this option creates multiple connections that improve the query response time because Incorta does not create a new connection each time the SQL Interface needs data.
Configuration Property: Connection pool size
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: Provide the number of SQL interface connections to keep available when executing queries from external BI tools. Determining this value depends on the following factors: Multithreading support from external BI tools. - Query complexity. The Incorta host machine specs.- Available resources. If you set the value too high machine resources are reserved without being utilized. If you set the value too low, query executions can slow.
Configuration Property: External Tables CSV File Path
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: If you use Spark Yarn, provide the CSV file path for the tables you use.
Configuration Property: Concurrency
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: This property sets the number of metadata gathering processes that Incorta can run in parallel when executing queries against the Incorta engine.
Configuration Property: Default Schemas
- Analytics Service Restart Required? No
- Loader Service Restart Required? No
- Description: Provide a comma-separated list of schemas to be used in the case of using non-qualified table names (wrong table path), or when the SQL query does not specify a schema.
Configuration Property: Enable Cache
- Analytics Service Restart Required? No
- Loader Service Restart Required? No
- Description: Enable this option to cache repeated SQL operations and enhance the performance of executing queries, if there is enough available cache size.
Configuration Property: Cache size (In gigabytes)
- Analytics Service Restart Required? No
- Loader Service Restart Required? No
- Description: Set the maximum caching size per user to cache the data returned by the SQLi queries. When this size is exceeded, the least recently used (LRU) data gets evicted, availing space for newer cache. Setting this parameter depends on the available memory in the Incorta host server, and the size of the common queries result-sets. For example, if the result is larger than this value, it will never be cached, in which case, it would be recommended to increase the cache size.
Configuration Property: Cached query result max size
- Analytics Service Restart Required? No
- Loader Service Restart Required? No
- Description: Configure this property to set the max size for each query result. That is, the table cell count, which is the rows multiplied by the columns.
Configuration Property: Enable cache auto refresh
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Enable this option to automatically refresh the cache at specified intervals.
Spark Integration Configurations
The Incorta Unified Data Analytics Platform utilizes Spark to:
- Execute complex queries that are not yet to be supported by Incorta
- Perform queries on the data residing in the staging area without loading them into the Incorta memory.
Configuration Property: Spark master URL
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? Yes
- Description: Provide the Spark Master connection string for the Apache Spark instance to execute materialized views (or SQL) queries. This option is required to connect to Apache Spark. You can access this info by navigating to the Spark host server UI (from any browser), using the following format:
<SPARK_HOST_SERVER>:<SPARK_PORT_NO>
Copy the Spark Master connection string (usually found in the top center of the UI) in the format:spark://<CONNECTION_STRING>:<SPARK_PORT_NO>
The default port number for Spark installed with Incorta is 7077.
Configuration Property: Enable SQL App
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: The SQL App runs within Spark to handle all incoming SQLi queries. Enable this option to start the SQL App, and keep it up and running, to execute incoming SQL queries.
Configuration Property: SQL App driver memory
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Allocate memory (in GB) to be used by the SQL interface Spark to construct (not calculate) the final results. Consult with the Spark admin to set this value.
Configuration Property: Spark App Cores
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Set the number of dedicated CPU cores for the SQLi Spark App. Ensure that there are enough cores in your setup that are reserved for OS, applications, and other services.
Configuration Property: Spark App Memory
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Provide the maximum memory that will be used by SQLi Spark queries, leaving extra memory for MVs if needed. The memory required for both applications combined cannot exceed the Worker Memory.
Configuration Property: SQL App executors
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Provide the maximum number of executors that can be spawned on a single worker. Each of the executors will have some of the cores defined in the “SQL App cores” property, and will use part of the memory defined in the SQL App memory” property. Note that the cores and memory assigned for each executor will be the same for all the executors. Thus, the number of executors is the divisor of the number of SQL App cores and SQL App memory, and must be smaller than or equal to them. However, if it is not the divisor, the total cores will not be utilized.
Example:
If the SQL App cores = 7 and the SQL App executors = 3, each executor will take 2 cores, and 1 of the cores will not be used. Additionally, if the number of executors is greater than the SQL App cores, the number of executors will be equal to the number of cores. Note that each executor will use a single core (e.g. If you have SQL App cores = 7 and SQL App executors = 10, then 7 executors will be created, and each executor will utilize a single core). If the number of executors is greater than the SQL App memory, then the executors will consume the assigned memory at 1 GB/executor (e.g., if the SQL App memory = 5 and the SQL App executors = 7, 5 cores will be created, with 1 GB each).
Configuration Property: SQL App shuffle partitions
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: A single shuffle partition represents a block of data processed for joins and/or aggregations execution. The shuffle partition increases as the processed data size increases. The optimal shuffle partition size is approximately 128MBs. Increase this value as the processed data size increases which increases CPU utilization. If the query operates on a trivial amount of data, an increased amount of partitions will lead to a small partition size. This can increase the query execution time due to the overhead of managing needless partitions. Insufficient partitions can cause a query to fail.
Configuration Property: SQL App extra options
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Extra Spark options can be passed to the SQL interface Spark application. These options can be used to override the default configurations. Sample value:
spark.sql.shuffle.partitions=8;
spark.executor.memory=4g;spark.driver.memory=4g
.
Configuration Property: Enable SQL App Dynamic Allocation
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: This property controls the dynamic allocation of the Data Hub Spark application. If it is enabled, Spark will dynamically allocate executors depending on the workload. This is bounded by the resources assigned in other configurations (e.g. CPUs and memory per executor). When a query gets fired, it starts with one executor and dynamically generates others if needed. This option helps optimize resource utilization as it removes idle executors to save resources. If the workload increases, Spark claims them again.
Configuration Property: Spark App port
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: This port used by Incorta to connect to Spark and access the Data Hub.
Configuration Property: Spark App control channel port
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: This port sends a shutdown signal to the Spark SQL app if the Incorta server needs to shut down Spark.
Configuration Property: Spark App fetch size
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? No
- Description: This property sets the number of rows that Incorta fetches at a time from the Data Hub while aggregating the query result-set.
Configuration Property: SQL App spark home (Optional)
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Provide the file system path for the Apache Spark instance used to execute queries that are either sent to the Incorta engine or Data Hub. If this option is not set, the
SPARK_HOME
environment variable will be used instead. If this is not set either, the Spark home value for the Spark instance used for the Incorta materialized views and compaction will be used.
Configuration Property: SQL App Spark master URL (Optional)
- Analytics Service Restart Required?
- Loader Service Restart Required?
- Description: Provide the Spark master server URL. for executing the SQLi queries sent to the Incorta engine or Data Store. If not entered, the shipped Spark master URL will be used instead.
Tuning Configurations
Configuration Property: Enable Parquet File Compression
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? Yes
- Description: Toggle to enable the Parquet files compression. Compression is required if you use Materialized Views.
UI Customizations Configurations
Configuration Property: Color Palette mode
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? Yes
- Description: Choose a theme to change the color pallette of Incorta insights and dashboards.
Diagnostics Configurations
Logging
- Analytics Service Restart Required? Yes
- Loader Service Restart Required? Yes
- Description: Set the logging level, to specify details you want to see in log files. The available logging levels are: OFF, SEVERE, WARNING, INFO, CONFIG, FINE, FINER, FINEST, and ALL.
NOTE
After making configuration changes on any page, you must select Save before navigating away from that page to avoid losing unsaved data.