Optimize Performance
You can optimize performance in one of the following ways:
-
Load schemas into memory - Using this option, users can load all of their data into Incorta’s memory. This can be done from any table’s definition page, by disabling the Performance Optimized option.
- Load schemas partially into memory - using the Load Filter.
- Loading schemas in the staging area - This option This can be done by either Loading tables in the staging area or Selecting tables within a schema to load into memory.
Performance Testing
To determine benchmarks for various instances, Incorta ran two tests (a simple aggregated query against the number of records in the table), six times, on two different AWS machines using the following parameters:
- Turned off the cache so no data caching occured.
- Tested on one table with 2 or 3 billion records.
- Source data was in a .csv file
- Test did not change based on the AWS instance type
- Query: Group revenue by year and month
Test One
- Log into Incorta.
- Select Schema with one table and 2 billion records for the table.
- Run an aggregated query to show revenue by year and month.
Test Two
- Log into Incorta.
- Select Schema with one table and 3 billion records for the table.
- Run an aggregated query to show revenue by year and month.
The following table displays the results of the benchmark tests. The number of GB is the size of the data after it has been ingested into Incorta.
Machine types | 2 Billion Records (29.8 GB) | 3 Billion Records (41.91 GB) |
---|---|---|
AWS Win Machine m4-16x Large 256 GB RAM i-79f9fe61 | 2.46 to 2.58 seconds | 4.43 to 4.51 seconds |
AWS Linux Machine x1-32x-large | 1.28 to 1.39 seconds | 1.89 to 2.14 seconds |
Load tables in the staging area
Users can control whether to load schema tables in Incorta’s memory or the staging area, by toggling the Performance Optimized button in any table’s definition page.
To disable performance optimization and load tables in the staging area:
-
Navigate to “Schema” page.
- Select a schema.
-
Select the down arrow in the upper right-hand corner corresponding to a table.
- Select “Table Details” from the drop-down menu.
-
Disable the Performance Optimized option in the upper-right hand corner of the screen.
- Select Done, in the upper right-hand corner of the screen.
- Select Load, and choose “Full Load”. Upon successful loading, tables with disabled performance optimization get loaded into the staging area. Those tables with enabled performance optimization get loaded into Incorta’s memory.
Selecting tables within a schema to load into memory
You can load one or more tables in a schema to the staging area by using the settings menu. Loading to staging allows users to query only the needed data to be loaded into Incorta’s memory.
To select/deselect tables to load to the staging area:
- Navigate to the “Schema” page.
-
Select the schema containing the tables you of interest.
-
Select the settings menu (gear icon) in the upper right-hand corner of the screen. The schema settings window opens.
-
Deselect or search for the tables to be unloaded from memor, e.g. “Customers”. Tables in the schema will be shown selected by default as they are loaded to the engine memory.
-
Select Save Changes. A confirmation window appears.
- Select Confirm Changes.
Create a schema containing query tables
If you load your data into the staging area, you have the option to run a query and load only the returned data to Incorta’s memory. To do that, create a schema based on query tables, and load it.
Create a query table
Create a table based on the data returned by a query.
-
Choose a database schema from the left-hand side of the schema wizard, e.g. “Sales”.
-
Select one or more tables to add to your schema. Each table you add appears in the middle section of the page.
-
Select the ”+“icon in the upper right-hand corner of the schema wizard window, to create a custom query table.
-
Provide a “Table Name” and a query in the empty field.
- Select Execute.
Edit an existing query in a schema
To edit a custom query table in a schema, the following example will be used to edit a table querying the revenue per product:
-
Choose a database schema from the left-hand side of the schema wizard, e.g. “Sales”.
- Select the icon in the upper right-hand corner of the schema wizard window.
- Edit query.
- Select Execute.
Entity Relationship Diagram
Incorta Analytics provides a convenient way to show the relationship diagram of the tables used within a specific schema. Follow the steps below to learn more:
-
Navigate to the “Schemas” tab in the “Schema” page.
-
Select a schema to see its ERD by clicking on its row, not the checkbox. The schema definition page opens.
-
Select Diagram in the upper right-hand corner of the screen. Select List to exit the diagram view.
-
Add the table keys to the diagram, by enabling “Show Keys” in the settings menu at the top.
-
Add the table columns to the diagram, by enabling “Show Columns” in the settings menu at the top.
-
Change the diagram orientation, by enabling selecting the desired orientation in the settings menu at the top.
Edit schema settings from the schema page
You can edit a schema settings from the “Schemas” tab in the “Schema” page. To access it:
-
Navigate to the “Schemas” tab in the “Schema” page.
-
Select the checkbox corresponding to one schema only to edit its settings. The “Actions” menu appears in the upper right-hand corner of the screen. Note that selecting more than one checkbox affects the available items in the “Actions” menu.
-
Select “Actions” in the upper right-hand corner of the screen.
-
Select “Schema settings” from the drop-down menu. The “Schema Settings” window appears.
- Select Save Changes.
Edit schema settings from the schema definition page
You can edit a schema settings in the definition page of that schema:
-
Navigate to the “Schemas” tab in the “Schema” section.
-
Select any schema to access its definition, by clicking on its row, not the checkbox. The schema definition page opens.
-
Select the gear (settings) icon in the upper right-hand corner of the screen. The “Schema Settings” window appears.
- Select Save Changes.
Add or Edit a schema description
In the schema settings window, you can add/edit a schema description.
To add or edit a schema description, simply use the following steps:
- Select the “Settings” tab.
- Add or edit the schema description in the “Schema Description” field.
-
Select Save Changes.
Share a schema with other users and groups
In the schema settings window, you can easily share a schema with other users and groups, and set their permissions on that schema.
To share a schema with other users and groups, simply use the following steps:
-
Select the “Sharing” tab.
-
Select ”+” in the upper right-hand corner to add users/groups. The search bar appears with the least privilege (i.e. “Can View”) selected by default.
-
Enter a name, email address, or group name to share the schema with. Once you start typing, suggestions will be displayed to help you choose.
-
Select a user/group. The permissions menu appears.
- Select a permission for the selected user/group.
-
Select Save. Note that each permission includes the one on top of it (e.g. if a user has the “Can Share” permission, they will automatically have the “Can View” permission). The sharing wizard closes, returning back to the “Shared With” list, showing all the users along with their permission level.
-
Select the permission (in blue) corresponding to a user to either change their permission level, or remove them and stop sharing the schema with them.
- Select a privilege and select Save, or Cancel to ignore the changes.
Set the loading sequence for a schema table
When loading a schema, you can easily set the extraction sequence for the tables within that schema. This can be done by giving each table a number, and they will be extracted in an ascending order according to that number. For example, if there are five tables; two of which share the same load order of 1 and the rest have the load order of 2, the two tables of the order 1 will be extracted in parallel first. After loading is successful, the rest of the tables will follow.
To set the tables extraction sequence, simply use the following steps:
-
Select the “Tables load order” tab. The tables list appears with the corresponding extraction sequence number, defaulted at 10.
- Select in the “Table Load Order” column corresponding to any table and enter the extraction order for that table. The default extraction order for all tables is 10, meaning that they will all be extracted in parallel. The number 10 is set so that you can easily set other tables to be extracted first. Note that tables having the same load order will be extracted in parallel.
- Select Reset Tables Load Order to reset any values you may have changed back to the default value, i.e. 10.
- Select Save Changes.
Edit Schema Structure
You can use the schema definition page to edit a schema structure and the underlying tables for that schema.
To access a schema definition page:
- Go the “Schema” page.
- Select a schema (not the checkbox) to edit its structure. The schema definition page opens.
Add a Table using the Schema Wizard
To add tables using the schema wizard, select + New in the upper right-hand corner of the screen, and select “Schema Wizard”. Run the wizard and add the tables from the panel on the left-hand side.
Add a Table Manually
-
Select “Table” from the + New drop-down menu. A submenu appears.
-
Select a type for the new table, the “Data Source” window opens at the top of the “New Table” page. For xlsx, csv, tsv, tab and txt files, select “File System”.
- Specify a data source or data file.
- Turn on the “Incremental” switch button (where available) to enable incremental loading, in which only new data is loaded from the server. If you do not enable incremental loading, this table will always be loaded fully when data is loaded or refreshed. This can result in long loading times.
-
Specify the datasource parameters:
- For a SQL datasource, defined the SQL query in the “Query” field. If you have selected incremental loads, define an update query in the “Update Query” field, e.g. “where <LastUpdateDate>>?”, given that a Change Data Capture column exists within the table. Note that ”?” refers to the last successful load start time.
- For CSV files: Specify the character set, thousands separator and date format. If you have enabled incremental loading, specify which CSV file to use for the incremental changes.
- For Excel XLSX files: Select a Worksheet. If no Worksheet is specified, the first one will be used. If you have enabled incremental loading, specify which worksheet contains the incremental changes (the delta from the last version loaded). For password-protected Excel files, ensure that this feature is enabled in the admin UI under “Tenant Configurations”. Users will need to enter the password when creating a table based on this file.
- For SalesForce data, define an SOQL query. If you have selected incremental loads, define an update query.
- Fetch Size is 300 rows by default; leave this value unset unless you are tuning the performance.
- For query-based tables, add or delete columns as needed by modifying the SQL or SOQL (SalesForce) query and then clicking “Update Columns.”
- Customize columns in your table by modifying the Label, Type and Dimension as needed.
-
Scroll down to add any of the following:
- Formula Columns.
- Runtime Security Filters.
- Parent and Child Joins.