Concepts → Load Filter

About a load filter

A load filter is a filter expression that reduces the numbers of rows that the Analytics Service loads in to memory for a given physical schema table or materialized view from shared storage. A load filter does not affect how the Loader Service extracts data from a data source into shared storage.

A schema developer can define a load filter for a physical schema table or a materialized view. A schema developer uses the Formula Builder to create the filter expression for the load filter. A load filter only is applicable for performance optimized tables that are in the memory of the Analytics Service.

Warning: the SQL interface may ignore the Load Filter

A query executed via the SQL interface may run as an Apache Spark job using the SQLApp. Apache Spark reads from Apache Parquet files in shared storage.

Important: Full Load

After creating, modifying, or removing a load filter, you must perform a full load of the physical schema table or the materialized view.

About a filter expression for a load filter

An example of a filter expression is to return only the rows where the Product Category column has “Electronics” as a row value. In this regard, a filter expression consists of:

  • a filterable column, e.g. Product Category
  • a filter operator, e.g. equals
  • a filter value, e.g. “Electronics”

The filter expression context for a load filter typically consists of one or more filterable columns from the parent object which is either a physical schema table or a materialized view. Using the Formula Builder, you can specify a complex filter expression using mathematical operations (*,\,+,-) or certain types of built-in functions such lower(), inList(), and daysBetween().

Note

It is possible to specify a formula expression for a load filter such as 1=0. As a load filter, a formula expression will evaluate as a filter expression. A filter expression evaluates which rows return true. In the case of 1=0, the result is always false for all rows.

About a filterable column for a load filter

Both a physical schema table and a materialized view support the creation of a load filter. A filterable column for a load filter is a data-backed column from the parent object. A filterable column can be a key, measure, or dimension.

A formula column cannot be referenced in a the filter expression of a load filter. The filterable column must be either physical schema table column or a formula expression that includes one or more physical schema table columns. A filterable column cannot be a physical schema table formula column.

The filter expression requires only the column name for filterable column. Here are some examples:

Column Name
User
PROD_CATEGORY
Hire_Date
Important

The filter expression syntax for a load filter does not support a fully qualified, three-part name such as physical_schema_name.table_name.column_name. The Formula Builder will show a one-part name in red and a three-part name in violet. However, for a load filter, the Formula Builder will not throw a validation error for a one-part name, but will for a three-part name.

About a filter operator for a load filter

A filter operator defines the operation for the filter expression. The available operators depends on the data type of the filterable column. Here are equality and relational operators:

Operator Date Timestamp Integer Long Double String Text
Equals
Does Not Equal
Greater Than
Greater Than or Equals
Less Than
Less Than or Equals

A load filter also supports formula expressions that reference filterable columns. This means that you can specify a formula expression such as a mathematical operation (*,\,+,-) or a formula expression that requires using built-in functions such lower(), upper(), daysBetween(), and isNull.

A load filter supports equality and relational operators as well as the following categories of built-in functions:

  • Boolean functions
  • Conditional statements
  • Conversion functions
  • Arithmetic functions
  • Date functions
  • String functions

For a filterable column of the type double, the equality and relational operators may not function as desired. A double represents a floating point. Consider using various built-in arithmetic functions to best approximate equality comparisons.

For a filterable column of the type string or text, equality and relational operators are case sensitive. You must use the same case and the full string in order to create a valid comparison. Consider using various built-in String functions such as lower() or upper() to compare the values in a filterable column with a string literal.

Boolean functions return true or false values. A filter expression with a boolean functions exposes the filterable column within the expression itself.

About a filter value for a load filter

A filter value can be one of the following:

Examples
User = 1001
LIST_PRICE > = STANDARD_COST
inList(PROD_ID, "501,502,503")
month(TIME_ID) = $currentMonth

The following variables are not supported in filter expression for a load filter:

Examples of filter expression for a load filter

Here are some examples of a filter expression for a load filter:

Examples
User = 1001
lower(PROD_CATEGORY) = "Electronics"
month(Hire_Date) >= date("03/03/2020")
(PROD_LIST_PRICE - PROD_COST) * 2 = 200
daysBetween($currentDate,Time_ID) <= 900

Equals

The Equals != operator works for all data types. The Equals operator returns the rows where the data is equal to a specified value.

Operator Syntax
= FILTERABLE_COLUMN = FILTER_VALUE
Examples
User = 1001
PROD_CATEGORY != "Photo"
Hire_Date = date("03/03/2020")

Not Equals

The Not Equals != operator works for all data types. The Not Equals operator returns the rows where the data is not equal to a specified value.

Operator Syntax
!= FILTERABLE_COLUMN != FILTER_VALUE
Examples
User != 1001
PROD_CATEGORY != "Photo"
Hire_Date != date("03/03/2020")

Greater Than

The Greater Than > relational operator works for all data types. The Greater Than operator returns the rows where the data is greater than a specified value.

Operator Syntax
> FILTERABLE_COLUMN < FILTER_VALUE
Examples
User > 1001
PROD_CATEGORY > "Photo"
Hire_Date > date("03/03/2020")

Greater Than or Equals To

The Greater Than or Equals To >= is an equality and relational operator that works for all data types. The Greater Than or Equals To operator returns the rows where the data is greater than or equal to a specified value.

Operator Syntax
>= FILTERABLE_COLUMN >= FILTER_VALUE
Examples
User >= 1001
PROD_CATEGORY >= "Photo"
Hire_Date >= date("03/03/2020")

Less Than

The Less Than < relational operator works for all data types. The Less Than operator returns the rows where the data is less than a specified value.

Operator Syntax
< FILTERABLE_COLUMN < FILTER_VALUE
Examples
User < 1001
PROD_CATEGORY < "Photo"
Hire_Date < date("03/03/2020")

Less Than or Equals To

The Less Than or Equals To <= is an equality and relational operator that works for all data types. The Less Than or Equals To operator returns the rows where the data is less than or equal to a specified value.

Operator Syntax
<= FILTERABLE_COLUMN <= FILTER_VALUE
Examples
User <= 1001
PROD_CATEGORY <= "Photo"
Hire_Date <= date("03/03/2020")

Boolean and()

The boolean and() operator requires the evaluation of two boolean expressions. In order for the and() operator to apply the inferred filter expression, both conditional expression must return true.

Operator Syntax
and() and(EXP1 with FILTERABLE_COLUMN, EXP2 with FILTERABLE_COLUMN)
Examples
and(PROD_CATEGORY = "Electronics", PROD_LIST_PRICE < 500 )

Boolean or()

The boolean or() operator requires the evaluation of two boolean expressions. The or() operator applies the inferred filter expression for all conditional expressions that return true.

Operator Syntax
or() or(EXP1 with FILTERABLE_COLUMN, EXP2 with FILTERABLE_COLUMN)

© Incorta, Inc. All Rights Reserved.