Concepts → Join Condition
About a join condition
A join condition defines the relationship between a physical schema entity object and itself (self-join) or the relationship between two entity objects: a Child table and a Parent table.
A join can have one or more conditions. A join condition is synonymous with the ON clause in a SQL Join statement. The Join Editor only supports Left Outer Joins. This means that a query between the objects will always return all the rows from the entity object on the left side of the join (that is, the Child tablet) and only the rows from the entity object on the right side of the join (that is, the Parent table) where there is a match, if any, between the join columns. The Loader Service evaluates the join and creates a Direct Data Map file in shared storage when saving the join details or when loading the related physical schema(s) or entity object(s). When executing a query that references the columns for the joined entity objects, the Analytics Service reads from memory the Direct Data Map for the join.
When you run a query that references columns from both the Child and Parent objects; rows from the Child object with no match in the Parent object will show null values for the columns from the Parent object.
About a matching row
The join conditions, along with the join filters, determine which rows to retrieve from the Parent object. A condition in a join constitutes an expression to specify one or all of the criteria of a matching row in the Parent object, while the join filter, if any, further narrows down the retrieved rows. A row in the Parent object has to satisfy all the join conditions and join filters in order to be included in the join.
Although both the join filter and join condition are synonymous with the ON clause in a Join SQL statement, the condition expression compares the values of the two columns in the join using the condition operator; where the filter compares the value of a column from the Parent object with a specified value, except for the First Version and Last Version operators which function differently.
The join condition context
You use the Join Editor to manage conditions for a given join. You have to be in a group with the Schema Manager or the SuperRole role to access the Schema Manager. You also need to either own the physical schema of the Child object or have Edit access rights to it in order that you can access the Join Editor and manage the physical schema joins.
In the case of a cross-schema join, you need to own both physical schemas, or at least have Edit access rights to the physical schema of the Child table and View access rights to the physical schema of the Parent table.
A join that you create between entity objects in different physical schemas shows in the Schema Designer for both physical schemas. You can only edit these joins by using the Join Editor for the physical schema that has the Child table in the join.
Cyclical joins between two entity objects are not supported, that is, two joins that share the same join columns while exchanging the Child and Parent objects in both joins. An error message appears when you add a condition that results in a cyclical join.To avoid cyclical joins between two entity objects, you can use an alias.
You can use the Schema Diagram Viewer to view the joins for entity objects in a physical schema and the join conditions for each join relationship. You can also access the Joins Editor to edit or view, as applicable, a given join relationship if you use the Schema Diagram Viewer for the physical schema that has the Child object.
The join condition syntax
A join condition expression consists of:
- a column from the Child object, typically a Foreign key that references a Primary Key in the Parent object; however, it can be any column with a supported data type
- a comparison operator
- a column in the Parent object, typically a Primary key of the same data type as the column from the Child object; however, it can be any column other than the Primary key
These three elements constitute the expression that evaluates which rows return true
(that is, matching rows) in the Parent object.
About the condition Child and Parent columns
For join condition columns in the Child or Parent sides, you specify the fully qualified name of the column by selecting the physical schema, entity object, and column. For a join condition, you can reference the following types of columns, whether for the Child side or the Parent side:
- Physcial_Schema.Table.Column
- Physcial_Schema.Table.Formula_Column
- Physcial_Schema.Incorta_Analyzer_Table.Column
- Physcial_Schema.Incorta_Analyzer_Table.Formula_Column
- Physcial_Schema.Materialized_View.Column
- Physcial_Schema.Materialized_View.Formula_Column
- Physcial_Schema.Alias.Column
- Physcial_Schema.Alias.Formula_Column
- Physcial_Schema.Incorta_SQL_Table.Column (including computed columns)
It is not recommended to reference a formula column in an entity object in another schema for the join condition column.
Do not select the Child and Parent columns from the same Incorta Analyzer table or Incorta SQL table. These tables do not support self joins. When you try to load an Incorta Analyzer table or Incorta SQL table with a self join, it will throw errors.
The following table shows the supported and recommended data types for columns that you can reference in a join condition for the Child or Parent side:
Data type | Supported | Recommended |
---|---|---|
Boolean (in the case of using formula columns that return a boolean value) | ✓ | ✓ |
Date | ✓ | ✓ |
Double | ✗ | |
Integer | ✓ | ✓ |
Long | ✓ | ✓ |
Null | ✗ | |
String | ✓ | ✗ |
Text | ✓ | ✗ |
Timestamp | ✓ | ✗ |
Although both join columns have to be of the same data type, a join condition allows for the join columns to have similar, but different data types such as an Integer and a Long. Implicit conversions between types may produce false or missed joins.
It is not recommended to use a text or string column for the join filter column as string comparisons can be slow for large data volumes. Also, using a timestamp column may result in inaccurate join results unless it is intended to match the exact timestamp value.
To learn more, review Concepts → Join Condition Column.
About the join condition operator
A join operator is a comparison operator used to compare values of the join columns. The following are the available operators to select from to compare between values in the join condition columns:
<
: less than≤
: less than or equal to=
: equal to≠
,!=
, or<>
: not equal to>
: greater than≥
: greater than or equal to
In addition to numeric and date scalar values, you can use comparison operators with string or text values. In such a case, the alphabetical order of the compared strings determines which is the greater value and which is the less value.
Types of join conditions
Incorta supports two different types of join conditions depending upon the operator you use in the join:
- Equijoin condition
- Range join condition
Equijoin condition
In an Equijoin condition, you match the values in the columns on equality. You use the Equal operator (=
) for the join condition. Typically, you use a Foreign key and a Primary Key for the join columns in the Child and Parent sides respectively.
At least one of the join conditions has to be an Equijoin. An Equijoin uses the equal operator (=
). Typically, the join is between a Foreign Key in the Child table and a Primary Key in the Parent table. In the case of an Equijoin condition, using a join Parent column without unique values may return random or inconsistent values from the Parent entity object.
The following is an example of an Equijoin condition:
sch_HR.tbl_employees.DepartmentID = sch_HR.tbl_departments.DepartmentD
Range join condition
In a range join condition, you use any of the comparison operators other than the Equal one (=
):
<
: less than≤
: less than or equal to≠
,!=
, or<>
: not equal to>
: greater than≥
: greater than or equal to
Typically, for a join with range condition, you need to create an Equijoin condition to keep data consistent.
The following is an example of a range condition:
PM.Delivery.DeliveryDate > PM.Projects.PlannedDeliveryDate
Examples of join conditions
The following table shows examples of join conditions and what a matching row in the Parent object will be when running a query that references columns from both entity objects assuming that no filters are added to the join:
Join Condition | Matching Rows |
---|---|
Sales.OrderDetails.OrderID = Sales.Orders.OrderID |
For each row in the OrderDetails table, retrieve only rows from the Orders table where the values of the OrderID columns in both tables are equal (match on equality) |
HR.Employees.ManagerID = HR.Employees.EmployeeID |
Two new columns appear in the Analyzer to construct the hierarchy of the table data: _LVL ): starting from 0, the level of the row in the hierarchy (_LEAF ): true or false , determine if the row is the last node in its path (true) or not (false) For each row in the Employees table show the value for the Level and Is Leaf columns. |
HR.Employees.ManagerID = HR.EmployeesAlias.EmployeeID |
For each row in the Employees table, retrieve only rows from the EmployeesAlias table where ManagerID in the Employees table is equal to the EmployeeID in the EmployeesAlias table. |
PM.Delivery.ProjectID = PM.Projects.ProjectID and PM.Delivery.DeliveryDate > PM.Projects.PlannedDeliveryDate |
For each row in the Delivery table, retrieve only rows from the Projects table where the value of ProjectID in both tables are equal and the DeliveryDate in the Delivery table is greater than the PlannedDeliveryDate in the Projects table. This shows overdue projects. |
HR.Employees.BranchID = HR.Branches.BranchID and HR.Employees.City <> HR.Branches.City |
For each row in the Employees table, retrieve only rows from the Branches table where the Branch’s City is not equal to the Employee’s City . That is, only employees working in a city other than the one that they live in will have data from the Branches table. |
Example of the Equivalent SQL Statement
The following is an example of the SQL statement equivalent to a join with one condition and a filter. Consider the sch_HR
physical schema with the tbl_employees
and tbl_departments
tables. A join created with the following condition and filter details:
Join Element | Details |
---|---|
Condition | sch_HR.tbl_employees.DepartmentID = sch_HR.tbl_departments.DepartmentD |
Filter | sch_HR.tbl_departments.DepartmentName IN ("HR","Sales") |
SQL Statement | SELECT sch_HR.tbl_employees.Name, sch_HR.tbl_departments.DepartmentName FROM sch_HR.tbl_employees LEFT OUTER JOIN sch_HR.tbl_departments ON sch_HR.tbl_departments.DepartmentName IN (“HR”,”Sales”) AND sch_HR.tbl_employees.DepartmentID = sch_HR.tbl_departments.DepartmentID |
Query Result | A list with all employees from the Child table (tbl_employees ). For each row in the Child table where the department is “HR ” or “Sales ”, the department name is retrieved from the Parent table (tbl_departments ); otherwise, the DepartmentName column will be Null. |
To learn more, review Concepts → Join.