DataSet

```
public interface DataSet
```
A data set represents an extraction unit for a single table. It is the Incorta equivalent of a JDBC prepared statement.
A data set is used during data extraction to actually extract data records from the data source.
A data set is also used during manual schema editing for discovering table columns.

Since:

1.0

Method Summary

All Methods Instance Methods Abstract Methods Default Methods
Modifier and Type	Method and Description
`void`	`cancelQuery()` This method is used during data extraction if the job is canceled while `queryData(...)` or `queryDataUpdates(...)` is running.
`com.incorta.io.Record.ColumnDef[]`	`discover()` This method is used during manual schema editing to discover table columns.
`default long`	`getTimeDifference()` This method should only be implemented when there is a data discrepancy caused by time zone difference.
`java.util.List<com.incorta.io.Record.RecordSet>`	`queryData(com.incorta.io.Record.ColumnDef[] columns)` This method is used during data extraction, during a full load job.
`java.util.List<com.incorta.io.Record.RecordSet>`	`queryDataUpdates(com.incorta.io.Record.ColumnDef[] columns, long lastUpdated)` This method is used during data extraction, during an incremental load job.

- Method Detail
  - discover
```
com.incorta.io.Record.ColumnDef[] discover()
                                    throws ConnectorException
```
    This method is used during manual schema editing to discover table columns. For example, if a user edits a table query, this method is used to execute the query and return the columns of the result set.
    
    Returns:
    
    Array of Record.ColumnDef objects
    
    Throws:
    
    ConnectorException
    
    Since:
    
    1.0
  - queryData
```
java.util.List<com.incorta.io.Record.RecordSet> queryData(com.incorta.io.Record.ColumnDef[] columns)
                                                   throws ConnectorException
```
    This method is used during data extraction, during a full load job. It is expected to return a list of Record.RecordSet objects. If this connector supports parallel extraction, it should return one record set for each parallel extraction thread. In a typical (sequential) data set implementation, the returned list should contain a single record set.
    
    Parameters:
    
    columns - Columns to be queried
    
    Returns:
    
    List of parallel Record.RecordSet objects, typically containing just one
    
    Throws:
    
    ConnectorException
    
    Since:
    
    1.0
  - queryDataUpdates
```
java.util.List<com.incorta.io.Record.RecordSet> queryDataUpdates(com.incorta.io.Record.ColumnDef[] columns,
                                                                 long lastUpdated)
                                                          throws ConnectorException
```
    This method is used during data extraction, during an incremental load job. It is expected to return a list of Record.RecordSet objects. If this connector supports parallel extraction, it should return one record set for each parallel extraction thread. In a typical (sequential) data set implementation, the returned list should contain a single record set.
    
    Parameters:
    
    columns - Columns to be queried
    
    lastUpdated - The timestamp (as a Unix epoch in milliseconds) of the previous successful extraction
    
    Returns:
    
    List of parallel Record.RecordSet objects, typically containing just one
    
    Throws:
    
    ConnectorException
    
    Since:
    
    1.0
  - cancelQuery
```
void cancelQuery()
          throws ConnectorException
```
    This method is used during data extraction if the job is canceled while queryData(...) or queryDataUpdates(...) is running. It is used to notify the DataSet object to terminate the running query if possible.
    
    Throws:
    
    ConnectorException
    
    Since:
    
    1.0
  - getTimeDifference
```
default long getTimeDifference()
                        throws ConnectorException
```
    This method should only be implemented when there is a data discrepancy caused by time zone difference. This can happen sometimes when there is a time zone difference between the machine hosting Incorta and the machine hosting the data being extracted. The purpose of this method is to return a time offset (in milliseconds).
    Note that a time zone difference does not automatically mean that a data discrepancy will occur or that this method must be implemented. This is only the case when the driver used to connect to the data source is unable to handle this time zone difference correctly (e.g. Oracle database when the incremental column type is DATE or TIMESTAMP instead of TIMESTAMP WITH TIME ZONE).
    By default, this method returns 0.
    
    Returns:
    
    Time offset in milliseconds calculated as (data source time - Incorta time)
    
    Throws:
    
    ConnectorException
    
    Since:
    
    1.0

Interface DataSet

Method Summary

Method Detail

discover

queryData

queryDataUpdates

cancelQuery

getTimeDifference