Class DeduplicateFunctionHelper
- java.lang.Object
-
- org.apache.flink.table.runtime.operators.deduplicate.utils.DeduplicateFunctionHelper
-
public class DeduplicateFunctionHelper extends Object
Utility for deduplicate function.TODO utilize the respective helper classes that inherit from an abstract deduplicate function helper in each deduplicate function.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidcheckInsertOnly(org.apache.flink.table.data.RowData currentRow)check message should be insert only.static voidprocessFirstRowOnProcTime(org.apache.flink.table.data.RowData currentRow, org.apache.flink.api.common.state.ValueState<Boolean> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out)Processes element to deduplicate on keys with process time semantic, sends current element if it is first row.static voidprocessLastRowOnChangelog(org.apache.flink.table.data.RowData currentRow, boolean generateUpdateBefore, org.apache.flink.api.common.state.ValueState<org.apache.flink.table.data.RowData> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out, boolean isStateTtlEnabled, RecordEqualiser equaliser)Processes element to deduplicate on keys, sends current element as last row, retracts previous element if needed.static voidprocessLastRowOnChangelogWithFilter(FilterCondition.Context context, org.apache.flink.table.data.RowData currentRow, boolean generateUpdateBefore, org.apache.flink.api.common.state.ValueState<org.apache.flink.table.data.RowData> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out, boolean isStateTtlEnabled, RecordEqualiser equaliser, FilterCondition filterCondition)static voidprocessLastRowOnProcTime(org.apache.flink.table.data.RowData currentRow, boolean generateUpdateBefore, boolean generateInsert, org.apache.flink.api.common.state.ValueState<org.apache.flink.table.data.RowData> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out, boolean isStateTtlEnabled, RecordEqualiser equaliser)Processes element to deduplicate on keys with process time semantic, sends current element as last row, retracts previous element if needed.static booleanshouldKeepCurrentRow(org.apache.flink.table.data.RowData preRow, org.apache.flink.table.data.RowData currentRow, int rowtimeIndex, boolean keepLastRow)Returns true if currentRow should be kept.static voidupdateDeduplicateResult(boolean generateUpdateBefore, boolean generateInsert, org.apache.flink.table.data.RowData preRow, org.apache.flink.table.data.RowData currentRow, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out)Collect the updated result for duplicate row.
-
-
-
Method Detail
-
processLastRowOnProcTime
public static void processLastRowOnProcTime(org.apache.flink.table.data.RowData currentRow, boolean generateUpdateBefore, boolean generateInsert, org.apache.flink.api.common.state.ValueState<org.apache.flink.table.data.RowData> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out, boolean isStateTtlEnabled, RecordEqualiser equaliser) throws ExceptionProcesses element to deduplicate on keys with process time semantic, sends current element as last row, retracts previous element if needed.- Parameters:
currentRow- latest row received by deduplicate functiongenerateUpdateBefore- whether need to send UPDATE_BEFORE message for updatesstate- state of function, null if generateUpdateBefore is falseout- underlying collectorisStateTtlEnabled- whether state ttl is disabledequaliser- the record equaliser used to equal RowData.- Throws:
Exception
-
processLastRowOnChangelog
public static void processLastRowOnChangelog(org.apache.flink.table.data.RowData currentRow, boolean generateUpdateBefore, org.apache.flink.api.common.state.ValueState<org.apache.flink.table.data.RowData> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out, boolean isStateTtlEnabled, RecordEqualiser equaliser) throws ExceptionProcesses element to deduplicate on keys, sends current element as last row, retracts previous element if needed.Note: we don't support stateless mode yet. Because this is not safe for Kafka tombstone messages which doesn't contain full content. This can be a future improvement if the downstream (e.g. sink) doesn't require full content for DELETE messages.
- Parameters:
currentRow- latest row received by deduplicate functiongenerateUpdateBefore- whether need to send UPDATE_BEFORE message for updatesstate- state of functionout- underlying collector- Throws:
Exception
-
processLastRowOnChangelogWithFilter
public static void processLastRowOnChangelogWithFilter(FilterCondition.Context context, org.apache.flink.table.data.RowData currentRow, boolean generateUpdateBefore, org.apache.flink.api.common.state.ValueState<org.apache.flink.table.data.RowData> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out, boolean isStateTtlEnabled, RecordEqualiser equaliser, FilterCondition filterCondition) throws Exception
- Throws:
Exception
-
processFirstRowOnProcTime
public static void processFirstRowOnProcTime(org.apache.flink.table.data.RowData currentRow, org.apache.flink.api.common.state.ValueState<Boolean> state, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out) throws ExceptionProcesses element to deduplicate on keys with process time semantic, sends current element if it is first row.- Parameters:
currentRow- latest row received by deduplicate functionstate- state of functionout- underlying collector- Throws:
Exception
-
updateDeduplicateResult
public static void updateDeduplicateResult(boolean generateUpdateBefore, boolean generateInsert, org.apache.flink.table.data.RowData preRow, org.apache.flink.table.data.RowData currentRow, org.apache.flink.util.Collector<org.apache.flink.table.data.RowData> out)Collect the updated result for duplicate row.- Parameters:
generateUpdateBefore- flag to generate UPDATE_BEFORE message or notgenerateInsert- flag to generate INSERT message or notpreRow- previous row under the keycurrentRow- current row under the key which is the duplicate rowout- underlying collector
-
shouldKeepCurrentRow
public static boolean shouldKeepCurrentRow(org.apache.flink.table.data.RowData preRow, org.apache.flink.table.data.RowData currentRow, int rowtimeIndex, boolean keepLastRow)Returns true if currentRow should be kept.
-
checkInsertOnly
public static void checkInsertOnly(org.apache.flink.table.data.RowData currentRow)
check message should be insert only.
-
-