Class HashJoinOperator
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
-
- org.apache.flink.table.runtime.operators.TableStreamOperator<org.apache.flink.table.data.RowData>
-
- org.apache.flink.table.runtime.operators.join.HashJoinOperator
-
- All Implemented Interfaces:
Serializable,org.apache.flink.api.common.state.CheckpointListener,org.apache.flink.streaming.api.operators.BoundedMultiInput,org.apache.flink.streaming.api.operators.InputSelectable,org.apache.flink.streaming.api.operators.KeyContext,org.apache.flink.streaming.api.operators.KeyContextHandler,org.apache.flink.streaming.api.operators.SetupableStreamOperator<org.apache.flink.table.data.RowData>,org.apache.flink.streaming.api.operators.StreamOperator<org.apache.flink.table.data.RowData>,org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator,org.apache.flink.streaming.api.operators.TwoInputStreamOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>
public abstract class HashJoinOperator extends TableStreamOperator<org.apache.flink.table.data.RowData> implements org.apache.flink.streaming.api.operators.TwoInputStreamOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>, org.apache.flink.streaming.api.operators.BoundedMultiInput, org.apache.flink.streaming.api.operators.InputSelectable
Hash join base operator.The join operator implements the logic of a join operator at runtime. It uses a hybrid-hash-join internally to match the records with equal key. The build side of the hash is the first input of the match. It support all join type in
HashJoinType.Note: In order to solve the problem of data skew, or too much data in the hash table, the fallback to sort merge join mechanism is introduced here. If some partitions are spilled to disk more than three times in the process of hash join, it will fallback to sort merge join by default to improve stability. In the future, we will support more flexible adaptive hash join strategy, for example, in the process of building a hash table, if the size of data written to disk reaches a certain threshold, fallback to sort merge join in advance.
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.flink.table.runtime.operators.TableStreamOperator
TableStreamOperator.ContextImpl
-
-
Field Summary
-
Fields inherited from class org.apache.flink.table.runtime.operators.TableStreamOperator
ctx, currentWatermark
-
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description voidclose()voidendInput(int inputId)abstract voidjoin(RowIterator<org.apache.flink.table.data.binary.BinaryRowData> buildIter, org.apache.flink.table.data.RowData probeRow)static HashJoinOperatornewHashJoinOperator(HashJoinType type, boolean leftIsBuild, boolean compressionEnable, int compressionBlockSize, GeneratedJoinCondition condFuncCode, boolean reverseJoinFunction, boolean[] filterNullKeys, GeneratedProjection buildProjectionCode, GeneratedProjection probeProjectionCode, boolean tryDistinctBuildRow, int buildRowSize, long buildRowCount, long probeRowCount, org.apache.flink.table.types.logical.RowType keyType, SortMergeJoinFunction sortMergeJoinFunction)org.apache.flink.streaming.api.operators.InputSelectionnextSelection()voidopen()voidprocessElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element)voidprocessElement2(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element)-
Methods inherited from class org.apache.flink.table.runtime.operators.TableStreamOperator
computeMemorySize, processWatermark
-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
finish, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted, notifyCheckpointComplete
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
-
-
-
Method Detail
-
open
public void open() throws Exception- Specified by:
openin interfaceorg.apache.flink.streaming.api.operators.StreamOperator<org.apache.flink.table.data.RowData>- Overrides:
openin classTableStreamOperator<org.apache.flink.table.data.RowData>- Throws:
Exception
-
processElement1
public void processElement1(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element) throws Exception- Specified by:
processElement1in interfaceorg.apache.flink.streaming.api.operators.TwoInputStreamOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>- Throws:
Exception
-
processElement2
public void processElement2(org.apache.flink.streaming.runtime.streamrecord.StreamRecord<org.apache.flink.table.data.RowData> element) throws Exception- Specified by:
processElement2in interfaceorg.apache.flink.streaming.api.operators.TwoInputStreamOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>- Throws:
Exception
-
nextSelection
public org.apache.flink.streaming.api.operators.InputSelection nextSelection()
- Specified by:
nextSelectionin interfaceorg.apache.flink.streaming.api.operators.InputSelectable
-
endInput
public void endInput(int inputId) throws Exception- Specified by:
endInputin interfaceorg.apache.flink.streaming.api.operators.BoundedMultiInput- Throws:
Exception
-
join
public abstract void join(RowIterator<org.apache.flink.table.data.binary.BinaryRowData> buildIter, org.apache.flink.table.data.RowData probeRow) throws Exception
- Throws:
Exception
-
close
public void close() throws Exception- Specified by:
closein interfaceorg.apache.flink.streaming.api.operators.StreamOperator<org.apache.flink.table.data.RowData>- Overrides:
closein classorg.apache.flink.streaming.api.operators.AbstractStreamOperator<org.apache.flink.table.data.RowData>- Throws:
Exception
-
newHashJoinOperator
public static HashJoinOperator newHashJoinOperator(HashJoinType type, boolean leftIsBuild, boolean compressionEnable, int compressionBlockSize, GeneratedJoinCondition condFuncCode, boolean reverseJoinFunction, boolean[] filterNullKeys, GeneratedProjection buildProjectionCode, GeneratedProjection probeProjectionCode, boolean tryDistinctBuildRow, int buildRowSize, long buildRowCount, long probeRowCount, org.apache.flink.table.types.logical.RowType keyType, SortMergeJoinFunction sortMergeJoinFunction)
-
-