Class MultiInputSortingDataInput<IN,​K>

  • All Implemented Interfaces:
    Closeable, AutoCloseable, org.apache.flink.runtime.io.AvailabilityProvider, PushingAsyncDataInput<IN>, StreamTaskInput<IN>

    public final class MultiInputSortingDataInput<IN,​K>
    extends Object
    implements StreamTaskInput<IN>
    An input that wraps an underlying input and sorts the incoming records. It starts emitting records downstream only when all the other inputs coupled with this MultiInputSortingDataInput have finished sorting as well.

    Moreover it will report it is available or approximately available if it has some records pending only if the head of the MultiInputSortingDataInput.CommonContext.getQueueOfHeads() belongs to the input. That way there is only ever one input that reports it is available.

    The sorter uses binary comparison of keys, which are extracted and serialized when received from the chained input. Moreover the timestamps of incoming records are used for secondary ordering. For the comparison it uses either FixedLengthByteKeyComparator if the length of the serialized key is constant, or VariableLengthByteKeyComparator otherwise.

    Watermarks, watermark statuses, nor latency markers are propagated downstream as they do not make sense with buffered records. The input emits the largest watermark seen after all records.

    • Method Detail

      • wrapInputs

        public static <K> MultiInputSortingDataInput.SelectableSortingInputs wrapInputs​(org.apache.flink.runtime.jobgraph.tasks.TaskInvokable containingTask,
                                                                                        StreamTaskInput<Object>[] sortingInputs,
                                                                                        org.apache.flink.api.java.functions.KeySelector<Object,​K>[] keySelectors,
                                                                                        org.apache.flink.api.common.typeutils.TypeSerializer<Object>[] inputSerializers,
                                                                                        org.apache.flink.api.common.typeutils.TypeSerializer<K> keySerializer,
                                                                                        StreamTaskInput<Object>[] passThroughInputs,
                                                                                        org.apache.flink.runtime.memory.MemoryManager memoryManager,
                                                                                        org.apache.flink.runtime.io.disk.iomanager.IOManager ioManager,
                                                                                        boolean objectReuse,
                                                                                        double managedMemoryFraction,
                                                                                        org.apache.flink.configuration.Configuration taskManagerConfiguration,
                                                                                        org.apache.flink.api.common.ExecutionConfig executionConfig)
      • prepareSnapshot

        public CompletableFuture<Void> prepareSnapshot​(org.apache.flink.runtime.checkpoint.channel.ChannelStateWriter channelStateWriter,
                                                       long checkpointId)
        Description copied from interface: StreamTaskInput
        Prepares to spill the in-flight input buffers as checkpoint snapshot.
        Specified by:
        prepareSnapshot in interface StreamTaskInput<IN>
      • getAvailableFuture

        public CompletableFuture<?> getAvailableFuture()
        Specified by:
        getAvailableFuture in interface org.apache.flink.runtime.io.AvailabilityProvider