Class BinaryExternalSorter

  • All Implemented Interfaces:
    Closeable, AutoCloseable, org.apache.flink.runtime.operators.sort.Sorter<org.apache.flink.table.data.binary.BinaryRowData>, org.apache.flink.runtime.operators.util.CloseableInputProvider<org.apache.flink.table.data.binary.BinaryRowData>

    public class BinaryExternalSorter
    extends Object
    implements org.apache.flink.runtime.operators.sort.Sorter<org.apache.flink.table.data.binary.BinaryRowData>
    The BinaryExternalSorter is a full fledged sorter for binary format. It implements a multi-way merge sort. Internally, it has three asynchronous threads (sort, spill, merger) which communicate through a set of blocking circularQueues, forming a closed loop. Memory is allocated using the MemoryManager interface. Thus the component will not exceed the provided memory limits.
    • Constructor Detail

      • BinaryExternalSorter

        public BinaryExternalSorter​(Object owner,
                                    org.apache.flink.runtime.memory.MemoryManager memoryManager,
                                    long reservedMemorySize,
                                    org.apache.flink.runtime.io.disk.iomanager.IOManager ioManager,
                                    AbstractRowDataSerializer<org.apache.flink.table.data.RowData> inputSerializer,
                                    BinaryRowDataSerializer serializer,
                                    NormalizedKeyComputer normalizedKeyComputer,
                                    RecordComparator comparator,
                                    int maxNumFileHandles,
                                    boolean compressionEnabled,
                                    int compressionBlockSize,
                                    boolean asyncMergeEnabled)
      • BinaryExternalSorter

        public BinaryExternalSorter​(Object owner,
                                    org.apache.flink.runtime.memory.MemoryManager memoryManager,
                                    long reservedMemorySize,
                                    org.apache.flink.runtime.io.disk.iomanager.IOManager ioManager,
                                    AbstractRowDataSerializer<org.apache.flink.table.data.RowData> inputSerializer,
                                    BinaryRowDataSerializer serializer,
                                    NormalizedKeyComputer normalizedKeyComputer,
                                    RecordComparator comparator,
                                    int maxNumFileHandles,
                                    boolean compressionEnabled,
                                    int compressionBlockSize,
                                    boolean asyncMergeEnabled,
                                    float startSpillingFraction)
    • Method Detail

      • startThreads

        public void startThreads()
        Starts all the threads that are used by this sorter.
      • close

        public void close()
        Shuts down all the threads initiated by this sorter. Also releases all previously allocated memory, if it has not yet been released by the threads, and closes and deletes all channels (removing the temporary files).

        The threads are set to exit directly, but depending on their operation, it may take a while to actually happen. The sorting thread will for example not finish before the current batch is sorted. This method attempts to wait for the working thread to exit. If it is however interrupted, the method exits immediately and is not guaranteed how long the threads continue to exist and occupy resources afterwards.

        Specified by:
        close in interface AutoCloseable
        Specified by:
        close in interface Closeable
      • write

        public void write​(org.apache.flink.table.data.RowData current)
                   throws IOException
        Throws:
        IOException
      • write

        @VisibleForTesting
        public void write​(org.apache.flink.util.MutableObjectIterator<org.apache.flink.table.data.binary.BinaryRowData> iterator)
                   throws IOException
        Throws:
        IOException
      • getIterator

        public org.apache.flink.util.MutableObjectIterator<org.apache.flink.table.data.binary.BinaryRowData> getIterator()
                                                                                                                  throws InterruptedException
        Specified by:
        getIterator in interface org.apache.flink.runtime.operators.util.CloseableInputProvider<org.apache.flink.table.data.binary.BinaryRowData>
        Throws:
        InterruptedException
      • getUsedMemoryInBytes

        public long getUsedMemoryInBytes()
      • getNumSpillFiles

        public long getNumSpillFiles()
      • getSpillInBytes

        public long getSpillInBytes()