Class AbstractBytesHashMap<K>

  • Direct Known Subclasses:
    BytesHashMap, WindowBytesHashMap

    public abstract class AbstractBytesHashMap<K>
    extends BytesMap<K,​org.apache.flink.table.data.binary.BinaryRowData>
    Bytes based hash map. It can be used for performing aggregations where the aggregated values are fixed-width, because the data is stored in continuous memory, AggBuffer of variable length cannot be applied to this HashMap. The KeyValue form in hash map is designed to reduce the cost of key fetching in lookup. The memory is divided into two areas:

    Bucket area: pointer + hashcode.

    • Bytes 0 to 4: a pointer to the record in the record area
    • Bytes 4 to 8: key's full 32-bit hashcode

    Record area: the actual data in linked list records, a record has four parts:

    • Bytes 0 to 4: len(k)
    • Bytes 4 to 4 + len(k): key data
    • Bytes 4 + len(k) to 8 + len(k): len(v)
    • Bytes 8 + len(k) to 8 + len(k) + len(v): value data

    BytesHashMap are influenced by Apache Spark BytesToBytesMap.

    • Field Detail

      • keySerializer

        protected final PagedTypeSerializer<K> keySerializer
        Used to serialize map key into RecordArea's MemorySegments.
    • Constructor Detail

      • AbstractBytesHashMap

        public AbstractBytesHashMap​(Object owner,
                                    org.apache.flink.runtime.memory.MemoryManager memoryManager,
                                    long memorySize,
                                    PagedTypeSerializer<K> keySerializer,
                                    org.apache.flink.table.types.logical.LogicalType[] valueTypes)
      • AbstractBytesHashMap

        public AbstractBytesHashMap​(Object owner,
                                    org.apache.flink.runtime.memory.MemoryManager memoryManager,
                                    long memorySize,
                                    PagedTypeSerializer<K> keySerializer,
                                    int valueArity)
    • Method Detail

      • getNumKeys

        public long getNumKeys()
        Description copied from class: BytesMap
        Returns the number of keys in this map.
        Specified by:
        getNumKeys in class BytesMap<K,​org.apache.flink.table.data.binary.BinaryRowData>
      • append

        public org.apache.flink.table.data.binary.BinaryRowData append​(BytesMap.LookupInfo<K,​org.apache.flink.table.data.binary.BinaryRowData> lookupInfo,
                                                                       org.apache.flink.table.data.binary.BinaryRowData value)
                                                                throws IOException
        Append an value into the hash map's record area.
        Returns:
        An BinaryRowData mapping to the memory segments in the map's record area belonging to the newly appended value.
        Throws:
        EOFException - if the map can't allocate much more memory.
        IOException
      • getNumSpillFiles

        public long getNumSpillFiles()
        Overrides:
        getNumSpillFiles in class BytesMap<K,​org.apache.flink.table.data.binary.BinaryRowData>
      • getUsedMemoryInBytes

        public long getUsedMemoryInBytes()
      • getSpillInBytes

        public long getSpillInBytes()
        Overrides:
        getSpillInBytes in class BytesMap<K,​org.apache.flink.table.data.binary.BinaryRowData>
      • getNumElements

        public long getNumElements()
        Overrides:
        getNumElements in class BytesMap<K,​org.apache.flink.table.data.binary.BinaryRowData>
      • getEntryIterator

        public KeyValueIterator<K,​org.apache.flink.table.data.binary.BinaryRowData> getEntryIterator​(boolean requiresCopy)
        Returns an iterator for iterating over the entries of this map.
      • getRecordAreaMemorySegments

        public ArrayList<org.apache.flink.core.memory.MemorySegment> getRecordAreaMemorySegments()
        Returns:
        the underlying memory segments of the hash map's record area
      • getBucketAreaMemorySegments

        public List<org.apache.flink.core.memory.MemorySegment> getBucketAreaMemorySegments()
      • free

        public void free()
        release the map's record and bucket area's memory segments.
      • free

        public void free​(boolean reservedRecordMemory)
        Overrides:
        free in class BytesMap<K,​org.apache.flink.table.data.binary.BinaryRowData>
        Parameters:
        reservedRecordMemory - reserved fixed memory or not.
      • reset

        public void reset()
        reset the map's record and bucket area's memory segments for reusing.
        Overrides:
        reset in class BytesMap<K,​org.apache.flink.table.data.binary.BinaryRowData>