Interface BatchCreator<RequestEntryT extends Serializable>

  • Type Parameters:
    RequestEntryT - the type of the request entries to be batched
    All Known Implementing Classes:
    SimpleBatchCreator

    @PublicEvolving
    public interface BatchCreator<RequestEntryT extends Serializable>
    A pluggable interface for forming batches of request entries from a buffer. Implementations control how many entries are grouped together and in what manner before sending them downstream.

    The AsyncSinkWriter (or similar sink component) calls createNextBatch(RequestInfo, RequestBuffer) (RequestInfo, Deque)} when it decides to flush or otherwise gather a new batch of elements. For instance, a batch creator might limit the batch by the number of elements, total payload size, or any custom partition-based strategy.

    • Method Detail

      • createNextBatch

        Batch<RequestEntryT> createNextBatch​(RequestInfo requestInfo,
                                             RequestBuffer<RequestEntryT> bufferedRequestEntries)
        Creates the next batch of request entries based on the provided RequestInfo and the currently buffered entries.

        This method is expected to:

        • Mutate the bufferedRequestEntries by polling/removing elements from it.
        • Return a batch containing the selected entries.

        Thread-safety note: This method is called from flush(), which is executed on the Flink main thread. Implementations should assume single-threaded access and must not be shared across subtasks.

        Contract: Implementations must ensure that any entry removed from bufferedRequestEntries is either added to the returned batch or properly handled (e.g., retried or logged), and not silently dropped.

        Parameters:
        requestInfo - information about the desired request properties or constraints (e.g., an allowed batch size or other relevant hints)
        bufferedRequestEntries - a collection ex: Deque of all currently buffered entries waiting to be grouped into batches
        Returns:
        a Batch containing the new batch of entries along with metadata about the batch (e.g., total byte size, record count)