Class AsyncSinkWriter<InputT,​RequestEntryT extends Serializable>

  • All Implemented Interfaces:
    AutoCloseable, org.apache.flink.api.connector.sink2.SinkWriter<InputT>, org.apache.flink.api.connector.sink2.StatefulSink.StatefulSinkWriter<InputT,​BufferedRequestState<RequestEntryT>>

    @PublicEvolving
    public abstract class AsyncSinkWriter<InputT,​RequestEntryT extends Serializable>
    extends Object
    implements org.apache.flink.api.connector.sink2.StatefulSink.StatefulSinkWriter<InputT,​BufferedRequestState<RequestEntryT>>
    A generic sink writer that handles the general behaviour of a sink such as batching and flushing, and allows extenders to implement the logic for persisting individual request elements, with allowance for retries.

    At least once semantics is supported through prepareCommit as outstanding requests are flushed or completed prior to checkpointing.

    Designed to be returned at createWriter time by an AsyncSinkBase.

    There are configuration options to customize the buffer size etc.

    • Method Detail

      • submitRequestEntries

        protected abstract void submitRequestEntries​(List<RequestEntryT> requestEntries,
                                                     java.util.function.Consumer<List<RequestEntryT>> requestToRetry)
        This method specifies how to persist buffered request entries into the destination. It is implemented when support for a new destination is added.

        The method is invoked with a set of request entries according to the buffering hints (and the valid limits of the destination). The logic then needs to create and execute the request asynchronously against the destination (ideally by batching together multiple request entries to increase efficiency). The logic also needs to identify individual request entries that were not persisted successfully and resubmit them using the requestToRetry callback.

        From a threading perspective, the mailbox thread will call this method and initiate the asynchronous request to persist the requestEntries. NOTE: The client must support asynchronous requests and the method called to persist the records must asynchronously execute and return a future with the results of that request. A thread from the destination client thread pool should complete the request and submit the failed entries that should be retried. The requestToRetry will then trigger the mailbox thread to requeue the unsuccessful elements.

        An example implementation of this method is included:

        {@code
        Parameters:
        requestEntries - a set of request entries that should be sent to the destination
        requestToRetry - the accept method should be called on this Consumer once the processing of the requestEntries are complete. Any entries that encountered difficulties in persisting should be re-queued through requestToRetry by including that element in the collection of RequestEntryTs passed to the accept method. All other elements are assumed to have been successfully persisted.
      • getSizeInBytes

        protected abstract long getSizeInBytes​(RequestEntryT requestEntry)
        This method allows the getting of the size of a RequestEntryT in bytes. The size in this case is measured as the total bytes that is written to the destination as a result of persisting this particular RequestEntryT rather than the serialized length (which may be the same).
        Parameters:
        requestEntry - the requestEntry for which we want to know the size
        Returns:
        the size of the requestEntry, as defined previously
      • flush

        public void flush​(boolean flush)
                   throws InterruptedException
        In flight requests will be retried if the sink is still healthy. But if in-flight requests fail after a checkpoint has been triggered and Flink needs to recover from the checkpoint, the (failed) in-flight requests are gone and cannot be retried. Hence, there cannot be any outstanding in-flight requests when a commit is initialized.

        To this end, all in-flight requests need to completed before proceeding with the commit.

        Specified by:
        flush in interface org.apache.flink.api.connector.sink2.SinkWriter<InputT>
        Throws:
        InterruptedException
      • snapshotState

        public List<BufferedRequestState<RequestEntryT>> snapshotState​(long checkpointId)
        All in-flight requests that are relevant for the snapshot have been completed, but there may still be request entries in the internal buffers that are yet to be sent to the endpoint. These request entries are stored in the snapshot state so that they don't get lost in case of a failure/restart of the application.
        Specified by:
        snapshotState in interface org.apache.flink.api.connector.sink2.StatefulSink.StatefulSinkWriter<InputT,​RequestEntryT extends Serializable>
      • getFatalExceptionCons

        protected java.util.function.Consumer<Exception> getFatalExceptionCons()