Interface StateBackend
-
- All Superinterfaces:
Serializable
- All Known Subinterfaces:
ConfigurableStateBackend,DelegatingStateBackend
- All Known Implementing Classes:
AbstractFileStateBackend,AbstractManagedMemoryStateBackend,AbstractStateBackend,FsStateBackend,HashMapStateBackend,MemoryStateBackend
@PublicEvolving public interface StateBackend extends Serializable
A State Backend defines how the state of a streaming application is stored locally within the cluster. Different State Backends store their state in different fashions, and use different data structures to hold the state of a running application.For example, the
hashmap state backendkeeps working state in the memory of the TaskManager. The backend is lightweight and without additional dependencies.The
EmbeddedRocksDBStateBackendstores working state in an embedded RocksDB and is able to scale working state to many terabytes in size, only limited by available disk space across all task managers.Raw Bytes Storage and Backends
The
StateBackendcreates services for keyed state and operator state.The
CheckpointableKeyedStateBackendandOperatorStateBackendcreated by this state backend define how to hold the working state for keys and operators. They also define how to checkpoint that state, frequently using the raw bytes storage (via theCheckpointStreamFactory). However, it is also possible that for example a keyed state backend simply implements the bridge to a key/value store, and that it does not need to store anything in the raw byte storage upon a checkpoint.Serializability
State Backends need to be
serializable, because they distributed across parallel processes (for distributed execution) together with the streaming application code.Because of that,
StateBackendimplementations (typically subclasses ofAbstractStateBackend) are meant to be like factories that create the proper states stores that provide access to the persistent storage and hold the keyed- and operator state data structures. That way, the State Backend can be very lightweight (contain only configurations) which makes it easier to be serializable.Thread Safety
State backend implementations have to be thread-safe. Multiple threads may be creating keyed-/operator state backends concurrently.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description <K> CheckpointableKeyedStateBackend<K>createKeyedStateBackend(Environment env, org.apache.flink.api.common.JobID jobID, String operatorIdentifier, org.apache.flink.api.common.typeutils.TypeSerializer<K> keySerializer, int numberOfKeyGroups, KeyGroupRange keyGroupRange, TaskKvStateRegistry kvStateRegistry, TtlTimeProvider ttlTimeProvider, org.apache.flink.metrics.MetricGroup metricGroup, Collection<KeyedStateHandle> stateHandles, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry)Creates a newCheckpointableKeyedStateBackendthat is responsible for holding keyed state and checkpointing it.default <K> CheckpointableKeyedStateBackend<K>createKeyedStateBackend(Environment env, org.apache.flink.api.common.JobID jobID, String operatorIdentifier, org.apache.flink.api.common.typeutils.TypeSerializer<K> keySerializer, int numberOfKeyGroups, KeyGroupRange keyGroupRange, TaskKvStateRegistry kvStateRegistry, TtlTimeProvider ttlTimeProvider, org.apache.flink.metrics.MetricGroup metricGroup, Collection<KeyedStateHandle> stateHandles, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry, double managedMemoryFraction)Creates a newCheckpointableKeyedStateBackendwith the given managed memory fraction.OperatorStateBackendcreateOperatorStateBackend(Environment env, String operatorIdentifier, Collection<OperatorStateHandle> stateHandles, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry)Creates a newOperatorStateBackendthat can be used for storing operator state.default StringgetName()Return the name of this backend, default is simple class name.default booleansupportsNoClaimRestoreMode()Tells if a state backend supports theRestoreMode.NO_CLAIMmode.default booleansupportsSavepointFormat(org.apache.flink.core.execution.SavepointFormatType formatType)default booleanuseManagedMemory()Whether the state backend uses Flink's managed memory.
-
-
-
Method Detail
-
getName
default String getName()
Return the name of this backend, default is simple class name.DelegatingStateBackendmay return the simple class name of the delegated backend.
-
createKeyedStateBackend
<K> CheckpointableKeyedStateBackend<K> createKeyedStateBackend(Environment env, org.apache.flink.api.common.JobID jobID, String operatorIdentifier, org.apache.flink.api.common.typeutils.TypeSerializer<K> keySerializer, int numberOfKeyGroups, KeyGroupRange keyGroupRange, TaskKvStateRegistry kvStateRegistry, TtlTimeProvider ttlTimeProvider, org.apache.flink.metrics.MetricGroup metricGroup, @Nonnull Collection<KeyedStateHandle> stateHandles, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry) throws Exception
Creates a newCheckpointableKeyedStateBackendthat is responsible for holding keyed state and checkpointing it.Keyed State is state where each value is bound to a key.
- Type Parameters:
K- The type of the keys by which the state is organized.- Parameters:
env- The environment of the task.jobID- The ID of the job that the task belongs to.operatorIdentifier- The identifier text of the operator.keySerializer- The key-serializer for the operator.numberOfKeyGroups- The number of key-groups aka max parallelism.keyGroupRange- Range of key-groups for which the to-be-created backend is responsible.kvStateRegistry- KvStateRegistry helper for this task.ttlTimeProvider- Provider for TTL logic to judge about state expiration.metricGroup- The parent metric group for all state backend metrics.stateHandles- The state handles for restore.cancelStreamRegistry- The registry to which created closeable objects will be registered during restore.- Returns:
- The Keyed State Backend for the given job, operator, and key group range.
- Throws:
Exception- This method may forward all exceptions that occur while instantiating the backend.
-
createKeyedStateBackend
default <K> CheckpointableKeyedStateBackend<K> createKeyedStateBackend(Environment env, org.apache.flink.api.common.JobID jobID, String operatorIdentifier, org.apache.flink.api.common.typeutils.TypeSerializer<K> keySerializer, int numberOfKeyGroups, KeyGroupRange keyGroupRange, TaskKvStateRegistry kvStateRegistry, TtlTimeProvider ttlTimeProvider, org.apache.flink.metrics.MetricGroup metricGroup, @Nonnull Collection<KeyedStateHandle> stateHandles, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry, double managedMemoryFraction) throws Exception
Creates a newCheckpointableKeyedStateBackendwith the given managed memory fraction. Backends that use managed memory are required to implement this interface.- Throws:
Exception
-
createOperatorStateBackend
OperatorStateBackend createOperatorStateBackend(Environment env, String operatorIdentifier, @Nonnull Collection<OperatorStateHandle> stateHandles, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry) throws Exception
Creates a newOperatorStateBackendthat can be used for storing operator state.Operator state is state that is associated with parallel operator (or function) instances, rather than with keys.
- Parameters:
env- The runtime environment of the executing task.operatorIdentifier- The identifier of the operator whose state should be stored.stateHandles- The state handles for restore.cancelStreamRegistry- The registry to register streams to close if task canceled.- Returns:
- The OperatorStateBackend for operator identified by the job and operator identifier.
- Throws:
Exception- This method may forward all exceptions that occur while instantiating the backend.
-
useManagedMemory
default boolean useManagedMemory()
Whether the state backend uses Flink's managed memory.
-
supportsNoClaimRestoreMode
default boolean supportsNoClaimRestoreMode()
Tells if a state backend supports theRestoreMode.NO_CLAIMmode.If a state backend supports
NO_CLAIMmode, it should create an independent snapshot when it receivesCheckpointType.FULL_CHECKPOINTinSnapshotable.snapshot(long, long, CheckpointStreamFactory, CheckpointOptions).- Returns:
- If the state backend supports
RestoreMode.NO_CLAIMmode.
-
supportsSavepointFormat
default boolean supportsSavepointFormat(org.apache.flink.core.execution.SavepointFormatType formatType)
-
-