Class SampleInCoordinator<T>

  • Type Parameters:
    T - the data type wrapped in ElementWithRandom as input.
    All Implemented Interfaces:
    Serializable, org.apache.flink.api.common.functions.Function, org.apache.flink.api.common.functions.GroupReduceFunction<IntermediateSampleData<T>,​T>

    @Internal
    public class SampleInCoordinator<T>
    extends Object
    implements org.apache.flink.api.common.functions.GroupReduceFunction<IntermediateSampleData<T>,​T>
    SampleInCoordinator wraps the sample logic of the coordinator side (the second phase of distributed sample algorithm). It executes the coordinator side sample logic in an all reduce function. The user needs to make sure that the operator parallelism of this function is 1 to make sure this is a central coordinator. Besides, we do not need the task index information for random generator seed as the parallelism must be 1.
    See Also:
    Serialized Form
    • Constructor Detail

      • SampleInCoordinator

        public SampleInCoordinator​(boolean withReplacement,
                                   int numSample,
                                   long seed)
        Create a function instance of SampleInCoordinator.
        Parameters:
        withReplacement - Whether element can be selected more than once.
        numSample - Fixed sample size.
        seed - Random generator seed.