Class SampleInPartition<T>

  • Type Parameters:
    T - The type of input data
    All Implemented Interfaces:
    Serializable, org.apache.flink.api.common.functions.Function, org.apache.flink.api.common.functions.MapPartitionFunction<T,​IntermediateSampleData<T>>, org.apache.flink.api.common.functions.RichFunction

    @Internal
    public class SampleInPartition<T>
    extends org.apache.flink.api.common.functions.RichMapPartitionFunction<T,​IntermediateSampleData<T>>
    SampleInPartition wraps the sample logic on the partition side (the first phase of distributed sample algorithm). It executes the partition side sample logic in a mapPartition function.
    See Also:
    Serialized Form
    • Constructor Detail

      • SampleInPartition

        public SampleInPartition​(boolean withReplacement,
                                 int numSample,
                                 long seed)
        Create a function instance of SampleInPartition.
        Parameters:
        withReplacement - Whether element can be selected more than once.
        numSample - Fixed sample size.
        seed - Random generator seed.