Class BeamPythonFunctionRunner

    • Field Detail

      • LOG

        protected static final org.slf4j.Logger LOG
      • resultBuffer

        @VisibleForTesting
        protected transient LinkedBlockingQueue<org.apache.flink.api.java.tuple.Tuple2<String,​byte[]>> resultBuffer
        Buffers the Python function execution result which has still not been processed.
      • mainInputReceiver

        @VisibleForTesting
        protected transient org.apache.beam.sdk.fn.data.FnDataReceiver<org.apache.beam.sdk.util.WindowedValue<byte[]>> mainInputReceiver
        The receiver which forwards the input elements to a remote environment for processing.
    • Constructor Detail

      • BeamPythonFunctionRunner

        public BeamPythonFunctionRunner​(org.apache.flink.runtime.execution.Environment environment,
                                        String taskName,
                                        ProcessPythonEnvironmentManager environmentManager,
                                        @Nullable
                                        FlinkMetricContainer flinkMetricContainer,
                                        @Nullable
                                        org.apache.flink.runtime.state.KeyedStateBackend<?> keyedStateBackend,
                                        @Nullable
                                        org.apache.flink.runtime.state.OperatorStateBackend operatorStateBackend,
                                        @Nullable
                                        org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer,
                                        @Nullable
                                        org.apache.flink.api.common.typeutils.TypeSerializer<?> namespaceSerializer,
                                        @Nullable
                                        TimerRegistration timerRegistration,
                                        org.apache.flink.runtime.memory.MemoryManager memoryManager,
                                        double managedMemoryFraction,
                                        FlinkFnApi.CoderInfoDescriptor inputCoderDescriptor,
                                        FlinkFnApi.CoderInfoDescriptor outputCoderDescriptor,
                                        Map<String,​FlinkFnApi.CoderInfoDescriptor> sideOutputCoderDescriptors)
    • Method Detail

      • open

        public void open​(org.apache.flink.configuration.ReadableConfig config)
                  throws Exception
        Description copied from interface: PythonFunctionRunner
        Prepares the Python function runner, such as preparing the Python execution environment, etc.
        Specified by:
        open in interface PythonFunctionRunner
        Throws:
        Exception
      • startBundle

        @VisibleForTesting
        protected void startBundle()
      • pollResult

        public org.apache.flink.api.java.tuple.Tuple3<String,​byte[],​Integer> pollResult()
                                                                                             throws Exception
        Description copied from interface: PythonFunctionRunner
        Retrieves the Python function result.
        Specified by:
        pollResult in interface PythonFunctionRunner
        Returns:
        the head of he Python function result buffer, or null if the result buffer is empty. f0 means the byte array buffer which stores the Python function result. f1 means the length of the Python function result byte array.
        Throws:
        Exception
      • takeResult

        public org.apache.flink.api.java.tuple.Tuple3<String,​byte[],​Integer> takeResult()
                                                                                             throws Exception
        Description copied from interface: PythonFunctionRunner
        Retrieves the Python function result, waiting if necessary until an element becomes available.
        Specified by:
        takeResult in interface PythonFunctionRunner
        Returns:
        the head of he Python function result buffer. f0 means the byte array buffer which stores the Python function result. f1 means the length of the Python function result byte array.
        Throws:
        Exception
      • flush

        public void flush()
                   throws Exception
        Description copied from interface: PythonFunctionRunner
        Forces to finish the processing of the current bundle of elements. It will flush the data cached in the data buffer for processing and retrieves the state mutations (if exists) made by the Python function. The call blocks until all of the outputs produced by this bundle have been received.
        Specified by:
        flush in interface PythonFunctionRunner
        Throws:
        Exception
      • notifyNoMoreResults

        public void notifyNoMoreResults()
        Interrupts the progress of takeResult.
      • buildTransforms

        protected abstract void buildTransforms​(org.apache.beam.model.pipeline.v1.RunnerApi.Components.Builder componentsBuilder)
      • getTimers

        protected abstract List<org.apache.beam.runners.core.construction.graph.TimerReference> getTimers​(org.apache.beam.model.pipeline.v1.RunnerApi.Components components)
      • getOptionalTimerCoderProto

        protected abstract Optional<org.apache.beam.model.pipeline.v1.RunnerApi.Coder> getOptionalTimerCoderProto()
      • createJobBundleFactory

        @VisibleForTesting
        public org.apache.beam.runners.fnexecution.control.JobBundleFactory createJobBundleFactory​(org.apache.beam.vendor.grpc.v1p60p1.com.google.protobuf.Struct pipelineOptions)
                                                                                            throws Exception
        Throws:
        Exception