Class TwoInputUdfOperator<IN1,IN2,OUT,O extends TwoInputUdfOperator<IN1,IN2,OUT,O>>
- java.lang.Object
-
- org.apache.flink.api.java.DataSet<OUT>
-
- org.apache.flink.api.java.operators.Operator<OUT,O>
-
- org.apache.flink.api.java.operators.TwoInputOperator<IN1,IN2,OUT,O>
-
- org.apache.flink.api.java.operators.TwoInputUdfOperator<IN1,IN2,OUT,O>
-
- Type Parameters:
IN1- The data type of the first input data set.IN2- The data type of the second input data set.OUT- The data type of the returned data set.
- All Implemented Interfaces:
UdfOperator<O>
- Direct Known Subclasses:
CoGroupOperator,CoGroupRawOperator,CrossOperator,JoinOperator
@Deprecated @Public public abstract class TwoInputUdfOperator<IN1,IN2,OUT,O extends TwoInputUdfOperator<IN1,IN2,OUT,O>> extends TwoInputOperator<IN1,IN2,OUT,O> implements UdfOperator<O>
Deprecated.All Flink DataSet APIs are deprecated since Flink 1.18 and will be removed in a future Flink major version. You can still build your application in DataSet, but you should move to either the DataStream and/or Table API.The TwoInputUdfOperator is the base class of all binary operators that execute user-defined functions (UDFs). The UDFs encapsulated by this operator are naturally UDFs that have two inputs (such asRichJoinFunctionorRichCoGroupFunction).This class encapsulates utilities for the UDFs, such as broadcast variables, parameterization through configuration objects, and semantic properties.
-
-
Field Summary
-
Fields inherited from class org.apache.flink.api.java.operators.Operator
minResources, name, parallelism, preferredResources
-
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected org.apache.flink.api.common.operators.DualInputSemanticPropertiesextractSemanticAnnotationsFromUdf(Class<?> udfClass)Deprecated.protected booleangetAnalyzedUdfSemanticsFlag()Deprecated.Map<String,DataSet<?>>getBroadcastSets()Deprecated.Gets the broadcast sets (name and data set) that have been added to context of the UDF.protected abstract org.apache.flink.api.common.functions.FunctiongetFunction()Deprecated.org.apache.flink.configuration.ConfigurationgetParameters()Deprecated.Gets the configuration parameters that will be passed to the UDF's open methodRichFunction.open(OpenContext).org.apache.flink.api.common.operators.DualInputSemanticPropertiesgetSemanticProperties()Deprecated.Gets the semantic properties that have been set for the user-defined functions (UDF).Oreturns(Class<OUT> typeClass)Deprecated.Adds a type information hint about the return type of this operator.Oreturns(org.apache.flink.api.common.typeinfo.TypeHint<OUT> typeHint)Deprecated.Adds a type information hint about the return type of this operator.Oreturns(org.apache.flink.api.common.typeinfo.TypeInformation<OUT> typeInfo)Deprecated.Adds a type information hint about the return type of this operator.protected voidsetAnalyzedUdfSemanticsFlag()Deprecated.voidsetSemanticProperties(org.apache.flink.api.common.operators.DualInputSemanticProperties properties)Deprecated.Sets the semantic properties for the user-defined function (UDF).protected booleanudfWithForwardedFieldsFirstAnnotation(Class<?> udfClass)Deprecated.protected booleanudfWithForwardedFieldsSecondAnnotation(Class<?> udfClass)Deprecated.OwithBroadcastSet(DataSet<?> data, String name)Deprecated.Adds a certain data set as a broadcast set to this operator.OwithForwardedFieldsFirst(String... forwardedFieldsFirst)Deprecated.Adds semantic information about forwarded fields of the first input of the user-defined function.OwithForwardedFieldsSecond(String... forwardedFieldsSecond)Deprecated.Adds semantic information about forwarded fields of the second input of the user-defined function.OwithParameters(org.apache.flink.configuration.Configuration parameters)Deprecated.Sets the configuration parameters for the UDF.-
Methods inherited from class org.apache.flink.api.java.operators.TwoInputOperator
getInput1, getInput1Type, getInput2, getInput2Type, translateToDataFlow
-
Methods inherited from class org.apache.flink.api.java.operators.Operator
getMinResources, getName, getParallelism, getPreferredResources, getResultType, name, setParallelism
-
Methods inherited from class org.apache.flink.api.java.DataSet
aggregate, checkSameExecutionContext, clean, coGroup, collect, combineGroup, count, cross, crossWithHuge, crossWithTiny, distinct, distinct, distinct, distinct, fillInType, filter, first, flatMap, fullOuterJoin, fullOuterJoin, getExecutionEnvironment, getType, groupBy, groupBy, groupBy, iterate, iterateDelta, join, join, joinWithHuge, joinWithTiny, leftOuterJoin, leftOuterJoin, map, mapPartition, max, maxBy, min, minBy, output, partitionByHash, partitionByHash, partitionByHash, partitionByRange, partitionByRange, partitionByRange, partitionCustom, partitionCustom, partitionCustom, print, print, printOnTaskManager, printToErr, printToErr, project, rebalance, reduce, reduceGroup, rightOuterJoin, rightOuterJoin, runOperation, sortPartition, sortPartition, sortPartition, sum, union, write, write, writeAsCsv, writeAsCsv, writeAsCsv, writeAsCsv, writeAsFormattedText, writeAsFormattedText, writeAsText, writeAsText
-
-
-
-
Constructor Detail
-
TwoInputUdfOperator
protected TwoInputUdfOperator(DataSet<IN1> input1, DataSet<IN2> input2, org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType)
Deprecated.Creates a new operators with the two given data sets as inputs. The given result type describes the data type of the elements in the data set produced by the operator.- Parameters:
input1- The data set for the first input.input2- The data set for the second input.resultType- The type of the elements in the resulting data set.
-
-
Method Detail
-
getFunction
protected abstract org.apache.flink.api.common.functions.Function getFunction()
Deprecated.
-
withParameters
public O withParameters(org.apache.flink.configuration.Configuration parameters)
Deprecated.Description copied from interface:UdfOperatorSets the configuration parameters for the UDF. These are optional parameters that are passed to the UDF in theRichFunction.open(OpenContext)method.- Specified by:
withParametersin interfaceUdfOperator<IN1>- Parameters:
parameters- The configuration parameters for the UDF.- Returns:
- The operator itself, to allow chaining function calls.
-
withBroadcastSet
public O withBroadcastSet(DataSet<?> data, String name)
Deprecated.Description copied from interface:UdfOperatorAdds a certain data set as a broadcast set to this operator. Broadcasted data sets are available at all parallel instances of this operator. A broadcast data set is registered under a certain name, and can be retrieved under that name from the operators runtime context viaRuntimeContext.getBroadcastVariable(String).The runtime context itself is available in all UDFs via
AbstractRichFunction.getRuntimeContext().- Specified by:
withBroadcastSetin interfaceUdfOperator<IN1>- Parameters:
data- The data set to be broadcast.name- The name under which the broadcast data set retrieved.- Returns:
- The operator itself, to allow chaining function calls.
-
withForwardedFieldsFirst
public O withForwardedFieldsFirst(String... forwardedFieldsFirst)
Deprecated.Adds semantic information about forwarded fields of the first input of the user-defined function. The forwarded fields information declares fields which are never modified by the function and which are forwarded at the same position to the output or unchanged copied to another position in the output.Fields that are forwarded at the same position are specified by their position. The specified position must be valid for the input and output data type and have the same type. For example
withForwardedFieldsFirst("f2")declares that the third field of a Java input tuple from the first input is copied to the third field of an output tuple.Fields which are unchanged copied from the first input to another position in the output are declared by specifying the source field reference in the first input and the target field reference in the output.
withForwardedFieldsFirst("f0->f2")denotes that the first field of the first input Java tuple is unchanged copied to the third field of the Java output tuple. When using a wildcard ("*") ensure that the number of declared fields and their types in first input and output type match.Multiple forwarded fields can be annotated in one (
withForwardedFieldsFirst("f2; f3->f0; f4")) or separate Strings (withForwardedFieldsFirst("f2", "f3->f0", "f4")). Please refer to the JavaDoc ofFunctionor Flink's documentation for details on field references such as nested fields and wildcard.It is not possible to override existing semantic information about forwarded fields of the first input which was for example added by a
FunctionAnnotation.ForwardedFieldsFirstclass annotation.NOTE: Adding semantic information for functions is optional! If used correctly, semantic information can help the Flink optimizer to generate more efficient execution plans. However, incorrect semantic information can cause the optimizer to generate incorrect execution plans which compute wrong results! So be careful when adding semantic information.
- Parameters:
forwardedFieldsFirst- A list of forwarded field expressions for the first input of the function.- Returns:
- This operator with annotated forwarded field information.
- See Also:
FunctionAnnotation,FunctionAnnotation.ForwardedFieldsFirst
-
withForwardedFieldsSecond
public O withForwardedFieldsSecond(String... forwardedFieldsSecond)
Deprecated.Adds semantic information about forwarded fields of the second input of the user-defined function. The forwarded fields information declares fields which are never modified by the function and which are forwarded at the same position to the output or unchanged copied to another position in the output.Fields that are forwarded at the same position are specified by their position. The specified position must be valid for the input and output data type and have the same type. For example
withForwardedFieldsSecond("f2")declares that the third field of a Java input tuple from the second input is copied to the third field of an output tuple.Fields which are unchanged copied from the second input to another position in the output are declared by specifying the source field reference in the second input and the target field reference in the output.
withForwardedFieldsSecond("f0->f2")denotes that the first field of the second input Java tuple is unchanged copied to the third field of the Java output tuple. When using a wildcard ("*") ensure that the number of declared fields and their types in second input and output type match.Multiple forwarded fields can be annotated in one (
withForwardedFieldsSecond("f2; f3->f0; f4")) or separate Strings (withForwardedFieldsSecond("f2", "f3->f0", "f4")). Please refer to the JavaDoc ofFunctionor Flink's documentation for details on field references such as nested fields and wildcard.It is not possible to override existing semantic information about forwarded fields of the second input which was for example added by a
FunctionAnnotation.ForwardedFieldsSecondclass annotation.NOTE: Adding semantic information for functions is optional! If used correctly, semantic information can help the Flink optimizer to generate more efficient execution plans. However, incorrect semantic information can cause the optimizer to generate incorrect execution plans which compute wrong results! So be careful when adding semantic information.
- Parameters:
forwardedFieldsSecond- A list of forwarded field expressions for the second input of the function.- Returns:
- This operator with annotated forwarded field information.
- See Also:
FunctionAnnotation,FunctionAnnotation.ForwardedFieldsSecond
-
returns
public O returns(Class<OUT> typeClass)
Deprecated.Adds a type information hint about the return type of this operator. This method can be used in cases where Flink cannot determine automatically what the produced type of a function is. That can be the case if the function uses generic type variables in the return type that cannot be inferred from the input type.Classes can be used as type hints for non-generic types (classes without generic parameters), but not for generic types like for example Tuples. For those generic types, please use the
returns(TypeHint)method.Use this method the following way:
DataSet<String[]> result = data1.join(data2).where("id").equalTo("fieldX") .with(new JoinFunctionWithNonInferrableReturnType()) .returns(String[].class);- Parameters:
typeClass- The class of the returned data type.- Returns:
- This operator with the type information corresponding to the given type class.
-
returns
public O returns(org.apache.flink.api.common.typeinfo.TypeHint<OUT> typeHint)
Deprecated.Adds a type information hint about the return type of this operator. This method can be used in cases where Flink cannot determine automatically what the produced type of a function is. That can be the case if the function uses generic type variables in the return type that cannot be inferred from the input type.Use this method the following way:
DataSet<Tuple2<String, Double>> result = data1.join(data2).where("id").equalTo("fieldX") .with(new JoinFunctionWithNonInferrableReturnType()) .returns(new TypeHint<Tuple2<String, Double>>(){});- Parameters:
typeHint- The type hint for the returned data type.- Returns:
- This operator with the type information corresponding to the given type hint.
-
returns
public O returns(org.apache.flink.api.common.typeinfo.TypeInformation<OUT> typeInfo)
Deprecated.Adds a type information hint about the return type of this operator. This method can be used in cases where Flink cannot determine automatically what the produced type of a function is. That can be the case if the function uses generic type variables in the return type that cannot be inferred from the input type.In most cases, the methods
returns(Class)andreturns(TypeHint)are preferable.- Parameters:
typeInfo- The type information for the returned data type.- Returns:
- This operator using the given type information for the return type.
-
getBroadcastSets
@Internal public Map<String,DataSet<?>> getBroadcastSets()
Deprecated.Description copied from interface:UdfOperatorGets the broadcast sets (name and data set) that have been added to context of the UDF. Broadcast sets are added to a UDF via the methodUdfOperator.withBroadcastSet(DataSet, String).- Specified by:
getBroadcastSetsin interfaceUdfOperator<IN1>- Returns:
- The broadcast data sets that have been added to this UDF.
-
getParameters
public org.apache.flink.configuration.Configuration getParameters()
Deprecated.Description copied from interface:UdfOperatorGets the configuration parameters that will be passed to the UDF's open methodRichFunction.open(OpenContext). The configuration is set via theUdfOperator.withParameters(Configuration)method.- Specified by:
getParametersin interfaceUdfOperator<IN1>- Returns:
- The configuration parameters for the UDF.
-
getSemanticProperties
@Internal public org.apache.flink.api.common.operators.DualInputSemanticProperties getSemanticProperties()
Deprecated.Description copied from interface:UdfOperatorGets the semantic properties that have been set for the user-defined functions (UDF).- Specified by:
getSemanticPropertiesin interfaceUdfOperator<IN1>- Returns:
- The semantic properties of the UDF.
-
setSemanticProperties
@Internal public void setSemanticProperties(org.apache.flink.api.common.operators.DualInputSemanticProperties properties)
Deprecated.Sets the semantic properties for the user-defined function (UDF). The semantic properties define how fields of tuples and other objects are modified or preserved through this UDF. The configured properties can be retrieved viaUdfOperator.getSemanticProperties().- Parameters:
properties- The semantic properties for the UDF.- See Also:
UdfOperator.getSemanticProperties()
-
getAnalyzedUdfSemanticsFlag
protected boolean getAnalyzedUdfSemanticsFlag()
Deprecated.
-
setAnalyzedUdfSemanticsFlag
protected void setAnalyzedUdfSemanticsFlag()
Deprecated.
-
extractSemanticAnnotationsFromUdf
protected org.apache.flink.api.common.operators.DualInputSemanticProperties extractSemanticAnnotationsFromUdf(Class<?> udfClass)
Deprecated.
-
udfWithForwardedFieldsFirstAnnotation
protected boolean udfWithForwardedFieldsFirstAnnotation(Class<?> udfClass)
Deprecated.
-
udfWithForwardedFieldsSecondAnnotation
protected boolean udfWithForwardedFieldsSecondAnnotation(Class<?> udfClass)
Deprecated.
-
-