Class SplitDataProperties<T>
- java.lang.Object
-
- org.apache.flink.api.java.io.SplitDataProperties<T>
-
- Type Parameters:
T- The type of the DataSource on which the SplitDataProperties are defined.
- All Implemented Interfaces:
org.apache.flink.api.common.operators.GenericDataSourceBase.SplitDataProperties<T>
@Deprecated @PublicEvolving public class SplitDataProperties<T> extends Object implements org.apache.flink.api.common.operators.GenericDataSourceBase.SplitDataProperties<T>
Deprecated.All Flink DataSet APIs are deprecated since Flink 1.18 and will be removed in a future Flink major version. You can still build your application in DataSet, but you should move to either the DataStream and/or Table API.SplitDataProperties define data properties onInputSplitgenerated by theInputFormatof aDataSource.InputSplits are units of input which are distributed among and assigned to parallel data source subtasks. SplitDataProperties can define that the elements which are generated by the associated InputFormat are
- Partitioned on one or more fields across InputSplits, i.e., all elements with the same (combination of) key(s) are located in the same input split.
- Grouped on one or more fields within an InputSplit, i.e., all elements of an input split that have the same (combination of) key(s) are emitted in a single sequence one after the other.
- Ordered on one or more fields within an InputSplit, i.e., all elements within an input split are in the defined order.
IMPORTANT: SplitDataProperties can improve the execution of a program because certain data reorganization steps such as shuffling or sorting can be avoided. HOWEVER, if SplitDataProperties are not correctly defined, the result of the program might be wrong!
- See Also:
InputSplit,InputFormat,DataSource, FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classSplitDataProperties.SourcePartitionerMarker<T>Deprecated.A custom partitioner to mark compatible split partitionings.
-
Constructor Summary
Constructors Constructor Description SplitDataProperties(org.apache.flink.api.common.typeinfo.TypeInformation<T> type)Deprecated.Creates SplitDataProperties for the given data types.SplitDataProperties(DataSource<T> source)Deprecated.Creates SplitDataProperties for the given data types.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description int[]getSplitGroupKeys()Deprecated.org.apache.flink.api.common.operators.OrderinggetSplitOrder()Deprecated.org.apache.flink.api.common.functions.Partitioner<T>getSplitPartitioner()Deprecated.int[]getSplitPartitionKeys()Deprecated.SplitDataProperties<T>splitsGroupedBy(int... groupFields)Deprecated.Defines that the data within an input split is grouped on the fields defined by the field positions.SplitDataProperties<T>splitsGroupedBy(String groupFields)Deprecated.Defines that the data within an input split is grouped on the fields defined by the field expressions.SplitDataProperties<T>splitsOrderedBy(int[] orderFields, org.apache.flink.api.common.operators.Order[] orders)Deprecated.Defines that the data within an input split is sorted on the fields defined by the field positions in the specified orders.SplitDataProperties<T>splitsOrderedBy(String orderFields, org.apache.flink.api.common.operators.Order[] orders)Deprecated.Defines that the data within an input split is sorted on the fields defined by the field expressions in the specified orders.SplitDataProperties<T>splitsPartitionedBy(int... partitionFields)Deprecated.Defines that data is partitioned across input splits on the fields defined by field positions.SplitDataProperties<T>splitsPartitionedBy(String partitionFields)Deprecated.Defines that data is partitioned across input splits on the fields defined by field expressions.SplitDataProperties<T>splitsPartitionedBy(String partitionMethodId, int... partitionFields)Deprecated.Defines that data is partitioned using a specific partitioning method across input splits on the fields defined by field positions.SplitDataProperties<T>splitsPartitionedBy(String partitionMethodId, String partitionFields)Deprecated.Defines that data is partitioned using an identifiable method across input splits on the fields defined by field expressions.
-
-
-
Constructor Detail
-
SplitDataProperties
public SplitDataProperties(org.apache.flink.api.common.typeinfo.TypeInformation<T> type)
Deprecated.Creates SplitDataProperties for the given data types.- Parameters:
type- The data type of the SplitDataProperties.
-
SplitDataProperties
public SplitDataProperties(DataSource<T> source)
Deprecated.Creates SplitDataProperties for the given data types.- Parameters:
source- The DataSource for which the SplitDataProperties are created.
-
-
Method Detail
-
splitsPartitionedBy
public SplitDataProperties<T> splitsPartitionedBy(int... partitionFields)
Deprecated.Defines that data is partitioned across input splits on the fields defined by field positions. All records sharing the same key (combination) must be contained in a single input split.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
partitionFields- The field positions of the partitioning keys.- Returns:
- This SplitDataProperties object.
-
splitsPartitionedBy
public SplitDataProperties<T> splitsPartitionedBy(String partitionMethodId, int... partitionFields)
Deprecated.Defines that data is partitioned using a specific partitioning method across input splits on the fields defined by field positions. All records sharing the same key (combination) must be contained in a single input split.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
partitionMethodId- An ID for the method that was used to partition the data across splits.partitionFields- The field positions of the partitioning keys.- Returns:
- This SplitDataProperties object.
-
splitsPartitionedBy
public SplitDataProperties<T> splitsPartitionedBy(String partitionFields)
Deprecated.Defines that data is partitioned across input splits on the fields defined by field expressions. Multiple field expressions must be separated by the semicolon ';' character. All records sharing the same key (combination) must be contained in a single input split.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
partitionFields- The field expressions of the partitioning keys.- Returns:
- This SplitDataProperties object.
-
splitsPartitionedBy
public SplitDataProperties<T> splitsPartitionedBy(String partitionMethodId, String partitionFields)
Deprecated.Defines that data is partitioned using an identifiable method across input splits on the fields defined by field expressions. Multiple field expressions must be separated by the semicolon ';' character. All records sharing the same key (combination) must be contained in a single input split.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
partitionMethodId- An ID for the method that was used to partition the data across splits.partitionFields- The field expressions of the partitioning keys.- Returns:
- This SplitDataProperties object.
-
splitsGroupedBy
public SplitDataProperties<T> splitsGroupedBy(int... groupFields)
Deprecated.Defines that the data within an input split is grouped on the fields defined by the field positions. All records sharing the same key (combination) must be subsequently emitted by the input format for each input split.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
groupFields- The field positions of the grouping keys.- Returns:
- This SplitDataProperties object.
-
splitsGroupedBy
public SplitDataProperties<T> splitsGroupedBy(String groupFields)
Deprecated.Defines that the data within an input split is grouped on the fields defined by the field expressions. Multiple field expressions must be separated by the semicolon ';' character. All records sharing the same key (combination) must be subsequently emitted by the input format for each input split.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
groupFields- The field expressions of the grouping keys.- Returns:
- This SplitDataProperties object.
-
splitsOrderedBy
public SplitDataProperties<T> splitsOrderedBy(int[] orderFields, org.apache.flink.api.common.operators.Order[] orders)
Deprecated.Defines that the data within an input split is sorted on the fields defined by the field positions in the specified orders. All records of an input split must be emitted by the input format in the defined order.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
orderFields- The field positions of the grouping keys.orders- The orders of the fields.- Returns:
- This SplitDataProperties object.
-
splitsOrderedBy
public SplitDataProperties<T> splitsOrderedBy(String orderFields, org.apache.flink.api.common.operators.Order[] orders)
Deprecated.Defines that the data within an input split is sorted on the fields defined by the field expressions in the specified orders. Multiple field expressions must be separated by the semicolon ';' character. All records of an input split must be emitted by the input format in the defined order.IMPORTANT: Providing wrong information with SplitDataProperties can cause wrong results!
- Parameters:
orderFields- The field expressions of the grouping key.orders- The orders of the fields.- Returns:
- This SplitDataProperties object.
-
getSplitPartitionKeys
public int[] getSplitPartitionKeys()
Deprecated.- Specified by:
getSplitPartitionKeysin interfaceorg.apache.flink.api.common.operators.GenericDataSourceBase.SplitDataProperties<T>
-
getSplitPartitioner
public org.apache.flink.api.common.functions.Partitioner<T> getSplitPartitioner()
Deprecated.- Specified by:
getSplitPartitionerin interfaceorg.apache.flink.api.common.operators.GenericDataSourceBase.SplitDataProperties<T>
-
getSplitGroupKeys
public int[] getSplitGroupKeys()
Deprecated.- Specified by:
getSplitGroupKeysin interfaceorg.apache.flink.api.common.operators.GenericDataSourceBase.SplitDataProperties<T>
-
getSplitOrder
public org.apache.flink.api.common.operators.Ordering getSplitOrder()
Deprecated.- Specified by:
getSplitOrderin interfaceorg.apache.flink.api.common.operators.GenericDataSourceBase.SplitDataProperties<T>
-
-