Class ParquetSplitReaderUtil
- java.lang.Object
-
- org.apache.flink.formats.parquet.vector.ParquetSplitReaderUtil
-
public class ParquetSplitReaderUtil extends Object
Util for generatingParquetColumnarRowSplitReader.
-
-
Constructor Summary
Constructors Constructor Description ParquetSplitReaderUtil()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static List<ParquetField>buildFieldsList(List<org.apache.flink.table.types.logical.RowType.RowField> childrens, List<String> fieldNames, org.apache.parquet.io.MessageColumnIO columnIO)static ColumnReadercreateColumnReader(boolean isUtcTimestamp, org.apache.flink.table.types.logical.LogicalType fieldType, org.apache.parquet.schema.Type type, List<org.apache.parquet.column.ColumnDescriptor> columnDescriptors, org.apache.parquet.column.page.PageReadStore pages, ParquetField field, int depth)static org.apache.flink.table.data.columnar.vector.ColumnVectorcreateVectorFromConstant(org.apache.flink.table.types.logical.LogicalType type, Object value, int batchSize)static org.apache.flink.table.data.columnar.vector.writable.WritableColumnVectorcreateWritableColumnVector(int batchSize, org.apache.flink.table.types.logical.LogicalType fieldType, org.apache.parquet.schema.Type type, List<org.apache.parquet.column.ColumnDescriptor> columnDescriptors, int depth)static ParquetColumnarRowSplitReadergenPartColumnarRowReader(boolean utcTimestamp, boolean caseSensitive, org.apache.hadoop.conf.Configuration conf, String[] fullFieldNames, org.apache.flink.table.types.DataType[] fullFieldTypes, Map<String,Object> partitionSpec, int[] selectedFields, int batchSize, org.apache.flink.core.fs.Path path, long splitStart, long splitLength)Util for generating partitionedParquetColumnarRowSplitReader.static org.apache.parquet.io.ColumnIOgetArrayElementColumn(org.apache.parquet.io.ColumnIO columnIO)static org.apache.parquet.io.GroupColumnIOgetMapKeyValueColumn(org.apache.parquet.io.GroupColumnIO groupColumnIO)static org.apache.parquet.io.ColumnIOlookupColumnByName(org.apache.parquet.io.GroupColumnIO groupColumnIO, String columnName)Parquet's column names are case in sensitive.
-
-
-
Method Detail
-
genPartColumnarRowReader
public static ParquetColumnarRowSplitReader genPartColumnarRowReader(boolean utcTimestamp, boolean caseSensitive, org.apache.hadoop.conf.Configuration conf, String[] fullFieldNames, org.apache.flink.table.types.DataType[] fullFieldTypes, Map<String,Object> partitionSpec, int[] selectedFields, int batchSize, org.apache.flink.core.fs.Path path, long splitStart, long splitLength) throws IOException
Util for generating partitionedParquetColumnarRowSplitReader.- Throws:
IOException
-
createVectorFromConstant
public static org.apache.flink.table.data.columnar.vector.ColumnVector createVectorFromConstant(org.apache.flink.table.types.logical.LogicalType type, Object value, int batchSize)
-
createColumnReader
public static ColumnReader createColumnReader(boolean isUtcTimestamp, org.apache.flink.table.types.logical.LogicalType fieldType, org.apache.parquet.schema.Type type, List<org.apache.parquet.column.ColumnDescriptor> columnDescriptors, org.apache.parquet.column.page.PageReadStore pages, ParquetField field, int depth) throws IOException
- Throws:
IOException
-
createWritableColumnVector
public static org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector createWritableColumnVector(int batchSize, org.apache.flink.table.types.logical.LogicalType fieldType, org.apache.parquet.schema.Type type, List<org.apache.parquet.column.ColumnDescriptor> columnDescriptors, int depth)
-
buildFieldsList
public static List<ParquetField> buildFieldsList(List<org.apache.flink.table.types.logical.RowType.RowField> childrens, List<String> fieldNames, org.apache.parquet.io.MessageColumnIO columnIO)
-
lookupColumnByName
public static org.apache.parquet.io.ColumnIO lookupColumnByName(org.apache.parquet.io.GroupColumnIO groupColumnIO, String columnName)Parquet's column names are case in sensitive. So when we look up columns we first check for exact match, and if that can not find we look for a case-insensitive match.
-
getMapKeyValueColumn
public static org.apache.parquet.io.GroupColumnIO getMapKeyValueColumn(org.apache.parquet.io.GroupColumnIO groupColumnIO)
-
getArrayElementColumn
public static org.apache.parquet.io.ColumnIO getArrayElementColumn(org.apache.parquet.io.ColumnIO columnIO)
-
-