Class AbstractColumnReader<VECTOR extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>
- java.lang.Object
-
- org.apache.flink.formats.parquet.vector.reader.AbstractColumnReader<VECTOR>
-
- All Implemented Interfaces:
ColumnReader<VECTOR>
- Direct Known Subclasses:
BooleanColumnReader,ByteColumnReader,BytesColumnReader,DoubleColumnReader,FixedLenBytesColumnReader,FloatColumnReader,IntColumnReader,LongColumnReader,ShortColumnReader,TimestampColumnReader
public abstract class AbstractColumnReader<VECTOR extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector> extends Object implements ColumnReader<VECTOR>
AbstractColumnReader. SeeColumnReaderImpl, part of the code is referred from Apache Spark and Apache Parquet.
-
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.parquet.column.ColumnDescriptordescriptorprotected org.apache.parquet.column.DictionarydictionaryThe dictionary, if this column has dictionary encoding.protected intmaxDefLevelMaximum definition level for this column.protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoderrunLenDecoderRun length decoder for data and dictionary.
-
Constructor Summary
Constructors Constructor Description AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.column.page.PageReader pageReader)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected voidafterReadPage()After read a page, we may need some initialization.protected voidcheckTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName)protected abstract voidreadBatch(int rowId, int num, VECTOR column)Read batch fromrunLenDecoderanddataInputStream.protected abstract voidreadBatchFromDictionaryIds(int rowId, int num, VECTOR column, org.apache.flink.table.data.columnar.vector.writable.WritableIntVector dictionaryIds)Decode dictionary ids to data.voidreadToVector(int readNumber, VECTOR vector)Reads `total` values from this columnReader into column.protected booleansupportLazyDecode()Support lazy dictionary ids decode.
-
-
-
Field Detail
-
dictionary
protected final org.apache.parquet.column.Dictionary dictionary
The dictionary, if this column has dictionary encoding.
-
maxDefLevel
protected final int maxDefLevel
Maximum definition level for this column.
-
descriptor
protected final org.apache.parquet.column.ColumnDescriptor descriptor
-
runLenDecoder
protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoder runLenDecoder
Run length decoder for data and dictionary.
-
-
Constructor Detail
-
AbstractColumnReader
public AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.column.page.PageReader pageReader) throws IOException- Throws:
IOException
-
-
Method Detail
-
checkTypeName
protected void checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName)
-
readToVector
public final void readToVector(int readNumber, VECTOR vector) throws IOExceptionReads `total` values from this columnReader into column.- Specified by:
readToVectorin interfaceColumnReader<VECTOR extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>- Parameters:
readNumber- number to read.vector- vector to write.- Throws:
IOException
-
afterReadPage
protected void afterReadPage()
After read a page, we may need some initialization.
-
supportLazyDecode
protected boolean supportLazyDecode()
Support lazy dictionary ids decode. See more inParquetDictionary. If return false, we will decode all the data first.
-
readBatch
protected abstract void readBatch(int rowId, int num, VECTOR column)Read batch fromrunLenDecoderanddataInputStream.
-
readBatchFromDictionaryIds
protected abstract void readBatchFromDictionaryIds(int rowId, int num, VECTOR column, org.apache.flink.table.data.columnar.vector.writable.WritableIntVector dictionaryIds)Decode dictionary ids to data. FromrunLenDecoderanddictionaryIdsDecoder.
-
-