Class BaseVectorizedColumnReader

  • All Implemented Interfaces:
    ColumnReader<org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>
    Direct Known Subclasses:
    ArrayColumnReader

    public abstract class BaseVectorizedColumnReader
    extends Object
    implements ColumnReader<org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>
    It's column level Parquet reader which is used to read a batch of records for a column, part of the code is referred from Apache Hive and Apache Parquet.
    • Field Detail

      • isUtcTimestamp

        protected boolean isUtcTimestamp
      • valuesRead

        protected long valuesRead
        Total number of values read.
      • endOfPageValueCount

        protected long endOfPageValueCount
        value that indicates the end of the current page. That is, if valuesRead == endOfPageValueCount, we are at the end of the page.
      • dictionary

        protected final ParquetDataColumnReader dictionary
        The dictionary, if this column has dictionary encoding.
      • isCurrentPageDictionaryEncoded

        protected boolean isCurrentPageDictionaryEncoded
        If true, the current page is dictionary encoded.
      • maxDefLevel

        protected final int maxDefLevel
        Maximum definition level for this column.
      • definitionLevel

        protected int definitionLevel
      • repetitionLevel

        protected int repetitionLevel
      • repetitionLevelColumn

        protected org.apache.flink.formats.parquet.vector.reader.BaseVectorizedColumnReader.IntIterator repetitionLevelColumn
        Repetition/Definition/Value readers.
      • definitionLevelColumn

        protected org.apache.flink.formats.parquet.vector.reader.BaseVectorizedColumnReader.IntIterator definitionLevelColumn
      • pageValueCount

        protected int pageValueCount
        Total values in the current page.
      • pageReader

        protected final org.apache.parquet.column.page.PageReader pageReader
      • descriptor

        protected final org.apache.parquet.column.ColumnDescriptor descriptor
      • type

        protected final org.apache.parquet.schema.Type type
      • logicalType

        protected final org.apache.flink.table.types.logical.LogicalType logicalType
    • Constructor Detail

      • BaseVectorizedColumnReader

        public BaseVectorizedColumnReader​(org.apache.parquet.column.ColumnDescriptor descriptor,
                                          org.apache.parquet.column.page.PageReader pageReader,
                                          boolean isUtcTimestamp,
                                          org.apache.parquet.schema.Type parquetType,
                                          org.apache.flink.table.types.logical.LogicalType logicalType)
                                   throws IOException
        Throws:
        IOException
    • Method Detail

      • readRepetitionAndDefinitionLevels

        protected void readRepetitionAndDefinitionLevels()
      • readPage

        protected void readPage()