Class OrcShimV200

  • All Implemented Interfaces:
    Serializable, OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
    Direct Known Subclasses:
    OrcShimV210

    public class OrcShimV200
    extends Object
    implements OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
    Shim orc for Hive version 2.0.0 and upper versions.
    See Also:
    Serialized Form
    • Constructor Summary

      Constructors 
      Constructor Description
      OrcShimV200()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static boolean[] computeProjectionMask​(org.apache.orc.TypeDescription schema, int[] selectedFields)
      Computes the ORC projection mask of the fields to include from the selected fields.rowOrcInputFormat.nextRecord(null).
      HiveOrcBatchWrapper createBatchWrapper​(org.apache.orc.TypeDescription schema, int batchSize)  
      protected org.apache.orc.Reader createReader​(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf)  
      org.apache.orc.RecordReader createRecordReader​(org.apache.hadoop.conf.Configuration conf, org.apache.orc.TypeDescription schema, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, org.apache.flink.core.fs.Path path, long splitStart, long splitLength)
      Create orc RecordReader from conf, schema and etc...
      protected org.apache.orc.RecordReader createRecordReader​(org.apache.orc.Reader reader, org.apache.orc.Reader.Options options)  
      static org.apache.flink.api.java.tuple.Tuple2<Long,​Long> getOffsetAndLengthForSplit​(long splitStart, long splitLength, List<org.apache.orc.StripeInformation> stripes)  
      boolean nextBatch​(org.apache.orc.RecordReader reader, org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch rowBatch)
      Read the next row batch.
      protected org.apache.orc.Reader.Options readOrcConf​(org.apache.orc.Reader.Options options, org.apache.hadoop.conf.Configuration conf)  
    • Constructor Detail

      • OrcShimV200

        public OrcShimV200()
    • Method Detail

      • createReader

        protected org.apache.orc.Reader createReader​(org.apache.hadoop.fs.Path path,
                                                     org.apache.hadoop.conf.Configuration conf)
                                              throws IOException
        Throws:
        IOException
      • createRecordReader

        protected org.apache.orc.RecordReader createRecordReader​(org.apache.orc.Reader reader,
                                                                 org.apache.orc.Reader.Options options)
                                                          throws IOException
        Throws:
        IOException
      • readOrcConf

        protected org.apache.orc.Reader.Options readOrcConf​(org.apache.orc.Reader.Options options,
                                                            org.apache.hadoop.conf.Configuration conf)
      • createRecordReader

        public org.apache.orc.RecordReader createRecordReader​(org.apache.hadoop.conf.Configuration conf,
                                                              org.apache.orc.TypeDescription schema,
                                                              int[] selectedFields,
                                                              List<OrcFilters.Predicate> conjunctPredicates,
                                                              org.apache.flink.core.fs.Path path,
                                                              long splitStart,
                                                              long splitLength)
                                                       throws IOException
        Description copied from interface: OrcShim
        Create orc RecordReader from conf, schema and etc...
        Specified by:
        createRecordReader in interface OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
        Throws:
        IOException
      • createBatchWrapper

        public HiveOrcBatchWrapper createBatchWrapper​(org.apache.orc.TypeDescription schema,
                                                      int batchSize)
        Specified by:
        createBatchWrapper in interface OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
      • nextBatch

        public boolean nextBatch​(org.apache.orc.RecordReader reader,
                                 org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch rowBatch)
                          throws IOException
        Description copied from interface: OrcShim
        Read the next row batch.
        Specified by:
        nextBatch in interface OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
        Throws:
        IOException
      • getOffsetAndLengthForSplit

        @VisibleForTesting
        public static org.apache.flink.api.java.tuple.Tuple2<Long,​Long> getOffsetAndLengthForSplit​(long splitStart,
                                                                                                         long splitLength,
                                                                                                         List<org.apache.orc.StripeInformation> stripes)
      • computeProjectionMask

        public static boolean[] computeProjectionMask​(org.apache.orc.TypeDescription schema,
                                                      int[] selectedFields)
        Computes the ORC projection mask of the fields to include from the selected fields.rowOrcInputFormat.nextRecord(null).
        Returns:
        The ORC projection mask.