Class CsvInputFormat<OUT>

  • Type Parameters:
    OUT -
    All Implemented Interfaces:
    Serializable, org.apache.flink.api.common.io.CheckpointableInputFormat<org.apache.flink.core.fs.FileInputSplit,​Long>, org.apache.flink.api.common.io.InputFormat<OUT,​org.apache.flink.core.fs.FileInputSplit>, org.apache.flink.core.io.InputSplitSource<org.apache.flink.core.fs.FileInputSplit>
    Direct Known Subclasses:
    PojoCsvInputFormat, RowCsvInputFormat, TupleCsvInputFormat

    @Internal
    public abstract class CsvInputFormat<OUT>
    extends org.apache.flink.api.common.io.GenericCsvInputFormat<OUT>
    InputFormat that reads csv files.
    See Also:
    Serialized Form
    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.flink.api.common.io.FileInputFormat

        org.apache.flink.api.common.io.FileInputFormat.FileBaseStatistics, org.apache.flink.api.common.io.FileInputFormat.InputSplitOpenThread
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static String DEFAULT_FIELD_DELIMITER  
      static String DEFAULT_LINE_DELIMITER  
      protected Object[] parsedValues  
      • Fields inherited from class org.apache.flink.api.common.io.GenericCsvInputFormat

        commentCount, commentPrefix, fieldIncluded, invalidLineCount, lineDelimiterIsLinebreak
      • Fields inherited from class org.apache.flink.api.common.io.DelimitedInputFormat

        currBuffer, currLen, currOffset, RECORD_DELIMITER
      • Fields inherited from class org.apache.flink.api.common.io.FileInputFormat

        currentSplit, ENUMERATE_NESTED_FILES_FLAG, enumerateNestedFiles, filePath, INFLATER_INPUT_STREAM_FACTORIES, minSplitSize, numSplits, openTimeout, READ_WHOLE_SPLIT_FLAG, splitLength, splitStart, stream, unsplittable
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected CsvInputFormat​(org.apache.flink.core.fs.Path filePath)  
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      protected static boolean[] createDefaultMask​(int size)  
      protected abstract OUT fillRecord​(OUT reuse, Object[] parsedValues)  
      Class<?>[] getFieldTypes()  
      protected void initializeSplit​(org.apache.flink.core.fs.FileInputSplit split, Long offset)  
      OUT nextRecord​(OUT record)  
      OUT readRecord​(OUT reuse, byte[] bytes, int offset, int numBytes)  
      protected static boolean[] toBooleanMask​(int[] sourceFieldIndices)  
      String toString()  
      • Methods inherited from class org.apache.flink.api.common.io.GenericCsvInputFormat

        checkAndCoSort, checkForMonotonousOrder, close, enableQuotedStringParsing, getCommentPrefix, getFieldDelimiter, getFieldParsers, getGenericFieldTypes, getNumberOfFieldsTotal, getNumberOfNonNullFields, isLenient, isSkippingFirstLineAsHeader, parseRecord, setCharset, setCommentPrefix, setFieldDelimiter, setFieldsGeneric, setFieldsGeneric, setFieldTypesGeneric, setLenient, setSkipFirstLineAsHeader, skipFields, supportsMultiPaths
      • Methods inherited from class org.apache.flink.api.common.io.DelimitedInputFormat

        configure, getBufferSize, getCharset, getCurrentState, getDelimiter, getLineLengthLimit, getNumLineSamples, getStatistics, loadConfigParameters, loadGlobalConfigParams, open, reachedEnd, readLine, reopen, setBufferSize, setDelimiter, setDelimiter, setDelimiter, setLineLengthLimit, setNumLineSamples
      • Methods inherited from class org.apache.flink.api.common.io.FileInputFormat

        acceptFile, createInputSplits, decorateInputStream, extractFileExtension, getFilePath, getFilePaths, getFileStats, getFileStats, getInflaterInputStreamFactory, getInputSplitAssigner, getMinSplitSize, getNestedFileEnumeration, getNumSplits, getOpenTimeout, getSplitLength, getSplitStart, getSupportedCompressionFormats, registerInflaterInputStreamFactory, setFilePath, setFilePath, setFilePaths, setFilePaths, setFilesFilter, setMinSplitSize, setNestedFileEnumeration, setNumSplits, setOpenTimeout, testForUnsplittable
      • Methods inherited from class org.apache.flink.api.common.io.RichInputFormat

        closeInputFormat, getRuntimeContext, openInputFormat, setRuntimeContext
    • Constructor Detail

      • CsvInputFormat

        protected CsvInputFormat​(org.apache.flink.core.fs.Path filePath)
    • Method Detail

      • initializeSplit

        protected void initializeSplit​(org.apache.flink.core.fs.FileInputSplit split,
                                       Long offset)
                                throws IOException
        Overrides:
        initializeSplit in class org.apache.flink.api.common.io.GenericCsvInputFormat<OUT>
        Throws:
        IOException
      • nextRecord

        public OUT nextRecord​(OUT record)
                       throws IOException
        Specified by:
        nextRecord in interface org.apache.flink.api.common.io.InputFormat<OUT,​org.apache.flink.core.fs.FileInputSplit>
        Overrides:
        nextRecord in class org.apache.flink.api.common.io.DelimitedInputFormat<OUT>
        Throws:
        IOException
      • readRecord

        public OUT readRecord​(OUT reuse,
                              byte[] bytes,
                              int offset,
                              int numBytes)
                       throws IOException
        Specified by:
        readRecord in class org.apache.flink.api.common.io.DelimitedInputFormat<OUT>
        Throws:
        IOException
      • fillRecord

        protected abstract OUT fillRecord​(OUT reuse,
                                          Object[] parsedValues)
      • getFieldTypes

        public Class<?>[] getFieldTypes()
      • createDefaultMask

        protected static boolean[] createDefaultMask​(int size)
      • toBooleanMask

        protected static boolean[] toBooleanMask​(int[] sourceFieldIndices)
      • toString

        public String toString()
        Overrides:
        toString in class org.apache.flink.api.common.io.FileInputFormat<OUT>