Class HadoopInputs


  • public final class HadoopInputs
    extends Object
    HadoopInputs is a utility class to use Apache Hadoop InputFormats with Apache Flink.

    It provides methods to create Flink InputFormat wrappers for Hadoop InputFormat and InputFormat.

    Key value pairs produced by the Hadoop InputFormats are converted into Flink Tuple2 objects where the first field (Tuple2.f0) is the key and the second field (Tuple2.f1) is the value.

    • Constructor Detail

      • HadoopInputs

        public HadoopInputs()
    • Method Detail

      • readHadoopFile

        public static <K,​V> HadoopInputFormat<K,​V> readHadoopFile​(org.apache.hadoop.mapred.FileInputFormat<K,​V> mapredInputFormat,
                                                                              Class<K> key,
                                                                              Class<V> value,
                                                                              String inputPath,
                                                                              org.apache.hadoop.mapred.JobConf job)
        Creates a Flink InputFormat that wraps the given Hadoop FileInputFormat.
        Returns:
        A Flink InputFormat that wraps the Hadoop FileInputFormat.
      • readHadoopFile

        public static <K,​V> HadoopInputFormat<K,​V> readHadoopFile​(org.apache.hadoop.mapred.FileInputFormat<K,​V> mapredInputFormat,
                                                                              Class<K> key,
                                                                              Class<V> value,
                                                                              String inputPath)
        Creates a Flink InputFormat that wraps the given Hadoop FileInputFormat.
        Returns:
        A Flink InputFormat that wraps the Hadoop FileInputFormat.
      • readSequenceFile

        public static <K,​V> HadoopInputFormat<K,​V> readSequenceFile​(Class<K> key,
                                                                                Class<V> value,
                                                                                String inputPath)
                                                                         throws IOException
        Creates a Flink InputFormat to read a Hadoop sequence file for the given key and value classes.
        Returns:
        A Flink InputFormat that wraps a Hadoop SequenceFileInputFormat.
        Throws:
        IOException
      • createHadoopInput

        public static <K,​V> HadoopInputFormat<K,​V> createHadoopInput​(org.apache.hadoop.mapred.InputFormat<K,​V> mapredInputFormat,
                                                                                 Class<K> key,
                                                                                 Class<V> value,
                                                                                 org.apache.hadoop.mapred.JobConf job)
        Creates a Flink InputFormat that wraps the given Hadoop InputFormat.
        Returns:
        A Flink InputFormat that wraps the Hadoop InputFormat.
      • readHadoopFile

        public static <K,​V> HadoopInputFormat<K,​V> readHadoopFile​(org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,​V> mapreduceInputFormat,
                                                                              Class<K> key,
                                                                              Class<V> value,
                                                                              String inputPath,
                                                                              org.apache.hadoop.mapreduce.Job job)
                                                                       throws IOException
        Creates a Flink InputFormat that wraps the given Hadoop FileInputFormat.
        Returns:
        A Flink InputFormat that wraps the Hadoop FileInputFormat.
        Throws:
        IOException
      • readHadoopFile

        public static <K,​V> HadoopInputFormat<K,​V> readHadoopFile​(org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,​V> mapreduceInputFormat,
                                                                              Class<K> key,
                                                                              Class<V> value,
                                                                              String inputPath)
                                                                       throws IOException
        Creates a Flink InputFormat that wraps the given Hadoop FileInputFormat.
        Returns:
        A Flink InputFormat that wraps the Hadoop FileInputFormat.
        Throws:
        IOException
      • createHadoopInput

        public static <K,​V> HadoopInputFormat<K,​V> createHadoopInput​(org.apache.hadoop.mapreduce.InputFormat<K,​V> mapreduceInputFormat,
                                                                                 Class<K> key,
                                                                                 Class<V> value,
                                                                                 org.apache.hadoop.mapreduce.Job job)
        Creates a Flink InputFormat that wraps the given Hadoop InputFormat.
        Returns:
        A Flink InputFormat that wraps the Hadoop InputFormat.