Class WordCount


  • public class WordCount
    extends Object
    Implements the "WordCount" program by DataStream API V2 that computes a simple word occurrence histogram over text files. The job will currently be executed in streaming mode, and will support batch mode execution in the future.

    The input is a [list of] plain text file[s] with lines separated by a newline character.

    Usage:

    • --input <path>A list of input files and / or directories to read. If no input is provided, the program is run with default data from WordCountData.
    • --discovery-interval <duration>Turns the file reader into a continuous source that will monitor the provided input directories every interval and read any new files.
    • --output <path>The output directory where the Job will write the results. If no output path is provided, the Job will print the results to stdout .

    This example shows how to:

    • Write a simple Flink program by DataStream API V2
    • Use tuple data types
    • Write and use a user-defined process function

    Please note that if you intend to run this example in an IDE, you must first add the following VM options: "--add-opens=java.base/java.util=ALL-UNNAMED". This is necessary because the module system in JDK 17+ restricts some reflection operations.

    Please note that the DataStream API V2 is a new set of APIs, to gradually replace the original DataStream API. It is currently in the experimental stage and is not fully available for production.

    • Constructor Detail

      • WordCount

        public WordCount()