Class DistCp


  • public class DistCp
    extends Object
    A main class of the Flink distcp utility. It's a simple reimplementation of Hadoop distcp (see http://hadoop.apache.org/docs/r1.2.1/distcp.html) with a dynamic input format Note that this tool does not deal with retries. Additionally, empty directories are not copied over.

    When running locally, local file systems paths can be used. However, in a distributed environment HDFS paths must be provided both as input and output.

    Note: All Flink DataSet APIs are deprecated since Flink 1.18 and will be removed in a future Flink major version. You can still build your application in DataSet, but you should move to either the DataStream and/or Table API. This class is retained for testing purposes.