Class NumericColumnSummary<T>
- java.lang.Object
-
- org.apache.flink.api.java.summarize.ColumnSummary
-
- org.apache.flink.api.java.summarize.NumericColumnSummary<T>
-
- Type Parameters:
T- the numeric type e.g. Integer, Double
- All Implemented Interfaces:
Serializable
@Deprecated @PublicEvolving public class NumericColumnSummary<T> extends ColumnSummary implements Serializable
Deprecated.All Flink DataSet APIs are deprecated since Flink 1.18 and will be removed in a future Flink major version. You can still build your application in DataSet, but you should move to either the DataStream and/or Table API.Generic Column Summary for Numeric Types.Some values are considered "missing" where "missing" is defined as null, NaN, or Infinity. These values are ignored in some calculations like mean, variance, and standardDeviation.
Uses the Kahan summation algorithm to avoid numeric instability when computing variance. The algorithm is described in: "Scalable and Numerically Stable Descriptive Statistics in SystemML", Tian et al, International Conference on Data Engineering 2012.
-
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description longgetInfinityCount()Deprecated.Number of values that are positive or negative infinity.TgetMax()Deprecated.DoublegetMean()Deprecated.Null, NaN, and Infinite values are ignored in this calculation.TgetMin()Deprecated.longgetMissingCount()Deprecated.The number of "missing" values where "missing" is defined as null, NaN, or Infinity.longgetNanCount()Deprecated.Number of values that are NaN.longgetNonMissingCount()Deprecated.The number of values that are not null, NaN, or Infinity.longgetNonNullCount()Deprecated.The number of non-null values in this column.longgetNullCount()Deprecated.The number of null values in this column.DoublegetStandardDeviation()Deprecated.Standard Deviation is a measure of variation in a set of numbers.TgetSum()Deprecated.DoublegetVariance()Deprecated.Variance is a measure of how far a set of numbers are spread out.StringtoString()Deprecated.-
Methods inherited from class org.apache.flink.api.java.summarize.ColumnSummary
containsNonNull, containsNull, getTotalCount
-
-
-
-
Method Detail
-
getMissingCount
public long getMissingCount()
Deprecated.The number of "missing" values where "missing" is defined as null, NaN, or Infinity.These values are ignored in some calculations like mean, variance, and standardDeviation.
-
getNonMissingCount
public long getNonMissingCount()
Deprecated.The number of values that are not null, NaN, or Infinity.
-
getNonNullCount
public long getNonNullCount()
Deprecated.The number of non-null values in this column.- Specified by:
getNonNullCountin classColumnSummary
-
getNullCount
public long getNullCount()
Deprecated.Description copied from class:ColumnSummaryThe number of null values in this column.- Specified by:
getNullCountin classColumnSummary
-
getNanCount
public long getNanCount()
Deprecated.Number of values that are NaN.(always zero for types like Short, Integer, Long)
-
getInfinityCount
public long getInfinityCount()
Deprecated.Number of values that are positive or negative infinity.(always zero for types like Short, Integer, Long)
-
getMin
public T getMin()
Deprecated.
-
getMax
public T getMax()
Deprecated.
-
getSum
public T getSum()
Deprecated.
-
getMean
public Double getMean()
Deprecated.Null, NaN, and Infinite values are ignored in this calculation.- See Also:
- Arithmetic Mean
-
getVariance
public Double getVariance()
Deprecated.Variance is a measure of how far a set of numbers are spread out.Null, NaN, and Infinite values are ignored in this calculation.
- See Also:
- Variance
-
getStandardDeviation
public Double getStandardDeviation()
Deprecated.Standard Deviation is a measure of variation in a set of numbers. It is the square root of the variance.Null, NaN, and Infinite values are ignored in this calculation.
- See Also:
- Standard Deviation
-
-