Package org.apache.flink.types.parser
Class FieldParser<T>
- java.lang.Object
-
- org.apache.flink.types.parser.FieldParser<T>
-
- Type Parameters:
T- The type that is parsed.
- Direct Known Subclasses:
BigDecParser,BigIntParser,BooleanParser,BooleanValueParser,ByteParser,ByteValueParser,DoubleParser,DoubleValueParser,FloatParser,FloatValueParser,IntParser,IntValueParser,LongParser,LongValueParser,ShortParser,ShortValueParser,SqlDateParser,SqlTimeParser,SqlTimestampParser,StringParser,StringValueParser
@PublicEvolving public abstract class FieldParser<T> extends Object
A FieldParser is used parse a field from a sequence of bytes. Fields occur in a byte sequence and are terminated by the end of the byte sequence or a delimiter.The parsers do not throw exceptions in general, but set an error state. That way, they can be used in functions that ignore invalid lines, rather than failing on them.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classFieldParser.ParseErrorStateAn enumeration of different types of errors that may occur.
-
Constructor Summary
Constructors Constructor Description FieldParser()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract TcreateValue()Returns an instance of the parsed value type.static booleandelimiterNext(byte[] bytes, int startPos, byte[] delim)Checks if the delimiter starts at the given start position of the byte array.static booleanendsWithDelimiter(byte[] bytes, int endPos, byte[] delim)Checks if the given bytes ends with the delimiter at the given end position.CharsetgetCharset()Gets the character set used for this parser.FieldParser.ParseErrorStategetErrorState()Gets the error state of the parser, as a value of the enumerationFieldParser.ParseErrorState.abstract TgetLastResult()Gets the parsed field.static <T> Class<FieldParser<T>>getParserForType(Class<T> type)Gets the parser for the type specified by the given class.protected intnextStringEndPos(byte[] bytes, int startPos, int limit, byte[] delimiter)Returns the end position of a string.protected static intnextStringLength(byte[] bytes, int startPos, int length, char delimiter)Returns the length of a string.protected abstract intparseField(byte[] bytes, int startPos, int limit, byte[] delim, T reuse)Each parser's logic should be implemented inside this methodintresetErrorStateAndParse(byte[] bytes, int startPos, int limit, byte[] delim, T reuse)Parses the value of a field from the byte array, taking care of properly reset the state of this parser.protected voidresetParserState()Reset the state of the parser.voidsetCharset(Charset charset)Sets the character set used for this parser.protected voidsetErrorState(FieldParser.ParseErrorState error)Sets the error state of the parser.
-
-
-
Method Detail
-
resetErrorStateAndParse
public int resetErrorStateAndParse(byte[] bytes, int startPos, int limit, byte[] delim, T reuse)Parses the value of a field from the byte array, taking care of properly reset the state of this parser. The start position within the byte array and the array's valid length is given. The content of the value is delimited by a field delimiter.- Parameters:
bytes- The byte array that holds the value.startPos- The index where the field startslimit- The limit unto which the byte contents is valid for the parser. The limit is the position one after the last valid byte.delim- The field delimiter characterreuse- An optional reusable field to hold the value- Returns:
- The index of the next delimiter, if the field was parsed correctly. A value less than 0 otherwise.
-
parseField
protected abstract int parseField(byte[] bytes, int startPos, int limit, byte[] delim, T reuse)Each parser's logic should be implemented inside this method
-
resetParserState
protected void resetParserState()
Reset the state of the parser. Called as the very first method insideresetErrorStateAndParse(byte[], int, int, byte[], Object), by default it just reset its error state.
-
getLastResult
public abstract T getLastResult()
Gets the parsed field. This method returns the value parsed by the last successful invocation ofparseField(byte[], int, int, byte[], Object). It objects are mutable and reused, it will return the object instance that was passed the parse function.- Returns:
- The latest parsed field.
-
createValue
public abstract T createValue()
Returns an instance of the parsed value type.- Returns:
- An instance of the parsed value type.
-
delimiterNext
public static final boolean delimiterNext(byte[] bytes, int startPos, byte[] delim)Checks if the delimiter starts at the given start position of the byte array.Attention: This method assumes that enough characters follow the start position for the delimiter check!
- Parameters:
bytes- The byte array that holds the value.startPos- The index of the byte array where the check for the delimiter starts.delim- The delimiter to check for.- Returns:
- true if a delimiter starts at the given start position, false otherwise.
-
endsWithDelimiter
public static final boolean endsWithDelimiter(byte[] bytes, int endPos, byte[] delim)Checks if the given bytes ends with the delimiter at the given end position.- Parameters:
bytes- The byte array that holds the value.endPos- The index of the byte array where the check for the delimiter ends.delim- The delimiter to check for.- Returns:
- true if a delimiter ends at the given end position, false otherwise.
-
setErrorState
protected void setErrorState(FieldParser.ParseErrorState error)
Sets the error state of the parser. Called by subclasses of the parser to set the type of error when failing a parse.- Parameters:
error- The error state to set.
-
getErrorState
public FieldParser.ParseErrorState getErrorState()
Gets the error state of the parser, as a value of the enumerationFieldParser.ParseErrorState. If no error occurred, the error state will beFieldParser.ParseErrorState.NONE.- Returns:
- The current error state of the parser.
-
nextStringEndPos
protected final int nextStringEndPos(byte[] bytes, int startPos, int limit, byte[] delimiter)Returns the end position of a string. Sets the error state if the column is empty.- Returns:
- the end position of the string or -1 if an error occurred
-
nextStringLength
protected static final int nextStringLength(byte[] bytes, int startPos, int length, char delimiter)Returns the length of a string. Throws an exception if the column is empty.- Returns:
- the length of the string
-
getCharset
public Charset getCharset()
Gets the character set used for this parser.- Returns:
- the charset used for this parser.
-
setCharset
public void setCharset(Charset charset)
Sets the character set used for this parser.- Parameters:
charset- charset used for this parser.
-
getParserForType
public static <T> Class<FieldParser<T>> getParserForType(Class<T> type)
Gets the parser for the type specified by the given class. Returns null, if no parser for that class is known.- Parameters:
type- The class of the type to get the parser for.- Returns:
- The parser for the given type, or null, if no such parser exists.
-
-