<Default Package>
Type string
UTF-8-encoded string type.
Enclose string literals in double quotes. Values of the string type are sequences of non-null Unicode characters encoded in UTF-8 format. Note that UTF-8 is a variable-width encoding and a character can occupy from 1 to 4 bytes of storage. The characters in the 7-bit ASCII character set are a subset of UTF-8 and occupy a single byte each.
Although string types are discussed as though they are primitive types, they are actually reference types. However, EPL's string objects are immutable. For example, a statement such as s:=s+" suffix"; creates a new string object and changes the variable s to refer to that new string object. Any other references to the old value continue to point to the old value.
Operations that can return a different string value, such as concatenation, case folding, or trimming white space, always create new strings rather than modifying the existing value in place. The previous value's storage is recovered later by the EPL runtime garbage collector.
The length of a string is limited by the memory available at runtime, which can be multiple gigabytes. In practice, you are unlikely to exceed the limit in a single string.
Special characters are encoded with a backslash (\) as follows:
\" | double quote |
\\ | backslash |
\n | newline |
\r | carriage return |
\t | tab |
The following operators are supported with strings:
< | Less-than string comparison |
> | Greater-than string comparison |
<= | Less-than or equal string comparison |
>= | Greater-than or equal string comparison |
= | Equal string comparison |
!= | Not-equal string comparison |
+ | String concatenation |
When you compare two strings for equality, the result is true if the strings are the same length and each character in one string is identical to the corresponding character at the same position in the other string.
When you compare two strings for less than or greater than, the characters in the strings are compared pairwise according to the numerical values of their Unicode code points. The comparison is case-sensitive, so capital letters are not equal to their lowercase equivalents. Characters earlier in the character set sort before characters later in the character set. To order two unequal strings, the earliest difference is considered. For example, "abcXdef" sorts earlier than "abcYdef", "abc" sorts earlier than "abcXYZ"; the empty string sorts earliest of all.
The default value of a string is the empty string ("").
Strings can be parsed, routed and compared. They are not cyclic.
canParse
boolean static canParse(string s)
Check if the string argument can be parsed as a string.- Parameters:
- s - The string to test for parseability.
- Returns:
- True if the string could be parsed as a string, false otherwise.
- See Also:
- - See the parse method for what is parseable.
parse
string static parse(string s)
Parses a quoted string, in the format used in the EPL representation of events.
The parse method takes a string in the form used for event files, for example "foo \\ bar \\n baz". String arguments must be enclosed in double quotes and newlines, tabs and backslashes must be escaped.- Parameters:
- s - The string to parse. Must be enclosed in double quotes and follow the string escaping rules.
- Returns:
- The parsed string.
- Throws:
- ParseException if s cannot be parsed as a string.
- See Also:
- - Creates a quoted string that can be parsed with this method.
clone
string clone()
Get a new reference to this string.
Because strings are immutable, clone() returns another reference to the same string and does not create a copy.- Returns:
- A reference to the same string.
contains
boolean contains(string searchString)
Determine whether this string contains the specified substring.- Parameters:
- searchString - The string to search for.
- Returns:
- Returns true if it is contained.
- Since:
- 10.15.4.0
- See Also:
- - To return the index at which the string is contained.
endsWith
boolean endsWith(string searchString)
Determine whether this string ends with the specified substring.- Parameters:
- searchString - The string to search for.
- Returns:
- Returns true if this string ends with the specified searchString.
- Since:
- 10.15.4.0
find
integer find(string searchString)
Locate a string within this string.- Parameters:
- searchString - The string to search for.
- Returns:
- The index (starting from 0) of the first character of searchString within this string. Returns -1 if searchString is not found.
- See Also:
- - If you want to search using a regular expression rather than a string literal.
- - To find out whether the string is contained without returning the index.
findFrom
integer findFrom(string searchString, integer fromIndex)
Locate a string within this string from a starting index.
findFrom behaves like find, but it begins searching from the specified index.- Parameters:
- searchString - The string to search for.
- fromIndex - The index in the string to start searching from.
- Returns:
- The index of the first character of searchString within this string. Returns -1 if searchString is not found after fromIndex.
groupSearch
sequence<string> groupSearch(string regex)
Find the first regular expression match in this string, and return a list of the matched (...) groups.
This is useful for extracting interesting data from a string, for example: print "Today Bob met Eve".groupSearch("([a-zA-Z]+) met ([a-zA-Z]+)").toString(); // Prints ["Bob", "Eve"]
// If there are no matches the result is an empty sequence
print "Goodbye Bob".groupSearch("Hello Bob").isEmpty().toString(); // true
// The sequence.getOr() method can be used to provide a default value in case there is no match
print "Goodbye Bob".groupSearch("Hello ([a-zA-Z]+)").getOr(0, "???"); // "???"
print "Hello Bob" .groupSearch("Hello ([a-zA-Z]+)").getOr(0, "someone"); // "Bob"
// Optional ? sections produce "" if missing
print "Hello Bob and Eve" .groupSearch("Hello ([a-zA-Z]+)( and [a-zA-Z]+)?( and [a-zA-Z]+)?").toString(); // ["Bob", " and Eve", ""]
print "Hello Bob and Alice and Eve".groupSearch("Hello ([a-zA-Z]+)( and [a-zA-Z]+)?( and [a-zA-Z]+)?").toString(); // ["Bob", " and Alice", " and Eve"]
// Non-capturing (?: ) groups do not show up in the returned sequence
print "Hello Bob and Eve".groupSearch("(?:Hello) ([a-zA-Z]+)(?: and )?([a-zA-Z]+)?").toString(); // ["Bob", "Eve"]
print "Hello Bob" .groupSearch("(?:Hello) ([a-zA-Z]+)(?: and )?([a-zA-Z]+)?").toString(); // ["Bob", ""]
- Parameters:
- regex - The regular expression to search for.
- Returns:
- A sequence with an item for each (...) capturing group in the expression, or an empty sequence if there was no match. Non-capturing groups such as (?:...) are ignored. Optional groups such as (foo|bar)? are always included in the returned sequence, with an empty string if the group did not match.
- Throws:
- IllegalArgumentException if the regular expression is invalid.
- Since:
- 10.15.4.0
- See Also:
- - See the documentation on matches for regular expression syntax.
- - To return all matches rather than just the first (but without any group information).
hash
integer hash()
Get an integer hash representation of the underlying object.
This function will return an integer evenly distributed over the whole range suitable for partitioning or indexing of that structure. Multiple different object can resolve to the same hash value.- Returns:
- An integer respresentation of the underlying object.
intern
string intern()
Mark the string it is called on as interned.
Subsequent incoming events that contain a string that is identical to an interned string use the same string object.
The benefit of using the intern() method is that it reduces the amount of memory used and the amount of work the garbage collector must do. A disadvantage is that you cannot free memory used for an interned string.
If there are a limited number of strings that will be used many times, then calling intern() on these strings speeds the handling of events that use them. You might want to call intern() on the names of products or stock symbols, which are all used frequently. For example, invoking "APMA".intern() might make sense if you are expecting a large number of incoming events of the form Tick("APMA", ...). You would not want to call intern() on order IDs because there are so many and each one is likely to be unique.
Calling intern() on a string is a global operation. That is, all contexts can then use the same string object. Any strings already in use by the correlator are not affected, even if they match the string intern() is called on.
If you use correlator persistence, the set of strings that have been interned is stored in the recovery datastore, so there is no need to call intern() again after a restart.
The interned version of the string is returned. The original will be garbage collected when all references to it have been dropped.- Returns:
- The interned version of the string.
isEmpty
boolean isEmpty()
Determine whether the string has no characters.- Returns:
- true if the length is 0.
- Since:
- 10.15.4.0
join
string join(sequence<string> args)
Concatenate the sequence argument using this string as a separator.
For example: string s := ", ".join(seq);- Parameters:
- args - The sequence to join.
- Returns:
- A string with args joined together separated by this string.
length
integer length()
Get the length of the string.- Returns:
- The length of the string.
ltrim
string ltrim()
Strip whitespace from the start of the string.
Whitespace characters are space, newline and tabs.- Returns:
- A copy of the string where all the whitespace characters at the start have been removed.
matches
boolean matches(string regex)
Test whether the string matches the specified regular expression.
EPL uses IBM's International Components for Unicode (ICU) to implement regular expressions. You can use all of the regular expressions that are described in the ICU User Guide with the regular expression methods of this type as described at https://unicode-org.github.io/icu/userguide/strings/regexp.html. Other than the ICU regular expression syntax, Apama provides the additional option (!g) for the replace method, which allows you to replace all matches rather than just the first one. This option must be the first part of the regular expression.
For example: print "zeroPlusMatchesXX onePlusMatchesYY".matches("zeroPlusMatches.* onePlusMatches[a-zA-Z]+").toString(); // true
print "optionalNumber=123.456 alternative2".matches("(optionalNumber=[0-9.]+)? (alternate1|alternative2)").toString(); // true
// Both regular expressions and EPL string literals both use backslash for escaping special characters, so a double backslash
// is needed. Some regular expression characters can also be escaped with [] to avoid the double backslash.
print "escaped.char escaped.char escaped\\char".matches("escaped\\.char escaped[.]char escaped\\\\char").toString(); // true
print "theStart containedAnywhereInString theEnd".matches(".*containedAnywhereInString.*").toString(); // true
print "theStart and theEnd".matches("^theStart .* theEnd$").toString(); // true
- Parameters:
- regex - The regular expression to compare with.
- Returns:
- True if the entire string matches the given regular expression, false otherwise.
- Throws:
- IllegalArgumentException if the regular expression is invalid.
- See Also:
- - To get each part (group) of the first match.
- - To get a list of all matches.
quote
string quote()
Adds quotes and escaping to this string, in the format used in the EPL representation of events.
For example, the string foo \ bar \n baz would be quoted as "foo \\ bar \\n baz".
This method is a convenient way to add quotes to text that will be logged or displayed to the user when it is often desirable to avoid any possibility of newline characters.- Returns:
- The string with added backslash escape sequences and surrounded by double quotes.
- Since:
- 10.15.4.0
- See Also:
- - Parses strings in the quoted string format returned by this method.
replace
string replace(string regex, string replacement)
Search and replace a regular expression within this string.
By default, only the first matching substring is replaced. If the regular expression starts with (!g), then all matching substrings are replaced instead.
The replacement string can contain references to parts of the matched expression. Matching (...) groups are referred to with $1 for the first match etc.
For example: // Prints "Price1 = 1.20 (USD), Price2 = USD3.50" - just replaces the first occurrence
print "Price1 = USD1.20, Price2 = USD3.50".replace("([A-Z]+)([0-9.]+)", "$2 ($1)");
// Prints "Price1 = 1.20 (USD), Price2 = 3.50 (USD)" - replaces all, due to the !g global flag
print "Price1 = USD1.20, Price2 = USD3.50".replace("(!g)([A-Z]+)([0-9.]+)", "$2 ($1)"); // remove suffix if present: barFOO
// Prints "myfile" - remove suffix if present
print "myfile.ext".replace("[.]ext$", "");
// Prints "myfile.ext" - remove prefix if present
print "/foo/bar/myfile.ext".replace("^.*[/]", "");
- Parameters:
- regex - The regular expression to search for.
- replacement - The replacement string, which may contain placeholders such as $1, $2, etc.
- Returns:
- A copy of the string with string(s) matching regex replaced by replacement.
- Throws:
- IllegalArgumentException if the regular expression is invalid.
- See Also:
- - See the documentation on matches for regular expression syntax.
- - If you wish to replace string literals rather than regular expressions.
replaceAll
string replaceAll(string searchString, string replacement)
Search and replace string literals within this string.
Searches for each instance of searchString in the string and creates a copy of the string with each one replaced with the replacement string.- Parameters:
- searchString - The string to search for. This is a string literal not a regular expression.
- replacement - The string to replace searchString with.
- Returns:
- A copy of the string with each instance of searchString replaced with replace.
- Throws:
- IllegalArgumentException if replacement is an empty string.
- See Also:
- - If you wish to replace regular expressions rather than string literals.
rfind
integer rfind(string searchString)
Locate a string within this string starting from the right/end of the string.- Parameters:
- searchString - The string to search for.
- Returns:
- The index (starting from 0) of the first character of searchString within this string. Returns -1 if searchString is not found.
- Since:
- 10.15.4.0
- See Also:
- - If you want to search from the left rather than the right.
rtrim
string rtrim()
Strip whitespace from the end of the string.
Whitespace characters are space, newline and tabs.- Returns:
- A copy of the string where all the whitespace characters at the end have been removed.
search
sequence<string> search(string regex)
Find all the substrings matching a specified regular expression.- Parameters:
- regex - The regular expression to search for.
- Returns:
- A sequence of each (non-overlapping) match for the regular expression within this string.
Note that this method returns matches for the entire regex (not for any regex groups that may be present), so the length of the sequence is equal to the number of substrings that matched the regex, and is not affected by any regex groups, unlike the groupSearch() action.
For example: print "Hello Bob, Hello Alice, Hello Eve".search("Hello ([a-zA-Z]+)").toString(); // ["Hello Bob", "Hello Alice", "Hello Eve"]
- Throws:
- IllegalArgumentException if the regular expression is invalid.
- See Also:
- - See the documentation on matches for regular expression syntax.
- - To return just the first match, but with extraction of each regular expression (...) group within the expression.
split
sequence<string> split(string input)
Split a string using a delimiter.
For example: sequence<string> items := ",".split("a,b,c");
Returns a sequence of the strings that result from splitting the input string on every occurrence of the delimiter string that the method is called on. The size of the returned sequence is always one more than the total number of occurrences of the delimiter string. Consecutive delimiters in the input string result in empty strings in the returned sequence. The split() method is useful for separating a string that contains newline characters into individual lines or for dividing comma-separated values in a single string into multiple strings.- Parameters:
- input - The string which should be split.
- Returns:
- A sequence containing the input string split using this string as a delimiter.
- Throws:
- IllegalArgumentException if attempted with an empty delimiter.
- See Also:
- - This method performs the inverse of join.
- - tokenize has slightly different behavior.
startsWith
boolean startsWith(string searchString)
Determine whether this string starts with the specified substring.- Parameters:
- searchString - The string to search for.
- Returns:
- Returns true if this string starts with the specified searchString.
- Since:
- 10.15.4.0
substring
string substring(integer start, integer end)
Extract part of this string.
The parameters indicate the position of the first and last characters of the substring, the first being inclusive, while the second is exclusive. If a parameter is a positive value, it is taken to be the position of a character going from left to right, counting upwards from 0. If a parameter is a negative value, it is taken to be the position of a character going from right to left, counting downwards from -1.
Examples: "abcde".substring( 1, 4 ); // "bcd"
"abcde".substring( 1, -1 ); // "bcd"
"abcde".substring( -2, -1 ); // "d"
"abcde".substring( 2, 5 ); // throws IndexOutOfBoundsException
"abcde".substring( -5, -3 ); // throws IndexOutOfBoundsException
- Parameters:
- start - The first character, inclusive. To start from the beginning use 0. To specify the position relative to the end of the string use a negative number e.g. -2 to begin 2 characters from the end.
- end - The last character, exclusive. To specify the position relative to the end of the string use a negative number e.g. -1 to end just before the final character.
- Returns:
- A new string containing the specified range from this string.
- Throws:
- IndexOutOfBoundsException if the magnitude of start or end is larger than the length of the original string.
- See Also:
- - To start from a specified character and go to the end of the string.
substringFrom
string substringFrom(integer start)
Extract part of this string, starting at a specified character and ending at the end of the string.
The parameter indicate the position of the first character to include. If a parameter is a positive value, it is taken to be the position of a character going from left to right, counting upwards from 0. If a parameter is a negative value, it is taken to be the position of a character going from right to left, counting downwards from -1.
Examples: "abcde".substringFrom( 1 ); // "bcde"
"abcde".substringFrom( -1 ); // "e"
"abcde".substringFrom( -3 ); // "cde"
- Parameters:
- start - The first character, inclusive. To start from the beginning use 0. To specify the position relative to the end of the string use a negative number e.g. -1 to extract just the final character.
- Returns:
- A new string containing the specified range from this string.
- Throws:
- IndexOutOfBoundsException if the magnitude of start is larger than the length of the original string.
- Since:
- 10.15.4.0
- See Also:
- - To specify both the start and end characters.
toBoolean
boolean toBoolean()
Convert the string to a boolean.
This method is case-sensitive.- Returns:
- True if the string is "true", false otherwise.
- See Also:
- - Parse as a boolean for case-insensitivity.
toDecimal
decimal toDecimal()
Convert the string to a decimal.
Returns a decimal representation of the string if the string starts with one or more numeric characters. The numeric characters can optionally have among them a decimal point or mantissa symbol. Only converts until the first non-numeric character. Returns 0.0 if there are no such characters.- Returns:
- The decimal representation of the string, or 0.0 if the string does not conform to the above conditions.
- See Also:
- - decimal.parse for a more strict method to parse decimals.
toFloat
float toFloat()
Convert the string to a float.
Returns a float representation of the string if the string starts with one or more numeric characters. The numeric characters can optionally have among them a decimal point or mantissa symbol. Only converts until the first non-numeric character. Returns 0.0 if there are no such characters.- Returns:
- The float representation of the string, or 0.0 if the string does not conform to the above conditions.
- See Also:
- - float.parse for a more strict method to parse floats.
toInteger
integer toInteger()
Convert the string to an integer.
The string this method is invoked on should be of the form:
[PREFIX][SIGN][BASE]INTEGER[SUFFIX]
Where:
PREFIX is zero or more whitespace characters (space, tab).
SIGN is zero or one sign characters (+ or -).
BASE is either empty (for base 10), 0b/0B for base 2 (binary) or 0x/0X for base 16 (hex).
INTEGER is a a sequence of one or more digits according to the base (i.e. 0 or 1 for base 2, 0-9 for base 10 and 0-F for base 16).
SUFFIX is zero or more other characters (whitespace, letters, symbols, digits outside the allowed set).
- Returns:
- The integer representation of the string, or 0 if the string does not conform to the above conditions.
- See Also:
- - integer.parse for a more strict method to parse integers.
tokenize
sequence<string> tokenize(string input)
Split a string into a sequence using an arbitrary set of delimiters.
Returns a sequence of all the non-empty strings (tokens) that result from splitting the input string on occurrences of any character from the string that the method is called on. The returned sequence never contains any empty strings, and will have no elements if the input string is empty or contains only delimiters. The tokenize() method is useful for extracting words from whitespace.- Parameters:
- input - The string to tokenize.
- Returns:
- A sequence of strings containing the tokenized values.
toLower
string toLower()
Convert the string to lowercase.- Returns:
- A copy of the string with all the characters converted to lowercase.
toString
string toString()
Return a reference to this string.
Unlike quote(), this method does not escape or enclose the string in quotes so the output is unsuitable for passing to string.parse.- Returns:
- The string, verbatim.
- See Also:
toUpper
string toUpper()
Convert the string to uppercase.- Returns:
- A copy of the string with all the characters converted to uppercase.