GLOBAL
Definitions of built-in functions related to string processing and manipulation.
Namespace: | GLOBAL |
---|---|
Source File: | /scripts/base/bif/strings.bif.bro |
cat_string_array : function &deprecated |
Concatenates all elements in an array of strings. |
cat_string_array_n : function &deprecated |
Concatenates a specific range of elements in an array of strings. |
clean : function |
Replaces non-printable characters in a string with escaped sequences. |
edit : function |
Returns an edited version of a string that applies a special
“backspace character” (usually \x08 for backspace or \x7f for DEL). |
escape_string : function |
Replaces non-printable characters in a string with escaped sequences. |
find_all : function |
Finds all occurrences of a pattern in a string. |
find_last : function |
Finds the last occurrence of a pattern in a string. |
gsub : function |
Substitutes a given replacement string for all occurrences of a pattern in a given string. |
hexdump : function |
Returns a hex dump for given input data. |
is_ascii : function |
Determines whether a given string contains only ASCII characters. |
join_string_array : function &deprecated |
Joins all values in the given array of strings with a separator placed between each element. |
join_string_vec : function |
Joins all values in the given vector of strings with a separator placed between each element. |
levenshtein_distance : function |
Calculates the Levenshtein distance between the two strings. |
reverse : function |
Returns a reversed copy of the string |
sort_string_array : function &deprecated |
Sorts an array of strings. |
split : function &deprecated |
Splits a string into an array of strings according to a pattern. |
split1 : function &deprecated |
Splits a string once into a two-element array of strings according to a pattern. |
split_all : function &deprecated |
Splits a string into an array of strings according to a pattern. |
split_n : function &deprecated |
Splits a string a given number of times into an array of strings according to a pattern. |
split_string : function |
Splits a string into an array of strings according to a pattern. |
split_string1 : function |
Splits a string once into a two-element array of strings according to a pattern. |
split_string_all : function |
Splits a string into an array of strings according to a pattern. |
split_string_n : function |
Splits a string a given number of times into an array of strings according to a pattern. |
str_shell_escape : function |
Takes a string and escapes characters that would allow execution of commands at the shell level. |
str_smith_waterman : function |
Uses the Smith-Waterman algorithm to find similar/overlapping substrings. |
str_split : function |
Splits a string into substrings with the help of an index vector of cutting points. |
strcmp : function |
Lexicographically compares two strings. |
string_cat : function |
Concatenates all arguments into a single string. |
string_fill : function |
Generates a string of a given size and fills it with repetitions of a source string. |
string_to_ascii_hex : function |
Returns an ASCII hexadecimal representation of a string. |
strip : function |
Strips whitespace at both ends of a string. |
strstr : function |
Locates the first occurrence of one string in another. |
sub : function |
Substitutes a given replacement string for the first occurrence of a pattern in a given string. |
sub_bytes : function |
Get a substring from a string, given a starting position and length. |
subst_string : function |
Substitutes each (non-overlapping) appearance of a string in another. |
to_lower : function |
Replaces all uppercase letters in a string with their lowercase counterpart. |
to_string_literal : function |
Replaces non-printable characters in a string with escaped sequences. |
to_upper : function |
Replaces all lowercase letters in a string with their uppercase counterpart. |
cat_string_array
Type: | function (a: string_array ) : string |
---|---|
Attributes: | &deprecated |
Concatenates all elements in an array of strings.
A: | The string_array (table[count] of string ). |
---|---|
Returns: | The concatenation of all elements in a. |
See also: cat
, cat_sep
, string_cat
, cat_string_array_n
, fmt
, join_string_vec
, join_string_array
cat_string_array_n
Type: | function (a: string_array , start: count , end: count ) : string |
---|---|
Attributes: | &deprecated |
Concatenates a specific range of elements in an array of strings.
A: | The string_array (table[count] of string ). |
---|---|
Start: | The array index of the first element of the range. |
End: | The array index of the last element of the range. |
Returns: | The concatenation of the range [start, end] in a. |
See also: cat
, string_cat
, cat_string_array
, fmt
, join_string_vec
, join_string_array
clean
Type: | function (str: string ) : string |
---|
Replaces non-printable characters in a string with escaped sequences. The mappings are:
- values not in [32, 126] to
\xXX
If the string does not yet have a trailing NUL, one is added internally.
In contrast to escape_string
, this encoding is not fully reversible.`
Str: | The string to escape. |
---|---|
Returns: | The escaped string. |
See also: to_string_literal
, escape_string
edit
Type: | function (arg_s: string , arg_edit_char: string ) : string |
---|
Returns an edited version of a string that applies a special
“backspace character” (usually \x08
for backspace or \x7f
for DEL).
For example, edit("hello there", "e")
returns "llo t"
.
Arg_s: | The string to edit. |
---|---|
Arg_edit_char: | A string of exactly one character that represents the “backspace character”. If it is longer than one character Bro generates a run-time error and uses the first character in the string. |
Returns: | An edited version of arg_s where arg_edit_char triggers the deletion of the last character. |
See also: clean
, to_string_literal
, escape_string
, strip
escape_string
Type: | function (s: string ) : string |
---|
Replaces non-printable characters in a string with escaped sequences. The mappings are:
- values not in [32, 126] to
\xXX
\
to\\
In contrast to clean
, this encoding is fully reversible.`
Str: | The string to escape. |
---|---|
Returns: | The escaped string. |
See also: clean
, to_string_literal
find_all
Type: | function (str: string , re: pattern ) : string_set |
---|
Finds all occurrences of a pattern in a string.
Str: | The string to inspect. |
---|---|
Re: | The pattern to look for in str. |
Returns: | The set of strings in str that match re, or the empty set. |
find_last
Type: | function (str: string , re: pattern ) : string |
---|
Finds the last occurrence of a pattern in a string. This function returns
the match that starts at the largest index in the string, which is not
necessarily the longest match. For example, a pattern of /.*/
will
return the final character in the string.
Str: | The string to inspect. |
---|---|
Re: | The pattern to look for in str. |
Returns: | The last string in str that matches re, or the empty string. |
gsub
Type: | function (str: string , re: pattern , repl: string ) : string |
---|
Substitutes a given replacement string for all occurrences of a pattern in a given string.
Str: | The string to perform the substitution in. |
---|---|
Re: | The pattern being replaced with repl. |
Repl: | The string that replaces re. |
Returns: | A copy of str with all occurrences of re replaced with repl. |
See also: sub
, subst_string
hexdump
Type: | function (data_str: string ) : string |
---|
Returns a hex dump for given input data. The hex dump renders 16 bytes per line, with hex on the left and ASCII (where printable) on the right.
Data_str: | The string to dump in hex format. |
---|---|
Returns: | The hex dump of the given string. |
See also: string_to_ascii_hex
, bytestring_to_hexstr
Note
Based on Netdude’s hex editor code.
is_ascii
Type: | function (str: string ) : bool |
---|
Determines whether a given string contains only ASCII characters.
Str: | The string to examine. |
---|---|
Returns: | False if any byte value of str is greater than 127, and true otherwise. |
join_string_array
Type: | function (sep: string , a: string_array ) : string |
---|---|
Attributes: | &deprecated |
Joins all values in the given array of strings with a separator placed between each element.
Sep: | The separator to place between each element. |
---|---|
A: | The string_array (table[count] of string ). |
Returns: | The concatenation of all elements in a, with sep placed between each element. |
See also: cat
, cat_sep
, string_cat
, cat_string_array
, cat_string_array_n
, fmt
, join_string_vec
join_string_vec
Type: | function (vec: string_vec , sep: string ) : string |
---|
Joins all values in the given vector of strings with a separator placed between each element.
Sep: | The separator to place between each element. |
---|---|
Vec: | The string_vec (vector of string ). |
Returns: | The concatenation of all elements in vec, with sep placed between each element. |
See also: cat
, cat_sep
, string_cat
, cat_string_array
, cat_string_array_n
, fmt
, join_string_array
levenshtein_distance
Type: | function (s1: string , s2: string ) : count |
---|
Calculates the Levenshtein distance between the two strings. See Wikipedia for more information.
S1: | The first string. |
---|---|
S2: | The second string. |
Returns: | The Levenshtein distance of two strings as a count. |
reverse
Type: | function (str: string ) : string |
---|
Returns a reversed copy of the string
Str: | The string to reverse. |
---|---|
Returns: | A reversed copy of str |
sort_string_array
Type: | function (a: string_array ) : string_array |
---|---|
Attributes: | &deprecated |
Sorts an array of strings.
A: | The string_array (table[count] of string ). |
---|---|
Returns: | A sorted copy of a. |
See also: sort
split
Type: | function (str: string , re: pattern ) : string_array |
---|---|
Attributes: | &deprecated |
Splits a string into an array of strings according to a pattern.
Str: | The string to split. |
---|---|
Re: | The pattern describing the element separator in str. |
Returns: | An array of strings where each element corresponds to a substring in str separated by re. |
See also: split1
, split_all
, split_n
, str_split
, split_string1
, split_string_all
, split_string_n
, str_split
Note
The returned table starts at index 1. Note that conceptually the return value is meant to be a vector and this might change in the future.
split1
Type: | function (str: string , re: pattern ) : string_array |
---|---|
Attributes: | &deprecated |
Splits a string once into a two-element array of strings according to a
pattern. This function is the same as split
, but str is only
split once (if possible) at the earliest position and an array of two strings
is returned.
Str: | The string to split. |
---|---|
Re: | The pattern describing the separator to split str in two pieces. |
Returns: | An array of strings with two elements in which the first represents the substring in str up to the first occurence of re, and the second everything after re. An array of one string is returned when s cannot be split. |
See also: split
, split_all
, split_n
, str_split
, split_string
, split_string_all
, split_string_n
, str_split
split_all
Type: | function (str: string , re: pattern ) : string_array |
---|---|
Attributes: | &deprecated |
Splits a string into an array of strings according to a pattern. This
function is the same as split
, except that the separators are
returned as well. For example, split_all("a-b--cd", /(\-)+/)
returns
{"a", "-", "b", "--", "cd"}
: odd-indexed elements do not match the
pattern and even-indexed ones do.
Str: | The string to split. |
---|---|
Re: | The pattern describing the element separator in str. |
Returns: | An array of strings where each two successive elements correspond to a substring in str of the part not matching re (odd-indexed) and the part that matches re (even-indexed). |
See also: split
, split1
, split_n
, str_split
, split_string
, split_string1
, split_string_n
, str_split
split_n
Type: | function (str: string , re: pattern , incl_sep: bool , max_num_sep: count ) : string_array |
---|---|
Attributes: | &deprecated |
Splits a string a given number of times into an array of strings according
to a pattern. This function is similar to split1
and
split_all
, but with customizable behavior with respect to
including separators in the result and the number of times to split.
Str: | The string to split. |
---|---|
Re: | The pattern describing the element separator in str. |
Incl_sep: | A flag indicating whether to include the separator matches in the
result (as in split_all ). |
Max_num_sep: | The number of times to split str. |
Returns: | An array of strings where, if incl_sep is true, each two successive elements correspond to a substring in str of the part not matching re (odd-indexed) and the part that matches re (even-indexed). |
See also: split
, split1
, split_all
, str_split
, split_string
, split_string1
, split_string_all
, str_split
split_string
Type: | function (str: string , re: pattern ) : string_vec |
---|
Splits a string into an array of strings according to a pattern.
Str: | The string to split. |
---|---|
Re: | The pattern describing the element separator in str. |
Returns: | An array of strings where each element corresponds to a substring in str separated by re. |
See also: split_string1
, split_string_all
, split_string_n
, str_split
split_string1
Type: | function (str: string , re: pattern ) : string_vec |
---|
Splits a string once into a two-element array of strings according to a
pattern. This function is the same as split_string
, but str is
only split once (if possible) at the earliest position and an array of two
strings is returned.
Str: | The string to split. |
---|---|
Re: | The pattern describing the separator to split str in two pieces. |
Returns: | An array of strings with two elements in which the first represents the substring in str up to the first occurence of re, and the second everything after re. An array of one string is returned when s cannot be split. |
See also: split_string
, split_string_all
, split_string_n
, str_split
split_string_all
Type: | function (str: string , re: pattern ) : string_vec |
---|
Splits a string into an array of strings according to a pattern. This
function is the same as split_string
, except that the separators
are returned as well. For example, split_string_all("a-b--cd", /(\-)+/)
returns {"a", "-", "b", "--", "cd"}
: odd-indexed elements do match the
pattern and even-indexed ones do not.
Str: | The string to split. |
---|---|
Re: | The pattern describing the element separator in str. |
Returns: | An array of strings where each two successive elements correspond to a substring in str of the part not matching re (even-indexed) and the part that matches re (odd-indexed). |
See also: split_string
, split_string1
, split_string_n
, str_split
split_string_n
Type: | function (str: string , re: pattern , incl_sep: bool , max_num_sep: count ) : string_vec |
---|
Splits a string a given number of times into an array of strings according
to a pattern. This function is similar to split_string1
and
split_string_all
, but with customizable behavior with respect to
including separators in the result and the number of times to split.
Str: | The string to split. |
---|---|
Re: | The pattern describing the element separator in str. |
Incl_sep: | A flag indicating whether to include the separator matches in the
result (as in split_string_all ). |
Max_num_sep: | The number of times to split str. |
Returns: | An array of strings where, if incl_sep is true, each two successive elements correspond to a substring in str of the part not matching re (even-indexed) and the part that matches re (odd-indexed). |
See also: split_string
, split_string1
, split_string_all
, str_split
str_shell_escape
Type: | function (source: string ) : string |
---|
Takes a string and escapes characters that would allow execution of
commands at the shell level. Must be used before including strings in
system
or similar calls.
Source: | The string to escape. |
---|---|
Returns: | A shell-escaped version of source. |
See also: system
str_smith_waterman
Type: | function (s1: string , s2: string , params: sw_params ) : sw_substring_vec |
---|
Uses the Smith-Waterman algorithm to find similar/overlapping substrings. See Wikipedia.
S1: | The first string. |
---|---|
S2: | The second string. |
Params: | Parameters for the Smith-Waterman algorithm. |
Returns: | The result of the Smith-Waterman algorithm calculation. |
str_split
Type: | function (s: string , idx: index_vec ) : string_vec |
---|
Splits a string into substrings with the help of an index vector of cutting points.
S: | The string to split. |
---|---|
Idx: | The index vector (vector of count ) with the cutting points. |
Returns: | A vector of strings. |
strcmp
Type: | function (s1: string , s2: string ) : int |
---|
Lexicographically compares two strings.
S1: | The first string. |
---|---|
S2: | The second string. |
Returns: | An integer greater than, equal to, or less than 0 according as s1 is greater than, equal to, or less than s2. |
string_cat
Type: | function (va_args: any ) : string |
---|
Concatenates all arguments into a single string. The function takes a variable number of arguments of type string and stitches them together.
Returns: | The concatenation of all (string) arguments. |
---|
See also: cat
, cat_sep
, cat_string_array
, cat_string_array_n
, fmt
, join_string_vec
, join_string_array
string_fill
Type: | function (len: int , source: string ) : string |
---|
Generates a string of a given size and fills it with repetitions of a source string.
Len: | The length of the output string. |
---|---|
Source: | The string to concatenate repeatedly until len has been reached. |
Returns: | A string of length len filled with source. |
string_to_ascii_hex
Type: | function (s: string ) : string |
---|
Returns an ASCII hexadecimal representation of a string.
S: | The string to convert to hex. |
---|---|
Returns: | A copy of s where each byte is replaced with the corresponding hex nibble. |
strip
Type: | function (str: string ) : string |
---|
Strips whitespace at both ends of a string.
Str: | The string to strip the whitespace from. |
---|---|
Returns: | A copy of str with leading and trailing whitespace removed. |
strstr
Type: | function (big: string , little: string ) : count |
---|
Locates the first occurrence of one string in another.
Big: | The string to look in. |
---|---|
Little: | The (smaller) string to find inside big. |
Returns: | The location of little in big, or 0 if little is not found in big. |
sub
Type: | function (str: string , re: pattern , repl: string ) : string |
---|
Substitutes a given replacement string for the first occurrence of a pattern in a given string.
Str: | The string to perform the substitution in. |
---|---|
Re: | The pattern being replaced with repl. |
Repl: | The string that replaces re. |
Returns: | A copy of str with the first occurence of re replaced with repl. |
See also: gsub
, subst_string
sub_bytes
Type: | function (s: string , start: count , n: int ) : string |
---|
Get a substring from a string, given a starting position and length.
S: | The string to obtain a substring from. |
---|---|
Start: | The starting position of the substring in s, where 1 is the first character. As a special case, 0 also represents the first character. |
N: | The number of characters to extract, beginning at start. |
Returns: | A substring of s of length n from position start. |
subst_string
Type: | function (s: string , from: string , to: string ) : string |
---|
Substitutes each (non-overlapping) appearance of a string in another.
S: | The string in which to perform the substitution. |
---|---|
From: | The string to look for which is replaced with to. |
To: | The string that replaces all occurrences of from in s. |
Returns: | A copy of s where each occurrence of from is replaced with to. |
to_lower
Type: | function (str: string ) : string |
---|
Replaces all uppercase letters in a string with their lowercase counterpart.
Str: | The string to convert to lowercase letters. |
---|---|
Returns: | A copy of the given string with the uppercase letters (as indicated
by isascii and isupper ) folded to lowercase
(via tolower ). |
to_string_literal
Type: | function (str: string ) : string |
---|
Replaces non-printable characters in a string with escaped sequences. The mappings are:
- values not in [32, 126] to
\xXX
\
to\\
'
and""
to\'
and\"
, respectively.
Str: | The string to escape. |
---|---|
Returns: | The escaped string. |
See also: clean
, escape_string
to_upper
Type: | function (str: string ) : string |
---|
Replaces all lowercase letters in a string with their uppercase counterpart.
Str: | The string to convert to uppercase letters. |
---|---|
Returns: | A copy of the given string with the lowercase letters (as indicated
by isascii and islower ) folded to uppercase
(via toupper ). |