public final class KeyRegexpFilter extends ScanFilter
The regular expression will be applied on the server-side, on the row key. Rows for which the key doesn't match will not be returned to the scanner, which can be useful to carefully select which rows are matched when you can't just do a prefix match, and cut down the amount of data transfered on the network.
Don't use an expensive regular expression, because Java's implementation uses backtracking and matching will happen on the server side, potentially on many many row keys. See Regular Expression Matching Can Be Simple And Fast for more details on regular expression performance (or lack thereof) and what "backtracking" means.
This means you need to be careful about using regular expressions supplied by users as that would allow them to easily DDoS HBase by sending prohibitively expensive regexps that would consume all CPU cycles and cause the entire HBase node to time out.
Constructor and Description |
---|
KeyRegexpFilter(byte[] regexp)
Sets a regular expression to filter results based on the row key.
|
KeyRegexpFilter(byte[] regexp,
Charset charset)
Sets a regular expression to filter results based on the row key.
|
KeyRegexpFilter(String regexp)
Sets a regular expression to filter results based on the row key.
|
KeyRegexpFilter(String regexp,
Charset charset)
Sets a regular expression to filter results based on the row key.
|
public KeyRegexpFilter(String regexp)
This is equivalent to calling KeyRegexpFilter(String, Charset)
with the ISO-8859-1 charset in argument.
regexp
- The regular expression with which to filter the row keys.public KeyRegexpFilter(String regexp, Charset charset)
regexp
- The regular expression with which to filter the row keys.charset
- The charset used to decode the bytes of the row key into a
string. The RegionServer must support this charset, otherwise it will
unexpectedly close the connection the first time you attempt to use this
scanner.KeyRegexpFilter(byte[], Charset)
public KeyRegexpFilter(byte[] regexp)
This is equivalent to calling KeyRegexpFilter(byte[], Charset)
with the ISO-8859-1 charset in argument.
regexp
- The binary regular expression with which to filter
the row keys.public KeyRegexpFilter(byte[] regexp, Charset charset)
regexp
- The regular expression with which to filter the row keys.charset
- The charset used to decode the bytes of the row key into a
string. The RegionServer must support this charset, otherwise it will
unexpectedly close the connection the first time you attempt to use this
scanner.