NOTUVY regex is a wrapper around the standard Java java.util.regex package. It adds no new functionality, but it makes writing sophisiticated regex logic simpler and cleaner than the standard package.
A regex evaluation consists of two parts -- the String input, and the regular expression Pattern. We say a PatternMatcher performs a *match*. This is an operation *on* an input string *using* a Pattern.
There are several operations for which patterns are used. The standard framework supports two.
One is replacement where entire original string is transformed into a new result string. The other is extraction where individual string values are parsed from the original string and returned as separate values.
These different operations are encapsulated in the ResultStrategy, where there are two replacement operations and two extraction operations.
The NOTUVY regex framework adds a third operation, composition, which is almost a hybrid of the other two. This takes the original string, extracts individual parts from it, and combines them into a single result string.
With NOTUVY regex, when doing both replacement and composition, the final result string is computed and set, and can be retrieved with the result() method in PatternMatcher.
In the standard framework, there are two replacement operations: replaceFirst and replaceAll. These are implemented on the Matcher class (and also in String).
In NOTUVY regex, the same two operations are implemented in the Strategy class with the first() and all() methods.
All potential exceptions generated within this framework are unchecked exceptions. The majority of these come directly from java.util.regex. However, a few originate from NOTUVY regex in the form of a PatternMatcherException:
The following code shows problematic error handling:
PatternMatcher pm = PatternMatcher.createOn("input String"); if (pm.using("([A-Z]").found()) { ... }
The problem is that the pattern is malformed (no closing parenthesis matching the opening one). Because the pattern is not compiled until runtime, the resulting PatternSyntaxException will not be thrown until that line of code is executed. If the code is located in an infrequently executed branch, it may remain hidden until an inopportune time.
This can be avoided by using static compilation of the patterns. This way, the error is discovered immediately at class load time. There are two ways to achieve this. First, keep the logic the same but use a pre-compiled pattern:
private static final Pattern PAT = pattern.compile("([A-Z]"); PatternMatcher pm = PatternMatcher.createOn("input String"); if (pm.using(PAT).found()) { ... }
Alternatively, the PatternMatcher itself can be made static:
private static final PatternMatcher PM = PatternMatcher.createUsing("([A-Z]"); if (PM.on("input String").found()) { ... }
In both of these cases the pattern syntax error still exists. The advantage is that it will be reported immediately, rather than remaining buried.
Statically declared PatternMatcher instances can be problematic because they are not thread-safe.
private static final PatternMatcher PM = PatternMatcher.createUsing("([A-Z])"); if (PM.on("input String").found()) { ... }
The problem with this logic is that PatternMatcher has internal state, so if multiple thread try to access this variable, they will interfere with each other. The correct way to do this is to make the variable immutable:
private static final PatternMatcher PM_FACTORY = PatternMatcher.createUsing("([A-Z])").immutable(); PatternMatcher pattern = PM_FACTORY.on("input String"). if (pattern.found()) { ... }
This will make the logic thread-safe. The reason this works is because the call to on() now returns a clone of the original object. Thus, each thread will create its own copy to operate on.
Be aware that the new object should be captured as a separate variable if results are to be extracted from it in a later step. Note that the following code is erroneous because it attempts to extract the result value from the immutable instance.
private static final PatternMatcher PM = PatternMatcher.createUsing("([A-Z])", Strategy.extractGroup(1)).immutable(); if (PM.on("input String").found()) { System.out.println(PM.result()); }
In standard Java string pattern replacement, it is simple to perform extraction and replacement. Consider the following example where the pattern searches for the first complete word that does not contain the letter "s", and it places square brackets around that word in the returned string.
"this is the test".replaceFirst("\\b[^sS]+\\b", "[$0]"); This produces the result: "this is [the] test"
This extracts a string value, and uses it unchanged in the result. However, if we do want to transform it (change or convert it somehow), the standard framework does not support it. This is accomplished with the NOTUVY regex framework with the use of StringTransformable.