When using RegEx in Java you might face the need of treating one or more metacharacters as ordinary characters. As a reminder the metacharacters in a Java RegEx are:
([{^$|)?*+.
If you want to treat them as ordinary characters you have two options:
- Escape the metacharacter with a backslash,
- Enclose the whole string that contains metacharacters within
Q
andE
Q
means: “quotes all characters until E”, while E
ends the quotes.
The following example will hopefully to clarify the subject:
String test = "I want to replace the . with the ,";
String replaced = test.replaceAll(".", ",");
System.out.println(replaced);
What do you expect the above method will do? Do you think the following string will be displayed?
I want to replace the , with the ,
If yes then you might be surprised to find out that what you really get is instead:
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
The problem is that the replaceAll
method of the String
class accept a RegExp as its first parameter.
Since .
means any character, so writing test.replaceAll(".", ",");
is translated in:
“Replace ANY character of the test
string with a comma”. As I said previously you can fix that in two ways.
Either you escape the . with a or enclose it within Q
and E
. What I didn’t say is that, since the is a
metacharacter itself, you need to escape it too. :-)
Translating this in Java you have:
test.replaceAll("\.", ",");
test.replaceAll("\Q.\E", ",");
I prefer to use the first method when the metacharacter is just one. However, when I have more metacharacters or I don’t know at compile time what my string is going to be, I use the second method.