December 3, 2008

Java split() of String | Multiple whitespace characters

Written by

The split method of the String class is very useful when you want to tokenize a string. Its power lies in the fact that it accepts a string, as a parameter, which can be a regular expression. However you must be careful when you want to split a string using the whitespace character as a delimiter. Consider the following snippet of code:

What’s the output produced by the previous code? If you think it is the following one you’re wrong:

-Testing-
-split-
-using-
-two-
-whitespace-
-characters-

The actual output is instead this one:


-Testing-
-split-
-using-
-two-

-whitespace-
-characters-

Where in the hell did that empty string come out from? It comes out from the two whitespace characters that are between the word two and whitespace of the str string. If this is what you want OK. However, most of the time, you will want to discard that empty string from your resulting string array. You can obtain this result by using the \\s+ regex in place of \\s. Basically, the previuos code becomes:

Category : ITProgramming

Tags :

Comments

3 Responses

  1. Rob says:

    Is there a way to capture (Define) as a value one of the words in the string?

  2. I’m not sure I understood your question, but the split method splits the string into a string array so you can point to the word you need by using common array accessing methods.

  3. Julia says:

    Sure hun use tokens[0], tokens[1] etc for the first second words ….. Its in an array after all :)

    Julia XX

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

Proudly powered by WordPress and Sweet Tech Theme