Click to See Complete Forum and Search --> : Getting multiple substrings of a string


holypromise
April 5th, 2010, 07:07 PM
Hi,
I'm looking for a way that I could get multiple substrings out of a string that look something like this:

Original string: cat ate mouse ->
sub1: ate mouse->
sub2: mouse

So each substring will have 1 less token than the string before it.
Is anyone got any idea of how could I do this?

Thank you very much!

GenePeer
April 5th, 2010, 08:34 PM
sorry but i'm not use to programming lingo, but i'll go ahead and assume that by token u mean word.


public class test{
public static void main(String[] args){
String str = "Cat ate mouse"; //you can put any string u want here.
String[] words = str.split(" "); //split the sentence into separate words
String[] subs = new String[words.length-1]; //Will contain the list of strings with "one less token"
for(int i=0; i<subs.length; i++) subs[i]="";
for(int i=1; i<words.length; i++)
for(int j=0; j<i; j++)
subs[j]+=words[i]+" ";
for(String sub:subs) System.out.println(sub);
}
}


or better yet, here's a function that will take a string and remove the first word from it.


public static String sub(String str){
String[] words = str.split(" ");
String sub = "";
for(int i=1; i<words.length; i++)
if(i==words.length-1)sub+=words[i];
else sub+=words[i]+" ";
return sub;
}

holypromise
April 5th, 2010, 09:18 PM
That's exactly what I'm looking for!
Thank you very much!

keang
April 6th, 2010, 03:34 AM
Some additional advice: When concatenating strings in a loop always use a StringBuilder (or StringBuffer if you need it to be thread safe). The performance cost of concatenating strings in loops multiple time, especially if they are large, can be enormous.

If you want more info on this read this (http://www.keang.co.uk/tutorial/stringBuilder.html).

holypromise
April 6th, 2010, 06:14 AM
Thank for the advice Keang! I definitely need to use stringbuilder on this one.

Trying to implement my own suffix tree that can take in a phrase rather than
just one string so it need to read heap load of documents in one go.