« Converting documents to word vectors | Main | Week of stuff »

Text to vectors progress

I finished the basic text to vector, but in its current form it is kinda useless. It only outputs to system.out in a format that isn't usefull yet. Also it doesn't save the dictionary that it uses to generate the vectors. So you can't generate any new word vectors to run comparisons with. I will be adding in features to save the generation dictionary, to import a saved dictionary, and to output the feature word vectors in the SVM light format. I also have recently found some cool java stemming software that I will add in to my project after I get the first usable version out. If your interested in the stemming software here are the links:

Porter Stemming Algorythm

Lancaster Stemmer

Post a comment


Type the characters you see in the picture above.

Web 2.0 craziness

View Dan Mayer's profile on LinkedIn


I Power Seekler
I Power Seekler

www.flickr.com
This is a Flickr badge showing public photos and videos from mayer_dan. Make your own badge here.

Creative Commons License
This weblog is licensed under a Creative Commons License.