After going through a little less than 3000 emails. I have finished testing nad doing any work on SVMMail. It still is sitting at a 97.5% accuracy. I am sure this could be increased, but I need to move all of my focus back to my primary project, News Shaker. On the News Shaker front. I have added about 350 new manually categorized sites across the database. I am going to rebuild all of the models and see if the increased training data brings my percents up to a more reasonable level. Then once I have a little better percent accuracy I will begin all of the auto categorization code and just start to let the system go crazy and see how many sites it can categorize correctly when left to its own devices. Should be an interesting time next week. That is if the machine boots up. Someone was working on my system and now it freezes on boot up. I am sure all my data is still there and I have a fairly recent back up but hopefully this can be sorted out before the begining of next week. I also have began reading Managing Gigabytes which is about compressing and indexing documents and images. I also ordered a new book about machine learning and artifical intellegence that i will begin reading soon. Perhaps they will provide me with some new ideas on how to improve my system.

blog comments powered by Disqus
Dan Mayer Profile Pic

I primary write about Ruby development, distributed teams, & dev/PM process. The archives go back to my first CS classes during college when I was learning to write software. I contribute to a few OSS projects and often work on my own projects. You can find my code on github.

Twitter @danmayer

Github @danmayer