I need to add the following features into news shaker to make it more usefull: Done (has an X if finished): X * Delete category and all related sites X * Delete category place all remaining sites in another category X * Ability to have one category be a sub category of another * 2 level categorization (related to the sub category idea above) X * Automated “real world” testing with accuracy for all categories after a new model build. Should consist of 20 unseen and unmodeled sites that are hand categorized and then have them categorized. X * A way to save the results from the real world testing in the database and display them. * A way to post articles that aren’t links but are actually html files into the system. (This also allows visitors to view this file.) X * ability for people to vote for a file that is in the wrong category to be recategorized X * Increased categorization speed * Start a test from a new UID and then track where all the results go and view each result individually. * Making sure that two of the same sites are never added to the database * Checking and updating sites and getting rid of no longer existant ones. X * Ability for users to report errors and admins to view them and delete X * Ability for users to request categories and admins to view them and delete X * Ability for administrator to recategorize based on users votes to recategorizeOn another note i have increased accuracy on testing to the 94% overall accuracy on known documents and i am getting and average of 25% for unknown documents, which isn’t horrible but i would like to do much better. I have now began to study and look into a transductive approach that i might begin to use, depending on the results of the next bits of testing.

blog comments powered by Disqus
Dan Mayer Profile Pic
Welcome to Dan Mayer's development blog. I primary write about Ruby development, distributed teams, and dev/PM process. The archives go back to my first CS classes during college when I was first learning programming. I contribute to a few OSS projects and often work on my own projects, You can find my code on github.

Twitter @danmayer

Github @danmayer