Here is a quick and interesting project for you algorithm lovers.
You are given a set of keywords (loaded from a file). For example:
banana cake recipe
blue cake recipe
salmon pattie recipe
monkey bread recipe
corned beef and bread recipe
blue punch recipe
jello poke cake recipe
You need to output a list of keyword patterns, sorted by number of times the pattern appears in keywords:
[...] recipe 7
[...] cake recipe 3
[...] bread recipe 2
blue [...] 2
The additional input parameter can be minimum number of words a pattern can have to be considered. If for example this was set at 2, the output would become:
[...] cake recipe 3
[...] bread recipe 2
You should ignore noncharacter-nondigit for example "cake recipe ! " is same as cake recipe.
The real trick is to write a good algorithm that will work fast - my test file contains nearly 200,000 keywords !
Sounds interesting? Apply!