VSM for text normalization

In many occasions we have textual labels in structured data. The case we considered in this paper is the industry designations to companies. While there are standard to govern the industry designation, its use is found to be arbitrary. [more]

Numbers for machine learning

How much data is enough? This was the question for any statistical exercise, such as experiments, simulations, surveys. But nowadays, this is also the question for machine learning. [more]