Monthly Archives: November 2018

Clustering Text with Transformed Document Vectors

A sister task to classification in machine learning is clustering. While classification requires up-front labeling of training data with class information, clustering is unsupervised. There is a large benefit to unattended grouping of text on disk and we would like to know if word-embeddings can help. In fact, once identified, these… Read more »

Want Clusters? How Many Will You Have?

      No Comments on Want Clusters? How Many Will You Have?

Ok, that was in jest, my apologies! But it is a question we should ask ourselves before embarking on a clustering exercise. Clustering hinges on the notion of distance. The members of a cluster are expected to be closer to that cluster’s centroid than they are to the centroids of other clusters…. Read more »