Wednesday, April 16, 2014

Similarity Matrix


http://bellm.org/blog/2013/02/10/tracing-the-changing-state-of-the-union-with-text-analysis/

A similarity matrix is a matrix of scores that represent the similarity between and number of data points.  In the figure above, it measures similarities in presidential State of the Union addresses with blue words being uncommon and white being common.  Red is the exact, obviously.  The similarity matrix shows presidents of the near past use more similar words with one another than presidents of distant past.  Overall, the distance of presidents proportionally relates to same verbiage.  Washington and Obama are very dissimilar.

No comments:

Post a Comment