Arabic Word Semantic Similarity

Our new Arabic Word Semantic Similarity benchmark dataset is available contained in the paper of the same name on my publications page. This work is performed with my PhD student Faaza Almarsoomi. We expect it to be of use to scientists who wish to evaluate and compare Arabic Word Semantic Similarity measures.


Seminal Papers #1: Features of Similarity

This is the first of a series of seminal papers

Tversky, A., Features of Similarity. Psychological Review, 1977. 84 (4 ): p. 327-352.

Tversky’s paper (Tversky, 1977) is fundamentally important as it set out to unify the existing work on set-theoretical models of similarity into a single model. The dominant models of similarity at the time were “geometric”, measuring distance rather than similarity, but always on the assumption that distance could be converted to (or negatively correlated with) similarity.

The paper includes an analysis using measurement theory (axiomatic measurement) which appealed to me because of my backgroundin Software Engineering which makes use of these axioms (Minimality, Symmetry, The Triangle Inequality).

The paper contains lots of interesting ideas, for example practical implications for the collection of similarity judgements from humans.

All of these seminal papers are widely cited, but sometines at second or third hand and I recommend checking the original source if you are going to use it.

To the best of my knowledge, this paper is not available online. I got my copy through inter-library loan. If you know of a copy legitimately available online please post a comment to this blog entry.

Welcome to Semantic Similarity

This website is intended to diseminate my own findings in the fields of Text Processing, Text Understanding and Text Mining. Because I am particularly interested in the application of Short Text Semantic Similarity in these fields I have called the site “Semantic Similarity” (the main focus of my PhD Thesis).

Appart from my work e-mail address at Manchester Metropolitan University,  I have also set up an e-mail account specifically for contacts from this website:

drjamesdoshea <at> gmai l<dot> com