Solving word analogy problems by constructing linear algebra programmes

30th October 2018 : 16:30 - 17:30

Category: Seminar

Speaker: Dr Ken Kahn, Department of Education

Location: Department of Education, Seminar Room E

Convener: Professor Gabriel Stylianides & Dr James Robson

Word embeddings (https://en.wikipedia.org/wiki/Word_embedding) is a way to place words in a very high dimensional space. With the ability to obtain a vector of numbers from a word one can compute nearby words, find the closest word to the average of words, and even discover word analogies.

For example vec(‘man’)-vec(‘woman’)+vec(‘father’) should be very close to vec(‘mother’). And vec(‘fast’)-vec(‘faster’)+vec(‘slow’) should yield ‘slower’ as the closest word. Word embeddings have been used for question-answering systems, aiding in detecting the sentiment of text, detecting paraphrases, and translation between languages. Advanced mathematical topics occur naturally as one works with word embeddings. Examples include (1) generalising Euclidean distance to high dimensions and comparing it with cosine similarity, (2) using a rotation matrix to align word embeddings from different languages, and (3) ways of mapping high dimensional spaces to two or three dimensions for visualising relations between words. Demonstrations will include a word embedding programming library for school students and an application of sentence embedding to find the closest matching question.