Skip to main content

Vocabulary of natural language processing (POC)

Search from vocabulary

Concept information

Preferred term

word embedding  

Definition

  • Process by which words are mapped into real-valued vectors. (ARTES)

Broader concept

Entry terms

  • word embeddings

Note

  • Each word in the vocabulary is represented by a vector w ∈ RD, where D is the dimension fixed in advance. One of the major advantages of representing words as vectors is the fact that standard similarity measures such as cosine similarity or Euclidean distance can be used, enabling semantic distances to be calculated between words. Contrary to what we may be led to think by the recent popularity surge for word embeddings, the use of compact, vectorial word representations is by no means new, and the theoretical underpinnings can be traced back at least to the 1950s and the theory of distributional semantics. The distributional hypothesis, the idea that you can define a word by the company it keeps (Harris, 1954), popularised in the 1950s by philosophers and linguists such as Harris (1954), Firth (1957) and Wittgenstein (1953), has been influential in the way textual input is represented in NLP [...) (Bawden, Rachel, Going beyond the sentence : Contextual Machine Translation of Dialogue, Université Paris-Saclay, 2018)

In other languages

  • French

  • plongement de mot
  • plongement de mots
  • représentation distribuée de mots

URI

http://data.loterre.fr/ark:/67375/8LP-M1NFTGZ7-1

Download this concept:

RDF/XML TURTLE JSON-LD Last modified 5/29/24