The Definitive Guide to Embeddings
The Definitive Guide to Embeddings | FeatureForm
“To understand embeddings, we must first understand the basic requirements of a machine learning model. Specifically, most machine learning algorithms can only take low-dimensional numerical data as inputs.
In the neural network below each of the input features must be numeric. That means that in domains such as recommender systems, we must transform non-numeric variables (ex. items and users) into numbers and vectors. We could try to represent items by a product ID; however, neural networks treat numerical inputs as continuous variables. That means higher numbers are “greater than” lower numbers. It also sees numbers that are similar as being similar items. This makes perfect sense for a field like “age” but is nonsensical when the numbers represent a categorical variable. Prior to embeddings, one of the most common methods used was one-hot encoding…”