De-Biasing Word Embeddings

Word embeddings can reflect the gender, ethnicity, age, sexual orientation and other biases of the text used to train the model.

For example, Man->Doctor, Woman->Nurse, which is not a correct mapping, and is biased based on gender.

The following is an overview of the steps to address this bias:

  1. Identify the bias direction

  2. Neutralize i.e. for every word that is not definitional (i.e. isn't defined to satisfy a given bias, for example, 'father' is defitional as it is only defined for the male gender, but 'doctor' is not definitional since it isn't defined for a particular gender), project it in the non-bias direction to get rid of bias

  3. Equalize the definitional words, i.e. make them equidistant from the non defitional words

Last updated