As in Word embedding has been trained for Natural Language Processing, It has been observed that algorithm is biased with gender. This is pointed out by Tolga Bolukbasi in his paper Man is to Computer Programmer as Woman is to Homemaker? The obvious reason that explains the gender biasness in the algorithms is due to corpus of data, which is used to train that word embedding matrix (word to vector mapping). The solution can be retrained the models with text data which is not biased, but that is not feasible, all available text data, huge piles of stories, novels and literature will be useless then, and not so surprisingly, most of stories and literature is full of such biases. Tolga Bolukbasi proposed a very simple, mathematical approach to correct the gender biasness in word embedding matrix in the same paper. (We could only wish, if there exist such innocent looking mathematical approach to change the biasness in human behaviour).
Here I am trying to show the visualisation of the algorithm that fixes the problem
Visualisation of Neutralisation
Neutralisation trick as per paper, is simply compute the bias component, which is the projection of biased word (word) on bias axis (biax), substract is from biased word.
and this is how it works. Word technology is more inclined to male than female, orange line in the figure is projection of word technology on gender axis, after subtraction bias component from technology, now resulting word technology unbiased for gender axis.
Visualisation of Equalisation (for pair of words)
Equalisation is is explained in the paper, however,due to many equations, it is bit less intuitive, what exactly it is doing. Here are the equations, followed by figures, plotted for 2-dimensional vectors to ease the understanding.
Considering embedding vectors ew1 = [ 5, 1.2 ], ew2 = [-2.5, 0.5] and bias axis
bax =[7 , 0], following the equations shown above, this is what we get. The resulting words are e1 and e2, which can be seen as quite the symmetrical along the orthogonal axis of bias.
Trying out ew1 = [2, 1.2 ], ew2 = [ -2.5, 0.5 ] and bias axis – bax = [ 3, 0.5 ]
and ew1 = [ 5, 1.2 ], ew2 = [ -2.5, 0.5 ] and biax = [ 4, -3 ]
I still wonder if this neutralizaton is not affecting the performance of other task that it meant to do??
Bolukbasi, Tolga, et al. “Man is to computer programmer as woman is to homemaker? debiasing word embeddings.” Advances in Neural Information Processing Systems. 2016. [link]