word embedding

explain word embeddings via automatic rule learning in text classification

In this project, we introduce a novel methodology to find out task-related dimensions within word embeddings. By harnessing the power of automatic rule learning, we effectively extract the critical dimensions relevant to particular tasks.

Rule-based Representation Learner (RRL) is a classifier that automatically learns interpretable non-fuzzy rules for data representation and classification.

The architecture of RRL

In this project, the word embeddings of text are initially fed into the RRL for gender classification, then we can derive the gender-related dimensions (identified as the “red dimensions”) from RRL learned rules. These gender-related dimension values will then be removed for subsequent tasks.

Word embedding explanation process

References