How can I handle rare levels in my categorical variables?

4 vues

Réponse

You can use the 'GROUPRARE' method. The 'preprocessRare' parameter, when set to True, groups rare levels at the start. The rarity is determined by 'rareThreshold' (a frequency count) or 'rareThresholdPercent' (a percentage of total observations). Levels falling below this threshold are combined into a single group.
Did this answer help you?
catTrans

dataPreprocess

See technical action