How can I handle rare levels in my categorical variables?
4 vues
Réponse
You can use the 'GROUPRARE' method. The 'preprocessRare' parameter, when set to True, groups rare levels at the start. The rarity is determined by 'rareThreshold' (a frequency count) or 'rareThresholdPercent' (a percentage of total observations). Levels falling below this threshold are combined into a single group.