A data scientist is analyzing a data set with categorical features and would like to make those features more useful when building a model. Which of the following data transformation techniques should the data scientist use? (Choose two.)
One-hot encoding creates binary indicator columns for each category, allowing models to treat nominal categories without implying any order.
Label encoding maps categories to integer labels, which can be useful for tree-based models or when you need a single numeric column (though you must ensure the algorithm can handle treated ordinality appropriately).
Currently there are no comments in this discussion, be the first to comment!