You are working on a email spam filtering assignment, while working on this you find there is new word e.g. HadoopExam comes in email, and in your solutions you never come across this word before, hence probability of this words is coming in either email could be zero. So which of the following algorithm can help you to avoid zero probability?
Refer to the exhibit.
You are building a decision tree. In this exhibit, four variables are listed with their respective values of info-gain.
Based on this information, on which attribute would you expect the next split to be in the decision tree?
In which phase of the data analytics lifecycle do Data Scientists spend the most time in a project?
Suppose there are three events then which formula must always be equal to P(E1|E2,E3)?
In which phase of the data analytics lifecycle do Data Scientists spend the most time in a project?
Currently there are no comments in this discussion, be the first to comment!