Databricks Exam Databricks Machine Learning Associate Topic 2 Question 11 Discussion

Actual exam question for Databricks's Databricks Machine Learning Associate exam

Question #: 11
Topic #: 2

[All Databricks Machine Learning Associate Questions]

A data scientist has a Spark DataFrame spark_df. They want to create a new Spark DataFrame that contains only the rows from spark_df where the value in column price is greater than 0.

Which of the following code blocks will accomplish this task?

Aspark_df[spark_df['price'] > 0]

Bspark_df.filter(col('price') > 0)

CSELECT * FROM spark_df WHERE price > 0

Dspark_df.loc[spark_df['price'] > 0,:]

Espark_df.loc[:,spark_df['price'] > 0]

Show Suggested Answer

Suggested Answer: C

Random Forest is a machine learning algorithm that typically uses bagging (Bootstrap Aggregating). Bagging is a technique that involves training multiple base models (such as decision trees) on different subsets of the data and then combining their predictions to improve overall model performance. Each subset is created by randomly sampling with replacement from the original dataset. The Random Forest algorithm builds multiple decision trees and merges them to get a more accurate and stable prediction.

Databricks documentation on Random Forest: Random Forest in Spark ML

by Tambra at Jul 17, 2024, 11:09 AM

Limited Time Offer

25%

Off

Get Premium Databricks Machine Learning Associate Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

A) spark_df[spark_df[\'price\'] > 0]

upvoted 0 times

...

Josphine

3 months ago

I think the correct answer is A.

upvoted 0 times

...