Independence Day Deal! Unlock 25% OFF Today – Limited-Time Offer - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks Machine Learning Associate Topic 3 Question 26 Discussion

Actual exam question for Databricks's Databricks Machine Learning Associate exam
Question #: 26
Topic #: 3
[All Databricks Machine Learning Associate Questions]

A data scientist wants to use Spark ML to one-hot encode the categorical features in their PySpark DataFrame features_df. A list of the names of the string columns is assigned to the input_columns variable.

They have developed this code block to accomplish this task:

The code block is returning an error.

Which of the following adjustments does the data scientist need to make to accomplish this task?

Show Suggested Answer Hide Answer
Suggested Answer: C

For large datasets, Spark ML uses iterative optimization methods to distribute the training of a linear regression model. Specifically, Spark MLlib employs techniques like Stochastic Gradient Descent (SGD) and Limited-memory Broyden--Fletcher--Goldfarb--Shanno (L-BFGS) optimization to iteratively update the model parameters. These methods are well-suited for distributed computing environments because they can handle large-scale data efficiently by processing mini-batches of data and updating the model incrementally.


Databricks documentation on linear regression: Linear Regression in Spark ML

Contribute your Thoughts:

Thurman
1 months ago
Maybe the error is because they forgot to add the 'sparkly' parameter to the OneHotEncoder. You know, to make it extra fabulous.
upvoted 0 times
...
Shannan
1 months ago
I heard the data scientist tried to one-hot encode their socks. Turns out they were just a bunch of ones and zeros!
upvoted 0 times
Daniela
11 days ago
C: They need to use VectorAssembler prior to one-hot encoding the features.
upvoted 0 times
...
Delisa
12 days ago
B: They need to use StringIndexer prior to one-hot encoding the features.
upvoted 0 times
...
Hannah
16 days ago
A: They need to specify the method parameter to the OneHotEncoder.
upvoted 0 times
...
Wilburn
16 days ago
A: They need to specify the method parameter to the OneHotEncoder.
upvoted 0 times
...
...
Alfred
1 months ago
VectorAssembler? Sounds like a superhero name. Maybe that's the solution, but I'm not sure.
upvoted 0 times
Pura
6 days ago
User3: Maybe they need to use StringIndexer before one-hot encoding the features.
upvoted 0 times
...
Lonny
8 days ago
User2: No, they should specify the method parameter to the OneHotEncoder.
upvoted 0 times
...
Dick
20 days ago
User1: I think the data scientist needs to use VectorAssembler before one-hot encoding.
upvoted 0 times
...
...
Rosendo
2 months ago
Ah, I see the issue. The method parameter is missing from the OneHotEncoder. We need to specify that.
upvoted 0 times
Essie
17 days ago
User1: Let's add that parameter and see if it works.
upvoted 0 times
...
Joaquin
1 months ago
User2: Yes, that's correct. That should fix the error.
upvoted 0 times
...
Eden
1 months ago
User1: I think we need to specify the method parameter to the OneHotEncoder.
upvoted 0 times
...
...
Jenelle
2 months ago
Wait, I think we need to use StringIndexer first to convert the string columns to numerical values. Then we can use OneHotEncoder.
upvoted 0 times
Latrice
4 days ago
D: And OneHotEncoder will then encode those numerical values as binary vectors.
upvoted 0 times
...
Osvaldo
5 days ago
C: That makes sense, StringIndexer will convert the strings to numerical values.
upvoted 0 times
...
Whitley
7 days ago
B: Then we can use OneHotEncoder to encode the categorical features.
upvoted 0 times
...
Reita
1 months ago
A: I think you're right, we should use StringIndexer first.
upvoted 0 times
...
...
Ligia
2 months ago
I believe they should also use StringIndexer before one-hot encoding the features to properly encode the categorical values.
upvoted 0 times
...
Quentin
2 months ago
Hmm, the error is probably due to the fit operation. Let's try removing that line and see if it works.
upvoted 0 times
Lawrence
1 months ago
User2: Yeah, let's try that and see if it fixes the error.
upvoted 0 times
...
Adaline
2 months ago
User1: I think we should remove the line with the fit operation.
upvoted 0 times
...
...
Essie
2 months ago
I agree with Daisy. Without specifying the method parameter, the code won't work properly.
upvoted 0 times
...
Daisy
2 months ago
I think the data scientist needs to specify the method parameter to the OneHotEncoder.
upvoted 0 times
...

Save Cancel
az-700  pass4success  az-104  200-301  200-201  cissp  350-401  350-201  350-501  350-601  350-801  350-901  az-720  az-305  pl-300  

Warning: Cannot modify header information - headers already sent by (output started at /pass.php:70) in /pass.php on line 77