Independence Day Deal! Unlock 25% OFF Today – Limited-Time Offer - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks Certified Associate Developer for Apache Spark 3.0 Topic 1 Question 60 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.0 exam
Question #: 60
Topic #: 1
[All Databricks Certified Associate Developer for Apache Spark 3.0 Questions]

The code block shown below should show information about the data type that column storeId of DataFrame transactionsDf contains. Choose the answer that correctly fills the blanks in the code

block to accomplish this.

Code block:

transactionsDf.__1__(__2__).__3__

Show Suggested Answer Hide Answer
Suggested Answer: B

Correct code block:

transactionsDf.select('storeId').printSchema()

The difficulty of this Question: is that it is hard to solve with the stepwise first-to-last-gap approach that has worked well for similar questions, since the answer options are so different from

one

another. Instead, you might want to eliminate answers by looking for patterns of frequently wrong answers.

A first pattern that you may recognize by now is that column names are not expressed in quotes. For this reason, the answer that includes storeId should be eliminated.

By now, you may have understood that the DataFrame.limit() is useful for returning a specified amount of rows. It has nothing to do with specific columns. For this reason, the answer that resolves to

limit('storeId') can be eliminated.

Given that we are interested in information about the data type, you should Question: whether the answer that resolves to limit(1).columns provides you with this information. While

DataFrame.columns is a valid call, it will only report back column names, but not column types. So, you can eliminate this option.

The two remaining options either use the printSchema() or print_schema() command. You may remember that DataFrame.printSchema() is the only valid command of the two. The select('storeId')

part just returns the storeId column of transactionsDf - this works here, since we are only interested in that column's type anyways.

More info: pyspark.sql.DataFrame.printSchema --- PySpark 3.1.2 documentation

Static notebook | Dynamic notebook: See test 3, Question: 57 (Databricks import instructions)


Contribute your Thoughts:

Ammie
11 months ago
I think A is incorrect because 'print_schema()' should be 'printSchema()'. So, C is the correct answer.
upvoted 0 times
...
Caprice
11 months ago
I'm not sure, but I think A could also be a possibility.
upvoted 0 times
...
Merilyn
11 months ago
I agree with Alyce, C seems correct.
upvoted 0 times
...
Barney
11 months ago
Hold up, are we sure this isn't a trick question? What if the 'storeId' column is secretly a list of cat emojis or something? I'm going with D just to be safe.
upvoted 0 times
...
Effie
11 months ago
Option E it is! Now, if only I could remember the difference between .printSchema() and .dtypes... Oh well, at least I get to use 'storeId' in my code, sounds like a fun variable name.
upvoted 0 times
...
Alyce
11 months ago
I think the answer is C.
upvoted 0 times
...
Arthur
11 months ago
Haha, good one! Though I think you're overthinking it a bit. E is definitely the way to go, unless the professor is trying to trip us up with some 'surprise' data type. Better bring my magnifying glass to the exam just in case!
upvoted 0 times
Ettie
11 months ago
Yeah, E looks like the right choice. Let's go with that.
upvoted 0 times
...
Youlanda
11 months ago
I think E is the correct answer.
upvoted 0 times
...
...
Goldie
12 months ago
Hmm, I think option E is the way to go. We want to see the data type, not just the schema, right?
upvoted 0 times
Katy
10 months ago
Sounds good to me.
upvoted 0 times
...
Ollie
11 months ago
Let's go with option E then.
upvoted 0 times
...
Shanda
11 months ago
Yeah, I agree. We need to see the data type, not just the schema.
upvoted 0 times
...
Serina
11 months ago
I think option E is the correct one.
upvoted 0 times
...
...

Save Cancel
az-700  pass4success  az-104  200-301  200-201  cissp  350-401  350-201  350-501  350-601  350-801  350-901  az-720  az-305  pl-300  

Warning: Cannot modify header information - headers already sent by (output started at /pass.php:70) in /pass.php on line 77