Which of the following describes the relationship between native Spark DataFrames and pandas API on Spark DataFrames?
To filter rows in a Spark DataFrame based on a condition, the filter method is used. In this case, the condition is that the value in the 'discount' column should be less than or equal to 0. The correct syntax uses the filter method along with the col function from pyspark.sql.functions.
Correct code:
from pyspark.sql.functions import col filtered_df = spark_df.filter(col('discount') <= 0)
Option A and D use Pandas syntax, which is not applicable in PySpark. Option B is closer but misses the use of the col function.
Justine
1 months agoMarshall
16 days agoJoseph
1 months agoCarylon
1 months agoStephanie
18 days agoVeronika
2 months agoCorrina
14 days agoAdolph
26 days agoYolande
1 months agoTayna
2 months agoDaren
4 days agoKenda
16 days agoLauna
1 months agoTalia
1 months agoSkye
2 months agoIlene
1 months agoIlene
2 months agoKyoko
2 months agoCharolette
2 months agoKyoko
2 months ago