Associate-Developer-Apache-Spark試験無料問題集「Databricks Certified Associate Developer for Apache Spark 3.0 認定」

The code block shown below should return a DataFrame with only columns from DataFrame transactionsDf for which there is a corresponding transactionId in DataFrame itemsDf. DataFrame itemsDf is very small and much smaller than DataFrame transactionsDf. The query should be executed in an optimized way. Choose the answer that correctly fills the blanks in the code block to accomplish this.
__1__.__2__(__3__, __4__, __5__)

解説: (GoShiken メンバーにのみ表示されます)
The code block shown below should return a DataFrame with all columns of DataFrame transactionsDf, but only maximum 2 rows in which column productId has at least the value 2. Choose the answer that correctly fills the blanks in the code block to accomplish this.
transactionsDf.__1__(__2__).__3__

解説: (GoShiken メンバーにのみ表示されます)
Which of the following code blocks returns a DataFrame with a single column in which all items in column attributes of DataFrame itemsDf are listed that contain the letter i?
Sample of DataFrame itemsDf:
1.+------+----------------------------------+-----------------------------+-------------------+
2.|itemId|itemName |attributes |supplier |
3.+------+----------------------------------+-----------------------------+-------------------+
4.|1 |Thick Coat for Walking in the Snow|[blue, winter, cozy] |Sports Company Inc.|
5.|2 |Elegant Outdoors Summer Dress |[red, summer, fresh, cooling]|YetiX |
6.|3 |Outdoors Backpack |[green, summer, travel] |Sports Company Inc.|
7.+------+----------------------------------+-----------------------------+-------------------+

解説: (GoShiken メンバーにのみ表示されます)
Which of the following code blocks applies the Python function to_limit on column predError in table transactionsDf, returning a DataFrame with columns transactionId and result?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following statements about broadcast variables is correct?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following code blocks reads all CSV files in directory filePath into a single DataFrame, with column names defined in the CSV file headers?
Content of directory filePath:
1._SUCCESS
2._committed_2754546451699747124
3._started_2754546451699747124
4.part-00000-tid-2754546451699747124-10eb85bf-8d91-4dd0-b60b-2f3c02eeecaa-298-1-c000.csv.gz
5.part-00001-tid-2754546451699747124-10eb85bf-8d91-4dd0-b60b-2f3c02eeecaa-299-1-c000.csv.gz
6.part-00002-tid-2754546451699747124-10eb85bf-8d91-4dd0-b60b-2f3c02eeecaa-300-1-c000.csv.gz
7.part-00003-tid-2754546451699747124-10eb85bf-8d91-4dd0-b60b-2f3c02eeecaa-301-1-c000.csv.gz spark.option("header",True).csv(filePath)

解説: (GoShiken メンバーにのみ表示されます)
Which of the following statements about DAGs is correct?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following code blocks reads in the JSON file stored at filePath, enforcing the schema expressed in JSON format in variable json_schema, shown in the code block below?
Code block:
1.json_schema = """
2.{"type": "struct",
3. "fields": [
4. {
5. "name": "itemId",
6. "type": "integer",
7. "nullable": true,
8. "metadata": {}
9. },
10. {
11. "name": "supplier",
12. "type": "string",
13. "nullable": true,
14. "metadata": {}
15. }
16. ]
17.}
18."""

解説: (GoShiken メンバーにのみ表示されます)
Which of the following describes properties of a shuffle?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following describes the role of the cluster manager?

解説: (GoShiken メンバーにのみ表示されます)