Databricks-Certified-Data-Engineer-Associate試験無料問題集（102題）「Databricks Certified Data Engineer Associate 認定」

出題：1

A new data engineering team has been assigned to work on a project. The team will need access to database customers in order to see what tables already exist. The team has its own group team.
Which of the following commands can be used to grant the necessary permission on the entire database to the new team?

A. GRANT USAGE ON DATABASE customers TO team;

B. GRANT VIEW ON CATALOG customers TO team;

C. GRANT CREATE ON DATABASE team TO customers;

D. GRANT CREATE ON DATABASE customers TO team;

E. GRANT USAGE ON CATALOG team TO customers;

正解：A 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：2

A data engineer has a Python variable table_name that they would like to use in a SQL query. They want to construct a Python code block that will run the query using table_name.
They have the following incomplete code block:
____(f"SELECT customer_id, spend FROM {table_name}")
Which of the following can be used to fill in the blank to successfully complete the task?

A. spark.delta.sql

B. spark.delta.table

C. spark.sql

D. spark.table

E. dbutils.sql

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：3

Which file format is used for storing Delta Lake Table?

A. Parquet

B. Delta

C. SV

D. JSON

正解：A 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：4

A data analyst has a series of queries in a SQL program. The data analyst wants this program to run every day. They only want the final query in the program to run on Sundays. They ask for help from the data engineering team to complete this task.
Which of the following approaches could be used by the data engineering team to complete this task?

A. They could redesign the data model to separate the data used in the final query into a new table.

B. They could wrap the queries using PySpark and use Python's control flow system to determine when to run the final query.

C. They could submit a feature request with Databricks to add this functionality.

D. They could automatically restrict access to the source table in the final query so that it is only accessible on Sundays.

E. They could only run the entire program on Sundays.

正解：B 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：5

A data engineer wants to schedule their Databricks SQL dashboard to refresh every hour, but they only want the associated SQL endpoint to be running when It is necessary. The dashboard has multiple queries on multiple datasets associated with it. The data that feeds the dashboard is automatically processed using a Databricks Job.
Which approach can the data engineer use to minimize the total running time of the SQL endpoint used in the refresh schedule of their dashboard?

A. O They can reduce the cluster size of the SQL endpoint.

B. 0 They can ensure the dashboard's SQL endpoint matches each of the queries' SQL endpoints.

C. O They can set up the dashboard's SQL endpoint to be serverless.

D. Q They can turn on the Auto Stop feature for the SQL endpoint.

正解：D 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：6

A data engineer and data analyst are working together on a data pipeline. The data engineer is working on the raw, bronze, and silver layers of the pipeline using Python, and the data analyst is working on the gold layer of the pipeline using SQL. The raw source of the pipeline is a streaming input. They now want to migrate their pipeline to use Delta Live Tables.
Which of the following changes will need to be made to the pipeline when migrating to Delta Live Tables?

A. The pipeline will need to stop using the medallion-based multi-hop architecture

B. The pipeline will need to be written entirely in Python

C. The pipeline will need to be written entirely in SQL

D. None of these changes will need to be made

E. The pipeline will need to use a batch source in place of a streaming source

正解：D 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：7

A data engineer is maintaining a data pipeline. Upon data ingestion, the data engineer notices that the source data is starting to have a lower level of quality. The data engineer would like to automate the process of monitoring the quality level.
Which of the following tools can the data engineer use to solve this problem?

A. Auto Loader

B. Unity Catalog

C. Data Explorer

D. Delta Lake

E. Delta Live Tables

正解：E 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：8

A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which of the following commands could the data engineering team use to access sales in PySpark?

A. There is no way to share data between PySpark and SQL.

B. spark.table("sales")

C. SELECT * FROM sales

D. spark.delta.table("sales")

E. spark.sql("sales")

正解：B 解答を投票する

出題：9

A dataset has been defined using Delta Live Tables and includes an expectations clause:
CONSTRAINT valid_timestamp EXPECT (timestamp > '2020-01-01') ON VIOLATION DROP ROW What is the expected behavior when a batch of data containing data that violates these constraints is processed?

A. Records that violate the expectation cause the job to fail.

B. Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.

C. Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.

D. Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.

E. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.

正解：E 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：10

Which of the following SQL keywords can be used to convert a table from a long format to a wide format?

A. WHERE

B. PIVOT

C. TRANSFORM

D. SUM

E. CONVERT

正解：B 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

Databricks-Certified-Data-Engineer-Associate試験無料問題集「Databricks Certified Data Engineer Associate 認定」