Databricks-Certified-Data-Engineer-Associate試験無料問題集「Databricks Certified Data Engineer Associate 認定」

A data engineer has configured a Structured Streaming job to read from a table, manipulate the data, and then perform a streaming write into a new table.
The code block used by the data engineer is below:

If the data engineer only wants the query to process all of the available data in as many batches as required, which of the following lines of code should the data engineer use to fill in the blank?

解説: (GoShiken メンバーにのみ表示されます)
A data engineer has developed a data pipeline to ingest data from a JSON source using Auto Loader, but the engineer has not provided any type inference or schema hints in their pipeline. Upon reviewing the data, the data engineer has noticed that all of the columns in the target table are of the string type despite some of the fields only including float or boolean values.
Which of the following describes why Auto Loader inferred all of the columns to be of the string type?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following describes the type of workloads that are always compatible with Auto Loader?

解説: (GoShiken メンバーにのみ表示されます)
A data analysis team has noticed that their Databricks SQL queries are running too slowly when connected to their always-on SQL endpoint. They claim that this issue is present when many members of the team are running small queries simultaneously. They ask the data engineering team for help. The data engineering team notices that each of the team's queries uses the same SQL endpoint.
Which of the following approaches can the data engineering team use to improve the latency of the team's queries?

解説: (GoShiken メンバーにのみ表示されます)
In which of the following file formats is data from Delta Lake tables primarily stored?

解説: (GoShiken メンバーにのみ表示されます)
An engineering manager wants to monitor the performance of a recent project using a Databricks SQL query. For the first week following the project's release, the manager wants the query results to be updated every minute. However, the manager is concerned that the compute resources used for the query will be left running and cost the organization a lot of money beyond the first week of the project's release.
Which of the following approaches can the engineering team use to ensure the query does not cost the organization any money beyond the first week of the project's release?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following describes the storage organization of a Delta table?

解説: (GoShiken メンバーにのみ表示されます)
A data engineer wants to create a new table containing the names of customers who live in France.
They have written the following command:
CREATE TABLE customersInFrance
_____ AS
SELECT id,
firstName,
lastName
FROM customerLocations
WHERE country = 'FRANCE';
A senior data engineer mentions that it is organization policy to include a table property indicating that the new table includes personally identifiable information (Pll).
Which line of code fills in the above blank to successfully complete the task?

解説: (GoShiken メンバーにのみ表示されます)
Which of the following describes a scenario in which a data engineer will want to use a single-node cluster?

解説: (GoShiken メンバーにのみ表示されます)
A data engineer is running code in a Databricks Repo that is cloned from a central Git repository. A colleague of the data engineer informs them that changes have been made and synced to the central Git repository. The data engineer now needs to sync their Databricks Repo to get the changes from the central Git repository.
Which of the following Git operations does the data engineer need to run to accomplish this task?

解説: (GoShiken メンバーにのみ表示されます)