Databricks-Certified-Professional-Data-Engineer試験無料問題集「Databricks Certified Professional Data Engineer 認定」

Which statement characterizes the general programming model used by Spark Structured Streaming?

解説: (GoShiken メンバーにのみ表示されます)
A Delta Lake table was created with the below query:

Consider the following query:
DROP TABLE prod.sales_by_store -
If this statement is executed by a workspace admin, which result will occur?

解説: (GoShiken メンバーにのみ表示されます)
A junior member of the data engineering team is exploring the language interoperability of Databricks notebooks. The intended outcome of the below code is to register a view of all sales that occurred in countries on the continent of Africa that appear in the geo_lookup table.
Before executing the code, running SHOW TABLES on the current database indicates the database contains only two tables: geo_lookup and sales.

Which statement correctly describes the outcome of executing these command cells in order in an interactive notebook?

解説: (GoShiken メンバーにのみ表示されます)
The data governance team is reviewing code used for deleting records for compliance with GDPR. They note the following logic is used to delete records from the Delta Lake table named users.

Assuming that user_id is a unique identifying key and that delete_requests contains all users that have requested deletion, which statement describes whether successfully executing the above logic guarantees that the records to be deleted are no longer accessible and why?

解説: (GoShiken メンバーにのみ表示されます)
The data science team has created and logged a production using MLFlow. The model accepts a list of column names and returns a new column of type DOUBLE.
The following code correctly imports the production model, load the customer table containing the customer_id key column into a Dataframe, and defines the feature columns needed for the model.

Which code block will output DataFrame with the schema'' customer_id LONG, predictions DOUBLE''?

解説: (GoShiken メンバーにのみ表示されます)
Which configuration parameter directly affects the size of a spark-partition upon ingestion of data into Spark?

解説: (GoShiken メンバーにのみ表示されます)
When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?

解説: (GoShiken メンバーにのみ表示されます)
A Delta Lake table was created with the below query:

Realizing that the original query had a typographical error, the below code was executed:
ALTER TABLE prod.sales_by_stor RENAME TO prod.sales_by_store
Which result will occur after running the second command?

解説: (GoShiken メンバーにのみ表示されます)
The data engineer is using Spark's MEMORY_ONLY storage level.
Which indicators should the data engineer look for in the spark UI's Storage tab to signal that a cached table is not performing optimally?

解説: (GoShiken メンバーにのみ表示されます)
The Databricks CLI is use to trigger a run of an existing job by passing the job_id parameter. The response that the job run request has been submitted successfully includes a filed run_id.
Which statement describes what the number alongside this field represents?

解説: (GoShiken メンバーにのみ表示されます)