Professional-Data-Engineer試験無料問題集「Google Certified Professional Data Engineer 認定」

ページ: 1 / 28
トータル 380 問

サインアップ、ログインされた後に、試験全体を無料で表示できるようになります。

出題：1

An aerospace company uses a proprietary data format to store its night data. You need to connect this new data source to BigQuery and stream the data into BigQuery. You want to efficiency import the data into BigQuery where consuming as few resources as possible. What should you do?

A. Use Apache Hive to write a Dataproc job that streams the data into BigQuery in CSV format

B. Use a standard Dataflow pipeline to store the raw data in BigQuery and then transform the format later when the data is used.

C. Use an Apache Beam custom connector to write a Dataflow pipeline that streams the data into BigQuery in Avro format

D. Write a shell script that triggers a Cloud Function that performs periodic ETL batch jobs on the new data source

正解：C 解答を投票する

出題：2

Your chemical company needs to manually check documentation for customer order. You use a pull subscription in Pub/Sub so that sales agents get details from the order. You must ensure that you do not process orders twice with different sales agents and that you do not add more complexity to this workflow.
What should you do?

A. Use Pub/Sub exactly-once delivery in your pull subscription.

B. Create a transactional database that monitors the pending messages.

C. Use a Deduphcate PTransform in Dataflow before sending the messages to the sales agents.

D. Create a new Pub/Sub push subscription to monitor the orders processed in the agent's system.

正解：A 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：3

You have data located in BigQuery that is used to generate reports for your company. You have noticed some weekly executive report fields do not correspond to format according to company standards for example, report errors include different telephone formats and different country code identifiers. This is a frequent issue, so you need to create a recurring job to normalize the data. You want a quick solution that requires no coding What should you do?

A. Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.

B. Create a Spark job and submit it to Dataproc Serverless.

C. Use Dataflow SQL to create a job that normalizes the data, and that after the first run of the job, schedule the pipeline to execute recurrently.

D. Use BigQuery and GoogleSQL to normalize the data, and schedule recurring quenes in BigQuery.

正解：A 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：4

You are designing a messaging system by using Pub/Sub to process clickstream data with an event-driven consumer app that relies on a push subscription. You need to configure the messaging system that is reliable enough to handle temporary downtime of the consumer app. You also need the messaging system to store the input messages that cannot be consumed by the subscriber. The system needs to retry failed messages gradually, avoiding overloading the consumer app, and store the failed messages after a maximum of 10 retries in a topic. How should you configure the Pub/Sub subscription?

A. Increase the acknowledgement deadline to 10 minutes.

B. Use exponential backoff as the subscription retry policy, and configure dead lettering to the same source topic with maximum delivery attempts set to 10.

C. Use immediate redelivery as the subscription retry policy, and configure dead lettering to a different topic with maximum delivery attempts set to 10.

D. Use exponential backoff as the subscription retry policy, and configure dead lettering to a different topic with maximum delivery attempts set to 10.

正解：D 解答を投票する

出題：5

Your company currently runs a large on-premises cluster using Spark Hive and Hadoop Distributed File System (HDFS) in a colocation facility. The duster is designed to support peak usage on the system, however, many jobs are batch n nature, and usage of the cluster fluctuates quite dramatically.
Your company is eager to move to the cloud to reduce the overhead associated with on-premises infrastructure and maintenance and to benefit from the cost savings. They are also hoping to modernize their existing infrastructure to use more servers offerings m order to take advantage of the cloud Because of the tuning of their contract renewal with the colocation facility they have only 2 months for their initial migration How should you recommend they approach thee upcoming migration strategy so they can maximize their cost savings in the cloud will still executing the migration in time?

A. Migrate the workloads to Dataproc plus Cloud Storage modernize later

B. Migrate the Spark workload to Dataproc plus HDFS, and modernize the Hive workload for BigQuery

C. Migrate the workloads to Dataproc plus HOPS, modernize later

D. Modernize the Spark workload for Dataflow and the Hive workload for BigQuery

正解：D 解答を投票する

出題：6

Your startup has a web application that currently serves customers out of a single region in Asia. You are targeting funding that will allow your startup lo serve customers globally. Your current goal is to optimize for cost, and your post-funding goat is to optimize for global presence and performance. You must use a native JDBC driver. What should you do?

A. Use a Cloud SQL for PostgreSQL highly available instance first, and bigtable with US. Europe, and Asia replication alter securing funding

B. Use a Cloud SQL for PostgreSQL zonal instance first and Bigtable with US. Europe, and Asia after securing funding.

C. Use Cloud Spanner to configure a single region instance initially. and then configure multi-region C oud Spanner instances after securing funding.

D. Use a Cloud SOL for PostgreSQL zonal instance first, and Cloud SOL for PostgreSQL with highly available configuration after securing funding.

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：7

You want to use a BigQuery table as a data sink. In which writing mode(s) can you use BigQuery as a sink?

A. Both batch and streaming

B. Only batch

C. Only streaming

D. BigQuery cannot be used as a sink

正解：A 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：8

When using Cloud Dataproc clusters, you can access the YARN web interface by configuring a browser to connect through a ____ proxy.

A. VPN

B. SOCKS

C. HTTPS

D. HTTP

正解：B 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：9

Your company is streaming real-time sensor data from their factory floor into Bigtable and they have noticed extremely poor performance. How should the row key be redesigned to improve Bigtable performance on queries that populate real-time dashboards?

A. Use a row key of the form <timestamp>.

B. Use a row key of the form >#<sensorid>#<timestamp>.

C. Use a row key of the form <sensorid>.

D. Use a row key of the form <timestamp>#<sensorid>.

正解：A 解答を投票する

出題：10

You have a data processing application that runs on Google Kubernetes Engine (GKE). Containers need to be launched with their latest available configurations from a container registry. Your GKE nodes need to have GPUs. local SSDs, and 8 Gbps bandwidth. You want to efficiently provision the data processing infrastructure and manage the deployment process. What should you do?

A. Use GKE to autoscale containers, and use gloud commands to provision the infrastructure.

B. Use Compute Engi.no startup scriots to pull container Images, and use gloud commands to provision the infrastructure.

C. Use Cloud Build to schedule a job using Terraform build to provision the infrastructure and launch with the most current container images.

D. Use Dataflow to provision the data pipeline, and use Cloud Scheduler to run the job.

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：11

You are updating the code for a subscriber to a Put/Sub feed. You are concerned that upon deployment the subscriber may erroneously acknowledge messages, leading to message loss. You subscriber is not set up to retain acknowledged messages. What should you do to ensure that you can recover from errors after deployment?

A. Create a Pub/Sub snapshot before deploying new subscriber code. Use a Seek operation to re-deliver messages that became available after the snapshot was created

B. Enable dead-lettering on the Pub/Sub topic to capture messages that aren't successful acknowledged if an error occurs after deployment, re-deliver any messages captured by the dead-letter queue

C. Set up the Pub/Sub emulator on your local machine Validate the behavior of your new subscriber togs before deploying it to production

D. Use Cloud Build for your deployment if an error occurs after deployment, use a Seek operation to locate a tmestamp logged by Cloud Build at the start of the deployment

正解：A 解答を投票する

出題：12

Your company's data platform ingests CSV file dumps of booking and user profile data from upstream sources into Cloud Storage. The data analyst team wants to join these datasets on the email field available in both the datasets to perform analysis. However, personally identifiable information (PII) should not be accessible to the analysts. You need to de-identify the email field in both the datasets before loading them into BigQuery for analysts. What should you do?

A. 1. Load the CSV files from Cloud Storage into a BigQuery table, and enable dynamic data masking.
2. Create a policy tag with the default masking value as the data masking rule.
3. Assign the policy to the email field in both tables.
4. Assign the Identity and Access Management bigquerydatapolicy.maskedReader role for the BigQuery tables to the analysts

B. 1. Load the CSV files from Cloud Storage into a BigQuery table, and enable dynamic data masking.
2. Create a policy tag with the email mask as the data masking rule.
3. Assign the policy to the email field in both tables. A
4. Assign the Identity and Access Management bigquerydatapolicy.maskedReader role for the BigQuery tables to the analysts.

C. 1. Create a pipeline to de-identify the email field by using recordTransformations in Cloud DLP with format-preserving encryption with FFX as the de-identification transformation type.
2. Load the booking and user profile data into a BigQuery table.

D. 1. Create a pipeline to de-identify the email field by using recordTransformations in Cloud Data Loss Prevention (Cloud DLP) with masking as the de-identification transformations type.
2. Load the booking and user profile data into a BigQuery table.

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：13

You have 100 GB of data stored in a BigQuery table. This data is outdated and will only be accessed one or two times a year for analytics with SQL. For backup purposes, you want to store this data to be immutable for
3 years. You want to minimize storage costs. What should you do?

A. 1 Create a BigQuery table snapshot.
2 Restore the snapshot when you need to perform analytics.

B. 1 Create a BigQuery table clone.
2. Query the clone when you need to perform analytics.

C. 1 Perform a BigQuery export to a Cloud Storage bucket with archive storage class.
2 Set a locked retention policy on the bucket.
3. Create a BigQuery external table on the exported files.

D. 1. Perform a BigQuery export to a Cloud Storage bucket with archive storage class.
2 Enable versionmg on the bucket.
3. Create a BigQuery external table on the exported files.

正解：C 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

出題：14

You are building new real-time data warehouse for your company and will use Google BigQuery streaming inserts. There is no guarantee that data will only be sent in once but you do have a unique ID for each row of data and an event timestamp. You want to ensure that duplicates are not included while interactively querying data. Which query type should you use?

A. Use the ROW_NUMBER window function with PARTITION by unique ID along with WHERE row equals 1.

B. Use GROUP BY on the unique ID column and timestamp column and SUM on the values.

C. Use the LAG window function with PARTITION by unique ID along with WHERE LAG IS NOT NULL.

D. Include ORDER BY DESK on timestamp column and LIMIT to 1.

正解：A 解答を投票する

解説: (GoShiken メンバーにのみ表示されます)

ページ: 1 / 28
トータル 380 問

Professional-Data-Engineer の機能をすべて解除する

キャプチャ不要
365日無料更新サービス
希望する合格率を設定できる
時間の割り当てられる（時間：分）
Professional-Data-Engineer に2つの練習用モード
サポートサービス対応

完全版を入手する

弊社のサイトにはあなたの試験合格を助けるために研究された効果的な知能問題集を提供しています。材料はすべてのユーザーによって称賛されています。弊社のサイトは、最短時間で多くの証明書を取得するのに役立つ学習プラットフォームになります。

掲示板

試験UiPath-ADAv1 トピック20 問題174 スレッド
試験HPE0-V25J トピック1 問題34 スレッド
試験PMI-RMP トピック8 問題7 スレッド
試験PMI-ACP-JPN トピック1 問題428 スレッド
試験GitHub-Copilot トピック5 問題65 スレッド
試験SC-300J トピック1 問題357 スレッド
試験GitHub-Copilot トピック6 問題57 スレッド

弊社を連絡する

我々の働いている時間：( UTC+9 ) 9:00-24:00

月曜日から土曜日まで

サポート：現在連絡

我々は１２時間以内ですべてのお問い合わせを答えます。