DEA-7TT2試験無料問題集「EMC Associate - Data Science and Big Data Analytics v2 認定」

Consider these itemsets:
(hat, scarf, coat)
(hat, scarf, coat, gloves)
(hat, scarf, gloves)
(hat, gloves)
(scarf, coat, gloves)
What is the confidence of the rule (gloves -> hat)?
Response:

Which data asset is an example of unstructured data?
Response:

Which word or phrase completes the statement? Data-ink ratio is to data visualization as _________.
Response:

What is the primary function of the NameNode in Hadoop?
Response:

Which of the following is an example of quasi-structured data?
Response:

Data has been collected on visitors' viewing habits at a bank's website. Which technique is used to identify pages commonly viewed during the same visit to the website?
Response:

You are using k-means clustering to classify heart patients for a hospital. You have chosen Patient Sex, Height, Weight, Age and Income as measures and have used 3 clusters.
When you create a pair-wise plot of the clusters, you notice that there is significant overlap between the clusters. What should you do?
Response:

Refer to the Exhibit.

You are going into a meeting where you anticipate your manager will have a question on your dataset. Specifically, your manager will want to know about customers that are classified as renters with a good credit status. In order to prepare for the meeting, you create a rule: RENTER => GOOD CREDIT.
What is the confidence of this rule?
Response:

You have been assigned to perform a study of the daily revenue effect of a pricing model of online transactions. All data currently available to you has been loaded into your analytics database. This includes revenue data, pricing data, and online transaction data.
You discover that all data comes in different levels of granularity. The transaction data has timestamps consisting of day, hour, minutes, and seconds. Pricing is stored at the daily level and revenue data is only reported monthly.
What is the next step?
Response:

You have been assigned to do a study of the daily revenue effect of a pricing model of online transactions. All the data currently available to you has been loaded into your analytics database; revenue data, pricing data, and online transaction data.
You find that all the data comes in different levels of granularity. The transaction data has timestamps (day, hour, minutes, seconds), pricing is stored at the daily level, and revenue data is only reported monthly.
What is your next step?
Response:

Which chart type is intended to display correlations between sets of numeric data?
Response:

You have scored your Naive Bayesian Classifier model on "hold out" test data for cross validation. You have determined the way the samples scored and have tabulated them as shown in the exhibit.

What are the Precision and Recall rates of the model?
Response:

A data scientist plans to classify the sentiment polarity of 10, 000 product reviews collected from the Internet. What is the most appropriate model to use? Suppose labeled training data is available.
Response:
Naive Bayesian classifier

During a study to understand the population growth of a certain bacterial culture, you plot the data and identify a quadratic growth trend over time. Which transformation should you apply to linearize the data?
Response:

To ensure a successful analytic project, which key role can provide business domain expertise with a deep understanding of the data and key performance indicators?
Response:

Refer to the exhibit.

To predict whether or not a customer will renew their annual property insurance policy, an insurance company built and operationalized a naive Bayes classification model.
In the model, there are two class labels, renewal and non-renewal, that are assigned to each customer based on their attributes. A subset of the key attributes, their values, and corresponding conditional probabilities are provided in the exhibit.
A customer has the following attributes:
- Age is greater than 65 years
- Owns their own home
- Renewal month is August
If 20% of customers do not renew their policies every year, what is the score for a non-renewal in the naive Bayesian model for the customer described above?
Response:

To ensure a successful analytic project, which key role can consult and advise the project team on the value of end results and how these will be used on a daily basis?
Response:

Which data asset is an example of semi-structured data?
Response:

Consider this SQL statement:
SELECT product, prod_cost, avg(prod_cost) OVER (PARTITION BY product)
FROM product_detail
The OVER clause makes this what type of function?
Response:

Consider the following itemsets:
(hat, scarf, coat)
(hat, scarf, coat, gloves)
(hat, scarf, gloves)
(hat, gloves)
(scarf, coat, gloves)
If the minimum support is 50%, what represents the complete list of frequent 2-itemsets?
Response: