DEA-7TT2試験無料問題集（233題）「EMC Associate - Data Science and Big Data Analytics v2 認定」

出題：1

Which word or phrase completes the statement; "Excessive emphasis color is to Bar chart as __________________."?
Response:

A. Confidence interval is to regression

B. Multicollinearity is to OLS

C. Multicollinearity is to serial correlation

D. Confidence is to leverage

正解：B 解答を投票する

出題：2

You have created a scatter plot using R from household income and education data as shown in the graphic. What can be done to improve the visualization?
Response:

A. Add a rug to the plot

B. Add a Box and Whisker overlay to the plot

C. Recreate the plot with a Bar plot

D. Recreate the plot with a hexbin plot

正解：D 解答を投票する

出題：3

Which characteristic applies only to Business Intelligence as opposed to Data Science?
Response:

A. Uses large data sets

B. Supports solving "what if" scenarios

C. Uses only structured data

D. Uses predictive modeling techniques

正解：C 解答を投票する

出題：4

Refer to the exhibit, which shows pairwise counts for items purchased together.

Consider the following association rules:
- Milk -> Eggs
- Eggs -> Milk
- Bread -> Milk
- Milk -> Bread
Which rule has a confidence higher than 70%?
Response:

A. Milk -> Bread

B. Eggs -> Milk

C. Bread -> Milk

D. Milk -> Eggs

正解：B 解答を投票する

出題：5

Which SQL OLAP extension provides all possible grouping combinations?
Response:

A. UNION ALL

B. ROLLUP

C. CROSS JOIN

D. CUBE

正解：D 解答を投票する

出題：6

You have an automotive database containing numeric characteristics such as engine size, horsepower, and top speed. Which technique could you use to group similar cars together?
Response:

A. K-means clustering

B. Naive Bayes classifier

C. Association rules

D. Logistic regression

正解：A 解答を投票する

出題：7

Consider a database with 4 transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
The minimum support is 25%. Which rule has a confidence equal to 50%?
Response:

A. {bread} => {cheese}

B. {bread, milk} => {cheese}

C. {juice} => {soda}

D. {bread} => {milk}

正解：B 解答を投票する

出題：8

A call center for a large electronics company handles an average of 35, 000 support calls a day. The head of the call center would like to optimize the staffing of the call center during the rollout of a new product due to recent customer complaints of long wait times.
You have been asked to create a model to optimize call center costs and customer wait times. The goals for this project include:
1. Relative to the release of a product, how does the call volume change over time?
2. How to best optimize staffing based on the call volume for the newly released product, relative to old products.
3. Historically, what time of day does the call center need to be most heavily staffed?
4. Determine the frequency of calls by both product type and customer language.
Which goals are suitable to be completed with MapReduce?
Response:

A. Goal 1 and 3

B. Goal 2 and 4

C. Goals 1, 2, 3, 4

D. Goals 2, 3, 4

正解：B 解答を投票する

出題：9

Refer to the exhibit.

Click on the calculator icon in the upper left corner. An analyst is searching a corpus of documents for the topic "solid state disk".
In the Exhibit, Table A provides the inverse document frequency for each term across the corpus. Table B provides each term's frequency in four documents selected from corpus.
Which of the four documents is most relevant to the analyst's search?
Response:

A. Document B

B. Document D

C. Document C

D. Document A

正解：C 解答を投票する

出題：10

Consider these itemsets:
(hat, scarf, coat)
(hat, scarf, coat, gloves)
(hat, scarf, gloves)
(hat, gloves)
(scarf, coat, gloves)
What is the confidence of the rule (hat, scarf) => gloves?
Response:

A. 66%

B. 40%

C. 60%

D. 50%

正解：A 解答を投票する

出題：11

Refer to the exhibit.

You have plotted the distribution of savings account sizes for your bank. How would you proceed, based on this distribution?
Response:

A. The accounts of size greater than 2500 are rare, and probably outliers. Eliminate them from your future analysis.

B. The data is extremely skewed. Replot the data on a logarithmic scale to get a better sense of it.

C. The data is extremely skewed, but looks bimodal; replot the data in the range 2, 500-10, 000 to be sure.

D. The data is extremely skewed. Split your analysis into two cohorts: accounts less than 2500, and accounts greater than 2500

正解：B 解答を投票する

出題：12

In a fitted ARIMA(1,2,3) model, how many differences are applied?
Response:

A. 2

B. 1

C. 0

D. 3

正解：A 解答を投票する

出題：13

Refer to the exhibit.

You are using k-means clustering to discover groupings within a data set. You plot within- sum-of-squares (wss) of multiple cluster sizes. Based on the exhibit, how many clusters should you use in your analysis?
Response:

A. 8

B. 2

C. 4

D. 10

正解：C 解答を投票する

出題：14

You have created a Linear Regression model to predict total sales based on variables M, N, P and Q as shown in the graphic. You originally expected all variables to have positive coefficients. Which action would you take?
Response:

A. Accept only statistically significant variables and investigate correlated independent variables

B. Accept none of the variables and investigate correlations between all variables

C. Accept all variables and begin model validation steps against holdout data

D. Accept only positive variables and investigate potential correlation with the dependent variable

正解：B 解答を投票する

出題：15

What is the reason for using LOESS?
Response:

A. Plots a continuous variable versus a discrete variable; comparing distributions across classes

B. Runs after a one-way ANOVA; determining which population has the highest mean value

C. Fits a smoothed curve to scatterplot data; providing a general idea of the data,s behavior

D. Significance test for the correlation between two variables

正解：C 解答を投票する

出題：16

You have just completed the Discovery phase of a project and finished interviewing the main stakeholders. You have identified the necessary data feeds and are now beginning to set up the analytic sandbox. What is the next step?
Response:

A. Run descriptive statistics for several data sets

B. Perform ELT / ETL

C. Assess data quality

D. Create data visualizations

正解：B 解答を投票する

出題：17

Consider a database with 4 transactions:
Transaction 1: {cheese, bread, milk}
Transaction 2: {soda, bread, milk}
Transaction 3: {cheese, bread}
Transaction 4: {cheese, soda, juice}
You decide to run the association rules algorithm where minimum support is 50%. Which rule has a confidence at least 50%?
Response:

A. {soda} => {milk}

B. {juice} => {cheese}

C. {cheese} => {bread}

D. {milk} => {soda}

正解：C 解答を投票する

出題：18

You are provided with the following list. Which window function is missing?
cume_dist()
dense_rank()
rank()
percent_rank()
first_value()
last_value()
lag()
lead()
ntile()
Response:

A. median()

B. cumulative_sum()

C. row_preceding()

D. row_number()

正解：D 解答を投票する

出題：19

You have been assigned to run a linear regression model for each of 5, 000 distinct districts, and all the data is currently stored in a PostgreSQL database. Which tool/library would you use to produce these models with the least effort?
Response:

A. R

B. MADlib

C. HBase

D. Mahout

正解：B 解答を投票する

出題：20

What describes a true limitation of a Logistic Regression method?
Response:

A. Does not handle correlated variables well

B. Does not have explanatory values

C. Does not handle missing values well

D. Does not handle redundant variables well

正解：C 解答を投票する

DEA-7TT2試験無料問題集「EMC Associate - Data Science and Big Data Analytics v2 認定」