Sample questions
CompTIA Data+ DA0-001 practice questions
Drag and drop the steps to clean a dataset with missing values in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to normalize a database table from 1NF to 3NF in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to create a data visualization dashboard in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to implement a data classification policy in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps for the ETL (Extract, Transform, Load) process in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to perform a data backup using the 3-2-1 rule in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to conduct a hypothesis test in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to perform a data audit in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to resolve data integration conflicts in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
Drag and drop the steps to perform a root cause analysis on data quality issues in the correct order.
Drag steps to the numbered slots on the right, or tap a step then tap a slot.
A data analyst is using SQL to extract data. The analyst wants to retrieve all records from a table named 'sales' where the 'amount' column is greater than 100. Which SQL clause should be used?
Trap 1: ORDER BY
ORDER BY is for sorting, not filtering.
Trap 2: GROUP BY
GROUP BY is for grouping rows.
Trap 3: HAVING
HAVING filters groups, not individual rows.
- A
WHERE
WHERE clause filters rows based on a condition.
- B
ORDER BY
Why wrong: ORDER BY is for sorting, not filtering.
- C
GROUP BY
Why wrong: GROUP BY is for grouping rows.
- D
HAVING
Why wrong: HAVING filters groups, not individual rows.
A company is analyzing customer feedback sentiment. The dataset is highly imbalanced with 95% positive and 5% negative comments. Which technique should the analyst use to address class imbalance before modeling?
Trap 1: Use accuracy as the evaluation metric
Accuracy is misleading for imbalanced data; F1 or AUC are better.
Trap 2: Undersample the majority class
Undersampling can help but may discard useful data; SMOTE is more effective.
Trap 3: Oversample the majority class
This would increase imbalance further.
- A
Use accuracy as the evaluation metric
Why wrong: Accuracy is misleading for imbalanced data; F1 or AUC are better.
- B
Undersample the majority class
Why wrong: Undersampling can help but may discard useful data; SMOTE is more effective.
- C
Oversample the majority class
Why wrong: This would increase imbalance further.
- D
Use SMOTE
SMOTE generates synthetic minority samples to balance classes.
A data analyst creates a bubble chart showing country GDP (x-axis), life expectancy (y-axis), and population (bubble size). However, large bubbles overlap and obscure many data points. Which corrective action should the analyst take?
Trap 1: Increase the chart canvas size
Larger canvas doesn't prevent overlap if bubbles are large.
Trap 2: Reduce all bubble sizes uniformly
Smaller bubbles may make small populations invisible.
Trap 3: Remove outlier countries with large populations
Removing data is not recommended; skews analysis.
- A
Increase the chart canvas size
Why wrong: Larger canvas doesn't prevent overlap if bubbles are large.
- B
Set bubble opacity to 70%
Transparency allows seeing through overlapping bubbles.
- C
Reduce all bubble sizes uniformly
Why wrong: Smaller bubbles may make small populations invisible.
- D
Remove outlier countries with large populations
Why wrong: Removing data is not recommended; skews analysis.
Which THREE factors should be considered when choosing a chart type for a dataset?
Trap 1: The animation capabilities of the software
Animation is not a primary consideration for chart selection.
Trap 2: The color scheme of the company logo
Color scheme is aesthetic, not a deciding factor for chart type.
- A
The animation capabilities of the software
Why wrong: Animation is not a primary consideration for chart selection.
- B
The data types (categorical, numerical, time series)
Data type determines suitable chart types.
- C
The number of variables to display
More variables may require advanced charts like bubble or multi-line.
- D
The key insight or message to convey
The chart should highlight the intended message.
- E
The color scheme of the company logo
Why wrong: Color scheme is aesthetic, not a deciding factor for chart type.
A data analyst creates a heatmap to show website click-through rates by hour and day of week. The heatmap uses a green-to-red gradient, but users cannot distinguish between moderate values. What is the best fix?
Trap 1: Remove all but the highest and lowest values
Removing data discards information.
Trap 2: Add black borders around each cell
Borders can create visual noise without improving color distinction.
Trap 3: Increase the size of each heatmap cell
Cell size doesn't improve color perception.
- A
Switch to a diverging color scheme with a neutral center
Diverging palettes highlight midpoints and differentiate values better.
- B
Remove all but the highest and lowest values
Why wrong: Removing data discards information.
- C
Add black borders around each cell
Why wrong: Borders can create visual noise without improving color distinction.
- D
Increase the size of each heatmap cell
Why wrong: Cell size doesn't improve color perception.
A company's database has a table 'orders' with columns: order_id, customer_id, order_date, and total_amount. A data analyst needs to identify customers who have placed more than 5 orders in the past year. Which data concept should be used to group orders by customer and count them?
Trap 1: Joining with other tables
Joining combines tables, not needed here.
Trap 2: Filtering with WHERE clause
Filtering selects rows, does not group.
Trap 3: Sorting with ORDER BY
Sorting orders results, does not group.
- A
Joining with other tables
Why wrong: Joining combines tables, not needed here.
- B
Filtering with WHERE clause
Why wrong: Filtering selects rows, does not group.
- C
Sorting with ORDER BY
Why wrong: Sorting orders results, does not group.
- D
Aggregation with GROUP BY
GROUP BY groups rows and aggregation functions compute counts.
Which TWO of the following are characteristics of structured data? (Choose TWO.)
Trap 1: Requires NoSQL databases for storage
Structured data is stored in relational databases.
Trap 2: Often contains natural language text
Natural language is typical of unstructured data.
Trap 3: Cannot be queried using SQL
SQL is used for structured data.
- A
Has a defined schema
Schema defines structure.
- B
Requires NoSQL databases for storage
Why wrong: Structured data is stored in relational databases.
- C
Often contains natural language text
Why wrong: Natural language is typical of unstructured data.
- D
Cannot be queried using SQL
Why wrong: SQL is used for structured data.
- E
Organized in rows and columns
Structured data is tabular.
A healthcare database stores patient records. Each patient has a unique patient_id, and the database includes a table 'visits' with visit_id, patient_id, visit_date, and diagnosis_code. To ensure data integrity, which constraint should be applied to the patient_id column in the 'visits' table?
Trap 1: Unique constraint
Unique constraint prevents duplicate values, not referential integrity.
Trap 2: Primary key
Primary key is for the table's own unique identifier.
Trap 3: Check constraint
Check constraint validates data range or format.
- A
Unique constraint
Why wrong: Unique constraint prevents duplicate values, not referential integrity.
- B
Foreign key
Foreign key enforces referential integrity.
- C
Primary key
Why wrong: Primary key is for the table's own unique identifier.
- D
Check constraint
Why wrong: Check constraint validates data range or format.
A data engineer is designing a data warehouse for a multinational corporation. The company has sales data from different regions with varying currencies and date formats. To ensure consistency, which data concept should be applied to standardize the data before loading into the warehouse?
Trap 1: Data cleansing
Cleansing fixes errors, but not formatting differences.
Trap 2: Data profiling
Profiling is for assessment, not transformation.
Trap 3: Data masking
Masking hides sensitive data, not standardize.
- A
Data cleansing
Why wrong: Cleansing fixes errors, but not formatting differences.
- B
Data transformation
Transformation includes standardization of formats.
- C
Data profiling
Why wrong: Profiling is for assessment, not transformation.
- D
Data masking
Why wrong: Masking hides sensitive data, not standardize.
A data analyst wants to compare the sales performance of four different stores over the same time period. Which chart type is most suitable?
Trap 1: Line chart with multiple lines
Multiple lines can be confusing if many categories.
Trap 2: Stacked bar chart
Stacked bars show composition, not direct comparison.
Trap 3: Pie chart with multiple pies
Multiple pies are hard to compare.
- A
Line chart with multiple lines
Why wrong: Multiple lines can be confusing if many categories.
- B
Grouped bar chart
Grouped bars allow side-by-side comparison of stores.
- C
Stacked bar chart
Why wrong: Stacked bars show composition, not direct comparison.
- D
Pie chart with multiple pies
Why wrong: Multiple pies are hard to compare.
A data team is preparing a dashboard for executives. The team wants to highlight key performance indicators (KPIs) that are below target. Which of the following visualization techniques would most effectively draw attention to underperforming metrics without causing confusion?
Trap 1: Remove underperforming KPIs from the dashboard to avoid confusion.
Hiding underperformance prevents action.
Trap 2: Use a scatter plot to show the relationship between KPIs.
Scatter plots do not directly indicate target achievement.
Trap 3: Use a pie chart showing the proportion of each KPI.
Pie charts do not effectively show performance against targets.
- A
Remove underperforming KPIs from the dashboard to avoid confusion.
Why wrong: Hiding underperformance prevents action.
- B
Use a scatter plot to show the relationship between KPIs.
Why wrong: Scatter plots do not directly indicate target achievement.
- C
Apply conditional formatting to turn KPI values red when below target.
Red highlights call attention to issues immediately.
- D
Use a pie chart showing the proportion of each KPI.
Why wrong: Pie charts do not effectively show performance against targets.
A data analyst creates a scatter plot showing the relationship between advertising spend and revenue. The plot shows a strong positive correlation. Which of the following should the analyst include in the report to ensure accurate communication?
Trap 1: Replace the scatter plot with a bar chart.
Bar chart is not appropriate for showing correlation.
Trap 2: Remove any outliers from the plot.
Removing outliers may distort the true relationship.
Trap 3: Add a trend line to the scatter plot.
A trend line is optional and does not address causation.
- A
Include a note that correlation does not imply causation.
This prevents misinterpretation of the relationship.
- B
Replace the scatter plot with a bar chart.
Why wrong: Bar chart is not appropriate for showing correlation.
- C
Remove any outliers from the plot.
Why wrong: Removing outliers may distort the true relationship.
- D
Add a trend line to the scatter plot.
Why wrong: A trend line is optional and does not address causation.
A data analyst encounters the above error log when trying to connect to a database. The analyst needs to explain the issue to the database administrator. Which of the following correctly describes the problem?
Exhibit
Refer to the exhibit. ``` [2024-03-15 14:23:45] ERROR: Connection pool exhausted. Retry attempt 1... [2024-03-15 14:23:46] ERROR: Connection pool exhausted. Retry attempt 2... [2024-03-15 14:23:47] ERROR: Connection pool exhausted. Retry attempt 3... [2024-03-15 14:23:48] ERROR: Connection pool exhausted. Max retries exceeded. ```
Trap 1: The database table is corrupted.
The error does not indicate data corruption.
Trap 2: The database server is out of disk space.
No mention of disk space issues.
Trap 3: The database authentication credentials are invalid.
The error does not mention authentication failure.
- A
The database connection pool has reached its maximum limit.
The log explicitly says 'Connection pool exhausted'.
- B
The database table is corrupted.
Why wrong: The error does not indicate data corruption.
- C
The database server is out of disk space.
Why wrong: No mention of disk space issues.
- D
The database authentication credentials are invalid.
Why wrong: The error does not mention authentication failure.
A data analyst creates a report showing sales by product category. The analyst notices that one category has a very high sales figure due to a one-time bulk order. Which of the following is the best way to communicate this insight to stakeholders?
Trap 1: Delete the bulk order from the dataset.
Deleting data without reason is not transparent.
Trap 2: Remove the category with the bulk order from the report.
Omitting data hides valuable information.
Trap 3: Use a pie chart to show the proportion of each category.
Pie charts would still misrepresent due to the outlier.
- A
Delete the bulk order from the dataset.
Why wrong: Deleting data without reason is not transparent.
- B
Add a note to the chart explaining the bulk order.
Annotation provides context for the anomaly.
- C
Remove the category with the bulk order from the report.
Why wrong: Omitting data hides valuable information.
- D
Use a pie chart to show the proportion of each category.
Why wrong: Pie charts would still misrepresent due to the outlier.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.