Courseiva
Knowledge + Practice
CertificationsVendorsCareer RoadmapsLabs & ToolsStudy GuidesGlossaryPractice Questions
C
Courseiva

Free IT certification practice questions with explained answers for CCNA, CompTIA, AWS, Azure, Google Cloud, and more.

Certification Practice Questions

CCNA practice questionsSecurity+ SY0-701 practice questionsAWS SAA-C03 practice questionsAZ-104 practice questionsAZ-900 practice questionsCLF-C02 practice questionsA+ Core 1 practice questionsGoogle Cloud ACE practice questionsCySA+ CS0-003 practice questionsNetwork+ N10-009 practice questions
View all certifications →

Product

CertificationsCertification PathsExam TopicsPractice TestsExam Dumps vs Practice TestsStudy HubComparisons

Free Resources

Difficulty IndexLearn — Free ChaptersIT GlossaryFree Tools & LabsStudy GuidesCareer RoadmapsBrowse by VendorCisco Command ReferenceCCNA Scenarios

Company

AboutContactEditorial PolicyQuestion Writing PolicyTrust Center

Legal

Privacy PolicyTerms of Service

Courseiva is a free IT certification practice platform offering original exam-style practice questions, detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics for Cisco, CompTIA, Microsoft, AWS, and other technology certifications.

© 2026 Courseiva. Courseiva is operated by JTNetSolutions Ltd. All rights reserved.

Courseiva is an independent certification practice platform and is not affiliated with, endorsed by, or sponsored by Cisco, Microsoft, AWS, CompTIA, Google, ISC2, ISACA, or any other certification vendor. Vendor names and certification marks are used only to identify the exams learners are preparing for.

HomeCertificationsDA0-001DomainsMining Data
DA0-001Free — No Signup

Mining Data

Practice DA0-001 Mining Data questions with full explanations on every answer.

113questions

Start practicing

Mining Data — choose a session length

10 questions~10 min20 questions~20 min30 questions~30 min50 questions~50 min

Free · No account required

DA0-001 Domains

Data Concepts and EnvironmentsAnalysing DataVisualising DataReporting InsightsMining DataComparing and Contrasting Data ConceptsMining and Acquiring DataAnalyzing and Modeling DataVisualizing DataCommunicating Data Insights

Practice Mining Data questions

10Q20Q30Q50Q

All DA0-001 Mining Data questions (113)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

1

A data analyst needs to retrieve all unique job titles from an employees table. Which SQL keyword should be used in the SELECT clause?

2

A data analyst is profiling a new dataset containing customer information. When assessing data quality, which metric would be most appropriate to determine if the 'email' column contains valid email addresses?

3

A data analyst is working with a sales table that contains columns: sale_id, product_id, sale_date, and amount. They need to calculate a 7-day moving average of sales amount for each product, ordered by sale_date. Which window function syntax should they use?

4

A data analyst needs to extract the year from a column named 'order_date' in a SQL database. The database supports standard SQL functions. Which function should they use?

5

A data analyst wants to retrieve the top 5 highest-paid employees from a table named 'employees' that has columns 'employee_id', 'salary', and 'name'. Which SQL query should they use?

6

A data analyst is performing exploratory data analysis on a dataset containing house prices. They want to identify outliers in the 'price' column using the IQR method. The first quartile (Q1) is $200,000, the third quartile (Q3) is $350,000, and the IQR is $150,000. What is the upper bound for identifying outliers?

7

A data analyst needs to perform stratified sampling on a customer database to ensure proportional representation across three regions: North (40%), South (30%), and West (30%). The total sample size required is 1,000. How many customers should be sampled from the North region?

8

A data analyst uses a CTE to find employees who earn more than the average salary in their department. Which SQL clause is used to define the CTE?

9

A data analyst is using pandas in Python to merge two DataFrames: sales (columns: sale_id, product_id, amount) and products (columns: product_id, product_name). Which pandas function should they use to combine these DataFrames on the 'product_id' column?

10

In SQL, which string function would you use to remove leading and trailing spaces from a column named 'city'?

11

A data analyst is writing a query to rank products by total sales amount within each category. They want ties to have the same rank and no gaps in the ranking sequence. Which window function should they use?

12

A data analyst needs to count the number of orders placed by each customer, but only for customers who have placed more than 5 orders. Which SQL clause should be used to filter the aggregated results?

13

A data analyst is performing data profiling on a customer table. Which TWO of the following are key metrics to assess data quality? (Select TWO.)

14

A data analyst is using a recursive CTE to traverse a hierarchical organizational chart. Which THREE components are required to define a recursive CTE? (Select THREE.)

15

A data analyst wants to sample a large dataset of customer transactions. Which TWO sampling methods are probability-based and ensure every element has a known chance of being selected? (Select TWO.)

16

In a dataset of customer orders, you need to count the number of distinct customers who have placed orders. Which SQL aggregate function should you use?

17

You are using pandas in Python to clean a dataset. You notice several rows with missing values in the 'age' column. Which method would you use to remove those rows?

18

A data analyst wants to retrieve the top 5 highest-paid employees from an 'employees' table, including ties. Which SQL clause should be used?

19

You are analyzing sales data and need to calculate the moving average of monthly sales over the previous 3 months for each month. Which type of function is best suited for this task?

20

A data quality assessment reveals that a column named 'email' contains values like 'user@example' (missing domain extension). Which data profiling technique would best identify such pattern violations?

21

A data analyst needs to sample 10% of customers from each of three regions (North, South, Central) to ensure proportional representation. Which sampling method should be used?

22

In SQL, you want to retrieve all products whose names start with 'Pro'. Which WHERE clause should you use?

23

You have a table 'Orders' with columns order_id, customer_id, order_date, and amount. You need to write a query that returns each customer's most recent order date and the amount for that order. Which approach is correct?

24

During exploratory data analysis, you calculate the IQR for a numeric column and find that several data points fall below Q1 - 1.5*IQR. These points are likely:

25

You have a hierarchical table 'Employees' with columns emp_id, emp_name, manager_id (referencing emp_id). You need to generate a full reporting chain from a given employee up to the CEO. Which SQL construct is most appropriate?

26

A data analyst wants to extract the year from a date column 'order_date' in a SQL database. Which function should be used?

27

In pandas, you have a DataFrame 'df' with columns 'product' and 'sales'. You want to calculate the total sales per product. Which method should you use?

28

A data analyst is performing data profiling on a customer table. Which TWO metrics are most useful for understanding the completeness of the data? (Choose two.)

29

A data analyst needs to identify the top 3 most frequent product categories from a sales table. Which SQL techniques can be used to achieve this? (Choose two.)

30

A data analyst is cleaning text data in a SQL database. Which THREE string functions are commonly used to standardize and clean text? (Choose three.)

31

A data analyst needs to retrieve only unique job titles from the 'employees' table. Which SQL keyword should be used in the SELECT clause?

32

A dataset contains sales transactions with columns 'order_date', 'amount', and 'region'. The analyst wants to calculate the total sales per region for orders placed in 2023, but only include regions where total sales exceed $10,000. Which SQL clause should be used to filter the aggregated results?

33

In a table 'employee_hierarchy' with columns 'employee_id', 'manager_id', and 'employee_name', an analyst needs to generate a list of all employees under a specific manager, including multiple levels of subordinates. Which SQL construct is most appropriate for querying this hierarchical data efficiently?

34

A data analyst runs a query to count the number of customers in each city. The query uses COUNT(*) and GROUP BY city. However, the result includes NULL for some cities. What will COUNT(*) return for a group where the city is NULL?

35

A table 'orders' contains columns 'order_id', 'customer_id', 'order_date', and 'total'. An analyst needs to find orders placed between January 1, 2023 and December 31, 2023. Which WHERE clause is correct?

36

A data analyst is performing data profiling on a customer dataset. Which metric would best reveal the number of distinct values in the 'state' column?

37

An analyst is using Python pandas and has a DataFrame 'sales' with columns 'date', 'product', 'revenue'. They need to create a pivot table showing total revenue per product per month. Which pandas function is most appropriate?

38

A dataset contains a 'salary' column. The analyst wants to identify outliers using the IQR method. If Q1 = 40,000 and Q3 = 70,000, what is the upper threshold for a non-outlier?

39

An analyst is sampling a large customer database to estimate the average purchase amount. To ensure that the sample proportionally represents different customer segments (e.g., age groups), which sampling method should be used?

40

Which SQL aggregate function would an analyst use to calculate the average value of a numeric column?

41

A data analyst is cleaning a dataset and finds that some cells in the 'email' column contain leading spaces. Which string function should be used to remove these spaces?

42

In a table 'sales_team' with columns 'salesperson', 'quarter', and 'revenue', an analyst wants to assign a rank to each salesperson within their quarter based on revenue, with the highest revenue getting rank 1. However, if two salespeople have the same revenue, they should receive the same rank, and the next rank should be the next consecutive integer (no gaps). Which window function should be used?

43

A data analyst is conducting exploratory data analysis (EDA) on a dataset. Which TWO tasks are typically performed during EDA? (Select two.)

44

An analyst is using SQL to analyze employee data. Which THREE of the following are valid uses of the WHERE clause? (Select three.)

45

A data analyst needs to perform a stratified random sample of a customer database. Which TWO steps are essential for this sampling method? (Select two.)

46

A data analyst needs to count the number of distinct product categories in a table named 'products'. Which SQL function should be used in the SELECT clause?

47

A data analyst wants to identify customers whose last name starts with 'Mc' from the 'customers' table. Which WHERE clause condition should be used?

48

A data analyst is reviewing sales data and wants to find orders where the order total is between $100 and $500, inclusive. Which WHERE clause is correct?

49

A data analyst runs the query: SELECT AVG(salary) FROM employees GROUP BY department HAVING AVG(salary) > 60000. What is the purpose of the HAVING clause?

50

Which SQL function can be used to extract the year from a date column 'order_date'?

51

A data analyst wants to assign a unique sequential integer to each row in a result set, starting at 1, based on the order of the 'sales_amount' column descending. Which window function should be used?

52

A data analyst is using a recursive CTE to traverse an organizational hierarchy. What is the purpose of the anchor member in the recursive CTE?

53

A data analyst is profiling a dataset and finds that the 'email' column contains some NULL values. Which SQL query can be used to count how many rows have a NULL email?

54

Which data sampling method involves selecting every k-th element from a list after a random start?

55

A data analyst is using pandas to read a CSV file named 'sales.csv'. Which line of code correctly reads the file into a DataFrame?

56

A data analyst is using the IQR method to identify outliers in a dataset. The first quartile (Q1) is 25 and the third quartile (Q3) is 45. What is the upper bound for identifying outliers?

57

A data analyst needs to create a new column 'full_name' by concatenating 'first_name' and 'last_name' with a space. Which SQL function should be used in the SELECT clause?

58

A data analyst is performing data profiling on a customer table. Which TWO metrics are commonly used to assess the completeness of a column? (Select TWO.)

59

A data analyst is investigating a correlation between two continuous variables. Which THREE of the following are appropriate steps in this exploratory data analysis? (Select THREE.)

60

A data analyst wants to retrieve the top 5 highest-paid employees from the 'employees' table. Which SQL clauses could be used to achieve this? (Select TWO.)

61

A data analyst needs to count the number of customers who have placed at least one order. Which SQL query should be used?

62

A data analyst wants to find the top 5 products by total sales amount, but only for products that have been sold more than 50 times. Which SQL query accomplishes this?

63

An analyst needs to compute a running total of sales for each department, ordered by date. Which window function is most appropriate?

64

A data analyst wants to generate a report showing employee names and their department names, but some employees are not assigned to any department. The analyst wants to include all employees. Which JOIN type should be used?

65

A dataset contains a column 'birthdate' in 'YYYY-MM-DD' format. The analyst needs to calculate the average age of customers as of today. Which combination of functions is most appropriate?

66

A data analyst is performing data profiling on a customer table. Which metric would best help identify missing values in the 'phone' column?

67

A data analyst wants to randomly select 100 customers from a database for a survey, ensuring that the sample reflects the proportion of male and female customers in the population. Which sampling method is most appropriate?

68

In a table with columns 'employee_id' and 'manager_id', a data analyst needs to retrieve the hierarchy level of each employee, where the top manager has manager_id NULL. Which SQL feature is best suited?

69

A data analyst is using pandas in Python to clean a dataset. Which method is most appropriate to replace missing numerical values with the median of the column?

70

An analyst wants to identify outliers in a dataset using the IQR method. Which values are typically considered outliers?

71

A dataset contains transaction amounts with a few extremely high values. The analyst wants to reduce the impact of these outliers on the average. Which measure of central tendency is most robust?

72

A data analyst uses a CTE to simplify a complex query. Which keyword is used to define a CTE?

73

A data analyst is exploring a sales dataset and wants to identify columns that are likely to be foreign keys. Which TWO characteristics would indicate a foreign key?

74

A data analyst is using Python pandas to perform exploratory data analysis. Which THREE methods are commonly used to assess data quality and distributions?

75

A data analyst needs to sample records from a large dataset for a quick analysis. Which TWO sampling methods are examples of probability sampling?

76

A data analyst runs the following query: SELECT DISTINCT city FROM customers. What is the primary purpose of using the DISTINCT keyword in this query?

77

In a sales database, an analyst needs to retrieve all orders where the order amount is between $100 and $500. Which WHERE clause should be used?

78

A data analyst uses the following query: SELECT department, AVG(salary) AS avg_salary FROM employees GROUP BY department HAVING AVG(salary) > 50000. What is the purpose of the HAVING clause in this query?

79

A data analyst wants to concatenate first_name and last_name columns with a space in between. Which string function combination should be used in SQL?

80

An analyst needs to retrieve the year from an order_date column (datetime type). Which function should be used in SQL?

81

A data analyst is using a window function to assign a unique rank to each employee within their department based on salary, with ties receiving the same rank and leaving gaps. Which function should be used?

82

A data analyst needs to create a recursive CTE to traverse a hierarchical employee-manager table. Which of the following is a key requirement for a recursive CTE?

83

A data analyst is performing data profiling on a customer table. Which metric provides the number of unique values in a column?

84

A data analyst wants to ensure a sample proportionally represents different regions in a population. Which sampling method should be used?

85

During EDA, an analyst calculates the Z-score for each data point in a dataset. A data point with a Z-score of 3.5 is identified. What does this indicate?

86

A data analyst uses Python's pandas library to read a CSV file into a DataFrame. Which function is used to read the file?

87

An analyst writes a SQL query that uses a window function: SELECT employee_id, salary, LAG(salary, 1) OVER (ORDER BY salary DESC) AS prev_salary FROM employees. What does the LAG function return for the row with the highest salary?

88

A data analyst needs to identify duplicate customer records based on email and phone number. Which SQL techniques can be used to find duplicates? (Select TWO).

89

A data analyst is performing EDA on a dataset with numerical features. Which methods are appropriate for identifying outliers? (Select TWO).

90

An analyst needs to aggregate sales data by region and product, then sort the results by total sales in descending order. Which SQL clauses are required? (Select THREE).

91

A data analyst needs to retrieve all unique job titles from the employees table. Which SQL clause should be used with the SELECT statement?

92

A data analyst wants to find customers whose last name starts with 'Mc' and have made purchases in 2023. The purchase table has a purchase_date column. Which SQL query accomplishes this?

93

A data analyst is writing a query to rank products by total sales within each category, showing dense rank and avoiding gaps. Which window function should be used?

94

An analyst needs to count the number of orders per customer but only for customers who have placed more than 5 orders. Which SQL construct allows filtering after aggregation?

95

A data analyst wants to create a temporary result set that can be referenced within a single SQL statement. Which feature should be used?

96

A data analyst is using pandas to clean a DataFrame. They need to replace missing values in the 'age' column with the median age. Which method should they use?

97

During data profiling, an analyst wants to identify the number of distinct values in a column. Which SQL function should be used?

98

A data analyst needs to sample 1000 customers from a database of 100,000 customers for a survey, ensuring every customer has an equal chance of selection. Which sampling method is most appropriate?

99

In a dataset of employee salaries, the analyst notices one value that is significantly higher than the rest. Using the IQR method, which values are typically considered outliers?

100

A data analyst wants to combine first_name and last_name columns into a single full_name column in a SQL query. Which string function should be used?

101

A data analyst needs to calculate the running total of sales for each product over time. Which window function clause is essential for this calculation?

102

An analyst is performing EDA and wants to measure the strength and direction of linear relationship between two continuous variables. Which statistical measure should they compute?

103

A data analyst is validating referential integrity between orders and customers tables. Which TWO of the following checks should the analyst perform?

104

An analyst is using a CTE to compute hierarchical data. Which TWO statements about recursive CTEs are true?

105

A data analyst wants to export a summary report from a DataFrame in pandas. Which THREE methods are commonly used for data export?

106

A data analyst is using SQL to filter a sales table for transactions that occurred in either 'Q1' or 'Q3' of 2023 and have a sale amount greater than $100. Which WHERE clause correctly implements this condition?

107

A data analyst is reviewing a dataset of customer transactions and wants to assess data quality by profiling the 'order_date' column. Which TWO profiling tasks are most appropriate for this date column? (Select TWO).

108

An e-commerce company wants to analyze sales performance across product categories. The dataset includes transaction amounts and a column 'category' with values (Electronics, Clothing, Home). The analyst decides to use stratified sampling to ensure proportional representation. Which THREE steps are required to implement this? (Select THREE).

109

A data analyst is using pandas to clean a DataFrame that contains missing values in the 'age' and 'income' columns. Which THREE pandas methods are appropriate for handling missing data? (Select THREE).

110

An analyst needs to identify outliers in a numeric column 'transaction_amount' using the interquartile range (IQR) method. Which TWO steps are part of this process? (Select TWO).

111

A data analyst uses a Common Table Expression (CTE) to query hierarchical employee data (manager_id references employee_id). Which THREE statements about recursive CTEs are correct? (Select THREE).

112

A data analyst needs to retrieve the top 5 most expensive products from a 'products' table sorted by price descending. Which TWO SQL clauses are required to achieve this? (Select TWO).

113

An analyst wants to use Python (pandas) to compute the average sales amount per region from a DataFrame 'df' with columns 'region' and 'sales'. Which TWO pandas operations are needed? (Select TWO).

Practice all 113 Mining Data questions

Other DA0-001 exam domains

Data Concepts and EnvironmentsAnalysing DataVisualising DataReporting InsightsComparing and Contrasting Data ConceptsMining and Acquiring DataAnalyzing and Modeling DataVisualizing DataCommunicating Data Insights

Frequently asked questions

What does the Mining Data domain cover on the DA0-001 exam?

The Mining Data domain covers the key concepts tested in this area of the DA0-001 exam blueprint published by CompTIA. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all DA0-001 domains — no account required.

How many Mining Data questions are in the DA0-001 question bank?

The Courseiva DA0-001 question bank contains 113 questions in the Mining Data domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Mining Data for DA0-001?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Mining Data questions for DA0-001?

Yes — the session launcher on this page draws questions exclusively from the Mining Data domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your DA0-001 domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Sign Up Free

Free forever · Every certification included

Practice Session

10 questions20 questions30 questions50 questions

Study Resources

All DomainsPractice TestMock ExamFlashcardsStudy Guide