Courseiva
Knowledge + Practice
CertificationsVendorsCareer RoadmapsLabs & ToolsStudy GuidesGlossaryPractice Questions
C
Courseiva

Free IT certification practice questions with explained answers for CCNA, CompTIA, AWS, Azure, Google Cloud, and more.

Certification Practice Questions

CCNA practice questionsSecurity+ SY0-701 practice questionsAWS SAA-C03 practice questionsAZ-104 practice questionsAZ-900 practice questionsCLF-C02 practice questionsA+ Core 1 practice questionsGoogle Cloud ACE practice questionsCySA+ CS0-003 practice questionsNetwork+ N10-009 practice questions
View all certifications →

Product

CertificationsCertification PathsExam TopicsPractice TestsExam Dumps vs Practice TestsStudy HubComparisons

Company

AboutContactEditorial PolicyQuestion Writing PolicyTrust Center

Legal

Privacy PolicyTerms of Service

Courseiva is a free IT certification practice platform offering original exam-style practice questions, detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics for Cisco, CompTIA, Microsoft, AWS, and other technology certifications.

© 2026 Courseiva. Courseiva is operated by JTNetSolutions Ltd. All rights reserved.

Courseiva is an independent certification practice platform and is not affiliated with, endorsed by, or sponsored by Cisco, Microsoft, AWS, CompTIA, Google, ISC2, ISACA, or any other certification vendor. Vendor names and certification marks are used only to identify the exams learners are preparing for.

HomeCertificationsDA0-001DomainsMining and Acquiring Data
DA0-001Free — No Signup

Mining and Acquiring Data

Practice DA0-001 Mining and Acquiring Data questions with full explanations on every answer.

99questions

Start practicing

Mining and Acquiring Data — choose a session length

10 questions~10 min20 questions~20 min30 questions~30 min50 questions~50 min

Free · No account required

DA0-001 Domains

Comparing and Contrasting Data ConceptsMining and Acquiring DataAnalyzing and Modeling DataVisualizing DataCommunicating Data Insights

Practice Mining and Acquiring Data questions

10Q20Q30Q50Q

All DA0-001 Mining and Acquiring Data questions (99)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

1

A data analyst is pulling data from a production database for a report. The database contains customer orders with a column 'order_date'. The analyst notices that some orders have dates in the future. Which data quality issue does this represent?

2

A data engineer is designing a data pipeline to ingest streaming data from IoT sensors. The sensors send data every second, and the pipeline must handle bursts of up to 10,000 messages per second. Which approach is most appropriate for capturing this data before processing?

3

A data analyst needs to combine two datasets: one contains customer information (customer_id, name, address) and the other contains order information (order_id, customer_id, order_date). The analyst wants to include all customers, even those who have not placed orders. Which type of join should be used?

4

A data analyst is tasked with extracting data from a legacy system that outputs fixed-width text files. The analyst needs to parse these files into a structured format. Which tool or method is most appropriate for this task?

5

A company is merging two databases from different departments. In Database A, customer IDs are integers. In Database B, customer IDs are alphanumeric strings. To merge, the data analyst must reconcile these differences. Which step should be taken first?

6

A data analyst needs to extract data from an API that returns JSON. The analyst wants to convert the JSON output into a tabular format for analysis. Which function in a scripting language is commonly used for this purpose?

7

A data analyst is building a dataset from multiple sources and needs to ensure data quality. During the data acquisition phase, which activity is most important to perform?

8

An organization needs to acquire data from a third-party vendor. The data will be used for regulatory reporting. Which of the following should be the primary consideration before acquiring the data?

9

A data analyst is using SQL to extract data. The analyst wants to retrieve all records from a table named 'sales' where the 'amount' column is greater than 100. Which SQL clause should be used?

10

Which TWO of the following are common methods for acquiring data from external sources?

11

Which THREE of the following are best practices when performing data extraction for a data pipeline?

12

Which TWO of the following are valid SQL clauses used to filter and sort data?

13

What is the primary purpose of the HAVING clause in the query shown?

14

A data analyst sees this error in the ETL logs. What is the most likely cause?

15

A data engineer is configuring access to a data lake in Amazon S3. What does the JSON policy shown allow?

16

A healthcare organization is building a data warehouse to support population health analytics. The data sources include: (1) an electronic health record (EHR) system with a relational database containing patient demographics, diagnoses, and medications; (2) a claims system that generates CSV files daily; (3) patient-generated health data from mobile apps via a REST API returning JSON. The data engineer needs to design a data acquisition process that runs nightly. The EHR system has a change tracking mechanism that logs changes with timestamps. The claims CSV files are appended daily. The API supports filtering by date. The data warehouse uses a star schema with fact and dimension tables. The engineer must ensure data consistency and minimize load times. Which approach should the engineer take?

17

A retail company is migrating its on-premises data warehouse to a cloud data warehouse. The current ETL process extracts data from a transactional database (SQL Server) and a web analytics system (JSON logs). The ETL runs nightly and takes 6 hours. The business requires that the new cloud warehouse support real-time reporting with data latency of less than 15 minutes. The data engineer proposes using change data capture (CDC) from the SQL Server database and streaming the JSON logs via a message queue. However, management is concerned about cost and complexity. The engineer must design a solution that meets the latency requirement while minimizing operational overhead. Which approach should the engineer recommend?

18

A data analyst is merging two datasets from different departments. The analyst notices that the 'CustomerID' field in the first dataset is stored as an integer, while in the second dataset it is stored as a string with leading zeros. Which TWO steps should the analyst take to ensure successful data integration?

19

Based on the exhibit, what is the most likely cause of the import failure?

20

A marketing company is building a customer segmentation model. The data team has access to two sources: a CRM database with customer demographics and purchase history, and a third-party data provider that offers social media activity scores. The CRM data is updated daily, while the third-party data is refreshed weekly on Sundays. The analyst needs to create a unified dataset for the model training scheduled for Wednesday morning. The analyst runs a SQL query to join the two tables on CustomerID, but the resulting dataset has far fewer rows than expected. Upon investigation, the analyst finds that many customers in the CRM do not have matching records in the third-party data. Additionally, some customers in the third-party data have multiple entries due to unresolved duplicates. The analyst must produce the most complete dataset possible while maintaining data quality. Which course of action should the analyst take?

21

Drag and drop the steps to perform a data backup using the 3-2-1 rule in the correct order.

22

Drag and drop the steps to perform a data audit in the correct order.

23

Match each data analysis technique to its primary purpose.

24

Match each database concept to its definition.

25

A data analyst needs to collect customer sentiment data from social media platforms. Which data acquisition method is most appropriate?

26

A company is merging two customer databases from different acquisitions. They need to identify duplicate records. Which data profiling technique is most effective?

27

A data architect is designing an ETL pipeline to ingest streaming data from IoT sensors. The data must be available for real-time analytics. Which acquisition method is best?

28

A marketing team wants to collect data on competitor pricing for similar products. Which data source is most appropriate?

29

During data acquisition, an analyst notices that the data from an external vendor has inconsistent date formats. What is the first step the analyst should take?

30

A data engineer needs to acquire data from a legacy mainframe system that does not support modern APIs or direct database connectivity. Which approach is most feasible?

31

A small business wants to acquire customer feedback through a short questionnaire emailed after purchase. Which data acquisition method does this represent?

32

An organization is integrating data from multiple sources into a data warehouse. They need to handle differences in data granularity (e.g., daily vs. hourly sales data). Which technique is most appropriate?

33

A data analyst is using a public API to collect historical weather data. The API has a rate limit of 100 requests per minute, but the analyst needs to retrieve 10,000 records as quickly as possible. What strategy should be used?

34

Which TWO are common methods for acquiring internal data? (Choose two.)

35

Which THREE are best practices for data profiling during acquisition? (Choose three.)

36

Which THREE are common challenges when acquiring data from external APIs? (Choose three.)

37

Refer to the exhibit. An analyst runs this query before acquiring data from a PostgreSQL database. What is the primary purpose of this query?

38

Refer to the exhibit. A data engineer is setting up data acquisition from an S3 bucket with this policy. What does the policy enforce?

39

Refer to the exhibit. An analyst sees this log during data acquisition. What action should be taken first?

40

A data analyst is tasked with combining customer data from a CRM system and a billing system. The CRM uses a GUID for customer ID, while billing uses an integer. Which approach should the analyst use to ensure a reliable merge?

41

A data team needs to extract data from a legacy system that only supports flat file exports. Which data acquisition method is most appropriate?

42

During a data mining project, an analyst discovers that a significant number of records have a negative value for the age field. What is the most appropriate first step?

43

Refer to the exhibit. What does the query return?

44

Refer to the exhibit. What data quality issue is indicated?

45

Refer to the exhibit. If the date column is stored as a string in 'MM/DD/YYYY' format, what will be the result?

46

A data analyst needs to identify duplicate customer records. Which TWO methods are commonly used? (Select two.)

47

After merging two datasets, an analyst finds that the resulting dataset has many null values in some columns. Which TWO steps should the analyst take to address this? (Select two.)

48

Which THREE data sources are suitable for web scraping? (Select three.)

49

A retail company wants to analyze customer purchase patterns to identify products frequently bought together. Which data mining technique is most appropriate?

50

A data analyst is importing a CSV file that contains a mixture of numeric and text fields. What is the most common issue when importing?

51

During data acquisition, a data engineer uses a tool to extract data from a source system incrementally based on a timestamp column. Which method is being used?

52

A data analyst discovers that a dataset contains multiple records for the same customer with different spellings (e.g., 'Jon' vs 'John'). Which data preparation step should be applied first?

53

A financial institution is merging transaction data from two different systems. System A stores currency amounts as integers in cents, and System B stores as decimals in dollars. What is the best way to integrate the data?

54

A data team is integrating customer data from three sources. After joining, they find that the count of unique customers is lower than expected. What is the most likely cause?

55

A data analyst needs to merge two customer tables from different sources. One table uses 'CUST_ID' as the primary key, the other uses 'CustomerID'. To ensure accurate merging, the analyst should first:

56

A company receives daily sales data in CSV format. The data includes a 'Date' column in MM/DD/YYYY format. To load this into a database that expects YYYY-MM-DD, the analyst should:

57

A data analyst is tasked with collecting data from a web API that returns JSON. The API requires an API key in the header. Which method should be used to authenticate?

58

An analyst needs to combine two datasets from different sources that share a common key but have different levels of granularity. Dataset A has daily sales per store, Dataset B has hourly foot traffic per store. The analyst wants to analyze correlation. Which approach is appropriate?

59

A data team is designing an ETL process to extract data from an operational database daily. The database experiences heavy write loads during business hours. What is the best practice to minimize impact on operations?

60

A healthcare organization acquires data from multiple hospitals with different patient record systems. The data includes patient IDs but no common identifier across systems. Which technique should be used to link records?

61

A financial analyst is integrating data from multiple stock exchanges. One exchange provides trade timestamps in UTC, another in Eastern Time. The analyst needs accurate time synchronization for time-series analysis. What is the best approach?

62

An e-commerce company is merging customer data from three legacy systems. Two systems use email as unique identifier, but one system allows multiple customers per email. The third uses phone number. To create a unified customer view, the analyst should first:

63

A data engineer is tasked with acquiring data from a third-party vendor that provides daily file drops via SFTP. The files are large (10 GB each). The pipeline must load data into a data warehouse. Which approach optimizes for speed and reliability?

64

A data analyst is validating a dataset acquired from an external source. Which TWO actions are appropriate for data quality assessment?

65

A company is acquiring social media data via a public API. Which TWO considerations are important for ensuring ethical and legal compliance?

66

A data scientist is merging retail transaction data from online and in-store sources. Which THREE steps are required to ensure data consistency?

67

A data analyst receives the above JSON snippet from a web API. The analyst needs to extract the email addresses for all customers. Which JSONPath expression should be used?

68

An analyst is reviewing the above SQL query used to acquire data. What does this query retrieve?

69

A data pipeline log shows the above error. Which data transformation should be applied during acquisition?

70

A marketing team wants to analyze customer sentiment from social media posts. Which data acquisition method is most appropriate?

71

A data analyst needs to combine sales data from multiple regional databases with different schemas. Which process is best?

72

An organization is acquiring data from an external vendor. The vendor provides a flat file with inconsistent delimiters and missing values. Which step should be performed first in data acquisition?

73

A data analyst is tasked with gathering data from a legacy system that only exports CSV files. The files contain headers but no data types. Which tool would best facilitate initial data exploration?

74

A company wants to collect real-time clickstream data from its website. Which acquisition method is most suitable?

75

A financial institution needs to acquire credit transaction data from multiple sources while ensuring compliance with data privacy regulations. What is the most critical step?

76

A data analyst is extracting data from a relational database using SQL. Which clause is essential for limiting the rows retrieved to only those needed?

77

An e-commerce company is acquiring product data from multiple supplier APIs. The APIs return JSON with inconsistent field naming conventions. Which data acquisition technique should be applied?

78

A data team is using web scraping to collect competitor pricing data. The target website has anti-scraping measures like CAPTCHAs and rate limiting. Which approach is most effective?

79

Which TWO are examples of internal data sources? (Select exactly 2)

80

A data analyst is evaluating data quality issues during acquisition. Which TWO issues are most likely to arise from merging data from different sources? (Select exactly 2)

81

Which THREE are best practices for acquiring data via web scraping? (Select exactly 3)

82

Refer to the exhibit. What is the most likely issue causing the unexpectedly low count?

83

Refer to the exhibit. What is the most likely cause of the extraction failure?

84

A retail company is acquiring sales data from 150 stores worldwide. Each store sends daily CSV files via email to a central email address. The data acquisition process is manual: an intern downloads each attachment and copies it into a shared folder. The shared folder is then accessed by an ETL tool that loads data into a data warehouse. Recently, the data warehouse has been missing records for several stores. The intern reports that some emails are not being received or are delayed. The company needs to improve the reliability and timeliness of data acquisition. Which course of action should be taken first?

85

A marketing analyst needs to combine customer data from a CRM database with social media engagement data from a third-party API. Which data acquisition method is most appropriate?

86

A data analyst is tasked with collecting data from multiple spreadsheets provided by different departments. Each spreadsheet has different column names and formats. What is the best first step?

87

A data engineer is designing an ETL pipeline to extract sales data from a legacy on-premise database and load it into a cloud data warehouse. The database is slow and queries during business hours affect performance. Which extraction strategy minimizes impact?

88

A research firm is acquiring data from public government databases via API. The API rate limits at 100 requests per minute. They need to download 10,000 records, but each request returns a maximum of 100 records. What is the most efficient approach to ensure complete acquisition without being blocked?

89

Which TWO are valid data acquisition methods? (Select two.)

90

Which THREE are challenges in acquiring data from external sources? (Select three.)

91

A retail company's data analytics team needs to acquire point-of-sale (POS) transaction data from 200 stores daily. Each store sends a CSV file via email at the end of the day. The files often arrive late, have inconsistent column names (e.g., "StoreID", "Store_ID", "store_id"), and occasionally contain corrupted rows. The team manually processes these files, leading to frequent errors and delays. The company wants to automate the acquisition process to ensure data is available by 9 AM the next business day with high quality. Which approach best addresses these issues?

92

A healthcare organization collects patient questionnaire data via paper forms at clinics. The forms are scanned and sent to a central office, where staff manually enter data into an electronic system. This process is slow and error-prone. The organization wants to reduce manual entry errors and speed up data availability. Which method should they adopt?

93

A logistics company receives GPS tracking data from fleet vehicles at 1-second intervals via a cellular network. The data is used to optimize routes and monitor driver behavior. Recently, the data acquisition system has been missing updates for some vehicles when they pass through tunnels or remote areas. The data team notices gaps during these periods. The company needs a solution to ensure near-real-time data continuity. What should they do?

94

An e-commerce company wants to integrate product pricing data from competitor websites to adjust its own prices dynamically. They plan to scrape pricing pages every hour. However, the competitors' websites have anti-scraping measures such as IP blocking and CAPTCHAs. The company's legal team also advises caution regarding terms of service. Which data acquisition strategy is both effective and compliant?

95

A financial analytics firm needs to acquire historical stock market tick data (millions of records per day) from a data vendor. The vendor provides data via FTP in binary format. The firm's existing infrastructure uses on-premise servers with limited storage and processing power. They need to stream the data into a cloud data lake for analysis. However, the binary format is proprietary and requires a licensed decoder. The budget is constrained. Which approach best meets the data acquisition requirements?

96

A social media monitoring company collects public tweets using the Twitter API. The API has a tiered access: free tier allows 500,000 tweets per month, and paid tier allows 2 million tweets per month. The company needs to collect 1.5 million tweets per month for analysis. They are on a free tier but have been exceeding the limit, causing account suspension. They need a sustainable solution without significantly increasing costs. What should they do?

97

A data analyst is performing data acquisition from multiple source files. Which TWO data profiling tasks should the analyst complete before loading the data into the target system?

98

Refer to the exhibit. A data analyst is trying to extract data from a SQL Server database but receives the error. Which configuration change should the analyst recommend to the database administrator?

99

A large retail company is integrating customer data from two separate CRM systems into a new data warehouse. System A stores customer IDs as integers (e.g., 12345), while System B stores them as alphanumeric strings (e.g., 'CUST-12345-X'). Additionally, some customers exist in both systems but with slight name variations (e.g., 'John Smith' vs 'Jon Smith'). The data warehouse requires a unified customer table with a single unique identifier for each customer. The analyst needs to design the data acquisition process. Which of the following is the most appropriate first step?

Practice all 99 Mining and Acquiring Data questions

Other DA0-001 exam domains

Comparing and Contrasting Data ConceptsAnalyzing and Modeling DataVisualizing DataCommunicating Data Insights

Frequently asked questions

What does the Mining and Acquiring Data domain cover on the DA0-001 exam?

The Mining and Acquiring Data domain covers the key concepts tested in this area of the DA0-001 exam blueprint published by CompTIA. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all DA0-001 domains — no account required.

How many Mining and Acquiring Data questions are in the DA0-001 question bank?

The Courseiva DA0-001 question bank contains 99 questions in the Mining and Acquiring Data domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Mining and Acquiring Data for DA0-001?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Mining and Acquiring Data questions for DA0-001?

Yes — the session launcher on this page draws questions exclusively from the Mining and Acquiring Data domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your DA0-001 domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Sign Up Free

Free forever · Every certification included

Practice Session

10 questions20 questions30 questions50 questions

Study Resources

All DomainsPractice TestMock ExamFlashcardsStudy Guide