Courseiva
Knowledge + Practice
CertificationsVendorsCareer RoadmapsLabs & ToolsStudy GuidesGlossaryPractice Questions
C
Courseiva

Free IT certification practice questions with explained answers for CCNA, CompTIA, AWS, Azure, Google Cloud, and more.

Certification Practice Questions

CCNA practice questionsSecurity+ SY0-701 practice questionsAWS SAA-C03 practice questionsAZ-104 practice questionsAZ-900 practice questionsCLF-C02 practice questionsA+ Core 1 practice questionsGoogle Cloud ACE practice questionsCySA+ CS0-003 practice questionsNetwork+ N10-009 practice questions
View all certifications →

Product

CertificationsCertification PathsExam TopicsPractice TestsExam Dumps vs Practice TestsStudy HubComparisons

Company

AboutContactEditorial PolicyQuestion Writing PolicyTrust Center

Legal

Privacy PolicyTerms of Service

Courseiva is a free IT certification practice platform offering original exam-style practice questions, detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics for Cisco, CompTIA, Microsoft, AWS, and other technology certifications.

© 2026 Courseiva. Courseiva is operated by JTNetSolutions Ltd. All rights reserved.

Courseiva is an independent certification practice platform and is not affiliated with, endorsed by, or sponsored by Cisco, Microsoft, AWS, CompTIA, Google, ISC2, ISACA, or any other certification vendor. Vendor names and certification marks are used only to identify the exams learners are preparing for.

HomeCertificationsDA0-001DomainsComparing and Contrasting Data Concepts
DA0-001Free — No Signup

Comparing and Contrasting Data Concepts

Practice DA0-001 Comparing and Contrasting Data Concepts questions with full explanations on every answer.

115questions

Start practicing

Comparing and Contrasting Data Concepts — choose a session length

10 questions~10 min20 questions~20 min30 questions~30 min50 questions~50 min

Free · No account required

DA0-001 Domains

Comparing and Contrasting Data ConceptsMining and Acquiring DataAnalyzing and Modeling DataVisualizing DataCommunicating Data Insights

Practice Comparing and Contrasting Data Concepts questions

10Q20Q30Q50Q

All DA0-001 Comparing and Contrasting Data Concepts questions (115)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

1

A retail company stores customer purchase history in a relational database. The database contains a table 'transactions' with columns: transaction_id, customer_id, product_id, quantity, price, and transaction_date. A data analyst needs to create a report that shows total revenue per customer for the last quarter. Which data concept describes the relationship between customer_id and total revenue?

2

A healthcare database stores patient records. Each patient has a unique patient_id, and the database includes a table 'visits' with visit_id, patient_id, visit_date, and diagnosis_code. To ensure data integrity, which constraint should be applied to the patient_id column in the 'visits' table?

3

A data engineer is designing a data warehouse for a multinational corporation. The company has sales data from different regions with varying currencies and date formats. To ensure consistency, which data concept should be applied to standardize the data before loading into the warehouse?

4

An e-commerce company uses a star schema for its data warehouse. The fact table 'sales_fact' contains foreign keys to dimension tables: customer_dim, product_dim, time_dim, and store_dim. A business user wants to know the total sales for each product category in the last month. Which join operation is required to retrieve this data?

5

A data analyst is working with a dataset containing customer information. The dataset includes a column 'full_name' which stores first and last names together. To perform analysis on first names separately, which data concept describes the process of splitting 'full_name' into 'first_name' and 'last_name'?

6

A data scientist is building a machine learning model to predict customer churn. The dataset includes both numerical features (age, income) and categorical features (gender, marital status). Which data concept describes the process of converting categorical features into numerical values that can be used by the algorithm?

7

A company's database has a table 'orders' with columns: order_id, customer_id, order_date, and total_amount. A data analyst needs to identify customers who have placed more than 5 orders in the past year. Which data concept should be used to group orders by customer and count them?

8

A data analyst receives a dataset with a column 'salary' that contains values like '45,000', '55,000', and '65,000'. The analyst notices that the values are stored as text. Which data concept should be applied to convert the salary column from text to numeric format for analysis?

9

Which TWO of the following are characteristics of structured data? (Choose TWO.)

10

Which THREE of the following are valid data quality dimensions? (Choose THREE.)

11

Which TWO of the following are examples of data transformation? (Choose TWO.)

12

A financial services company is migrating its customer data from a legacy on-premises relational database to a cloud-based data warehouse. The legacy database uses a denormalized schema with a single table 'customer_master' that contains all customer attributes, including repeated groups for multiple accounts per customer (account1_type, account1_balance, account2_type, account2_balance, etc.). The data warehouse team wants to implement a normalized star schema with separate dimension and fact tables. During the ETL process, the team encounters an error: 'Data truncation: string data right truncation' when loading account_type values into the dim_account table. The account_type column in dim_account is defined as VARCHAR(10), but the source data contains account types like 'SavingsPlus' (11 characters) and 'CheckingPremium' (15 characters). The team must resolve this issue without losing data. Which course of action should the team take?

13

A healthcare organization maintains a database of patient records. The database has a table 'patients' with columns: patient_id (primary key), first_name, last_name, date_of_birth, gender, and last_visit_date. A data analyst is tasked with creating a report that lists all patients who have not visited in the last two years. The analyst writes a query: SELECT * FROM patients WHERE last_visit_date < DATEADD(year, -2, GETDATE()); However, the query returns zero rows, even though the analyst knows there are patients who have not visited for over two years. Upon inspection, the analyst discovers that the last_visit_date column contains NULL values for patients who have never visited. Which modification to the query should the analyst make to include patients with NULL last_visit_date?

14

A data analyst needs to ensure that a customer's address is stored in a consistent format across multiple databases. Which data quality dimension is the analyst primarily concerned with?

15

A data engineer is designing a data warehouse for a retail company. The fact table must record each sale transaction, including product ID, store ID, date, and quantity sold. The product details (name, category, price) are stored in a separate table. This design is an example of which data modeling concept?

16

A data analyst is troubleshooting a report that shows unusually high sales for a specific product. Upon investigation, the analyst finds that the product was returned by several customers, but the returns were recorded in a separate system and not reflected in the sales data. Which data integration concept was likely missing?

17

When the analyst runs the query, it fails. What is the most likely reason?

18

A data analyst is reviewing the error log from a nightly batch load. What is the most likely cause of the error?

19

A data analyst is creating a report for a marketing campaign. The campaign data includes customer names, email addresses, and purchase history. Which of the following best describes the 'customer name' data type?

20

Which TWO of the following are considered structured data?

21

Which TWO of the following are examples of data governance best practices?

22

Which THREE of the following are characteristics of a relational database?

23

Which THREE of the following are valid methods for handling missing data?

24

A mid-sized e-commerce company stores customer data in a relational database. The database has a table named 'Customers' with columns: CustomerID (primary key), FirstName, LastName, Email, Phone, Address, City, State, ZipCode, and SignUpDate. The company is migrating to a new CRM system that requires a denormalized structure for performance reasons. The new system expects a single table 'CustomerDetails' with columns: CustomerID, FullName (concatenation of first and last name), ContactInfo (JSON object containing email, phone, and address), SignUpDate, and Region (derived from state). The data analyst must design an ETL process to transform the data. During a test run, the analyst notices that some records have missing Phone or Address values. Which of the following is the best approach to handle missing data in the ContactInfo JSON object?

25

Drag and drop the steps for the ETL (Extract, Transform, Load) process in the correct order.

26

Drag and drop the steps to create a data visualization dashboard in the correct order.

27

Match each data quality dimension to its description.

28

Match each data visualization type to its best use case.

29

A company needs to store raw, unprocessed data from IoT sensors for future machine learning experiments. The data is in various formats and schemas are not yet defined. Which storage solution is most appropriate?

30

A financial application requires fast query performance for aggregations on large historical datasets. The schema has many lookup tables. Which schema design is most efficient for this workload?

31

A database table has columns: OrderID (primary key), ProductID, CustomerID, CustomerName, OrderDate, ProductName. All products are purchased only by the customer who placed the order. Which normal form violation exists if CustomerName depends on CustomerID?

32

A data engineer needs to store logs from web servers that have varying fields. The logs are in JSON format. Which data type describes this JSON data?

33

A retail company processes daily transactions. The current system transforms data before loading it into the data warehouse. The volume is growing rapidly, and they want to load raw data first to reduce processing time. Which approach should they adopt?

34

A table Orders has OrderID (primary key), CustomerID, and CustomerEmail. During analysis, it is found that CustomerID uniquely identifies CustomerEmail. Which normal form is violated if both CustomerID and CustomerEmail are stored in this table?

35

A company requires real-time masking of credit card numbers for customer support agents while allowing full access for accountants. Which technique should be implemented?

36

A hospital's patient records system must process thousands of small transactions per second. Which type of database system is best suited for this workload?

37

A data quality report shows that 95% of records have all required fields completed, but 20% of the completed fields contain values that are outside valid ranges. Which data quality dimension is most affected?

38

Which TWO roles are primarily responsible for defining and enforcing data governance policies within an organization?

39

Which THREE of the following are NoSQL database types?

40

A data team must implement a data retention policy to reduce storage costs while meeting legal requirements. Which TWO actions best achieve this?

41

Refer to the exhibit. A data analyst is trying to understand access permissions for the company-data bucket. Which statement accurately describes the effective permissions?

42

Refer to the exhibit. A database administrator notices that queries filtering on both CustomerID and OrderDate are slow. Which single change would most likely improve performance for such queries?

43

Refer to the exhibit. A data pipeline is failing to parse this log entry. What is the most likely cause of the error?

44

A retail analyst needs to determine the most popular product category. The dataset includes columns: ProductID, Category, SalesDate, QuantitySold, UnitPrice. Which column contains qualitative data?

45

A data scientist is building a model to predict customer churn. The company's internal CRM system provides customer demographics and transaction history. They also purchase demographic data from a third-party vendor. How should the purchased data be classified?

46

A logistics company is analyzing truck delivery times. Which variable is discrete?

47

A hospital wants to analyze patient readmission rates. The data contains daily patient visits. What is the level of granularity?

48

A data analyst finds that the "Age" column contains values like "N/A", "unknown", and negative numbers. Which data quality dimension is primarily affected?

49

An e-commerce company stores customer support emails in a text database, product images in a blob store, and sales transactions in a SQL table. Which data store holds only structured data?

50

A market researcher conducts a survey with questions like "What is your favorite brand?" and "How many units do you purchase per year?" Which data types correspond?

51

In a customer database, each row represents a customer with columns: CustomerID, Name, Address, Phone. What does the column "Name" represent?

52

A data audit reveals that some numbers in the "Revenue" column were manually entered from PDF invoices. This introduces potential errors. Which data concept is being addressed?

53

Which TWO data types are considered quantitative? (Select two.)

54

Which THREE characteristics describe unstructured data? (Select three.)

55

Which TWO are examples of primary data? (Select two.)

56

Refer to the exhibit. Which type of data is the field "region"?

57

Refer to the exhibit. Based on the data profiling results, what is a likely data quality issue?

58

Refer to the exhibit. Which data quality dimension is compromised by the missing value for Charlie's salary?

59

A retail company stores customer transaction data in a relational database. They want to analyze purchasing patterns over time. Which type of data structure best supports this analysis?

60

A data analyst receives a dataset with inconsistent date formats (e.g., "01/02/2023", "2023-01-02", "Jan 2, 2023"). Which data quality dimension is most directly affected?

61

A company is designing a data lake to store raw sensor data from IoT devices. The data arrives as JSON objects with varying schemas. Which storage approach is most appropriate?

62

A healthcare provider needs to integrate patient data from multiple clinics into a single data warehouse. Which process is used to extract, transform, and load the data?

63

A data analyst is working with a dataset that contains customer names and addresses. Some records have missing state codes. Which data quality issue is this?

64

A financial institution wants to analyze transaction networks to detect fraud rings. Which database type is best suited for this analysis?

65

A data architect is designing a schema for a product catalog where each product has a variable number of attributes. Which NoSQL database type is most appropriate?

66

A data engineer is comparing data warehouses and data lakes. Which statement accurately describes a data warehouse?

67

During an ETL process, a data quality check fails due to duplicate customer IDs. Which data quality dimension is violated?

68

Which TWO of the following are examples of semi-structured data?

69

Which THREE data quality dimensions are commonly assessed in a data profiling task?

70

Which TWO of the following are characteristics of a data lake?

71

Refer to the exhibit. A data analyst notices that direct S3 access to files outside the "incoming/" prefix is blocked. Which data governance principle does this policy enforce?

72

Refer to the exhibit. A data analyst runs this query to identify high-value customers. However, the result does not include customers with exactly 5 orders. Which data concept does the HAVING clause illustrate?

73

Refer to the exhibit. An Avro schema is defined as shown. Which data design concept does this represent?

74

A marketing team needs to store customer feedback from social media posts, including text, images, and emojis. Which data concept is most appropriate for this storage?

75

An organization wants to assign responsibility for data quality and metadata management. Which role is primarily accountable for defining data standards and ensuring data quality across a specific domain?

76

A data analyst notices that customer addresses in the database contain invalid ZIP codes. Which data quality dimension is being violated?

77

A company is implementing a data lifecycle management policy. Which stage occurs immediately after data is created?

78

To consolidate data from multiple operational databases into a central repository for reporting, a company decides to transform data before loading it into the target system. Which data integration approach is being used?

79

A business needs to store large volumes of raw data in its native format for future analytics. Which storage architecture is most appropriate?

80

A data modeler is designing a dimensional model for a sales analytics system. The fact table contains sales transactions, and the dimension tables include product, customer, and time. To reduce data redundancy, the modeler normalizes the dimension tables into multiple related tables. Which schema is being implemented?

81

An organization has multiple systems that store customer information inconsistently. To create a single authoritative view of customer data, they implement a process that identifies and merges duplicate records. This is an example of which data management discipline?

82

A data governance team is drafting a policy for handling personally identifiable information (PII). According to data governance best practices, which document should define the classification levels and handling procedures?

83

Which TWO of the following are considered internal data sources within an organization?

84

Which THREE of the following are common characteristics of unstructured data?

85

Which TWO of the following are primary benefits of implementing a data governance program?

86

Refer to the exhibit. The data shown is an example of which data concept?

87

Refer to the exhibit. Which data concept does this exhibit best represent?

88

Refer to the exhibit. Which conclusion can be drawn from this data quality report?

89

A market research firm collects survey responses where customers rate satisfaction on a scale of 'Very Unsatisfied', 'Unsatisfied', 'Neutral', 'Satisfied', 'Very Satisfied'. What type of data is being collected?

90

A data analyst needs to compare sales data from the company's internal CRM with public demographic data from a government census. Which data concept best describes this scenario?

91

A sensor records temperature readings in Celsius and a separate sensor records wind speed in meters per second. A data scientist wants to combine these datasets for analysis. Which statement accurately compares these data types?

92

A company stores customer data in a relational database with tables for orders, products, and customers. Which type of data best describes this?

93

A researcher wants to study the effect of a new drug. She collects data directly from clinical trial participants. Later, she compares her findings with historical data from medical journals. Which contrast best describes her data sources?

94

A data analyst notices that a column labeled 'Income' contains values like '$50,000' and '$75,000', but also 'High' and 'Low'. What data concept issue is occurring?

95

Which of the following is an example of qualitative data?

96

A data team is building a predictive model. They have data on 'Number of employees' (whole numbers) and 'Revenue' (currency). Which statement correctly compares these data types?

97

A dataset contains a column 'Education Level' with values: 'High School', 'Bachelor', 'Master', 'PhD'. An analyst computes the average by assigning numbers 1-4. Which data concept is being violated?

98

Which TWO of the following are characteristics of structured data? (Choose TWO.)

99

Which TWO of the following are examples of quantitative data? (Choose TWO.)

100

Which THREE of the following are properties of ratio data? (Choose THREE.)

101

The exhibit shows a JSON schema for a dataset. Which statement correctly describes the data types represented?

102

A retail company has merged with another firm and now needs to create a unified customer data warehouse. The existing systems use different data classification methods: System A stores customer income as a categorical range (e.g., '$0-$50k', '$50k-$100k', '$100k+') while System B stores exact income as a decimal number. A data analyst must combine these into a single table. The goal is to perform statistical analysis that includes calculating average income, but the categorical data from System A loses precision. The analyst proposes converting System B's exact values into the same ranges as System A to ensure consistency. However, the data governance team wants to preserve as much detail as possible. Which course of action should the analyst recommend?

103

A healthcare analytics team is building a dashboard to monitor patient vitals. They receive data from two sources: Source 1 provides 'heart rate' as an integer (beats per minute), and Source 2 provides 'blood pressure' as a ratio (systolic/diastolic, e.g., 120/80). The team wants to create a combined metric called 'cardiac stress index' that uses both heart rate and systolic blood pressure. However, they notice that heart rate data occasionally contains negative values due to sensor errors. The data governance policy requires that all data be valid and meaningful. Which action best addresses the data quality issue while preserving the data types?

104

A retail company analyzes customer purchase data to improve inventory management. They store daily transaction records in a relational database and monthly aggregate reports in a data warehouse. Which difference between these storage methods best explains why the warehouse is more suitable for trend analysis?

105

A data analyst is evaluating data quality issues in a customer database. Which TWO actions are best practices for ensuring data consistency?

106

An organization is implementing a data lake to store raw data from various sources. Which THREE characteristics are typically associated with a data lake compared to a data warehouse?

107

A manufacturing company has two primary data systems: an ERP system that stores production orders with fields like OrderID, ProductID, Quantity, and ProductionDate, and a CRM system that stores customer sales with fields like SaleID, CustomerID, ProductID, SaleDate, and Amount. The data analyst needs to create a unified view of product performance by joining these tables. However, the ProductID field in the ERP uses a 5-character alphanumeric code (e.g., 'P1234'), while the CRM uses a 6-character code (e.g., 'PR1234'). Additionally, some products have multiple entries due to slight variations in naming. The analyst wants to ensure accurate matching without losing data. Which action should the analyst take first to address the data inconsistency?

108

A large financial institution is implementing a data governance framework to comply with new regulations requiring strict control over sensitive customer data. The data governance committee has identified several domains, including customer master data, transaction data, and risk assessment data. They need to decide on a master data management (MDM) approach that ensures a single, authoritative source of customer information across all systems. However, the current environment has multiple legacy systems with conflicting customer records. The committee is concerned about downtime and business disruption during the transition. Which MDM approach best balances data consistency with minimal operational impact?

109

A data analyst at a marketing agency is working with a dataset containing customer demographics, purchase history, and social media engagement metrics. The agency wants to perform sentiment analysis on unstructured social media comments to identify brand perception. The dataset also includes structured fields like age, income, and purchase amounts. The analyst needs to choose a storage and processing platform that can handle both structured and unstructured data efficiently without requiring extensive schema definition upfront. Which platform should the analyst recommend?

110

A telecommunications company is experiencing issues with its customer satisfaction survey data. The data is collected from multiple channels: phone, email, and web forms. Each channel uses a different scale for ratings: phone uses 1-10, email uses 1-5, and web uses 1-7. Additionally, some survey responses contain missing values for demographic fields. The data analyst needs to calculate an overall satisfaction score that is comparable across all channels. The company's leadership wants a single metric that minimizes distortion from the different scales. Which approach should the analyst use to standardize the ratings?

111

A healthcare organization is subject to strict data privacy regulations requiring the classification of all data assets. The data governance team has identified three data sensitivity levels: Public, Internal, and Restricted. They have a new data pipeline importing patient health records from multiple clinics. The records include patient names, diagnoses, treatment codes, and insurance information. The team must ensure that the classification is applied correctly and that restricted data (e.g., diagnoses) is not exposed to unauthorized personnel. However, the pipeline uses automated tagging based on metadata rules, and some fields are misclassified. What is the most effective immediate action to improve classification accuracy?

112

An e-commerce company wants to provide real-time personalized product recommendations based on customer browsing behavior. Currently, they have a traditional data warehouse that processes batch updates every night. The marketing team complains that recommendations are outdated within hours because customers see yesterday's data. The data engineer needs to modify the architecture to support near-real-time analytics. The budget is limited, and the existing warehouse infrastructure must be reused as much as possible. Which architectural change would best meet the requirement?

113

A data analyst is comparing characteristics of structured and unstructured data. Which TWO of the following are characteristics of structured data? (Choose two.)

114

Refer to the exhibit. A data architect is designing a data dictionary for a relational database. Based on the exhibit, which data concept is being illustrated?

115

A retail company is merging customer data from three separate systems: an e-commerce platform, a point-of-sale (POS) system, and a loyalty program. The e-commerce platform stores customer names in "FirstName LastName" format, the POS system stores names as "LastName, FirstName", and the loyalty program stores names in separate "first_name" and "last_name" fields. The data analyst needs to create a unified customer master table. After initial merging, there are 20% more records than expected, including duplicates with slight name variations (e.g., "John Smith" vs "John A. Smith"). To ensure accurate consolidation, which data concept should the analyst prioritize applying first?

Practice all 115 Comparing and Contrasting Data Concepts questions

Other DA0-001 exam domains

Mining and Acquiring DataAnalyzing and Modeling DataVisualizing DataCommunicating Data Insights

Frequently asked questions

What does the Comparing and Contrasting Data Concepts domain cover on the DA0-001 exam?

The Comparing and Contrasting Data Concepts domain covers the key concepts tested in this area of the DA0-001 exam blueprint published by CompTIA. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all DA0-001 domains — no account required.

How many Comparing and Contrasting Data Concepts questions are in the DA0-001 question bank?

The Courseiva DA0-001 question bank contains 115 questions in the Comparing and Contrasting Data Concepts domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Comparing and Contrasting Data Concepts for DA0-001?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Comparing and Contrasting Data Concepts questions for DA0-001?

Yes — the session launcher on this page draws questions exclusively from the Comparing and Contrasting Data Concepts domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your DA0-001 domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Sign Up Free

Free forever · Every certification included

Practice Session

10 questions20 questions30 questions50 questions

Study Resources

All DomainsPractice TestMock ExamFlashcardsStudy Guide