This chapter covers SQL data types and constraints, which are fundamental to designing relational database schemas in Azure SQL Database, Azure SQL Managed Instance, and SQL Server on Azure VMs. For the DP-900 exam, approximately 10-15% of questions relate to data types and constraints, often testing your ability to choose the correct type or constraint for a given scenario. You will need to understand the differences between exact and approximate numeric types, character string types, date/time types, and the purpose of PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, and DEFAULT constraints. Mastering these concepts ensures you can design efficient, consistent relational databases.
Jump to a section
Think of SQL data types as different types of storage containers in a warehouse. An INT is like a small, fixed-size bin that can only hold whole numbers from -2^31 to 2^31-1. If you try to put a decimal or a text label in that bin, the warehouse system rejects it. A VARCHAR(50) is like a flexible plastic tote that can hold up to 50 characters of text, but it expands only to the length needed. A DECIMAL(10,2) is like a precision scale that stores numbers with exactly 2 decimal places, like monetary amounts. Constraints are the rules for what can go into each container. A PRIMARY KEY constraint is like a unique barcode on a bin that also forces the bin to be non-empty — no two bins can have the same barcode. A FOREIGN KEY constraint is like a cross-reference between bins in different aisles: if you put a product in aisle B that refers to a bin in aisle A, that bin in aisle A must exist. A CHECK constraint is like a quality gate that only allows items that meet certain criteria, like "quantity must be >= 0". The database engine enforces these rules automatically, just as a warehouse management system would reject any item that violates the container rules. This prevents data corruption and ensures consistency.
What Are SQL Data Types?
SQL data types define the kind of data that can be stored in a column, table variable, or parameter. They enforce data integrity by restricting the format, range, and precision of values. In Microsoft SQL Server (the engine behind Azure SQL Database), data types are grouped into categories: exact numeric, approximate numeric, date and time, character strings, Unicode character strings, binary strings, and others.
Exact Numeric Types
Exact numeric types store numbers with precise precision and scale. The main types are: - INT: 4-byte integer, range -2,147,483,648 to 2,147,483,647. - BIGINT: 8-byte integer, range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. - SMALLINT: 2-byte integer, range -32,768 to 32,767. - TINYINT: 1-byte integer, range 0 to 255. - DECIMAL(p, s) and NUMERIC(p, s): Fixed precision and scale. p (precision) is the total number of digits, s (scale) is the number of digits after the decimal point. Maximum precision is 38. Storage depends on precision: up to 9 digits use 5 bytes, up to 19 use 9 bytes, up to 28 use 13 bytes, up to 38 use 17 bytes. - MONEY and SMALLMONEY: Store currency values with fixed precision. MONEY uses 8 bytes, range -922,337,203,685,477.5808 to 922,337,203,685,477.5807. SMALLMONEY uses 4 bytes, range -214,748.3648 to 214,748.3647. These are exact numeric types but not commonly used; DECIMAL is preferred.
Approximate Numeric Types
- FLOAT(n): Approximate number with n bits of mantissa. n can be 1-53. Default is 53 (8 bytes). Range: -1.79E+308 to 1.79E+308. - REAL: Equivalent to FLOAT(24), 4 bytes. Range: -3.40E+38 to 3.40E+38. Approximate types should not be used for financial calculations because they can introduce rounding errors. Use DECIMAL instead.
Character String Types
CHAR(n): Fixed-length non-Unicode string. n is the number of bytes (1 to 8000). If the string is shorter, it is padded with spaces. Storage is always n bytes.
VARCHAR(n | MAX): Variable-length non-Unicode string. n is max bytes (1 to 8000). MAX allows up to 2^31-1 bytes (2 GB). Storage is actual length + 2 bytes overhead.
TEXT: Deprecated, use VARCHAR(MAX).
For Unicode data (e.g., storing multiple languages), use NCHAR and NVARCHAR. NVARCHAR(MAX) stores up to 2 GB of Unicode text.
Date and Time Types
DATE: Stores date only (0001-01-01 to 9999-12-31), 3 bytes.
TIME(p): Stores time only (00:00:00.0000000 to 23:59:59.9999999), precision p (0-7), 3-5 bytes.
DATETIME: Stores date and time with 3.33 ms accuracy, range 1753-01-01 to 9999-12-31, 8 bytes.
DATETIME2(p): More precise, range 0001-01-01 to 9999-12-31, precision 0-7, 6-8 bytes.
SMALLDATETIME: Date and time with 1 minute accuracy, range 1900-01-01 to 2079-06-06, 4 bytes.
DATETIMEOFFSET(p): Like DATETIME2 but includes time zone offset, 8-10 bytes.
Binary Types
BINARY(n): Fixed-length binary data, n bytes.
VARBINARY(n | MAX): Variable-length binary data. MAX allows up to 2 GB.
IMAGE: Deprecated, use VARBINARY(MAX).
Other Types
UNIQUEIDENTIFIER: 16-byte GUID.
XML: Stores XML data up to 2 GB.
SQL_VARIANT: Can store any SQL Server data type except TEXT, NTEXT, IMAGE, timestamp, and itself.
HIERARCHYID: For hierarchical data.
GEOGRAPHY and GEOMETRY: Spatial data types.
What Are Constraints?
Constraints are rules enforced on data columns to ensure data integrity. They are defined at table creation or added later with ALTER TABLE. The main types are:
PRIMARY KEY
FOREIGN KEY
UNIQUE
CHECK
DEFAULT
PRIMARY KEY Constraint
A PRIMARY KEY uniquely identifies each row in a table. It enforces both UNIQUE and NOT NULL. A table can have only one PRIMARY KEY, but it can consist of multiple columns (composite key). The database automatically creates a unique clustered index on the primary key columns by default (unless specified otherwise).
Example:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100)
);FOREIGN KEY Constraint
A FOREIGN KEY links two tables. It ensures that values in a column (or columns) match values in the primary key or unique constraint of another table. This enforces referential integrity. Foreign keys can also reference the same table (self-referencing).
Example:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT FOREIGN KEY REFERENCES Customers(CustomerID)
);UNIQUE Constraint
A UNIQUE constraint ensures that all values in a column are distinct. Unlike PRIMARY KEY, a table can have multiple UNIQUE constraints, and they allow one NULL value (in SQL Server). UNIQUE constraints create a unique nonclustered index by default.
CHECK Constraint
A CHECK constraint validates data based on a logical expression. It can reference multiple columns and can include subqueries in some cases.
Example:
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
Price DECIMAL(10,2) CHECK (Price >= 0),
Quantity INT CHECK (Quantity >= 0)
);DEFAULT Constraint
A DEFAULT constraint provides a default value for a column when no value is specified in an INSERT. It can be a constant, a system function (e.g., GETDATE()), or a scalar expression.
Example:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
HireDate DATETIME DEFAULT GETDATE()
);How Constraints Work Internally
When a DML statement (INSERT, UPDATE, DELETE) is executed, the SQL Server engine checks each constraint in a specific order: 1. DEFAULT constraints are applied first (for INSERT). 2. NULL/NOT NULL checks. 3. CHECK constraints are validated. 4. UNIQUE and PRIMARY KEY constraints check for duplicates. 5. FOREIGN KEY constraints check referential integrity.
If any constraint is violated, the entire statement is rolled back. For bulk operations, constraints can be disabled (WITH NOCHECK) but this is not recommended for production.
Interaction with Indexes
PRIMARY KEY and UNIQUE constraints automatically create indexes (clustered for PK, nonclustered for UNIQUE by default). Foreign keys do not create indexes, but it is best practice to index foreign key columns to improve join performance.
Data Type Precedence and Conversion
When comparing values of different data types, SQL Server uses data type precedence. The type with higher precedence is converted to. For example, INT has higher precedence than SMALLINT. Implicit conversions are allowed between compatible types (e.g., INT to DECIMAL). Explicit conversion requires CAST or CONVERT.
Choosing the Right Data Type
Use the smallest data type that can hold your data to save storage and improve performance.
For monetary values, use DECIMAL(19,4) or similar, not FLOAT.
For large text, use VARCHAR(MAX) or NVARCHAR(MAX).
For dates, prefer DATETIME2 over DATETIME for better precision and range.
Common Mistakes in DP-900
Confusing DECIMAL and FLOAT: DECIMAL is exact, FLOAT is approximate.
Thinking VARCHAR(MAX) is always better than VARCHAR(n): VARCHAR(MAX) has overhead and cannot be indexed.
Believing UNIQUE allows multiple NULLs: In SQL Server, UNIQUE allows only one NULL.
Assuming FOREIGN KEY automatically creates an index: It does not.
Using MONEY type: It is less flexible than DECIMAL and can cause issues with rounding.
Verifying Constraints
Use system views:
SELECT * FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS;
SELECT * FROM INFORMATION_SCHEMA.CHECK_CONSTRAINTS;
SELECT * FROM sys.check_constraints;Summary
Data types and constraints are the backbone of relational database schema design. They enforce data integrity at the database level, reducing application complexity. On the DP-900 exam, focus on understanding the purpose of each constraint and the characteristics of common data types, especially exact vs. approximate numeric, fixed vs. variable-length strings, and date/time types.
Define Table and Columns
Start by creating a table using CREATE TABLE. Specify each column name and its data type. For example, CREATE TABLE Customers (CustomerID INT, Name VARCHAR(100), Email VARCHAR(255)). At this point, no constraints are applied, so data integrity is not enforced. The database engine allocates storage based on data types: INT uses 4 bytes, VARCHAR(100) uses up to 100 bytes plus overhead. This step is purely structural.
Add Primary Key Constraint
Add a PRIMARY KEY constraint on CustomerID: ALTER TABLE Customers ADD CONSTRAINT PK_Customers PRIMARY KEY (CustomerID). The engine immediately checks that no existing rows have NULL or duplicate CustomerID. It then creates a unique clustered index on CustomerID, physically ordering the table by that column. Any future INSERT or UPDATE that violates uniqueness or nullability is rejected with error 2627.
Add Foreign Key Constraint
Create an Orders table with a foreign key referencing Customers: CREATE TABLE Orders (OrderID INT PRIMARY KEY, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)). The engine verifies that all existing CustomerID values in Orders exist in Customers. During INSERT/UPDATE on Orders, it checks that the new CustomerID exists in Customers. If a DELETE on Customers would leave orphaned rows in Orders, it is blocked unless ON DELETE CASCADE is specified.
Add Check Constraint
Add a CHECK constraint to ensure Age is non-negative: ALTER TABLE Customers ADD CONSTRAINT CHK_Age CHECK (Age >= 0). The engine validates all existing rows; if any violate, the constraint is not added. On INSERT/UPDATE, the expression is evaluated. If Age is NULL, the constraint is not enforced (unless WITH NOCHECK is used). The expression can be complex, e.g., CHECK (Age BETWEEN 0 AND 150).
Add Default Constraint
Add a DEFAULT constraint for CreateDate: ALTER TABLE Customers ADD CONSTRAINT DF_CreateDate DEFAULT GETDATE() FOR CreateDate. When inserting a row without specifying CreateDate, the engine substitutes the default value. If a column has a default and is explicitly set to NULL (and allows NULL), the default is not used. Defaults can be scalar expressions or system functions.
In a production e-commerce system using Azure SQL Database, data types and constraints are critical for data integrity. For example, the Products table uses DECIMAL(10,2) for Price to ensure exact monetary calculations. A CHECK constraint ensures Price >= 0. The table has a PRIMARY KEY on ProductID (INT) and a UNIQUE constraint on ProductName to prevent duplicates. The Orders table has a FOREIGN KEY referencing Customers, with ON DELETE CASCADE to automatically remove orders when a customer is deleted (though this is often avoided in practice). A common misconfiguration is using FLOAT for Price, leading to rounding errors in financial reports. Another scenario is a logging table that stores timestamps. Using DATETIME2(3) instead of DATETIME provides better precision and avoids the 1753 year limit. For large text fields like product descriptions, NVARCHAR(MAX) is used to support Unicode. However, indexing such columns is problematic; a full-text index might be used instead. In a data warehouse scenario, dimension tables use surrogate keys (INT IDENTITY) as PRIMARY KEY, while fact tables use composite foreign keys referencing dimensions. Constraints ensure that no fact row references a non-existent dimension. When loading data with Azure Data Factory, constraints may be temporarily disabled for bulk inserts to improve performance, but this risks data quality. A real-world issue: forgetting to add a UNIQUE constraint on email columns in a Customer table can lead to duplicate customer records, causing confusion in CRM systems. Azure SQL Database provides tools like SQL Server Management Studio or Azure Data Studio to visually manage constraints. Performance-wise, too many constraints (especially CHECK constraints with complex expressions) can slow down DML operations. Best practice is to use constraints for critical business rules and handle complex validation in application logic. Lastly, when migrating on-premises SQL Server databases to Azure SQL, data type compatibility must be checked; for example, the deprecated TEXT type must be converted to VARCHAR(MAX).
The DP-900 exam tests your ability to identify the correct data type or constraint for a given scenario, not to write complex SQL. Key objectives are under "Describe relational data" (2.3) and "Describe how to work with relational data on Azure" (2.4). Common wrong answers include: choosing FLOAT for currency (it's approximate), using VARCHAR without considering Unicode (use NVARCHAR for multilingual), or thinking PRIMARY KEY allows NULLs (it does not). The exam loves to test edge cases: UNIQUE constraint allows only one NULL in SQL Server, but this is not standard across all databases. Also, DATETIME2 has a larger range and precision than DATETIME; SMALLDATETIME has minute-level accuracy. Another trap: MONEY type is exact but not recommended; DECIMAL is preferred. For constraints, remember that a table can have only one PRIMARY KEY but multiple UNIQUE constraints. FOREIGN KEY does not create an index automatically. CHECK constraints can reference multiple columns. DEFAULT constraints use functions like GETDATE() or CURRENT_TIMESTAMP. The exam may ask which constraint enforces entity integrity (PRIMARY KEY) or referential integrity (FOREIGN KEY). Numbers to memorize: INT range (-2^31 to 2^31-1), TINYINT (0-255), DECIMAL max precision 38, VARCHAR max 8000 (or MAX for 2GB). Also know that NVARCHAR uses 2 bytes per character. In Azure SQL Database, data types are the same as SQL Server. A common scenario question: "You need to store product prices up to $999,999.99 with two decimal places. Which data type?" Answer: DECIMAL(9,2) — 7 digits before decimal, 2 after, total 9. Another: "Which constraint ensures a column has unique values and no NULLs?" PRIMARY KEY. Also, be aware that adding a constraint with NOCHECK allows existing data to violate it, but future changes are checked. The exam may test that WITH NOCHECK is used for existing data only.
INT uses 4 bytes; TINYINT uses 1 byte (0-255); SMALLINT uses 2 bytes; BIGINT uses 8 bytes.
DECIMAL(p,s) is exact; FLOAT is approximate. Use DECIMAL for monetary values.
VARCHAR(MAX) stores up to 2 GB but has overhead and cannot be indexed; prefer VARCHAR(n) for fixed max length.
PRIMARY KEY enforces UNIQUE and NOT NULL; a table can have only one PRIMARY KEY.
UNIQUE constraint allows one NULL in SQL Server; multiple UNIQUE constraints per table are allowed.
FOREIGN KEY ensures referential integrity but does not create an index; index manually.
CHECK constraint validates data based on a logical expression; can reference multiple columns.
DEFAULT constraint provides a value when none is specified; can use functions like GETDATE().
DATETIME2 has larger range (0001-9999) and higher precision than DATETIME.
NVARCHAR uses 2 bytes per character; VARCHAR uses 1 byte per ASCII character.
These come up on the exam all the time. Here's how to tell them apart.
DECIMAL
Exact numeric type with fixed precision and scale.
No rounding errors; ideal for financial data.
Storage varies with precision (5-17 bytes).
Can specify precision up to 38 digits.
Slower arithmetic than FLOAT in some cases.
FLOAT
Approximate numeric type with floating-point representation.
Can introduce rounding errors; not for exact calculations.
Storage fixed at 4 bytes (REAL) or 8 bytes (FLOAT).
Range is larger: up to 1.79E+308.
Faster arithmetic for scientific computations.
VARCHAR(n)
Non-Unicode variable-length string.
1 byte per character (ASCII).
Maximum n is 8000 bytes.
Cannot store characters from multiple languages reliably.
Use when data is known to be ASCII only.
NVARCHAR(n)
Unicode variable-length string.
2 bytes per character (UTF-16).
Maximum n is 4000 characters (8000 bytes).
Can store any Unicode character (e.g., Arabic, Chinese).
Use for multilingual applications.
Mistake
VARCHAR(MAX) is always better than VARCHAR(n) because it can hold any length.
Correct
VARCHAR(MAX) has overhead (24 bytes for the pointer) and cannot be indexed like VARCHAR(n). Use VARCHAR(n) when the maximum length is known and under 8000 characters.
Mistake
FLOAT is fine for storing currency values.
Correct
FLOAT is approximate and can cause rounding errors (e.g., 0.1 + 0.2 != 0.3). Use DECIMAL for exact numeric values like money.
Mistake
UNIQUE constraint allows multiple NULL values.
Correct
In SQL Server, a UNIQUE constraint allows only one NULL value. This is implementation-specific; some databases allow multiple NULLs.
Mistake
FOREIGN KEY automatically creates an index on the referencing column.
Correct
FOREIGN KEY does not create an index. You should manually create an index on the foreign key column(s) to improve join performance.
Mistake
PRIMARY KEY and UNIQUE constraints are functionally identical.
Correct
PRIMARY KEY enforces NOT NULL and UNIQUE, and a table can have only one. UNIQUE allows one NULL and a table can have multiple UNIQUE constraints. Also, PRIMARY KEY creates a clustered index by default; UNIQUE creates a nonclustered index.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
CHAR(n) is a fixed-length string; it always uses n bytes, padding with spaces if the string is shorter. VARCHAR(n) is variable-length; it uses actual length + 2 bytes overhead. Use CHAR when the length is fixed (e.g., state abbreviations), and VARCHAR when length varies. On the exam, know that CHAR can be more efficient for fixed-length data.
Yes, a FOREIGN KEY can reference a column that has a UNIQUE constraint. It does not have to be a PRIMARY KEY, but the referenced column(s) must have a unique constraint (PRIMARY KEY or UNIQUE) to ensure each row is uniquely identifiable.
In SQL Server (and Azure SQL Database), a UNIQUE constraint allows only one NULL value. This is a common exam trap. Other database systems like PostgreSQL allow multiple NULLs.
DATETIME has a range of 1753-01-01 to 9999-12-31 with 3.33 ms precision, using 8 bytes. DATETIME2 has a range of 0001-01-01 to 9999-12-31 with up to 100 ns precision (7 decimal places), using 6-8 bytes. DATETIME2 is recommended for new work.
The DEFAULT constraint provides a default value for a column when no value is specified in an INSERT statement. It can be a constant (e.g., 0), a system function (e.g., GETDATE()), or a scalar expression. If a column has a default and is explicitly set to NULL, the default is not used.
No, MONEY is not recommended because it can cause rounding issues and has limited precision. Use DECIMAL(19,4) or similar for monetary values. The exam may test that MONEY is exact but less flexible.
PRIMARY KEY creates a clustered index by default, which physically orders the table data. UNIQUE constraint creates a nonclustered index by default, which is a separate structure that points to the data rows. You can change the index type when creating the constraint.
You've just covered SQL Data Types and Constraints — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.
Done with this chapter?