Skip to main content
Database Administration & Troubleshooting

Union In SQL: Syntax, Examples, Flags & Production Guide

union in sql is the set operator that combines result sets from two or more SELECT statements into a single result set, removing duplicates by default. It requires matching column counts and compatible data types across all queries.

SELECT city FROM Table1 UNION SELECT city FROM Table2;

This operator is part of the ANSI SQL standard and is implemented across major database engines. Use it to merge data from separate tables or partitions into one unified result set.

union in sql Syntax Reference

-- Basic UNION (removes duplicates)
SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;

-- UNION ALL (preserves all duplicates)
SELECT column1, column2, ...
FROM table1
UNION ALL
SELECT column1, column2, ...
FROM table2;

-- UNION with ORDER BY (applied to final result set)
SELECT city FROM Table1
UNION
SELECT city FROM Table2
ORDER BY city;

-- UNION with SELECT INTO (creates new table from combined results)
SELECT column1, column2
INTO new_table
FROM table1
UNION
SELECT column1, column2
FROM table2;

Tested on SQL Server 2022 (16.x) and Azure SQL Database. Syntax applies to T-SQL, PostgreSQL, MySQL 8.x, and most ANSI SQL-compliant engines.

union in sql Rapid Reference Cheat Sheet

Action SQL Clause Provider/Context Key Flag/Modifier Impact/Result
Combine with deduplication UNION All ANSI SQL N/A Removes duplicate rows; performs a DISTINCT operation on the combined set
Combine with all rows UNION ALL All ANSI SQL ALL Preserves duplicates; no sort or dedup overhead — faster on large sets
Sort final result ORDER BY All ANSI SQL ASC / DESC Must reference column names or ordinals from the first SELECT; applied after union
Create table from union SELECT … INTO SQL Server, Azure SQL, PostgreSQL INTO destination_table Materializes the union result into a new permanent or temp table
Filter individual queries WHERE per SELECT All SQL dialects Comparison operators Reduces per-query rows before the union; improves performance vs. filtering after
See also  Concatenation In SQL Query — Syntax & Examples

Advanced Implementation & Parameters

Data Type Compatibility and Precedence

Column definitions across SELECT statements in a UNION do not have to be identical, but they must be implicitly convertible. When data types differ, SQL Server determines the resulting type using data type precedence rules. For example:

SELECT 1 AS value
UNION
SELECT 1.5;
-- Result column type is NUMERIC(2,1) because NUMERIC has higher precedence than INT

Column Names and Aliases

The final result set inherits column names from the first SELECT statement. Use aliases there to control output column names:

SELECT city AS location FROM customers
UNION
SELECT city FROM suppliers;  -- output column will be 'location'

UNION with ORDER BY – Critical Restriction

ORDER BY can only appear once, at the end of the entire UNION query. It applies to the combined result set, not individual SELECT statements. You may reference either a column name from the first SELECT or an ordinal position (1-based):

SELECT city, state FROM Table1
UNION
SELECT city, state FROM Table2
ORDER BY state DESC, city ASC;
-- ORDER BY applies to the final combined set

Effect of ALL and Parentheses – Multi-Set Combinations

Parentheses control evaluation order when combining three or more sets. UNION ALL preserves all rows from its left and right operands, while UNION deduplicates its combined result. Mixing them without parentheses can yield unexpected results:

-- Example D from official docs: three SELECTs with ALL and parentheses
SELECT 'A' AS col
UNION ALL
SELECT 'B'
UNION
SELECT 'A';
-- The inner UNION ALL produces ('A','B'), then UNION deduplicates: ('A','B')

-- With parentheses to force evaluation order
SELECT 'A' AS col
UNION ALL
(SELECT 'B'
 UNION
 SELECT 'A');
-- Inner UNION produces ('A','B'), then UNION ALL keeps both: ('A','A','B')

UNION in Subqueries and CTEs

A UNION can appear inside a Common Table Expression (CTE) or derived table, enabling multi-source aggregations before further processing:

WITH combined AS (
    SELECT order_id, amount FROM orders_2023
    UNION ALL
    SELECT order_id, amount FROM orders_2024
)
SELECT customer_id, SUM(amount) AS total_spent
FROM combined
GROUP BY customer_id;

Error Resolution & Troubleshooting

Error Code / Signal Root Cause Remediation Command
Msg 205 (SQL Server)
Column count mismatch
Number of columns in each SELECT is not equal
-- Ensure both SELECTs have same number of columns
SELECT col1, col2 FROM t1
UNION
SELECT col1 FROM t2;  -- ERROR
-- Fix: include a placeholder or adjust
SELECT col1, col2 FROM t1
UNION
SELECT col1, NULL AS col2 FROM t2;
Msg 257 (SQL Server)
Implicit conversion error
Data types are not compatible (e.g., INT vs VARCHAR)
-- Convert to a common type explicitly
SELECT CAST(id AS VARCHAR(10)) AS id FROM t1
UNION
SELECT id FROM t2;  -- if t2.id is VARCHAR
ORA-01790 (Oracle)
Expression must have same data type
Corresponding columns have incompatible types
-- Use CAST or CONVERT to align types
SELECT CAST(emp_id AS NUMBER) FROM employees
UNION
SELECT manager_id FROM departments;
ORDER BY not allowed in subquery of UNION ORDER BY placed inside an individual SELECT within a UNION (except with TOP/OFFSET)
-- Remove ORDER BY from inner SELECT; apply only at the end
SELECT TOP 5 name FROM t1 ORDER BY name   -- ERROR inside UNION
UNION
SELECT name FROM t2
ORDER BY name;  -- correct
Duplicate column name (MySQL)
Ambiguous column in ORDER BY
ORDER BY references a column name that appears with different values across the union
-- Use column alias from first SELECT or ordinal
SELECT city AS location FROM t1
UNION
SELECT town AS location FROM t2
ORDER BY location;

Performance Pitfalls

  • Unnecessary DISTINCT: UNION (without ALL) forces a sort + distinct. If duplicates are acceptable or impossible, always use UNION ALL to avoid the sort overhead.
  • Large intermediate sorts: When combining millions of rows, UNION may spill to disk. Monitor tempdb (SQL Server) or work_mem (PostgreSQL). Increase sort memory or switch to UNION ALL + application-level dedup.
  • Indexing strategy: Each SELECT in a UNION can use separate indexes. Ensure covering indexes exist on the filter columns of each table.
See also  SQL Server DATEADD CLI Reference: Syntax & Troubleshooting

Production-Grade Implementation

When to Use UNION vs UNION ALL

  • UNION ALL: Use when you know duplicates are impossible (e.g., non-overlapping date partitions) or when duplicates are acceptable. Almost always preferred for performance.
  • UNION: Use only when you need a distinct set and the data sources genuinely overlap. Avoid as a substitute for a proper JOIN — if you’re combining rows from the same table, a GROUP BY or DISTINCT on a single query is more efficient.

Multi-Source Data Consolidation Pattern

-- Consolidating sales data from regional tables with deduplication
SELECT order_id, amount, 'East' AS region FROM sales_east
UNION ALL
SELECT order_id, amount, 'West' AS region FROM sales_west
UNION ALL
SELECT order_id, amount, 'Central' AS region FROM sales_central;

Security and Least Privilege

Each SELECT in a UNION executes under the permissions of the calling user. To reduce surface area:

  • Grant SELECT only on required columns — avoid SELECT * inside unions.
  • Use views to expose a limited set of columns from each source table, then UNION the views.
  • In Azure Synapse or SQL Server with row-level security, each SELECT respects the predicate — test with overlapping users.

Automation and CI/CD

When deploying UNION queries as part of stored procedures or views, version-control the DDL and include a test that validates:

  • Column count parity between all SELECT statements.
  • Data type compatibility (use sys.columns / information_schema.columns to compare).
  • No accidental ORDER BY inside subqueries.
-- Automated validation query (SQL Server)
SELECT 
    s.name AS schema_name,
    v.name AS view_name,
    c.column_id,
    c.name AS column_name,
    TYPE_NAME(c.user_type_id) AS data_type
FROM sys.views v
JOIN sys.schemas s ON v.schema_id = s.schema_id
JOIN sys.columns c ON c.object_id = v.object_id
WHERE v.name = 'ConsolidatedSalesView'
ORDER BY c.column_id;

Frequently Asked Questions

What is the difference between UNION and UNION ALL in SQL?

Answer: UNION removes duplicate rows from the combined result set; UNION ALL returns all rows including duplicates.

See also  Select Distinct Count SQL — Complete CLI Reference, Syntax

UNION performs an implicit DISTINCT operation requiring extra sorting and de-duplication, making it slower on large datasets. UNION ALL skips dedup, ideal when duplicates are acceptable or impossible. Use UNION only when uniqueness is required. Example:

SELECT city FROM customers UNION SELECT city FROM suppliers;

When should I use the ORDER BY clause with UNION?

Answer: Apply ORDER BY only after the final SELECT statement, not within individual queries.

Putting ORDER BY on a subquery in UNION can cause syntax errors in most SQL engines (except MySQL with passthrough). Correct usage:

SELECT col FROM t1 UNION SELECT col FROM t2 ORDER BY col;

Each subquery can use ORDER BY if wrapped in parentheses with LIMIT (e.g., PostgreSQL), but final output sorting must be at the end.

How do I fix “ORA-00932: inconsistent datatypes” when using UNION?

Answer: Ensure all selected columns in each UNION branch have compatible or explicitly cast data types.

This Oracle error occurs when column types mismatch (e.g., VARCHAR2 vs NUMBER). Cast explicitly:

SELECT CAST(id AS VARCHAR2(10)) AS id FROM t1 UNION SELECT CAST(id AS VARCHAR2(10)) FROM t2;

For other engines, similar logic applies: use CAST or CONVERT to align types across all queries.

Does the UNION operator work on all major cloud database services (AWS RDS, Azure SQL, GCP Cloud SQL)?

Answer: Yes, UNION is ANSI SQL standard and supported across AWS RDS (MySQL, PostgreSQL, SQL Server), Azure SQL Database, and GCP Cloud SQL.

However, behavior differs: MySQL has an optimizer hint for UNION; Azure SQL supports UNION in columnstore indexes; GCP Cloud SQL for PostgreSQL treats NULL ordering consistently. Always test for engine-specific limits (e.g., column count, max statement size). No cross-cloud performance guarantees.

What is the fastest way to combine large datasets with UNION in SQL?

Answer: Use UNION ALL instead of UNION when duplicates are irrelevant, and ensure proper indexing on all source SELECT columns.

For extremely large tables, avoid UNION’s implicit DISTINCT. If dedup is required, use UNION ALL and then apply DISTINCT via

SELECT DISTINCT col FROM (SELECT col FROM t1 UNION ALL SELECT col FROM t2) AS u;

Additionally, filter early with WHERE clauses and consider using temporary tables or EXCEPT/INTERSECT operators in databases that support them.