SQL has a set of statements that are used to manipulate data called DML (Data Manipulation Language). I used to think that DML involves statements that only modify data however, DML also involves data retrieval. In this article, I'll be discussing data modification statements that are used to insert data namely INSERT and SELECT.
Inserting data
In this section, I'll discuss the different statements that are used to insert data into tables like INSERT VALUES, INSERT SELECT, INSERT EXEC, SELECT INTO, and BULK INSERT.
The INSERT VALUES statement
The INSERT VALUES statement is used to insert rows into a table based on the values you provide. Let's take a look at how the statement works using the snippets below (assuming you've set up your server). In this example, I'll be using a database called BikeStores
USE BikeStores
IF OBJECT_ID('dbo.Orders', 'U') IS NOT NULL
DROP TABLE dbo.Orders;
-- Create the Orders table in the dbo schema
CREATE TABLE dbo.Orders
(
ordered INT NOT NULL,
CONSTRAINT PK_Orders PRIMARY KEY,
orderdate DATE NOT NULL,
CONSTRAINT DFT_orderdate DEFAULT(SYSDATETIME()),
emptied INT NOT NULL,
custom VARCHAR(10) NOT NUL
)
We can now insert some data into the created table using INSERT VALUES statement as shown below.
INSERT INTO dbo.Orders (ordered, orderdate, empid, custid)
VALUES(10001, '20090314', 3, 'A')
Specifying the column names is optional but this allows you to control the value-column associations instead of relying on the order in which the columns were presented when the table was defined.
If you specify a value for a particular column then SQL Server will use that value. If you don't, SQL Server will first check if the column has a default value defined for that column and if it does, it'll use that value. If it doesn't and the column accepts NULL marks, then a NULL mark will be used. If a NULL mark is used in a column that does not accept NULL marks then the whole insert operation will fail. In the snippet below, data is inserted into dbo.Orders without the specifying any value for the orderdate column because a default value is specified for that column.
INSERT INTO dbo.Orders(ordered, emptied, custid)
VALUES (10002, 5, 'B');
The Enhanced VALUES clause
SQL Server 2008 and SQL Server 2012 support an enhanced VALUES clause that allows you to specify multiple rows separated by commas. For example, let's insert 4 rows into dbo.Orders.
INSERT INTO dbo.Orders
(ordered, orderdate, empid, custid)
VALUES
(10003, '20090213', 4, 'B'),
(10004, '20090214', 1, 'A'),
(10005, '20090213', 1, 'C'),
(10006, '20090215', 3, 'C');
The statement is processed as an atomic operation, therefore if any row isn't successfully entered into the table, the whole operation fails.
Here's another feature of the enhanced VALUES clause. You can use it create a derived table.
SELECT *
FROM ( VALUES
(10003, '20090213', 4, 'B'),
(10004, '20090214', 1, 'A'),
(10005, '20090213', 1, 'C'),
(10006, '20090215', 3, 'C'))
AS O(orderid, orderdate, empid, custid) -- An alias MUST be assigned to the table
The query produces the following output
orderid orderdate empid custid
--------- ---------- ------- ------
10003 20090213 4 B
10004 20090214 1 A
10005 20090213 1 C
10006 20090215 3 C
The INSERT SELECT statement
The INSERT SELECT statement is used to insert a set of rows into a target table with a SELECT statement. Let's take a look at an example.
INSERT INTO dbo.Orders(orderid, orderdate, empid, custid)
SELECT
FIRST_VALUE(order_id)
OVER(PARTITION BY order_id, customer_id, order_date
ORDER BY customer_id, order_id
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) + 4000 AS orderid,
FIRST_VALUE(order_date)
OVER(PARTITION BY order_id, customer_id, order_date
ORDER BY customer_id, order_id
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS orderdate,
2 AS empid,
FIRST_VALUE(customer_id)
OVER(PARTITION BY order_id, customer_id, order_date
ORDER BY customer_id, order_id
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS custid
FROM sales.joinOrdersAndItems
GROUP BY order_id, customer_id, order_date
ORDER BY orderid, custid, orderdate
OFFSET 0 ROWS FETCH FIRST 10 ROWS ONLY
The INSERT SELECT statement also allows you the option of specifying the target column names, and the recommendations provided earlier regarding specifying those names remains the same with this statement. This statement is also processed as an atomic operation.
Prior to the implementation of the enhanced feature of the VALUES clause, if you wanted a virtual table based on values, you'd have to use multiple SELECT statements, each returning a single row based on values, and unify the rows with the UNION ALL set operator. For example, the query below uses the SELECT statement and UNION ALL set operator to insert rows into dbo.Orders.
INSERT INTO dbo.Orders(orderid, orderdate, empid, custid)
SELECT 10007, '20090215', 2, 'B' UNION ALL
SELECT 10008, '20090215', 1, 'C' UNION ALL
SELECT 10009, '20090216', 2, 'C' UNION ALL
SELECT 10010, '20090216', 3, 'A';
The use of the SELECT statement without the FROM clause is non-standard. It's better to use a table constructor based on the VALUES clause.
The INSERT EXEC statement
This statement is used to insert the resultset of a stored procedure into a target table. It's very similar to the INSERT SELECT statement as shown in the query below.
-- Delete the procedure if it already exists
IF OBJECT_ID('sales.usp_getorders', 'P') IS NOT NULL
DROP PROC sales.usp_getorders;
GO
CREATE PROC sales.usp_getorders
@orderid AS INT
AS
SELECT orderid + 1 AS orderid, orderdate, empid, custid, qty
FROM dbo.Orders
WHERE orderid = @orderid
GO
INSERT INTO dbo.Orders (orderid, orderdate, empid, custid, qty)
EXEC sales.usp_getorders @orderid=4001;
By using this the INSERT EXEC statement, you can direct the results of the procedure to the dbo.Orders table.
The SELECT INTO statement
The SELECT INTO statement is a non-standard T-SQL statement that is used to create a new table that is populated by the result set of a query. Non-standard means that it is not part of the ISO and ANSI SQL standards. This statement cannot be used to insert data into an existing table. Use INSERT SELECT for that instead. Here's how to use the SELECT INTO statement to create a copy of dbo.Orders.
IF OBJECT_ID('dbo.newOrders', 'U') IS NOT NULL
DROP TABLE dbo.Orders;
SELECT orderid, orderdate, empid, custid
INTO dbo.newOrders
FROM dbo.Orders;
-- Clean up
DROP TABLE dbo.newOrders
The structure of the target table and the data it contains is based on the source table. The SELECT INTO statement copies the following from the source table to the target.
- Column names
- Types
- Nullability
- Identity Property
- The data
There are 4 things that the statement does not copy, namely:
- Constraints
- Triggers
- Permissions
- Indexes
You'll have to implement these features manually if required.
Logging
Minimal logging involves logging only the information that is required to recover a transaction without supporting point-in-time recovery. Check out this blog post to read more.
One of the benefits of the SELECT INTO statement is that you can run it in minimally logged mode by setting a database property called Recovery Model to FULL. The INSERT SELECT statement can also benefit from minimal logging based on certain criteria. For details see Prerequisites for Minimal Logging in Bulk Import
SELECT INTO with SETS
There's a slight variation in the syntax but the query does the same thing.
IF OBJECT_ID('dbo.uniontable') IS NOT NULL
DROP TABLE dbo.People;
SELECT first_name, last_name, email
INTO dbo.People
FROM sales.customers
UNION
select first_name, last_name, email
from sales.staffs
The BULK INSERT statement
The BULK INSERT statement is used to insert data from a file into an existing table. In the statement, you specify the target table, the source file, and options. Check out the documentation to find all the options. Let's add an order to dbo.Orders from a file called orders.txt.
# Create the file
echo "4022,'20160101',2,600,500" > orders.txt
Now run the following query to insert the order
BULK INSERT dbo.Orders FROM './orders.txt'
WITH
(
DATAFILETYPE = 'char',
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
);
The BULK INSERT statement can be run in minimally logged mode in certain scenarios provided some requirements are met.
Conclusion
In this article I discussed the use of the INSERT and SELECT statements to add data to tables. Furthermore, I highlighted a few standard and non-standard ways of adding data.