<<< These scripts were tested on MS SQL Server 2019 >>>
From time to time we need to generate huge data for testing. Without any data generating tool we can achieve it by using some SQL Scripts. these scripts are different from regular query scripts.
Table of Contents
Generate list of data
CREATE TABLE #accounts ( fname VARCHAR(20), lname VARCHAR(20))
GO
INSERT #accounts VALUES ('Fred', 'Flintstone')
GO 1000
SELECT TOP 10 * FROM #accounts
fname lname
Fred Flintstone
Fred Flintstone
Fred Flintstone
Fred Flintstone
Fred Flintstone
Fred Flintstone
Fred Flintstone
Fred Flintstone
Fred Flintstone
Fred Flintstone
ALTER TABLE #accounts ADD id INT IDENTITY(1,1)
SELECT TOP 10 * FROM #accounts
GO
fname lname id
Fred Flintstone 1
Fred Flintstone 2
Fred Flintstone 3
Fred Flintstone 4
Fred Flintstone 5
Fred Flintstone 6
Fred Flintstone 7
Fred Flintstone 8
Fred Flintstone 9
Fred Flintstone 10
ALTER TABLE #accounts ADD id INT
DECLARE @id INT
SET @id = 0
UPDATE #accounts
SET @id = id = @id + 1
OPTION ( MAXDOP 1 )
GO
SELECT * FROM #accounts
GO
fname lname id Fred Flintstone 1 Fred Flintstone 2 Fred Flintstone 3 Fred Flintstone 4 Fred Flintstone 5 Fred Flintstone 6 Fred Flintstone 7 Fred Flintstone 8 Fred Flintstone 9 Fred Flintstone 10
Note: MAXDOP is max degree of parallelism, set it to 1 to avoid duplicate value if running on multiple processors machine. Read article 1 for detail.
Generate sequence numbers
Using sys.all_columns
table
SELECT TOP (1000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM sys.all_columns
Generate huge Rows using CROSS JOIN
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
Note: the number of rows being generated are depending on the sys.all_columns
table size. It took 1 minute to generate 30 million rows in my machine. You could use top n to limit the number of rows being generated.
Direct generation
SELECT n FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(n)
select * from x
WITH x AS (SELECT n FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(n))
SELECT ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n
FROM x ones, x tens, x hundreds, x thousands
ORDER BY 1
SELECT ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n
FROM (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) ones(n),
(VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) tens(n),
(VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) hundreds(n),
(VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) thousands(n)
WHERE ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n BETWEEN 1000 AND 2000
ORDER BY 1
DECLARE @startnum INT=1000, @endnum INT=1050
;
WITH gen AS (
SELECT @startnum AS num
UNION ALL
SELECT num+1 FROM gen WHERE num+1<=@endnum
)
SELECT * FROM gen
option (maxrecursion 10000)
Generate random numbers
IF OBJECT_ID('tempdb..#SomeTestTable','U') IS NOT NULL DROP TABLE #SomeTestTable
DECLARE @NumberOfRows INT = 1000000,
@StartValue INT = 400,
@EndValue INT = 500
DECLARE @Range INT = @EndValue - @StartValue + 1;
SELECT TOP (@NumberOfRows)
SomeRandomInteger = ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SomeRandomFloat = RAND(CHECKSUM(NEWID())) * @Range + @StartValue
INTO #SomeTestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
select * from #SomeTestTable order by 1
Summary
We could use system table to generate huge number of custom data by using CROSS JOIN on system tables. We also can generate huge number of random data by just using a formula. To limit the number of data being generated we can use SELECT TOP clause.
By the way, "Tally" table is another good way to generate data, more detail read this article: The "Numbers" or "Tally" Table: What it is and how it replaces a loop
Reference
1. Populate a SQL Server column with a sequential number not using an identity
2. What MAXDOP setting should be used for SQL Server
3. How to generate a range of numbers between two numbers?
Comments