I've written an improved version of the random string generator and have blogged about it here:
1: This article's code samples are all T-SQL. For other SQL dialects, some changes need to be done in the code, but the principles still hold.
2: An article called How to pre-populate a random strings pool was just published on my blog.
This post is a little bit of self-promotion, and a little bit of knowledge sharing. With that out of the way, let's begin.
Random strings can be very useful in many different use cases – from shortened URLs to encryption keys and salts, we use them everywhere. But why would we want to pre-populate a pool of them instead of generating them on the fly when you need them? Well, if you just want your strings to be random, then by all means, generate them on the fly. but if you want them to be both random and unique – then you need a pool of them. Why? because of what’s known as the birthday problem.
In a nutshell, the birthday problem concerns the probability that two people in a group of x people will share their birthday.
Surprisingly, that probability is a lot higher than most people would assume.
In a group of just 23 people, the chance of two people to share a birthday is 50% - and it climbs to 99.9% when the group has just 70 people.
This question has an answer by SQL Server expert Martin Smith.
I've edited his code a bit to also allow numbers in the random strings, and to pack it into a user defined inline table function. To do that, I had to first create a view, because you can't use the
newid operator in a user defined function. The view is very simple:
CREATE VIEW GuidGenerator AS SELECT Newid() As NewGuid
and the function based on Martin's code that use it is this:
CREATE FUNCTION dbo.RandomStringGenerator ( @Length int, @Count int -- Note: up to 1,000,000 rows ) RETURNS TABLE AS RETURN WITH E1(N) AS (SELECT N FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) V(N)), -- 10 E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100 E3(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000 E4(N) AS (SELECT 1 FROM E3 a, E2 b), --1,000,000 Tally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY @@SPID) FROM E4) SELECT TOP(@Count) ( SELECT TOP (@Length) CHAR( CASE Abs(Checksum(NewGuid)) % 3 WHEN 0 THEN 65 + Abs(Checksum(NewGuid)) % 26 -- Random upper case letter WHEN 1 THEN 97 + Abs(Checksum(NewGuid)) % 26 -- Random lower case letter ELSE 48 + Abs(Checksum(NewGuid)) % 10 -- Random digit END ) FROM Tally As t0 CROSS JOIN GuidGenerator WHERE t0.n <> -t1.n -- force SQL Server to re-evalulate the subquery for each row in the main query. FOR XML PATH('') ) As RandomString FROM Tally As t1
Basically, the idea is that you create an inline tally table using stacked common table expressions. You use this tally table as a row multiplier - in the inner query for the number of chars in the generated string (the
@Length argument), and in the outer query for the number of strings to return (the
Inside the inner query, you use
Abs(Checksum(NewGuid)) as a random number, and you base your return value on the ascii codes of chars - numbers from 65 to 90 are ascii codes for upper case letters, from 97 to 122 are lower case, and from 48 to 57 are for digits. You also use the same technique to determine whether to return a digit, an upper case letter or a lower case letter.
You use this function to generate random strings like this:
SELECT RandomString FROM dbo.RandomStringGenerator(20, 100);
This will return a list of 100 random strings of 20 chars each.
If you want to guarantee your random strings will also be unique, you have to pre-populate a table with them, and have a unique constraint/index on the column holding the strings.
You can read more about it in How to pre-populate a random strings pool over on my blog. There's also a link there to an online demo on rextester.