Sometimes you need to have hundreds of thousands or even millions of rows to test the performance of your database and queries because we can't explore performance if we only have 10 rows, so before playing with performance we need to make some test data.
But can't we just use a shell script or a (your favorite language) script, yes you can but try generating 1M rows with your script then save to the database but your script memory will not be able to take 1M rows in memory so you will have to create a stream or generate chunks of data then save to the database or a CSV file then import it to the database, but why all that while you have amazing functions from PostgreSQL to help you.
The repeat()
Function
A function that takes two parameters repeat(string to repeat, number of times to repeat)
to repeat a string.
Example:
INSERT INTO test (stringColumn) VALUES (repeat("Wow ", 3));
This will insert a row with a stringColumn with a value of Wow Wow Wow
The generate_series()
Function
A function that takes two required parameters and an optional one generate_series(start, stop, increment by [default is 1])
to generate a series of numbers each in a row.
Example:
INSERT INTO test (id) VALUES (generate_series(1, 1000));
This will insert 1000 rows each with an id equal to the previous id incremented by one.
The random()
Function
A function that generates a random float from 0 to 1.
Example:
INSERT INTO test (id) VALUES (trunc(random()*100));
This will generate an id of three digits from 0 to 100 (trunc to make the float an integer).
ALL TOGETHER:
Let's create a players table:
CREATE TABLE players (id INT, about TEXT, age INT);
Then let's insert 1000 rows with some random data:
INSERT INTO players (id, about, age) VALUES (generate_series(1, 1000), repeat('A cool player. ', 2) || 'My number is ' || trunc(random()*1000), trunc(random()*10 * 2 + 10));
I hope you like this article, you can follow my LinkedIn at:
https://www.linkedin.com/in/amr-elmohamady
Oldest comments (0)