Josh Campbell

Posted on Sep 29

Improve Your PL/pgSQL Performance: Identifying and Resolving Common Bottlenecks

#postgres #plpgsql #sql #performance

Hey folks! If you’re like me, at some point you've probably experienced the frustration of sluggish database performance with PL/pgSQL functions in PostgreSQL. It's a common struggle, but don't worry. I've been down that road and come out the other side with some valuable insights. Today, I want to share some of the most common bottlenecks I've encountered and how I managed to fix them. We'll look at some very brief code examples, both the pitfalls and the solutions so you can learn from my missteps and successes!

1. Row-by-Row Processing (Slow Loops)

The Issue:

Using loops (FOR, WHILE) to iterate over large datasets is like walking when you could be driving. It processes one row at a time, which is a massive time sink.

-- Inefficient Loop
FOR record IN SELECT * FROM large_table LOOP
  PERFORM some_function(record);
END LOOP;

The Solution:

Enter set-based operations! SQL is designed to handle entire sets of data in one swoop.

-- Efficient Set-Based Operation
INSERT INTO results_table
SELECT some_function(*)
FROM large_table;

2. Unoptimized SQL Queries Within Functions

The Issue:

Ever written a query that just drags its feet? It might be missing indexes, full of unnecessary joins, or pulling more data than it needs.

-- Unoptimized Query
SELECT * 
FROM big_table 
WHERE some_column = 'value';

The Solution:

Always optimize your queries. Index those columns, select only what you need, and keep computations minimal.

-- Optimized Query
CREATE INDEX idx_some_column ON big_table(some_column);

SELECT specific_column 
FROM big_table 
WHERE some_column = 'value';

3. Excessive Context Switching Between PL/pgSQL and SQL

The Issue:

Switching back and forth between procedural code and SQL statements? That overhead piles up faster than you might think.

-- Excessive Context Switching
FOR record IN SELECT * FROM big_table LOOP
  EXECUTE 'INSERT INTO results_table VALUES ($1)' USING record.id;
END LOOP;

The Solution:

Reduce context switches by combining SQL operations and trimming down procedural code.

-- Reduced Context Switching
INSERT INTO results_table
SELECT id 
FROM big_table;

4. Not Utilizing Function Volatility Properly

The Issue:

Function volatility (IMMUTABLE, STABLE, VOLATILE) isn't just a buzzword. Mislabeling it can trip up PostgreSQL's optimization.

-- Incorrect Volatility
CREATE FUNCTION slow_func() RETURNS integer
LANGUAGE plpgsql
IMMUTABLE AS $$
BEGIN
  -- Some complex operation
END;
$$;

The Solution:

Make sure you correctly declare the function's volatility so the query planner can do its thing.

-- Correct Volatility
CREATE FUNCTION optimized_func() RETURNS integer
LANGUAGE plpgsql
STABLE AS $$
BEGIN
  -- Some complex operation
END;
$$;

5. Overuse of Exception Handling

The Issue:

Too many exception blocks? They're dragging you down. They add overhead, especially if you’re using them for things that should rarely happen.

-- Overused Exception Handling
BEGIN
  -- some code
EXCEPTION
  WHEN others THEN
    -- handle exception
END;

The Solution:

Be sparing with exception handling. Keep those blocks as tight as a drum.

-- Optimized Exception Handling
BEGIN
  -- some code that seldom fails
EXCEPTION
  WHEN unique_violation THEN
    -- handle specific exception
END;

6. Inefficient Data Type Casting

The Issue:

Frequent casting—especially within loops or big datasets—can feel like dragging weights around.

-- Inefficient Data Type Casting
FOR record IN SELECT * FROM table LOOP
  PERFORM CAST(record.some_field AS text);
END LOOP;

The Solution:

Keep it consistent to avoid unnecessary conversions.

-- Efficient Data Type Usage
FOR record IN SELECT * FROM table LOOP
  PERFORM record.some_field::text;
END LOOP;

7. Lack of Proper Indexing on Referenced Tables

The Issue:

Your queries are only as fast as your indexes. No index? Get ready for those full table scans to bog you down.

-- No Index
SELECT * 
FROM big_table 
WHERE some_column = 'value';

The Solution:

Create indexes on columns you frequently query.

-- Indexed Query
CREATE INDEX idx_some_column ON big_table(some_column);

SELECT * 
FROM big_table 
WHERE some_column = 'value';

8. Unnecessary Data Retrieval

The Issue:

Are you guilty of selecting more data than you need? Join the club. It's a major performance killer.

-- Unnecessary Data Retrieval
SELECT * 
FROM users;

The Solution:

Only grab what you need to get the job done.

-- Streamlined Data Retrieval
SELECT id, name 
FROM users;

9. Neglecting to Use Prepared Statements for Dynamic SQL

The Issue:

If you're running the same dynamic SQL over and over, that constant parsing and planning hurts performance.

-- Dynamic SQL Without Preparation
EXECUTE 'SELECT * FROM ' || table_name;

The Solution:

Use prepared statements to cut down on the overhead.

-- With Prepared Statement
PREPARE dynamic_stmt AS
SELECT * 
FROM some_table;

EXECUTE dynamic_stmt;

10. Not Cleaning Up Temporary Data

The Issue:

Temporary tables and variables are useful, but if they hang around too long, they start to eat up your resources.

-- Unmanaged Temporary Data
CREATE TEMP TABLE temp_table AS
SELECT * FROM big_table;

The Solution:

Clean up when you’re done.

-- Properly Managed Temporary Data
CREATE TEMP TABLE temp_table AS
SELECT * FROM big_table;

-- Do something with temp_table

DROP TABLE temp_table;

11. Ignoring Transaction Scope

The Issue:

Long-running transactions can hold locks and bloat your transaction logs, especially when things get busy.

-- Long Running Transaction
BEGIN;
-- lot of operations
COMMIT;

The Solution:

Keep your transactions short and sweet.

-- Short Transaction
BEGIN;
-- minimal operations
COMMIT;

12. Failure to Analyze and Vacuum Tables Regularly

The Issue:

Outdated statistics lead to poor query plans, and dead tuples bloat your tables. It's not pretty.

The Solution:

Regularly run ANALYZE and VACUUM to keep things humming along smoothly.

-- Regular Maintenance
ANALYZE;
VACUUM;

13. Insufficient Hardware Resources

The Issue:

Overloaded CPU, memory, or disk I/O can slow things to a crawl, especially under heavy loads.

The Solution:

Monitor your resource usage and scale up your hardware as needed.

By staying mindful of these common performance bottlenecks and applying the fixes, I've significantly boosted the efficiency of my PL/pgSQL functions. I hope this helps you as much as it helped me. Happy coding, and may your queries always be fast! 🚀

Top comments (2)

Martin Baun • Sep 30

Fantastic resource! I found EXISTS can be used in very creative/efficient ways. INTERSECT/EXCEPT can also be extremely useful since they do null-safe comparisons.

Josh Campbell • Sep 30

Thanks Martin!
And I agree, it can indeed! I often use it for single shot checks in the beginning of functions to verify things like this before proceeding with the rest of the function. Probably could have worked that into another issue/solution but I haven't come across one, at least not one that is very common.

-- Check if user is linked to the company being selected - p_company_id
  SELECT EXISTS (
    SELECT 1 FROM public.company_users WHERE user_id = auth.uid() AND company_id = p_company_id
  ) INTO user_in_company;

  IF NOT user_in_company THEN
    RAISE EXCEPTION 'User is not in the company.';
  END IF;

1. Row-by-Row Processing (Slow Loops)

The Issue:

The Solution:

2. Unoptimized SQL Queries Within Functions

The Issue:

The Solution:

3. Excessive Context Switching Between PL/pgSQL and SQL

The Issue:

The Solution:

4. Not Utilizing Function Volatility Properly

The Issue:

The Solution:

5. Overuse of Exception Handling

The Issue:

The Solution:

6. Inefficient Data Type Casting

The Issue:

The Solution:

7. Lack of Proper Indexing on Referenced Tables

The Issue:

The Solution:

8. Unnecessary Data Retrieval

The Issue:

The Solution:

9. Neglecting to Use Prepared Statements for Dynamic SQL

The Issue:

The Solution:

10. Not Cleaning Up Temporary Data

The Issue:

The Solution:

11. Ignoring Transaction Scope

The Issue:

The Solution:

12. Failure to Analyze and Vacuum Tables Regularly

The Issue:

The Solution:

13. Insufficient Hardware Resources

The Issue:

The Solution:

Read next

7+ Free PostgreSQL Hosting Platforms for Developers in 2024

Dynamic SQL in PL/SQL

Laravel Mix vs Vite: Why did Laravel Transitioned to Vite

Intro to SQL using Apache Iceberg and Dremio