3

We are converting a database from SQL Server 2022 to Postgres 18. Our system has a lot of code in the database and a lot of queries that use local temp tables a lot more than CTE's. This is due to the queries being extremely large and complex and are broken down so that the SQL Server optimizer has a better chance of choosing an optimal execution plan. In Postgres I have read that temp tables can cause performance issues under heavy load due to design limitations which can lead to bloated catalogues and then need for frequent Full Vacuums. I have no idea what is the heavy load that can cause this. Has anyone got experience of this in a heavily used Postgres database system, and if so, was the solution to replace the temp table with a CTE?

We are at the start of the migration process, which may take another year, but wanted to know if we have problems ahead with our current reliance on Temp tables over CTE's for complex queries.

1
  • Note that local temp tables work the same but the global variant currently requires pgtt extension, emulating them using unlogged. Working with fairly large and temp-heavy OLAP workloads on pg, you just tune your setup to minimise bloat and refresh more often (examples from @Laurenz Albe), and it won't clog up. Depending on how you use the temps, you might want to also run vacuum+analyze+reindex+cluster cycles directly from the app/operator. Commented Oct 9 at 12:07

1 Answer 1

4

whenever a temporary table is created, its metadata are stored in the system catalogs pg_class, pg_attribute and pg_depend. Whenever a temporary table is dropped or the database session terminated, these entries get deleted. Now a deleted row has to cleaned up by the garbage collection process “autovacuum”, so if that background process is not tuned to run fast enough, those system catalog tables can get bloated and inefficient.

If you create dozens or hundreds of temporary tables per second, make sure that you have no long-running database transactions, raise the parameter autovacuum_vacuum_cost_limit and lower autovacuum_vacuum_cost_delay to make autovacum run faster.

Other than that, there is no problem with that strategy. Make sure that you run an explicit ANALYZE on each temporary table after populating it. Using CTEs may be an alternative, but PostgreSQL cannot gather optimizer statistics for a CTE.

1
  • I did not know about running ANALYZE after temp table population and index creation. I have a lot of places to add that. Not something required in SQL Server due to auto update statistics. Commented Oct 13 at 8:00

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.