4

I am trying to create a function that takes a table and a variable number of columns as arguments, and then returns a table without rows that have duplicates on all of those columns. I am trying to figure out how to have a variable number of columns as arguments, and I have gathered that I will probably need a VARIADIC argument, but I am not sure how to implement it. What I have so far:

CREATE FUNCTION remove_duplicates(orig_table, VARIADIC sel_columns column)
RETURNS table AS $$
    SELECT * FROM 
        (SELECT *,
            count(*) over (partition by sel_columns) AS count
        FROM orig_table)
    WHERE count = 1;
$$ LANGUAGE SQL;

As an example, if I had a table like this:

cola | colb | colc
-------------------
a    | b    | 1
a    | b    | 2
a    | c    | 3
a    | d    | 4

I would like to run SELECT * FROM remove_duplicates(mytable, cola, colb) and get this result:

cola | colb | colc
-------------------
a    | c    | 3
a    | d    | 4

Thank you for the help. I'm using postgresql 9.4.9

1 Answer 1

3

You'll cannot get what you want with a simple SQL function, you need the power of a procedural language. A possible solution is:

CREATE OR REPLACE FUNCTION remove_duplicates(orig_table anyelement, VARIADIC sel_columns text[])
RETURNS SETOF anyelement AS $$
DECLARE
    orig_table_columns TEXT;
BEGIN
    SELECT array_to_string(array_agg(quote_ident(column_name)),',') INTO orig_table_columns FROM information_schema.columns WHERE table_name = CAST(pg_typeof(orig_table) AS TEXT);
    RETURN QUERY EXECUTE 'SELECT ' || orig_table_columns || ' FROM '
        || '(SELECT *, '
        || '    count(*) over (partition by ' || array_to_string(sel_columns, ',') || ') AS count '
        || 'FROM ' || pg_typeof(orig_table) || ') AS tmp '
        || ' WHERE count = 1 ';
END
$$ LANGUAGE PLPGSQL;

SELECT * FROM remove_duplicates(NULL::tests, 'cola', 'colb');

Don't forget to do your changes to avoid SQL Injection.

EDIT: For a very good explanation about functions with dynamic return types see Erwin's answer here.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, this works! I'm a little confused as to why it's necessary to make orig_table type anyelement, and why the table argument has to be NULL that is cast to a specific table.
@AmadouKone anyelement because we're taking advantage of Polymorphic Types. Casting to null is just a trick to pass the type of desired table.
@AmadouKone Also note the return type is anyelement too. This makes the function polymorphic.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.