6

I Have a problem with oracle split query.

While splitting comma separated data into multiple rows using connect by and regular expression in oracle query I am getting more duplicate rows. for example actually my table having 150 rows in that one two rows having comma separated strings so overall i have to get only 155 rows but i am getting 2000 rows. If i use distinct its working fine but i dont want duplicate rows in query result.

I tried the following query however it's generating duplicate rows in query result:

WITH CTE AS (SELECT 'a,b,c,d,e' temp,1 slno  FROM DUAL
              UNION 
              SELECT 'f,g',2 from dual
              UNION 
               SELECT 'h',3 FROM DUAL)

SELECT TRIM(REGEXP_SUBSTR( TEMP, '[^,]+', 1, LEVEL)) ,SLNO FROM CTE 
CONNECT BY LEVEL <= LENGTH(REGEXP_REPLACE(temp, '[^,]+')) + 1

EDIT

The above select query is only able to split a single comma delimited string, however, it produces duplicate rows when executed on a table with multiple rows. How to restrict the duplicate rows?

2
  • This question is not a duplicate of the other question marked as duplicate. The other question is splitting a single row, and this question is about multiple rows. Commented Mar 12, 2015 at 8:20
  • The accepted answer uses the condition DBMS_RANDOM.VALUE IS NOT NULL which is inappropriate. It just prevents the cyclic loop, however a straight forward question would come up as How and when dbms_random.VALUE can be null? Logically, it will never be NULL. More appropriate solution is to use sys.odciNumberList and prevent the cyclic loop. Please see my answer below. Commented Mar 25, 2015 at 12:11

6 Answers 6

9

Finally I came up with this answer

WITH CTE AS (SELECT 'a,b,c,d,e' temp, 1 slno FROM DUAL
              UNION
              SELECT 'f,g' temp, 2 slno FROM DUAL
              UNION
              SELECT 'h' temp, 3 slno FROM DUAL)
SELECT TRIM(REGEXP_SUBSTR(temp, '[^,]+', 1, level)), slno
FROM CTE
CONNECT BY level <= REGEXP_COUNT(temp, '[^,]+')
    AND PRIOR slno = slno
    AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
Sign up to request clarification or add additional context in comments.

2 Comments

Regarding your comment here, why does this work and does not produce dupliate lines? Ther random seems to prevent a cyclic loop.
Logically, PRIOR DBMS_RANDOM.VALUE IS NOT NULL makes no sense. How and when dbms_random.VALUE can be null? The better way is using sys.odciNumberList. See few demos here lalitkumarb.wordpress.com/2015/03/04/…
5

Try like this,

WITH CTE AS (SELECT 'a,b,c,d,e' temp,1 slno  FROM DUAL
              UNION 
              SELECT 'f,g',2 from dual
              UNION 
              SELECT 'h',3 FROM DUAL)
     SELECT regexp_substr (temp, '[^,]+', 1, rn)temp, slno
     FROM   cte
     CROSS JOIN 
     (
          SELECT ROWNUM rn
          FROM  (SELECT MAX (LENGTH (regexp_replace (temp, '[^,]+'))) + 1 max_l
                 from cte
                 )
          connect by level <= max_l
         )
     WHERE regexp_substr (temp, '[^,]+', 1, rn) IS NOT NULL
     order by temp;

Comments

2

without using connect by:

WITH CTE AS (SELECT 'a,b,c,d,e' temp,1 slno  FROM DUAL
      UNION 
      SELECT 'f,g',2 from dual
      UNION 
       SELECT 'h',3 FROM DUAL
)
,x as (
  select
  ','||temp||',' temp
  ,slno
  from CTE
)
,iter as (SELECT rownum AS pos
    FROM all_objects
)
select
SUBSTR(x.temp
  ,INSTR(x.temp, ',', 1, iter.pos) + 1
  ,INSTR(x.temp, ',', 1, iter.pos + 1)-INSTR(x.temp, ',', 1, iter.pos)-1
) temp
,x.slno
from x, iter
where iter.pos < = (LENGTH(x.temp) - LENGTH(REPLACE(x.temp, ','))) - 1;

1 Comment

+1 much more simple and understandable - join each comma-separated row with a dummy table big enough to contain more rows than values in each of your comma-separated lists (so you get cartesian result), and use either substr as above or regexp_substr. great.
2

The accepted answer uses the condition DBMS_RANDOM.VALUE IS NOT NULL which is inappropriate. It just prevents the cyclic loop, however a straight forward question would come up as How and when dbms_random.VALUE can be null? Logically, it will never be NULL.

More appropriate solution is to use sys.odciNumberList and prevent the cyclic loop.

For example,

Set up

SQL> CREATE TABLE t (
  2    ID          NUMBER GENERATED ALWAYS AS IDENTITY,
  3    text        VARCHAR2(100)
  4  );

Table created.

SQL>
SQL> INSERT INTO t (text) VALUES ('word1, word2, word3');

1 row created.

SQL> INSERT INTO t (text) VALUES ('word4, word5, word6');

1 row created.

SQL> INSERT INTO t (text) VALUES ('word7, word8, word9');

1 row created.

SQL> COMMIT;

Commit complete.

SQL>
SQL> SELECT * FROM t;

        ID TEXT
---------- ----------------------------------------------
         1 word1, word2, word3
         2 word4, word5, word6
         3 word7, word8, word9

SQL>

Required query:

SQL> SELECT t.id,
  2         trim(regexp_substr(t.text, '[^,]+', 1, lines.column_value)) text
  3  FROM t,
  4    TABLE (CAST (MULTISET
  5    (SELECT LEVEL FROM dual CONNECT BY LEVEL <= regexp_count(t.text, ',')+1)
  6                 AS sys.odciNumberList
  7                )
  8          ) lines
  9  ORDER BY id
 10  /

        ID TEXT
---------- --------------------------------------------------
         1 word1
         1 word2
         1 word3
         2 word4
         2 word5
         2 word6
         3 word7
         3 word8
         3 word9

9 rows selected.

An alternate solution using XMLTABLE:

SQL> SELECT id,
  2         trim(COLUMN_VALUE) text
  3  FROM t,
  4    xmltable(('"'
  5    || REPLACE(text, ',', '","')
  6    || '"'))
  7  /

        ID TEXT
---------- ------------------------
         1 word1
         1 word2
         1 word3
         2 word4
         2 word5
         2 word6
         3 word7
         3 word8
         3 word9

9 rows selected.

SQL>

There are many ways to achieve the task, like a MODEL clause. For more examples see Split comma delimited strings in a table

4 Comments

can this be extended to multiple columns ..? I have a table with 3 columns in which 2 of them are , separated , I need to convert them to multiple rows . COL1 = a,b,c COL2 = x,y,z and I need 9 rows a-x , a-y , a-z, b-x, b-y,b-z , x-z,c-y,c-z -- Is this something possible ?
@NuthanKumar Just concatenate the two columns in the regular expression of select.
I just came across this answer. I don't understand your objection to using prior dbms_random.value() is not null. (Note that the whole point of this is the PRIOR operator.) Of course that value is never NULL; the point of this condition is not to provide a way to break out of the recursion (other conditions play that role), but to add a non-repeating pseudo-column to the ones Oracle uses to find cycles in the hierarchical query. These days PRIOR sys_guid() is not null is seen more often, playing exactly the same role.
@mathguy Because logically it is never NULL, sys.odciNumberList makes more sense.
0

Can use the below query to convert comma separated values in rows

 SELECT trim(x.column_value.extract('e/text()')) COLUMNS
 from t t, table (xmlsequence(xmltype('<e><e>' || replace(valuestring,':','</e><e>')||  
 '</e></e>').extract('e/e'))) x   );  

Comments

0

Adding an unique clause does the trick:

    WITH cte AS (
        SELECT 'a,b,c,d,e' temp, 1 slno FROM DUAL UNION 
        SELECT 'f,g',2 FROM DUAL UNION SELECT 'h',3 FROM DUAL
    ) SELECT UNIQUE(slno),REGEXP_SUBSTR(temp,'[^,]+', 1, LEVEL)temp FROM cte
    CONNECT BY LEVEL<=REGEXP_COUNT(temp, '[^,]+') ORDER BY slno;

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.