Snowflake: Replace multiple patterns using single Regex_replace
Andrew Henderson
Requirement: Want to achieve all below into one regex_replace
- Replace comma, tab and new line character to empty
- Replace double quotes into a single quote
- Replace pipe into a dash
Also would like to know if replace performance is faster than regex_replace
Below is what I tried out
with c as (select ' abc \n def\n' as t union select 'abc\tdef' as t) select t,regexp_replace(t, '[,\t\n]','') from c; with c as (select ' abc de"f\n' as t union select 'abc\td|ef' as t) select t,replace(replace(replace(t, '\n',' '),'"','\''),'|','-') from c; 0 1 Answer
The regexp_replace function will only match one pattern and replace it with a literal value, so you can't make it selectively replace depending on what it matches. You can write a UDF to do that in one step, which will make your SQL a lot neater, particularly if you have to put it in a lot of statements. This one does all three replacements:
create or replace function CLEANUP_STRING(STR string)
returns string
language javascript as
$$ var s = STR.replace(/[,\t\n]/g, ""); // Replace comma, tab and new line character to empty s = s.replace(/"/g, "'"); // Replace double quote with single quote s = s.replace(/\|/g, "-"); // Replace pipe with dash return s;
$$;
-- This will display in black font, showing that the line break is gone.
select cleanup_string('This is a "double quoted string". This has a | pipe. This has a new \n line. This has a, comma.'); 2