How to escape non-escaped characters with sed?
Matthew Harrington
I would like to use sed to escape all non-escaped occurrences of a character, say "&", in a string contained in variable text. What I do is
text='one&two\&three'
sed 's/\([^\\]\)&/\1\\&/g' <<< "${text}"and I expect the output to be one\&two\&three. However, what I get is
one\e&two\&threeWhat I (try to) do:
- the search pattern
\([^\\]\)&should match any occurrence of¬ preceded by a backslash, and store the character that precedes&in\1 - the replace pattern
\1\\&should put a backslash in between&and the previous character, however it acts as\\\1&for some strange reason
What am I doing wrong here?
1 Answer
Why your command fails:
You did:
sed 's/\([^\\]\)&/\1\\&/g' <<< "${text}"[^\\]\matches any character except\, and put that in matched group 1, then&matches a literal&. So forone&two\&three, this will matchebefore first&, put that in captured group 1. For the&beforethreethis won't be matched as\is before&In the replacement you have used
\1\\&, so the output becomesone\e&two\&threebecause:\1is replaced bye- then two
\\s is treated as single\. that gives use\till now - then
&will match the full match i.e.e&i.e.&will not be escaped as you were thinking
So, the matched portion i.e.
e&is replaced withe\e&You would get the desired result, if you were using another
\before&(As two\\make one\, so you need one before&too:sed 's/\([^\\]\)&/\1\\\&/g' <<<"${text}"As Ubuntu's
sedsupports ERE (Extended Regular Expression), you can use the-Eor -roption to enable that to get rid of the()s while capturing:sed -E 's/([^\\])&/\1\\\&/g' <<<"${text}"
Alternate approach:
First, removing \s before all &s, and then adding \ before all &:
sed -E 's/[\]+(&)/\1/g; s/&/\\&/g'This is composed of two sed statements:
s/[\]+(&)/\1/gremoves all\s before&in the string (line)s/&/\\&/gadds a\too all&in the string (line)
Example:
% text='one&two\&three'
% sed 's/\([^\\]\)&/\1\\\&/g' <<< "${text}"
one\&two\&three
% sed -E 's/([^\\])&/\1\\\&/g' <<< "${text}"
one\&two\&three
% sed -E 's/[\]+(&)/\1/g; s/&/\\&/g' <<<"$text"
one\&two\&three 2