Velvet Star Monitor

Standout celebrity highlights with iconic style.

updates

How to escape non-escaped characters with sed?

Writer Matthew Harrington

I would like to use sed to escape all non-escaped occurrences of a character, say "&", in a string contained in variable text. What I do is

text='one&two\&three'
sed 's/\([^\\]\)&/\1\\&/g' <<< "${text}"

and I expect the output to be one\&two\&three. However, what I get is

one\e&two\&three

What I (try to) do:

  • the search pattern \([^\\]\)& should match any occurrence of & not preceded by a backslash, and store the character that precedes & in \1
  • the replace pattern \1\\& should put a backslash in between & and the previous character, however it acts as \\\1& for some strange reason

What am I doing wrong here?

1 Answer

Why your command fails:

You did:

sed 's/\([^\\]\)&/\1\\&/g' <<< "${text}"
  • [^\\]\ matches any character except \, and put that in matched group 1, then & matches a literal &. So for one&two\&three, this will match e before first &, put that in captured group 1. For the & before three this won't be matched as \ is before &

  • In the replacement you have used \1\\&, so the output becomes one\e&two\&three because:

    • \1 is replaced by e
    • then two \\s is treated as single \. that gives us e\ till now
    • then & will match the full match i.e. e& i.e. & will not be escaped as you were thinking
  • So, the matched portion i.e. e& is replaced with e\e&

    You would get the desired result, if you were using another \ before & (As two \\ make one \, so you need one before & too:

    sed 's/\([^\\]\)&/\1\\\&/g' <<<"${text}"

    As Ubuntu's sed supports ERE (Extended Regular Expression), you can use the -E or -r option to enable that to get rid of the ()s while capturing:

    sed -E 's/([^\\])&/\1\\\&/g' <<<"${text}"

Alternate approach:

First, removing \s before all &s, and then adding \ before all &:

sed -E 's/[\]+(&)/\1/g; s/&/\\&/g'

This is composed of two sed statements:

  • s/[\]+(&)/\1/g removes all \s before & in the string (line)

  • s/&/\\&/g adds a \ too all & in the string (line)


Example:

% text='one&two\&three'
% sed 's/\([^\\]\)&/\1\\\&/g' <<< "${text}"
one\&two\&three
% sed -E 's/([^\\])&/\1\\\&/g' <<< "${text}"
one\&two\&three
% sed -E 's/[\]+(&)/\1/g; s/&/\\&/g' <<<"$text"
one\&two\&three
2

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy