What is the meaning of :a;$!N; in a sed command?
Emily Wong
$ (echo hello; echo there) | sed ':a;$!N;s/\n/string/;ta'
hellostringthereAbove sed command replaces new line character with the string "string". But I don't know the meaning of :a;$!N;s/\n/string/;ta within the single quotes. I know the middle part s/\n/string/. But I don't know the function of first (:a;$!N;) and last (ta) part.
2 Answers
These are the, admittedly cryptic, sed commands. Specifically (from man sed):
: label
Label for b and t commands.t label
If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.n N Read/append the next line of input into the pattern space.
So, the script you posted can be broken down into (spaces added for readbility):
sed ':a; $!N; s/\n/string/; ta' --- ---- ------------- -- | | | |--> go back (`t`) to `a` | | |-------------> substitute newlines with `string` | |----------------------> If this is not the last line (`$!`), append the | next line to the pattern space. |----------------------------> Create the label `a`.Basically, what this is doing could be written in pseudocode as
while (not end of line){ append current line to this one and replace \n with 'string'
}You can understand this a bit better with a more complex input example:
$ printf "line1\nline2\nline3\nline4\nline5\n" | sed ':a;$!N;s/\n/string/;ta'
line1stringline2stringline3stringline4stringline5I am not really sure why the !$ is needed. As far as I can tell, you can get the same output with
printf "line1\nline2\nline3\nline4\nline5\n" | sed ':a;N;s/\n/string/;ta' 9 I post this answer since I see a lot of confusion about why the last line is excluded when executing N (through the line addressing string $!) and because the OP was confused about the meaning of :a;$!N; in a sed command, not only in the specific one he posted.
Well, the benefit of using $!N instead of N is not evindent in the examples proposed (by the OP and by @terdon), since no "important" (keep reading) command is performed on the last line after the N command. (Indeed, the result is the same if one strips that line address off.)
In a more complex example (for instance, substitute this sentence in a file, with the two words appearing sometimes on one line and some other times on two lines), excluding the last line for the N command could be crucial! If the last line is not excluded, upon executing N on it, sed hits the EOF and exits immediately, preventing all subsequent commands (branching commands as well, namely t and b) to be executed.
In the too simplistic examples shown, we can safely remove $! and let sed fail in executing N and return since the aborted s command would do nothing if it was executed, since there's no \n to match.