Velvet Star Monitor

Standout celebrity highlights with iconic style.

news

Regex: Find all html tags that contain some other particular html tags

Writer Andrew Henderson

I have some html tags starting with <p> and end with </p> The first tag has in it some other tags such as </li> </ul> </div> and spaces and \n as you can see.

<p></g></svg> </a> </li> </ul> </div> </p> <p>Foarte frumos lucru</p>
<p>I love cars</p>

I want to find and delete all html tags such as the first one that contain </li> </ul> </div>

The Output should be:

 <p>Foarte frumos lucru</p>
<p>I love cars</p>

My solution is not good:

FIND: (?=<p>)[\s\S]*?</li></div>|</ul>[\s\S]*?</p>

REPLACE BY: LEAVE EMPTY

4

3 Answers

This will remove all <p> tags that contain only empty tags or spaces:

  • Ctrl+H
  • Find what: <p>(?:<.+?>|\s)+?</p>\R*
  • Replace with: LEAVE EMPTY
  • CHECK Wrap around
  • CHECK Regular expression
  • Replace all

Explanation:

<p> # start tag (?: # non capture group <.+?> # any tag | # OR \s # any kind of space )+? # end group, must appear 1 or more times, not greedy
</p> # end tag
\R* # 0 or more any kind of linebreak

Screenshot (before):

enter image description here

Screenshot (after):

enter image description here

8

Use the following:

  • Ctrl+H
  • Find what: ^(?!.*(</p>)).*|\s+</p>
  • Replace with: LEAVE EMPTY
  • CHECK Match case
  • CHECK Wrap around
  • CHECK Regular expression
  • UNCHECK . matches newline
  • Replace all
3

Another solution more complete that can deal with other tags inside the<p> ... </p>.

 <p>[^<>]*</p>\R?(*SKIP)(*F)|.

You can try it here with explanation.

Be aware that all your file will be deleted! Except <p> ... </p>

1

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy