Velvet Star Monitor

Standout celebrity highlights with iconic style.

general

How can I match the start and end in Python's regex?

Writer Mia Lopez

I have a string and I want to match something at the start and end with a single search pattern. How can this be done?

Let's say we have a string like:

 string = "ftp://"

I want to do something like this:

 re.search("^ftp:// & .jpg$" ,string)

Obviously, it's incorrect, but I hope it gets my point across. Is this possible?

0

6 Answers

How about not using a regular expression at all?

if string.startswith("ftp://") and string.endswith(".jpg"):

Don't you think this reads nicer?

You can also support multiple options for start and end:

if (string.startswith(("ftp://", "http://")) and string.endswith((".jpg", ".png"))):
2

re.match will match the string at the beginning, in contrast to re.search:

re.match(r'(ftp|http)://.*\.(jpg|png)$', s)

Two things to note here:

  • r'' is used for the string literal to make it trivial to have backslashes inside the regex
  • string is a standard module, so I chose s as a variable
  • If you use a regex more than once, you can use r = re.compile(...) to built the state machine once and then use r.match(s) afterwards to match the strings

If you want, you can also use the urlparse module to parse the URL for you (though you still need to extract the extension):

>>> allowed_schemes = ('http', 'ftp')
>>> allowed_exts = ('png', 'jpg')
>>> from urlparse import urlparse
>>> url = urlparse("ftp://")
>>> url.scheme in allowed_schemes
True
>>> url.path.rsplit('.', 1)[1] in allowed_exts
True
0

Don't be greedy, use ^ftp://(.*?)\.jpg$

Try

 re.search(r'^ftp://.*\.jpg$' ,string)

if you want a regular expression search. Note that you have to escape the period because it has a special meaning in regular expressions.

0
import re
s = "ftp://"
print(re.search("^ftp://.*\.jpg$", s).group(0))

I want extract all numeric, include int and float.

and it works for me.

import re
s = '[11-09 22:55:41] [INFO ] [ 4560] source_loss: 0.717, target_loss: 1.279,
transfer_loss: 0.001, total_loss: 0.718'
print([float(s) if '.' in s else int(s) for s in re.findall(r'-?\d+\.?\d*', s)])

refs:

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy