Regular Expression


Catching Group


expr_1 = r'(\w+)(\W+)'
line = 'abcabcdefabc*,.zyw'
print(re.findall(expr_1,line))
    

[('abcabcdefabc', '*,.')]
    

If \W not in bracket, the second result does not show.

Ignore uppercase and lowercase


expr_1 = r'(?i)leo leung'
line = 'Leo Leung'
print(re.findall(expr_1,line))
print(re.findall(expr_1,'LEO LEUNG'))
    

['Leo Leung']
['LEO LEUNG']
    

Lookahead assertion (Positive)(?=x) (Negative)(?!x)


expr_1 = r'(?i).* (?=leung)'
line = 'Leo Leung'
print(re.findall(expr_1,line))
print(re.findall(expr_1,'ok LEUNG'))
    

['Leo ']
['ok ']
    

Lookbehind assertion


expr_1 = r'(?i)(?<=leo) .*'
line = 'Leo Leung'
print(re.findall(expr_1,line))
print(re.findall(expr_1,'ok LEUNG'))
    

[' Leung']
[]
    

A mix of operations

The difference between (?=x) and (?:x) is that, the latter includes the display of x.


expr_1 = r'(?i:leo) .* (?i:patrick) abc'
line = 'I am Leo Leung Patrick abc, 12, Leo aaa patrick abc'
print(re.findall(expr_1,line))
print(re.findall(expr_1,'ok LEUNG'))
    

['Leo Leung Patrick abc, 12, Leo aaa patrick abc']
[]
    

References


  1. Regular expression operations
  2. Regex Cheat Sheet: A Quick Guide to Regular Expressions in Python