
Asterisk ( * ) quantifier –
The asterisk quantifier matches zero or more occurrences of the pattern to the left of it.
In [1]: import re
In [2]: re.findall('python*' , 'pytho')
Out[2]: ['pytho']
In [3]: re.findall('python*', 'python')
Out[3]: ['python']
In [4]: re.findall('python*', 'pythonnnn')
Out[4]: ['pythonnnn']
The python* will match zero or more occurrences n in the text.
Let’s say you want to match all the words that starts with P.
In [5]: text = 'Python is cool. I practice python everyday.'
In [6]: re.findall('p[a-z]* ', text, flags=re.IGNORECASE)
Out[6]: ['Python ', 'practice ', 'python ']
Here, the pattern says that the word starts with a p followed by any character between a to z and the asterisk quantifier say give me zeros of more repetitions of the characters between a to z. We also used the re.IGNORECASE flag to make the pattern case insensitive. This will match uppercase as well as lowercase characters.
In [7]: re.findall('p[a-z]* ', text)
Out[7]: ['practice ', 'python ']
If you want to match everything that starts with p, you will write.
In [8]: re.findall('p.*', text, flags=re.IGNORECASE)
Out[8]: ['Python is cool. I practice python everyday.']
The pattern says that the text starts with a p. The dot ( . ) character matches any characters except a newline character and the asterisk quantifier says give me zero or more occurrences of any character except the newline character.
How to match an asterisk character ?
To match a asterisk character just escape the asterisk with a backslash.
In [9]: re.findall('\*', '***python***')
Out[9]: ['*', '*', '*', '*', '*', '*']