Asterisk ( * ) quantifier –
The asterisk quantifier matches zero or more occurrences of the pattern to the left of it.
In : import re In : re.findall('python*' , 'pytho') Out: ['pytho'] In : re.findall('python*', 'python') Out: ['python'] In : re.findall('python*', 'pythonnnn') Out: ['pythonnnn']
The python* will match zero or more occurrences n in the text.
Let’s say you want to match all the words that starts with P.
In : text = 'Python is cool. I practice python everyday.' In : re.findall('p[a-z]* ', text, flags=re.IGNORECASE) Out: ['Python ', 'practice ', 'python ']
Here, the pattern says that the word starts with a p followed by any character between a to z and the asterisk quantifier say give me zeros of more repetitions of the characters between a to z. We also used the re.IGNORECASE flag to make the pattern case insensitive. This will match uppercase as well as lowercase characters.
In : re.findall('p[a-z]* ', text) Out: ['practice ', 'python ']
If you want to match everything that starts with p, you will write.
In : re.findall('p.*', text, flags=re.IGNORECASE) Out: ['Python is cool. I practice python everyday.']
The pattern says that the text starts with a p. The dot ( . ) character matches any characters except a newline character and the asterisk quantifier says give me zero or more occurrences of any character except the newline character.
How to match an asterisk character ?
To match a asterisk character just escape the asterisk with a backslash.
In : re.findall('\*', '***python***') Out: ['*', '*', '*', '*', '*', '*']