Python-Ref > Regular expressions (regexp) > Regular expressions introduction
 
 

^^->
Moduly
Knihovní funkce

Regular expressions introduction

Regular expressions may be in a vague and distant way compared to String formatting operations. In a sense, both describe a template for a string. While in case of the formatting strings, the purpose of the template is to describe how a new string should be created, regular expressions describe (in first approximation) how an already existing string is formatted and how to decompose it to specific parts.
Regular expressions are a powerful tool for processing of text. They enable searching, parsing and modifications of strings that would otherwise require tremendous programming efforts.
The following few examples show in brief some of the things that are possible using regular expressions.
The code snippet below shows a regular expression used to find a particular type of information in a string.
Expand/Shrink
Zdroj: (regexp1-1.py)
  1   import re
  2   
  3   dogs = "I have 2 dogs."
  4   result = re.search( "[0-9]+", dogs)  # matches a sequence of successive number digits
  5   
  6   print result.group(0)
stdout:
2
Doba běhu: 22.9 ms
The following code shows how to check if a string conforms to a specified prescription.
Expand/Shrink
Zdroj: (regexp1-2.py)
  1   import re
  2   
  3   lines = ["1.23 cm", "0.256 mm", "6.2", "x cm", "-2.64 cm"]
  4   
  5   for line in lines:
  6       # [] specify a set or range of characters tha match this expression
  7       # . has special meaning in regexp, therefore it is escaped
  8       # + means one or more of the previous expressions
  9       if re.match( "[0-9]+\.[0-9]+ [a-z]+", line):
 10           print "right formatting", line
 11       else:
 12           print "wrong formatting", line
stdout:
right formatting 1.23 cm
right formatting 0.256 mm
wrong formatting 6.2
wrong formatting x cm
wrong formatting -2.64 cm
Doba běhu: 22.8 ms
A slight variation on the previous example is capable of parsing the string into components.
Expand/Shrink
Zdroj: (regexp1-3.py)
  1   import re
  2   
  3   lines = ["1.23 cm", "0.256 mm", "6.2", "x cm", "-2.64 cm"]
  4   
  5   for line in lines:
  6       # [] specify a set or range of characters tha match this expression
  7       # . has special meaning in regexp, therefore it is escaped
  8       # + means one or more of the previous expressions
  9       # () specify a group
 10       m = re.match( "([0-9]+\.[0-9]+) ([a-z]+)", line)
 11       if m:
 12           number = float( m.group(1))
 13           unit = m.group(2)
 14           print "number=%.3f, unit=%s" % (number, unit)
 15       else:
 16           print "wrong formatting", line
stdout:
number=1.230, unit=cm
number=0.256, unit=mm
wrong formatting 6.2
wrong formatting x cm
wrong formatting -2.64 cm
Doba běhu: 23.1 ms