Skip to content

Regular Expression

A regular expression is a group of characters or symbols which is used to find a specific pattern in a text.

regex

  • a string literal is the simplest possible regular expression. "str" would match str(s followed by t, followed by r).
  • regular expression are generally case sensitive

Meta Characters

  • special characters. does not mean anything on their own.
  • some meta character have different meaning when written inside square bracket
meta char desc
. match any single character except for linebreak
[] char class. matches any char contained between the brackets
[^ ] negated char class. matches any character NOT contained between the brackets
* matches 0 or more repetitions of the preceeding symbol
+ match one or more repetitions of preceeding symbol
? makes the preceeding symbol optional
{n,m} match at least n but not more than m repetitions of preceeding symbol
(abc) char group
| alteration. matches either the chars before or the chars after the symbol
\ escapes character
^ beginning of the input
$ end of the input

Things to Remember

set character range by using hyphen inside character class. Example: /[a-z0-9]/.

a period inside a char set means a literal period

the * (star) with a . (dot) can be used to match any string of characters. Example: .*

braces(also called quantifiers) are used to specify the number of times a character or a group of character can be repeated. Examples:

  • [0-9]{3} - matches exactly 3 digits.
  • [a-z]{2,} - matches between 2 and unlimited times
  • [A-Z]{2, 5} - matches between 3 and 5 times