Regular expression is a basic concept in theoretical computer science. Once you see the Wikipedia page of “Regular expression”, you can realize how important it is for understanding computer science.
But for beninner of web engineer, the simple explanation of regular expression could be, it is just a “pattern” in a nut shell.
Regular expression is often abbreviated to regex.
(To be updated…) Here is often used regex syntax.
\d
: A digit.\w
: A alphanumerical character (word).{}
: Repeat. {2}
means “repeat twice”, and {3:5}
means “repeat 3 to 5 times.”[6-9]
: A digit between 6 to 9.()
: Make a syntax group.(a|b)
: The character a or br..oo
: An any one character (.
) and oo, like “foo”.^The
: Start with ($) the characters “The”.A backslash is used for escape.
\.
: The character dot “.
”.Regular expression is not good at “not contain” syntax.
But we can with [^]
.
It is depend on implementations in most cases, I think.
For example, in NLP we want to remove punctuations sometimes. If you use Python, the following page will be very helpful.
https://stackoverflow.com/questions/265960/best-way-to-strip-punctuation-from-a-string
grep -E "(\d{1,3}\.){3}\d{1,3}" --only-matching subnet.txt
https://www.regular-expressions.info/wordboundaries.html
[^/]+
Untill next slash /
. Not /
and more than 1 character.
https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/