Regular expressions

A regular expression language is a powerful way of manipulating with texts. Advanced Renamer supports the use of regular expressions for pattern searching and replacing in several methods. The use of these expressions is primarily meant for power users and people with programming experience but none the less gaining knowledge of the basics will prove to be very rewarding. A standard library called PCRE is used which means that people with prior knowledge of this library will feel right at home. Those learning this for the first time will also be able to use the skills in other similar tools.

This page will try to give you basic knowledge about the use of regular expressions in the context of file renaming.

A simple regular expression

A regular expression contains normal characters and metacharacters. The normal characters are interpreted as they are while the metacharacters have special meaning. Let's start out with a simple expression:

zip_\d\d\d\d

Given the expression above the resulting match of a filename "BayTower_zip_4500.txt" will be "zip_4500". The "\d" is a metacharacter which represent any numeric value raging from 0 to 9. The above expression matches any phrase starting with "zip_" followed by 4 digits.

How can we use this for renaming?

The most common method with regex support is the Replace method. If you in the first text field type the above expressions "zip_\d\d\d\d" and in the second text field type "zip_unknown" any file containing "zip_" followed by 4 digits will get this text phrase replaced by "zip_unknown".

Sequences

What if we don't know how many numbers a zip code consists of? What if some files contain "zip_123" and others "zip_384739"? The above expression will only match if there are exactly 4 digits. If we don't know how many digits there are we can use another meta character. Consider this expression:

zip_\d+

The plus + will match the previous character 1 or more times which means that this expression will match "zip_123" and "zip_1234" and "zip_8000000".

Grouping

It is possible to define subpatterns within the pattern itself which proves very useful when using the replace method. A group is defined by enclosing a part of the pattern in a parenthesis (). Given this pattern:

zip_(\d+)

The match is exactly the same as before except now we can access the value of the subpattern in a replace scenario. If in the replace method we put "\1_zip" in the second text field the result will show that the two parts of the filename have changed place. If the filename contains "zip_123" the filename will contain "123_zip" after the method has been applied. The value of the special metacharacter \1 is in this case "123" . If more than one group is used the next group name will be \2.

A more advanced example

If we have a filename like this "Michael Jackson - Thriller.mp3" and we want to change it to something like this "Thriller - Michal Jackson.mp3" we can apply a regular expression in the replace method like this:

Text to be replaced: (.*) - (.*)
Replace with: \2 - \1

In this case some familiar notation is used but also two new metacharacter. The dot matches anything character digit or non-digit. The star * matches the previous character 0 or more times. The pattern is build of two similar groups divided by - . The first group matches "Michal Jackson" and putting it into \1 while the second group matches "Thriller" putting it into \2. Because the first group is replaced by the value of the second and vice versa, the two parts of the filename change place.

Metacharacters

\w Word character [a-zA-Z_]
\W Non-word character [^a-zA-Z_]
\d Digit character [0-9]
\D Non-digit character [^0-9]
.

Any character

^ Start of line (start of filename)
$ End of line (end of filename)
[...] Characters contained in list. E.g. [abcd]
[^...] Characters not contained in list. E.g. [^abcd]
(...) Subpattern group. E.g. (.*)
(...|...) Subpattern group alternation. E.g. (\d*|\D*)
* Match previous character or metacharacter 0 or more times
+ Match previous character or metacharacter 1 or more times
? Match previous character or metacharacter 0 or 1 times
{n} Match previous character or metacharacter exactly n times. E.g. \d{4}
{n,} Match previous character or metacharacter at least n times. E.g. \d{3,}
{n,m} Match previous character or metacharacter at least n times but no more than m times. E.g. \d{3,5}
\1 Sub pattern value 1 for use in pattern based replace