Using RegEx (Regular Expression Extractor) with JMeter
JMeter, the most popular open source performance testing tool, can work with regular expressions, with the Regular Expression Extractor. Regular expressions are a tool used to extract a required part of the text by using advanced manipulations. Regular expressions are popular when testing web applications because they can be used to validate and to perform operations on a response from a web application.
In JMeter, the Regular Expression Extractor is useful for extracting information from the response. For example, when you request a page and then need to get a link from the page that was downloaded. Another use case is saving the extracted information to a variable, so it can be used later on in the performance test, for example when testing an application that uses token authentication, like CSRF/XSRF.
In this article I’m going to share how to use the Regular Expression Extractor in JMeter.
I created a very simple test-plan, look at Figure 1:
You may notice one unknown element in the image; it is the Regular Expression Extractor post-processor. Let's look at it more closely, in Figure 2:
Regular Expression Extractor Syntax
When configuring the regular expressions in JMeter, use the same syntax as Perl5. But there is one very important difference between JMeter and Perl regexps processing. In Perl you have to use the “//” delimiter to specify regexp. So, the regular expressions reg might appear like this: ~/regular_expression/. But you cannot use “//” for the same purpose in JMeter, otherwise the regular expression will be parsed literally, and not logically. So, if you are using grouping in a regular expression, use the “()” parentheses to separate one group from another.
Configuring the Regular Expression Extractor
Now I will shortly describe all of this element's fields.
“Apply to” radio button
You can choose whether the regular expression will be applied to the main sample results, to the sub-samples/embedded resources, or both.
The possible options are:
- Main sample only - only applies to the main sample
- Sub-samples only - only applies to the sub-samples
- Main sample and sub-samples - applies to both main sample and sub-samples
- JMeter Variable - the expression is applied to the filled in variable
“Field to check” radio button
You can choose which field the regular expression is applied to. The possible options are:
- Body - the body of the response. The content of your web-page, excluding headers, will be parsed with the regular expression.
- Body (unescaped) - the body of the response, with all HTML escape codes replaced. Note that HTML escapes are processed without regard to context, so some incorrect substitutions may be made.
- Headers - the headers of the response or the request
- URL – the URL of the request
- Response Code - e.g. 200
- Response Message - e.g. OK
Name of Created Variable - the name of the variable where the parsing results will be saved in JMeter.
Regular Expression - fill in the regular expression to test.
Template - choose the group you would like to extract from the regular expression. '$1$' will extract group 1, '$2$' will extract group 2, and so on. $0$ will extract the entire expression. For example, if you have the word “economics” in your response and you search for the regular expression “(ec)(onomics)” and apply template $2$$1$, then in the output variable you will receive “onomicsec”. If you apply template $0$, then in the output variable you will receive "economics".
Match ¹. If there is several character sequences, allows specifying, which variant exactly should be used. Important note. If you set “Apply to” to “Main sample and sub-samples” and specify “Match ¹” = 3, than JMeter will select matching sequence from the 2nd sub-sample because 1st will be main sample. If zero is specified, JMeter will choose a match at random. If you specify negative number, e.g. “-2”
If the match number is set to a negative number, then all the possible matches in the sampler data are processed. The variables are set as follows:
refName_matchNr - the number of matches found; could be 0
refName_n, where n = 1,2,3 etc - the strings as generated by the template
refName_n_gm, where m=0,1,2 - the groups for match n
refName - always set to the default value
refName_gn - not set
Indicates which match to use. The regular expression may match multiple times.
Use a value of zero to indicate JMeter should choose a match at random.
A positive number N means to select the nth match.
That’s all about options of Regular Expression Extractor. And now I will show a couple of practical examples. In all examples I will use the same URL for extracting string by regexp, see Figure 3.
After extracting string it will be put to variable $pageLink and used in “pageLink” HTTP Request as it displayed in Figure 4.
Searching by word. If you need to extract string with regular expression that is a single word than fill Regular Extractor as in Figure 5.
After executing “tut.by” request and extracting regexp, we will get the following $pageLink = economics, and that will be used in “pageLink” request, Figure 6.
Using groups. You can move parts of regular expressions using groups. For example, you need to find word “economics”, but before putting it to $pageLink you want to rearrange parts of word. Look at the Figure 7 for the syntax
And what we’ll have in View Results Tree
Using classes in regexps. Regular expressions can use classes of characters. For example, [0-9] means “any of numeric symbol ”. If I set regexp as in Figure 9, than I will receive the 3rd appropriate result from response body.
“{5,6}” means that result should contain no less then 5 and no more then 6 characters. And what we will have in View Results Tree in Figure 10
Using “^”. “^” means inversion, e.g. regular expression [^0-9] will look for non-numeric symbols. So, I’ll set regexp as in Figure 11
And in View results Tree I will have very interesting situation, Figure 12
What happened? Look at Figure 13
We caught “carriage return” symbol and this is a reason of java.net.MalformedURLException. To repair regexp I’ll add “<” before it and restart test. Now it’s ok.
Of course, I cannot cover in one article all possible and impossible cases about using regular expressions. For more information you can refer to JMeter Regular Expressions Tutorial which has exhaustive information.
JMeter uses Jakarta ORO for regular expressions processing. You can quickly test your regular expressions using Jakarta ORO Demonstration Applet which is the fastest way of seeing result matches/groups/etc.