rex - Splunk Documentation (2024)

Description

Use this command to either extract fields using regular expression named groups, or replace or substitute characters in a field using sed expressions.

The rex command matches the value of the specified field against the unanchored regular expression and extracts the named groups into fields of the corresponding names.

When mode=sed, the given sed expression used to replace or substitute characters is applied to the value of the chosen field. This sed-syntax is also used to mask, or anonymize, sensitive data at index-time. Read about using sed to anonymize data in the Getting Data In Manual.

If a field is not specified, the regular expression or sed expression is applied to the _raw field. Running the rex command against the _raw field might have a performance impact.

Use the rex command for search-time field extraction or string replacement and character substitution.

Syntax

The required syntax is in bold.

rex [field=<field>]
( <regex-expression> [max_match=<int>] [offset_field=<string>] ) | (mode=sed <sed-expression>)

Required arguments

You must specify either <regex-expression> or mode=sed <sed-expression>.

regex-expression
Syntax: "<string>"
Description: The PCRE regular expression that defines the information to match and extract from the specified field.
mode
Syntax: mode=sed
Description: Specify to indicate that you are using a sed (UNIX stream editor) expression.
sed-expression
Syntax: "<string>"
Description: When mode=sed, specify whether to replace strings (s) or substitute characters (y) in the matching regular expression. No other sed commands are implemented. Sed mode supports the following flags: global (g) and Nth occurrence (N), where N is a number that is the character location in the string.

Optional arguments

field
Syntax: field=<field>
Description: The field that you want to extract information from.
Default: _raw
max_match
Syntax: max_match=<int>
Description: Controls the number of times the regex is matched. If greater than 1, the resulting fields are multivalued fields. Use 0 to specify unlimited matches. Multiple matches apply to the repeated application of the whole pattern. If your regex contains a capture group that can match multiple times within your pattern, only the last capture group is used for multiple matches.
Default: 1
offset_field
Syntax: offset_field=<string>
Description: Creates a field that lists the position of certain values in the field argument, based on the regular expression specified in regex-expression. For example, if the rex expression is "(?<tenchars>.{10})" the first ten characters of the field argument are matched. The offset_field shows tenchars=0-9. The offset calculation always uses zero ( 0 ) for the first position. For another example, see Examples.
Default: No default

Usage

The rex command is a distributable streaming command. See Command types.

rex command or regex command?

Use the rex command to either extract fields using regular expression named groups, or replace or substitute characters in a field using sed expressions.

Use the regex command to remove results that do not match the specified regular expression.

Regular expressions

Splunk SPL supports perl-compatible regular expressions (PCRE).

When you use regular expressions in searches, you need to be aware of how characters such as pipe ( | ) and backslash ( \ ) are handled. See SPL and regular expressions in the Search Manual.

For general information about regular expressions, see About Splunk regular expressions in the Knowledge Manager Manual.

Sed expressions

When using the rex command in sed mode, you have two options: replace (s) or character substitution (y).

The syntax for using sed to replace (s) text in your data is: "s/<regex>/<replacement>/<flags>"

  • <regex> is a PCRE regular expression, which can include capturing groups.
  • <replacement> is a string to replace the regex match. Use \n for back references, where "n" is a single digit.
  • <flags> can be either g to replace all matches, or a number to replace a specified match.

The syntax for using sed to substitute characters is: "y/<string1>/<string2>/"

  • This substitutes the characters that match <string1> with the characters in <string2>.

When using the rex command in sed mode, the rex command supports the same sed expressions as the SEDCMD setting in the props.conf.in file.

Anonymize multiline text using sed expressions

The Splunk platform doesn't support applying sed expressions in multiline mode. To use a sed expression to anonymize multiline events, use 2 sed expressions in succession by first removing the newlines and then performing additional replacements. For example, the following search uses the rex command to replace all newline characters in a multiline event containing HTML content, and then redacts all of the HTML content.

index=main html | rex mode=sed field=_raw "s/\\n/NEWLINE_REMOVED/g" | rex mode=sed field=_raw "s/<html.*html>/REDACTED/g"

Examples

1. Extract email values using regular expressions

Extract email values from events to create from and to fields in your events. For example, you have events such as:

Mon Mar 19 20:16:27 2018 Info: Bounced: DCID 8413617 MID 19338947 From: <MariaDubois@example.com> To: <zecora@buttercupgames.com> RID 0 - 5.4.7 - Delivery expired (message too old) ('000', ['timeout']) Mon Mar 19 20:16:03 2018 Info: Delayed: DCID 8414309 MID 19410908 From: <WeiZhang@example.com> To: <mcintosh@buttercupgames.com> RID 0 - 4.3.2 - Not accepting messages at this time ('421', ['4.3.2 try again later']) Mon Mar 19 20:16:02 2018 Info: Bounced: DCID 0 MID 19408690 From: <Exit_Desk@sample.net> To: <lyra@buttercupgames.com> RID 0 - 5.1.2 - Bad destination host ('000', ['DNS Hard Error looking up mahidnrasatyambsg.com (MX): NXDomain']) Mon Mar 19 20:15:53 2018 Info: Delayed: DCID 8414166 MID 19410657 From: <Manish_Das@example.com> To: <dash@buttercupgames.com> RID 0 - 4.3.2 - Not accepting messages at this time ('421', ['4.3.2 try again later']) 

When the events were indexed, the From and To values were not identified as fields. You can use the rex command to extract the field values and create from and to fields in your search results.

The from and to lines in the _raw events follow an identical pattern. Each from line is From: and each to line is To:. The email addresses are enclosed in angle brackets. You can use this pattern to create a regular expression to extract the values and create the fields.

source="cisco_esa.txt" | rex field=_raw "From: <(?<from>.*)> To: <(?<to>.*)>"

You can remove duplicate values and return only the list of address by adding the dedup and table commands to the search.

source="cisco_esa.txt" | rex field=_raw "From: <(?<from>.*)> To: <(?<to>.*)>" | dedup from to | table from to

The results look something like this:

2. Extract from multi-valued fields using max_match

You can use the max_match argument to specify that the regular expression runs multiple times to extract multiple values from a field.

For example, use the makeresults command to create a field with multiple values:

| makeresults | eval test="a$1,b$2"


The results look something like this:

_timetest
2019-12-05 11:15:28a$1,b$2

To extract each of the values in the test field separately, you use the max_match argument with the rex command. For example:

...| rex field=test max_match=0 "((?<field>[^$]*)\$(?<value>[^,]*),?)"

The results look something like this:

_timefieldtestvalue
2019-12-05 11:36:57a

b

a$1,b$21

2

3. Extract values from a field in scheduler.log events

Extract "user", "app" and "SavedSearchName" from a field called "savedsearch_id" in scheduler.log events. If savedsearch_id=bob;search;my_saved_search then user=bob , app=search and SavedSearchName=my_saved_search

... | rex field=savedsearch_id "(?<user>\w+);(?<app>\w+);(?<SavedSearchName>\w+)"

4. Use a sed expression

Use sed syntax to match the regex to a series of numbers and replace them with an anonymized string.

... | rex field=ccnumber mode=sed "s/(\d{4}-){3}/XXXX-XXXX-XXXX-/g"

5. Use a sed expression with capture replace for strings

This example shows how to use the rex command sed expression with capture replace using \1, \2 to reuse captured pieces of a string.

This search creates an event with three fields, _time, search, and orig_search. The regular expression removes the quotation marks and any leading or trailing spaces around the quotation marks.

|makeresults|eval orig_search="src_ip=TERM( \"10.8.2.33\" ) OR src_ip=TERM( \"172.17.154.197\" )", search=orig_search|rex mode=sed field=search "s/\s\"(\d+\.\d+\.\d+\.\d+)\"\s/\1/g"

The results look like this:

_timeorig_searchsearch
2021-05-31 23:36:29src_ip=TERM( "10.8.2.33" ) OR src_ip=TERM( "172.17.154.197" )src_ip=TERM(10.8.2.33) OR src_ip=TERM(172.17.154.197)

6. Use an offset_field

To identify the position of certain values in a field, use the rex command with the offset_field argument and a regular expression.

The following example starts with the makeresults command to create a field with a value:

| makeresults| eval list="abcdefghijklmnopqrstuvwxyz"

The results look something like this:

_timelist
2022-05-21 11:36:57abcdefghijklmnopqrstuvwxyz

Add the rex command with the offset_field argument to the search to create a field called off. You can identify the position of the first five values in the field list using the regular expression "(?<firstfive>abcde)". For example:

| makeresults| eval list="abcdefghijklmnopqrstuvwxyz"| rex offset_field=off field=list "(?<firstfive>abcde)"

The results look something like this:

_timefirstfivelistoff
2022-05-21 11:36:57abcdeabcdefghijklmnopqrstuvwxyzfirstfive=0-4

You can identify the position of several of the middle values in the field list using the regular expression "(?<middle>fgh)". For example:

| makeresults| eval list="abcdefghijklmnopqrstuvwxyz"| rex offset_field=off field=list "(?<middle>fgh)"

The results look something like this:

_timelistmiddleoff
2022-05-21 11:36:57abcdefghijklmnopqrstuvwxyzfghmiddle=5-7

7. Display IP address and ports of potential attackers

Display IP address and ports of potential attackers.

sourcetype=linux_secure port "failed password" | rex "\s+(?<ports>port \d+)" | top src_ip ports showperc=0

This search uses the rex command to extract the port field and values. The search returns a table that lists the top source IP addresses (src_ip) and ports of the potential attackers.

See also

Commands
extract
kvform
multikv
regex
spath
xmlkv
rex - Splunk Documentation (2024)

FAQs

How to use Erex in Splunk? ›

Command Syntax
  1. | rex [field=<field>] (<regex-expression>)
  2. | rex [field=<field>] (?< field_name>”regex”)
  3. | erex <field_name> examples="<example, <example>" counterexamples="<example,<example>"
  4. | erex Port_Used examples=”Port 8000, Port 3182”
May 22, 2024

What is the difference between rex and regex in Splunk? ›

yes, you're correct: rex extracts fields, regex searches for a string with rules. If you want to have a statistic for the NewProcessName, you have to extract them and use this new field in the stats command.

What is the function of Rex in Splunk? ›

The rex function matches the value of the specified field against the unanchored regular expression and extracts the named groups into fields of the corresponding names. When mode=sed , the given sed expression used to replace or substitute characters is applied to the value of the chosen field.

Does Splunk support regex? ›

Splunk offers two commands — rex and regex — in SPL. These commands allow Splunk analysts to utilize regular expressions in order to assign values to new fields or narrow results on the fly as part of their search.

How do I make Splunk search faster? ›

Improve your searches
  1. Select an index in the first line of your search. ...
  2. Use the TERM directive. ...
  3. Use the tstats command. ...
  4. Avoid using table commands in the middle of searches and instead, place them at the end. ...
  5. Test your search string performance.
Apr 16, 2024

How do I ingest data into Splunk? ›

How to get data into your Splunk deployment
  1. How do you want to add data?
  2. Upload data.
  3. Monitor data.
  4. Forward data.
  5. Assign the correct source types to your data.
  6. Prepare your data for preview.
  7. Modify event processing.
  8. Modify input settings.
Sep 25, 2023

Is regex faster than replace? ›

Replace() by a factor of ~2.9x. Regex. Replace is the clear winner, scaling very well with the number of replaces and size of the original string.

What is spath in Splunk? ›

Using spath simplifies the extraction of data by automatically parsing structured formats and making their properties accessible as fields within SPL. Once parsed, spath makes the individual properties of these structured data formats directly accessible as distinct fields within SPL queries.

Is regex search expensive? ›

regex is expensive – regex is often the most CPU-intensive part of a program. And a non-matching regex can be even more expensive to check than a matching one.

What is the purpose of Rex? ›

The REX system simplifies export formalities by allowing the registered exporter to certify the preferential origin himself by including a specific declaration (so-called statements on origin) on the invoice or another document identifying the exported products.

What is coalesce in Splunk? ›

The coalesce command is essentially a simplified case or if-then-else statement. It returns the first of its arguments that is not null. In your example, fieldA is set to the empty string if it is null. See http://docs.splunk.com/Documentation/Splunk/6.5.0/SearchReference/CommonEvalFunctions.

How does a Rex work? ›

Request to Exit, or REX, is a type of sensor commonly used alongside access control systems to automate the opening of doors and expand event sensing capabilities. The REX functions by detecting when there's continuity across the 2 wires in the REX section of the door cassette.

What query language does Splunk use? ›

SPL is the abbreviation for the Splunk Search Processing Language. The Search Processing Language is a set of commands that you use to search your data. There are 2 versions of the Search Processing Language: SPL and SPL2.

Do people still use regex? ›

Today, regexes are widely supported in programming languages, text processing programs (particularly lexers), advanced text editors, and some other programs.

What is the Splunk equivalent of grep? ›

The Splunk equivalent to a chain of grep s is a chain of searches. Like with Linux, the command after each pipe processes the results of the command before the pipe.

How do you use erex drops? ›

Usage instructions:

Dosage and directions for use: Put 30 drops in a small glass of water and drink 1 - 2 hours before intimacy. Only for oral use. Do not exceed the recommended dosage. Most effective as part of a healthy diet and lifestyle.

How do I use automatic lookups in Splunk? ›

  1. In Splunk Web, select Settings > Lookups.
  2. Under Actions for Automatic Lookups, click Add new.
  3. Select the Destination app.
  4. Give your automatic lookup a unique Name.
  5. Select the Lookup table that you want to use in your fields lookup.
Mar 29, 2024

How to get CPU utilization in Splunk? ›

CPU Usage by Instance

Click on any instance in the list to open the CPU Usage: Instance dashboard, where you can view detailed CPU usage information for that specific instance.

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Gregorio Kreiger

Last Updated:

Views: 6216

Rating: 4.7 / 5 (77 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Gregorio Kreiger

Birthday: 1994-12-18

Address: 89212 Tracey Ramp, Sunside, MT 08453-0951

Phone: +9014805370218

Job: Customer Designer

Hobby: Mountain biking, Orienteering, Hiking, Sewing, Backpacking, Mushroom hunting, Backpacking

Introduction: My name is Gregorio Kreiger, I am a tender, brainy, enthusiastic, combative, agreeable, gentle, gentle person who loves writing and wants to share my knowledge and understanding with you.