Do you ever need to use regex to filter and monitor a list of queries or URLs?

I do, pretty frequently.

My usual use-cases for regex are:

  • Monitoring a list of new URLs from a content gap
  • Checking performance for blog articles I've specifically created for a client
  • Extracting priority keywords for monitoring from the Google Search Console API (like I recently showed with my GSC rank tracker in Google Sheets)
  • Segmenting a clients site into different sections based upon URL patterns

Sometimes, the list of things I want to monitor can be quite large, and that is where my Google Sheets regex generator can help.

regex generator screenshot

The tool is pretty simple.

All you need to do is enter a list of the URLs, queries or whatever else you want to create a regex rule for, specify the match type and then copy and paste the rule it gives back into whatever tool you're using.

Below are a few different ideas for how you could use this sheet to generate regex patterns that you can input into other tools.

You could use the regex examples across any of these tools, so make sure to experiment.

Example 1: Google Analytics segments

One example use case is creating a segment within Google Analytics.

Say you've just completed a content gap and created a range of new categories pages targeting terms on your e-commerce site.

All you have to do is copy your list of URLs and paste them into the sheet with a match type of 'Exactly matches'.

regex generator google analytics

Once you've done that, head to Google Analytics and create a new segment.

google analytics new segment

After you've done that, name your segment, head to conditions, choose 'Landing page' as your dimension, 'matches regex' as your match type and then paste the regex outputted from the tool.

google analytics regex landing page match

You've now got a segment in Google Analytics monitoring those new URLs so you can quickly see the performance impact.

Example 2: Data studio CASE statements

One of the great things about Data Studio is CASE statements.

There are many ways you can use the CASE and REGEXP_MATCH functions, but one is to create a brand / non-brand split with the Google Search Console connector.

Here is an example CASE statement to create a brand / non-brand dimension in Data Studio.


CASE
  WHEN REGEXP_MATCH(Query, "(.*brand1.*|.*brand2.*|.*brand3.*|.*brand4.*|.*brand5.*)") THEN "Brand"
  ELSE "Non-brand"
ENDCode language: PHP (php)

To create the Regex within the Google Sheet, use a contains match type alongside entering a list of different brand variations in the table below.

brand regex match sheet

Next, create a calculated field in Data Studio by heading to your GSC data source field settings and selecting 'add a field' in the top right.

data studio add field

Then, give the field a name, enter the CASE statement in the example above with your regex pasted in over the example regex and click save.

data studio brand non brand calculated field

Example 3: Supermetrics

If you have a list of priority terms you're trying to optimise for, what you could do is utilise Supermetrics and filter for those specific terms to track performance.

To do that, enter the queries you want to pull data for in my sheet, set it to exactly matches and copy the regex rule.

regex generator search query exactly matches

Next, open up the Supermetrics sidebar.

image 34

Use the GSC data source and configure however you like.

Head to the filter area, set up a 'search query' filter with a 'matches regex' rule, and paste the rule you copied from the sheet.

supermetrics regex filter by query

Get the data to the table, and you'll have GSC data for just the queries you entered.

This is quite similar to how my GSC rank tracker works, but I also bypass the sidebar to make it easier to use.

How does it work?

The actual formula is quite simple. All it is a IFS statement combined with a JOIN and FILTER to exclude empty cells for the JOIN.

=CONCATENATE(
     IFS(
        C4 = "Exactly matches",
        "^",
        C4 = "Contains",
        ".*",
        C4 = "Ends with",
        "",
        C4 = "Starts with",
        "^"
    ),
     JOIN(
         IFS(
            C4 = "Exactly matches",
            "$|^",
            C4 = "Contains",
            ".*|.*",
            C4 = "Ends with",
            "$|",
            C4 = "Starts with",
            "|^"
        ),
         FILTER(
            B8:B,
            NOT(
                B8:B = ""
            ) 
        )
    )
)Code language: JavaScript (javascript)

I tried a few different ways of making it, but this seemed the simplest.

Summary

A short post, but hopefully, this is something someone finds useful! Any questions, feel free to tweet me @SamUnderwoodUK.