Tech Tip: Multi-line Perl regex pattern match
Yesterday, I encountered an analysis issue that appeared to be resolvable with a simple pattern replacement technique.
This turned out to be a bit more complex than estimated, as the pattern spanned multiple lines and I struggled to get the regex constructed to match this use case.
I finally figured it out, and although the overall problem remains unresolved (it has many layers), this particular part works well. My error was in how I viewed use of /s and /m as mutually exclusive; they aren’t of course, and I have made that mistake one other time over the years. I also hadn’t set the input record separator ($/) to paragraph read mode. I’m posting details here to document my specific usage.
Assumptions:
- A literal pattern exists that is present in multiple places in a file
- The literal pattern spans multiple lines in the file
-
Literal pattern is
- Node-path: branches/%%BRANCHES%%
- Node-kind: dir
- Node-action: add
- Perl is being used to read the file and current line of file is $line
- Objective is to delete all occurrences of this pattern from the file one $line at a time
Here’s how I did that:
[read the file & crawl thru the lines]
$/ = ”;
$line =~ s/^Node-path:\sbranches\/%%BRANCHES%%.*?^Node-kind:\sdir.*?^Node-action:\sadd.*?//sm;
Popularity: 35% [?]