James E. McDonough

March 29, 2023 5 minute read

Enhancing Regex Toy – Part 4

This is the fourth in a series of six blogs describing how to enhance the regular expression tester known as Regex Toy, each blog describing a single enhancement to its capabilities.

Before applying the fourth patch

The preceding blog in this series described how to patch a copy of Regex Toy to enable it to identify matches even when the matching text straddles implicit line breaks, but ended with a description of an issue whereby Regex Toy formats text in the Matches block such that it runs beyond the visible area of the window and is accompanied by a horizontal scroll bar. To illustrate this problem again, execute the enhanced Regex Toy and follow these steps:

Paste the following tongue twister into the Text block:

A skunk sat on a stump. The skunk thunk the stump stunk and the stump thunk the skunk stunk.

Place a check mark into the IN TABLE check box.
Select the All Occurrences button.
Specify in the Regex slot of the Input block the following string

the skunk

Press enter.

As shown in the screen shot above, the text runs off to the right of the visible area of the Matches window, requiring the use of the horizontal scroll bar to see it.

Now do the same with Regex Storm, selecting the checkbox for Ignore Case:

As shown in the screen shot above, with Regex Storm the text does not run off to the right of the visible area of the Input block.

The reason for the fourth patch

Unlike Regex Storm, Regex Toy does not facilitate keeping the text shown in the Matches block from running beyond the constraints of the visible area of the Matches window.

Applying the fourth patch

Using your favorite ABAP editor, edit the copy of ABAP repository object DEMO_REGEX_TOY containing the previous patches and apply the following 2-step change in method display:

1. Ahead of the CONCATENATE LINES OF statement, insert the following set of lines (first and last lines, shown preceding and succeeding a comment line of all hyphens, already exist in the code as lines 216 and 217, respectively):

216   APPEND '<html><body><font face="Arial monospaced for ...
      " -----------------------------------
      " DEMO_REGEX_TOY enhancement #4
      " Format result text using approximate window width
      " specified by line_len:
      constants     : html_space     type string value ' '
                    , html_special_character_start
                                     type string value '&'
                    , html_special_character_end
                                     type string value ';'
                    , html_format_start
                                     type string value '<'
                    , html_format_end
                                     type string value '>'
                    .
      data          : visible_text_length
                                     type int4
                    , html_last_space_length
                                     type int4
                    , cause_line_break
                                     type flag
                    , html_string    type string
                    , html_excess    type string
                    , html_stack     type standard table
                                       of string
                    .
      loop at result_it
         into result_wa.
        clear: html_string, visible_text_length.
        while strlen( result_wa ) gt 00.
          case result_wa+00(01).
            when html_format_start.
              while result_wa+00(01) ne html_format_end.
                concatenate html_string
                            result_wa+00(01)
                       into html_string.
                shift result_wa left by 01 places.
              endwhile.
            when html_special_character_start.
              if strlen( result_wa ) ge 06.
                if result_wa+00(06) eq html_space.
                  if visible_text_length gt line_len.
                    cause_line_break = abap_true.
                  endif.
                endif.
              endif.
              while result_wa+00(01) ne
                    html_special_character_end.
                concatenate html_string
                            result_wa+00(01)
                       into html_string.
                shift result_wa left by 01 places.
              endwhile.
              add 01 to visible_text_length.
              if cause_line_break eq abap_false.
                html_last_space_length
                                     = strlen( html_string ) + 01.
              endif.
            when others.
              add 01 to visible_text_length.
          endcase.
          concatenate html_string
                      result_wa+00(01)
                 into html_string.
          shift result_wa left by 01 places.
          if cause_line_break eq abap_true.
            clear html_excess.
            if html_last_space_length gt 00.
              html_excess =
                     substring( val = html_string
                                off = html_last_space_length ).
              html_string =
                     substring( val = html_string
                                len = html_last_space_length ).
            endif.
            append html_string
                to html_stack.
            clear: html_string, html_last_space_length.
            visible_text_length      = strlen( html_string ).
            cause_line_break         = abap_false.
            if strlen( html_excess ) gt 00.
              concatenate html_excess
                          result_wa
                     into result_wa.
            endif.
          endif.
        endwhile.
        append html_string
            to html_stack.
      endloop.
      " -----------------------------------
217   CONCATENATE LINES OF result_it INTO result_wa ...

2. Then change the CONCATENATE LINES OF statement from

CONCATENATE LINES OF result_it INTO result_wa …

CONCATENATE LINES OF html_stack INTO result_wa …

This is the same fourth patch unchanged from the E-bite.

After applying the fourth patch

Now activate the program and execute it using the same process described previously:

Paste the following tongue twister into the Text block:

A skunk sat on a stump. The skunk thunk the stump stunk and the stump thunk the skunk stunk.

Place a check mark into the IN TABLE check box.
Select the All Occurrences button.
Specify in the Regex slot of the Input block the following string

the skunk

Press enter.

As shown in the screen shot above, you should find that now the same two matches are found but the text in the Matches block no longer runs off to the right of its visible area.

What’s next?

Now try the following regular expression test:

Paste the following first sentence of the tongue twister into the Text block:

A
skunk
sat
on
a
stump.

such that each word is followed by an explicit line break placing it on its own line as shown above.

Place a check mark into the IN TABLE check box.
Select the All Occurrences button.
Specify a single dot character in the Regex slot
Press enter.

As shown in the screen shot above, you should find that now every character of the Text matches the regular expression pattern, as it should.

Note: The dot character is the wildcard character applicable to regular expression patterns, indicating to match any character at that position in the text, providing the same functionality to regular expressions that the ‘+’ wildcard character provides with patterns specified for ranges and select-options statements in SAP.

However, notice that the text no longer is formatted in the Matches window with the explicit line breaks provided with the content in the Text window.

Next, try the same with Regex Storm:

As shown in the screen shot above, Regex Storm does observe and retain explicit line breaks.

Whereas the fourth patch provides an improvement to Regex Toy by enabling it to keep the Matches text within the boundaries of its window, it no longer observes explicit line breaks. This issue is addressed in the next blog in this series, Enhancing Regex Toy – Part 5.

Enhancing Regex Toy – Part 4

Enhancing Regex Toy – Part 4

Before applying the fourth patch

The reason for the fourth patch

Applying the fourth patch

After applying the fourth patch

What’s next?

Recommend

Revealing the nature of fractures in high-strength steel caused by hydrogen

I tried to use Google Bard to help me with Wordle - but it didn't go well

Trump Mistakenly Boasts That He Got Rid of NATO. He Meant NAFTA.

Update on deprecation of Keycloak adapters

比尔·盖茨爱玩的“中产运动”，让这对85后夫妻赢麻了

ChatGPT标注数据比人类便宜20倍，80%任务上占优势 | 苏黎世大学

Go 解析日期格式-解决 parsing time xx as xx: cannot parse xx as xx 错误

Linux 上的 WireGuard 网络分析（一）

播放量破千万，春日广告不止氛围感

王慧文收购国产AI框架OneFlow，为中国版ChatGPT疯狂抢人抢基建

About Joyk