5

Enhancing Regex Toy – Part 4

 1 year ago
source link: https://blogs.sap.com/2023/03/29/enhancing-regex-toy-part-4/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
March 29, 2023 5 minute read

Enhancing Regex Toy – Part 4

This is the fourth in a series of six blogs describing how to enhance the regular expression tester known as Regex Toy, each blog describing a single enhancement to its capabilities.

Before applying the fourth patch

The preceding blog in this series described how to patch a copy of Regex Toy to enable it to identify matches even when the matching text straddles implicit line breaks, but ended with a description of an issue whereby Regex Toy formats text in the Matches block such that it runs beyond the visible area of the window and is accompanied by a horizontal scroll bar. To illustrate this problem again, execute the enhanced Regex Toy and follow these steps:

  • Paste the following tongue twister into the Text block:

A skunk sat on a stump.  The skunk thunk the stump stunk and the stump thunk the skunk stunk.

  • Place a check mark into the IN TABLE check box.
  • Select the All Occurrences button.
  • Specify in the Regex slot of the Input block the following string

the skunk

  • Press enter.
blog-image-22.png

As shown in the screen shot above, the text runs off to the right of the visible area of the Matches window, requiring the use of the horizontal scroll bar to see it.

Now do the same with Regex Storm, selecting the checkbox for Ignore Case:

blog-image-23.png

As shown in the screen shot above, with Regex Storm the text does not run off to the right of the visible area of the Input block.

The reason for the fourth patch

  • Unlike Regex Storm, Regex Toy does not facilitate keeping the text shown in the Matches block from running beyond the constraints of the visible area of the Matches window.

Applying the fourth patch

Using your favorite ABAP editor, edit the copy of ABAP repository object DEMO_REGEX_TOY containing the previous patches and apply the following 2-step change in method display:

1. Ahead of the CONCATENATE LINES OF statement, insert the following set of lines (first and last lines, shown preceding and succeeding a comment line of all hyphens, already exist in the code as lines 216 and 217, respectively):

216   APPEND '<html><body><font face="Arial monospaced for ...
      " -----------------------------------
      " DEMO_REGEX_TOY enhancement #4
      " Format result text using approximate window width
      " specified by line_len:
      constants     : html_space     type string value ' '
                    , html_special_character_start
                                     type string value '&'
                    , html_special_character_end
                                     type string value ';'
                    , html_format_start
                                     type string value '<'
                    , html_format_end
                                     type string value '>'
                    .
      data          : visible_text_length
                                     type int4
                    , html_last_space_length
                                     type int4
                    , cause_line_break
                                     type flag
                    , html_string    type string
                    , html_excess    type string
                    , html_stack     type standard table
                                       of string
                    .
      loop at result_it
         into result_wa.
        clear: html_string, visible_text_length.
        while strlen( result_wa ) gt 00.
          case result_wa+00(01).
            when html_format_start.
              while result_wa+00(01) ne html_format_end.
                concatenate html_string
                            result_wa+00(01)
                       into html_string.
                shift result_wa left by 01 places.
              endwhile.
            when html_special_character_start.
              if strlen( result_wa ) ge 06.
                if result_wa+00(06) eq html_space.
                  if visible_text_length gt line_len.
                    cause_line_break = abap_true.
                  endif.
                endif.
              endif.
              while result_wa+00(01) ne
                    html_special_character_end.
                concatenate html_string
                            result_wa+00(01)
                       into html_string.
                shift result_wa left by 01 places.
              endwhile.
              add 01 to visible_text_length.
              if cause_line_break eq abap_false.
                html_last_space_length
                                     = strlen( html_string ) + 01.
              endif.
            when others.
              add 01 to visible_text_length.
          endcase.
          concatenate html_string
                      result_wa+00(01)
                 into html_string.
          shift result_wa left by 01 places.
          if cause_line_break eq abap_true.
            clear html_excess.
            if html_last_space_length gt 00.
              html_excess =
                     substring( val = html_string
                                off = html_last_space_length ).
              html_string =
                     substring( val = html_string
                                len = html_last_space_length ).
            endif.
            append html_string
                to html_stack.
            clear: html_string, html_last_space_length.
            visible_text_length      = strlen( html_string ).
            cause_line_break         = abap_false.
            if strlen( html_excess ) gt 00.
              concatenate html_excess
                          result_wa
                     into result_wa.
            endif.
          endif.
        endwhile.
        append html_string
            to html_stack.
      endloop.
      " -----------------------------------
217   CONCATENATE LINES OF result_it INTO result_wa ...

2. Then change the CONCATENATE LINES OF statement from

CONCATENATE LINES OF result_it INTO result_wa …
CONCATENATE LINES OF html_stack INTO result_wa …

This is the same fourth patch unchanged from the E-bite.

After applying the fourth patch

Now activate the program and execute it using the same process described previously:

  • Paste the following tongue twister into the Text block:

A skunk sat on a stump.  The skunk thunk the stump stunk and the stump thunk the skunk stunk.

  • Place a check mark into the IN TABLE check box.
  • Select the All Occurrences button.
  • Specify in the Regex slot of the Input block the following string

the skunk

  • Press enter.
blog-image-24.png

As shown in the screen shot above, you should find that now the same two matches are found but the text in the Matches block no longer runs off to the right of its visible area.

What’s next?

Now try the following regular expression test:

  • Paste the following first sentence of the tongue twister into the Text block:

A
skunk
sat
on
a
stump.

such that each word is followed by an explicit line break placing it on its own line as shown above.

  • Place a check mark into the IN TABLE check box.
  • Select the All Occurrences button.
  • Specify a single dot character in the Regex slot
  • Press enter.
blog-image-25.png

As shown in the screen shot above, you should find that now every character of the Text matches the regular expression pattern, as it should.

Note: The dot character is the wildcard character applicable to regular expression patterns, indicating to match any character at that position in the text, providing the same functionality to regular expressions that the ‘+’ wildcard character provides with patterns specified for ranges and select-options statements in SAP.

However, notice that the text no longer is formatted in the Matches window with the explicit line breaks provided with the content in the Text window.

Next, try the same with Regex Storm:

blog-image-26.png

As shown in the screen shot above, Regex Storm does observe and retain explicit line breaks.

Whereas the fourth patch provides an improvement to Regex Toy by enabling it to keep the Matches text within the boundaries of its window, it no longer observes explicit line breaks. This issue is addressed in the next blog in this series, Enhancing Regex Toy – Part 5.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK