Regular expressions to solve programming interview riddles

Sep 30, 2016 • Yehonathan Sharvit

Acknowledgements

This article is a rewrite of the work of Mark Engelberg in his automata repository - with a tweak:

All the code snippets of this page are live and interactive powered by the klipse plugin:

Live: The code is executed in your browser
Interactive: You can modify the code and it is evaluated as you type

In his repo, Mark shows how to use regular expressions and automatas to solve programming riddles. In this article, we will focus on regular expressions.

Prelude

For the purposes of this code, it is useful to replace Clojure’s max / max-key with versions that return nil when passed no inputs to maximize. Also, we are going to use clojure.combinatorics by Mark Engelberg for generating cartesian products of sequences:

xxxxxxxxxx

(ns my.combinatorics

  (:require [clojure.math.combinatorics :refer [cartesian-product]]))

(defn max ([] nil) ([& s] (apply clojure.core/max s)))

(defn max-key ([k] nil) ([k & s] (apply clojure.core/max-key k s)))

the evaluation will appear here (soon)...

The classic interview problem - maximum segment sum

A popular problem is to find an O(n) algorithm for computing the maximum sum achievable by adding up some contiguous subsequence (aka segment) of a sequence of numbers (typical input is a mix of positive and negative integers).

For example,(maximum-segment-sum [-1 2 3 -4 5 -8 4]) should return 6 because 2+3+-4+5 is 6.

If you’ve never seen this problem before, I encourage you to go try to solve it right now. It’s a fun problem.

The trick is to keep a running sum as you traverse the sequence, never letting the running sum dip below 0. This has the effect of ignoring negative numbers that make things “too negative”. Return the highest sum you saw.

This strategy can be implemented concisely in Clojure:

xxxxxxxxxx

(defn maximum-segment-sum [s]

  (apply max (reductions (comp #(max 0 %) +) 0 s)))

xxxxxxxxxx

the evaluation will appear here (soon)...

Let’s the results of the reductions with [-1 2 3 -4 5 -8 4]:

xxxxxxxxxx

(reductions (comp #(max 0 %) +) 0 [-1 2 3 -4 5 -8 4])

xxxxxxxxxx

the evaluation will appear here (soon)...

The max is 6.

A harder problem - maximum non-segment sum

But we’re going to do something harder, we’re looking for the maximum sum among subsequences that are not a continguous segment.

For example, (maximum-non-segment-sum [-1 4 5 -3 -4]) should be 5 because 4+5+-4 = 5, and those three numbers don’t form a segment.

We can’t choose just 4, or just 5, or 4+5, because singletons and adjacent pairs are considered a segment. We can’t even choose the “empty” subsequence with a value of 0, because that is also considered a segment. We could have chosen things like -1+5 or 5+-4 or 4+-3, but they happen to be not as good.

Unfortunately, there’s no clever trick for solving this problem. We’re going to have to look for a more principled approach.

(If you don’t believe me, go spend a while trying to solve it, just so you can appreciate how hard this problem really is.)

Brute force with Regular expressions

Our strategy is going to be brute force:

xxxxxxxxxx

(defn maximum-non-segment-sum [s]

  (->> (all-non-segment-subsequences s)

    (map (partial apply +))

    (apply max)))

xxxxxxxxxx

the evaluation will appear here (soon)...

But how to write all-non-segment-subsequences?

First key insight is that you can represent a subsequence by applying a bitmask of 0s and 1s to the sequence.

xxxxxxxxxx

(defn apply-bitmask

  "Takes a sequence and a bitmask, and returns the correpsonding subsequence"

  [s bitmask]

  (for [[item bit] (map vector s bitmask) :when (= bit 1)] item))

xxxxxxxxxx

the evaluation will appear here (soon)...

Let’s see how it works:

xxxxxxxxxx

(apply-bitmask [1 2 3 4 5] [0 1 1 0 1])

xxxxxxxxxx

the evaluation will appear here (soon)...

We can describe the satisfactory bitmasks with a regex

xxxxxxxxxx

(def non-segment-regex #"0*1+0+1(0|1)*")

xxxxxxxxxx

the evaluation will appear here (soon)...

What this regex says is that a non segment bitmask is a sequence of:

0 or 0s
1 or more 1s
1 or more 0s
a single 1
0s or 1s freely

And indeed, this regex recognizes whether a bitmask represents a non-segment

xxxxxxxxxx

(re-matches non-segment-regex "011010" )

xxxxxxxxxx

the evaluation will appear here (soon)...

or not

xxxxxxxxxx

(re-matches non-segment-regex "011110")

xxxxxxxxxx

the evaluation will appear here (soon)...

Now, let’s make a function that receives a identifies non-segment bitmasks:

xxxxxxxxxx

(defn non-segment-bitmask?

  "Takes a sequence of 0's and 1's and determines whether this represents a subsequence that is not a contiguous segment"

[s]

  (not (nil? (re-matches non-segment-regex (clojure.string/join s)))))

xxxxxxxxxx

the evaluation will appear here (soon)...

It works as expected:

xxxxxxxxxx

(map non-segment-bitmask? [[0 1 1 1] [0 1 1 1 0 1]])

xxxxxxxxxx

the evaluation will appear here (soon)...

Now, we are ready to write our all-non-segment-subsequences: we will generate all the 0s and 1s sequences of the desired length and filter with non-segment-bitmask?.

We will use the cartesian-product from clojure.combinatorics:

xxxxxxxxxx

(defn all-non-segment-subsequences

  "Takes a sequence and returns all subsequences that are not a contiguous segment"

[s]

  (->> (apply cartesian-product (repeat (count s) [0 1])) ; all bitmasks matching s's length

    (filter non-segment-bitmask?)

    (map (partial apply-bitmask s))))

xxxxxxxxxx

the evaluation will appear here (soon)...

Let’s take a look at all the non-segment subsequences of [1 2 -3 4 -5]:

xxxxxxxxxx

(all-non-segment-subsequences [1 2 -3 4 -5])

xxxxxxxxxx

the evaluation will appear here (soon)...

And now, all the pieces of the puzzle are in place in order to run the maximum-non-segment-sum that we wrote above:

xxxxxxxxxx

(maximum-non-segment-sum [-1 4 5 -3 -4])

xxxxxxxxxx

the evaluation will appear here (soon)...

xxxxxxxxxx

(maximum-non-segment-sum [-1 4 5 -3 4 -9 10])

xxxxxxxxxx

the evaluation will appear here (soon)...

Please don’t try to run it with too long sequences!

(On my browser it starts to take too much time with 15 elements).

In our next article, we will show how to make our algorithm much more efficient using automatas.

If you like this article, you will enjoy a lot Mark Engelberg’s talk on youtube about automatas.

You might also enjoy this much simpler combinatorics riddle.

If you enjoy this kind of interactive articles would you consider a (small) donation💸 on Patreon or at least giving a star⭐ for the Klispe repo on Github?

to stay up-to-date with the coolest interactive articles around the world.

Discover more cool interactive articles about javascript, clojure[script], python, ruby, scheme, c++ and even brainfuck!

Give Klipse a Github star to express how much you appreciate Code Interactivity.

Subscribe to the Klipse newsletter:

Feel free to email me [email protected] for getting practical tips and tricks in writing your first interactive blog post.

Regular expressions to solve programming interview riddles

Regular expressions to solve programming interview riddles

Acknowledgements

Prelude

The classic interview problem - maximum segment sum

A harder problem - maximum non-segment sum

Brute force with Regular expressions

Recommend

Automata to solve programming interview riddles

Parsing with derivatives - Elegant matching of regular languages in clojure

Custom defn macro with clojure.spec - part 1: conform/unform

Custom defn macro - part 2: playing with parse trees

bootrapped (self-host) namespace loading strategy in KLIPSE - insights into the...

KLIPSE talk at ClojureX 2016

Core.match interactive tutorial

In javascript, setTimeout is forced to be >= 4 msec

str vs. pr-str - humans vs. machines

JSX (react) in depth - an interactive tutorial

About Joyk