Maximum call stack size exceeded with recursion - javascript

As the title suggests, I was trying to recursively solve a JavaScript problem. An exercise for my internet programming class was to invert any string that was entered in the function, and I saw this as a good opportunity to solve this with recursion. My code:
function reverseStr(str){
str = Array.from(str);
let fliparray = new Array(str.length).fill(0);
let char = str.slice(-1);
fliparray.push(char);
str.pop();
str.join("");
return reverseStr(str);
}
writeln(reverseStr("hello"))

The biggest problem is that your function doesn't have an end (base) case. It needs to have some way to recognize when it's supposed to stop or it will recurse forever.
The second problem is that you don't really seem to be thinking recursively. You're making some modification to the string, but then you just call reverseStr() all over again on the modified string, which is just going to start the process all over again.
The following doesn't really resemble your attempt (I don't know how to salvage your attempt), but it is a simple way to implement the reverse string algorithm recursively.
function reverseStr(str) {
// string is 0 or 1 characters. nothing to reverse
if (str.length <= 1) {
return str;
}
// return the first character appended to the end of the reverse of
// the portion after the first character
return reverseStr(str.substring(1)) + str.charAt(0);
}
console.log(reverseStr("Hello Everybody!"));

Related

Maximum call stack size exceeded for large iterators

I'm trying to convert an iterator to an array. The iterator is the result of calling matchAll on a very long string. The iterator (I assume) has many matches within the string. First I tried it with the spread operator:
const array = [...myLongString.matchAll(/myregex/g)];
This gave me the error: RangeError: Maximum call stack size exceeded
So I tried iterating via next():
const safeIteratorToArray = (iterator) => {
const result = [];
let item = iterator.next();
while (!item.done) {
result.push(item.value);
item = iterator.next();
}
return result;
};
But this gives me the same error, on the item = iterator.next() line. So I tried making it async in an effort to reset the call stack:
const safeIteratorToArray = async (iterator) => {
const result = [];
let item = iterator.next();
while (!item.done) {
result.push(item.value);
item = await new Promise(resolve => setTimeout(() => resolve(iterator.next())));
}
return result;
};
But I still get the same error.
If you are curious about the actual use case:
The regex I'm actually using is:
/\[(.+?)\] \[DEBUG\] \[Item (.+?)\] Success with response: ((.|\n)+?)\n\[/g
And the contents of the text file (it's a log file) generally looks like:
[TIMESTAMP] [LOG_LEVEL] [Item ITEM_ID] Success with response: {
...put a giant json object here
}
Repeat that ad-nauseam with newlines in between each log.
(V8 developer here.)
It's not about the iterator, it's about the RegExp.
[Update]
Looks like I was misled by a typo in my test, so my earlier explanation/suggestion doesn't fix the problem. With the test corrected, it turns out that only the end of the expression (which I called "fishy" before) needs to be fixed.
The massive consumption of stack memory is caused by the fact that (.|\n) is a capture group, and is matched very frequently. One idea would be to write it as [.\n] instead, but the . metacharacter is not valid inside [...] character sets.
Hat tip to #cachius for suggesting an elegant solution: use the s flag to make . match \n characters.
As an unrelated fix, prefer checking for the closing } instead of the next opening [ at the beginning of a line, so that there's no overlap between matched ranges (which would make you miss some matches).
So, in summary, replace ((.|\n)+?)\n\[/g with (.+?)\n}/gs.
[/Update]
Here is a reproducible example. The following exhibits the stack overflow:
let lines = ["[TIMESTAMP] [DEBUG] [Item ITEM_ID] {"];
for (let i = 0; i < 1000000; i++) {
lines.push(" [0]"); // Fake JSON, close enough for RegExp purposes.
}
lines.push("}");
lines.push("[TIMESTAMP]");
let myLongString = lines.join("\n");
const re = /\[(.+?)\] \[DEBUG\] \[Item (.+?)\] ((.|\n)+?)\n\[/g;
myLongString.match(re);
If you replace the const re = ... line with:
const re = /\[(.+?)\] \[DEBUG\] \[Item (.+?)\] (.+?)\n}/gs;
then the stack overflow disappears.
(It would be possible to reduce the simplified example even further, but then the connection with your original case wouldn't be as obvious.)
[Original post below -- the mechanism I explained here is factually correct, and applying the suggested replacement indeed improves performance by 25% because it makes the RegExp simpler to match, it just isn't enough to fix the stack overflow.]
The problematic pattern is:
\[(.+?)\]
which, after all, means "a [, then any number of arbitrary characters, then a ]". While I understand that regular expressions might seem like magic, they're actually real algorithmic work under the hood, kind of like miniature programs in their own right. In particular, any time a ] is encountered in the string, the algorithm has to decide whether to count this as one of the "arbitrary characters", or as the one ] that ends this sequence. Since it can't magically know that, it has to keep both possibilities "in mind" (=on the stack), pick one, and backtrack if that turns out to be incorrect. Since this backtracking information is kept on the stack (where else?), if you put sufficiently many ] into your string, the stack will run out of space.
Luckily, the solution is simple: since what you actually mean is "a [, then any number of characters that aren't ], then a ]", you can just tell the RegExp engine that, replacing . with [^\]]:
\[([^\]]+?)\]
Note: ((.|\n)+?)\n\[ seems fishy for the same reason, but according to this test doesn't appear to be the problem, even if I further increase the input size. I'm not sure why; it might be due to how I created the test. If you see further problems with the real input, it may be worth reformulating this part as well.
[/Original post]

Why does this dynamic programming optimization actually make code slower?

This is from Leetcode problem: Concatenated Words.
Below is a working solution. I added what I thought to be an optimization (see code comment), but it actually slows down the code. If I remove the wrapping if statement, it runs faster.
To me, the optimization helps avoid having to:
call an expensive O(n) substring()
check inside wordsSet
making an unnecessary function call to checkConcatenation
Surely if (!badStartIndices.has(end + 1)) isn't more expensive than all the above, right? Maybe it has something to do with Javascript JIT compilation? V8? Thoughts?
Use the following test input:
// Notice how the second string ends with a 'b'!
const words = [
'a',
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab',
];
// Function in question.
var findAllConcatenatedWordsInADict = function (words) {
console.time();
// 1) put all words in a set
const wordsSet = new Set(words);
let badStartIndices;
// 2) iterate words, recursively check if it's a valid word
const concatenatedWords = [];
function checkConcatenation(word, startIdx = 0, matches = 0) {
if (badStartIndices.has(startIdx)) {
return false;
}
if (startIdx === word.length && matches >= 2) {
concatenatedWords.push(word);
return true;
}
for (let end = startIdx; end < word.length; end++) {
// I ADDED THE IF STATEMENT AS AN OPTIMIZATION. BUT CODE RUNS FASTER WITHOUT IT.
// NOTE: Code is correct with & without if statement.
if (!badStartIndices.has(end + 1)) {
const curWord = word.substring(startIdx, end + 1);
if (wordsSet.has(curWord)) {
if (checkConcatenation(word, end + 1, matches + 1)) {
return true;
}
}
}
}
badStartIndices.add(startIdx);
return false;
}
for (const word of words) {
// reset memo at beginning of each word
badStartIndices = new Set();
checkConcatenation(word);
}
console.timeEnd();
return concatenatedWords;
};
Turns out this depends entirely on the input data, not on JavaScript or V8. (And as of writing this, I don't know what data you used for benchmarking.)
With the example input from the Leetcode page you've linked, badStartIndices never does anything useful (both of the .has checks always return false); so it's fairly obvious that doing this fruitless check twice is a little slower than doing it just once. In that case, the "dynamic programming" mechanism of the solution never kicks in, so the effective behavior degenerates to brute force, which is good enough because the input data is well-behaved. (In fact, deleting badStartIndices entirely would be even faster for such a test case.)
If I construct "evil" input data that actually leads to exponential combinatorial blow-up, i.e. where the badStartIndices.has(...) checks actually have something to do, then adding the early check does have a (small) performance benefit. (And without either of the checks, the computation would take "forever" for such inputs.)
So, taking a step back, this is one more example to illustrate that benchmarking is difficult; in particular, in order to get useful results, care must be taken to select relevant/realistic input data.
If the tests are too simple, developers are likely to not build optimizations that would help (a little or a lot) in high-load situations.
If the tests are too demanding, developers are likely to waste time on overly complicated code that ends up being slower than it could be for its target use case.
And if the code must handle any input with maximum performance, then as the developer you have the extra challenge of avoiding overhead for simple inputs while still scaling well to tough inputs...

Using # to mark end of array

I am currently studying another user’s code for a coding question from LeetCode. My question is about certain aspects of his code. Here’s a link to the question.
Question:
Why does this user use a # to mark the end of the array?
Under the second if case, the user writes:
ans.push(nums[t] + '->' + (nums[i-1]))
Now, I understand what this statement does. My question is: Why does this produce an output of ["0->2",...] instead of [0"->"2,...]?
var summaryRanges = function(nums) {
var t = 0
var ans = []
nums.push('#')
for(var i=1;i<nums.length;i++)
if(nums[i]-nums[t] !== i-t){
if(i-t>1)
ans.push(nums[t]+'->'+(nums[i-1]))
else
ans.push(nums[t].toString())
t = i
}
return ans
}
The algorithm depends on that the difference between nums[i] and nums[t] is not the same as the difference between i and t. When that happens, the algorithm adds more to the output. This creates a special case when the last range is just a single number, since this cannot trigger the condition.
Hence the hash character is padding to extend the array in order to make the algorithm work, so that the condition nums[i]-nums[t] !== i-t will trigger even for a finishing range of a single number. It could be any string really as long as it is not an integer number.

How to reverse a string recursively

I was in an interview the other day and I was asked to write a method that reverses a string recursively.
I started writing a method that calls itself and got stuck.
Here is what I was asked, reverse a string "Obama" recursively in JavaScript.
Here is how far I got.
function reverseString(strToReverse)
{
reverseString(strToReverse);
};
And got stuck, they said NO for i loops.
Anyone got any ideas?
Look at it this way: the reversed string will start with the last letter of the original, followed by all but the last letter, reversed.
So:
function reverseString(strToReverse)
{
if (strToReverse.length <= 1)
return strToReverse;
// last char +
// 0 .. second-last-char, reversed
return strToReverse[strToReverse.length - 1] +
reverseString( strToReverse.substring(0, strToReverse.length - 1) );
}
See the solution by #MaxZoom below for a more concise version.
Note that the tail-recursive style in my own answer provides no advantage over the plain-recursive version since JS interpreters are not required to perform tail call elimination.
[Original]
Here's a tail recursive version that works by removing a character from the front of the input string and prepending it to the front of the "accumulator" string:
function reverse(s, acc) {
if (s.length <= 0) { return acc; }
return reverse(s.slice(1), s.charAt(0) + (acc||''));
}
reverse('foobar'); // => "raboof"
The simplest one:
function reverse(input) {
if (input == null || input.length < 2) return input;
return reverse(input.substring(1)) + input.charAt(0);
}
console.log(reverse('Barack Obama'));
Single line solution. And if they asked I tell them it's recursive in the native code part and not to ask any more stupid questions.
var result = "Obama".split('').reverse().join('');
Output: amabO
The real problem here is not "how to reverse a string". The real problem is, "do you understand recursion". That is what the interview question is about!
So, in order to solve the problem, you need to show you know what recursion is about, not that you can reverse the string "Obama". If all you needed to do was reverse the string "Obama", you could write return "amabO"; see?
In other words, this specific programming task is not what it's all about! The real solution is not to copy and paste the code from the answers here, but to know about recursion.
In brief,
Recursion involves calling the same function again, yes, but that's not all
In order to prevent stack overflow, you MUST ensure that the function doesn't call itself indefinitely
So there's always a condition under which the function can exit without calling itself (again)
And when it does call itself again, it should do so with parameters that make the above condition more likely.
In the case of string operations, one way to make that all happen is to make sure that it calls itself only with strings that are shorter than the one it was called with. Since strings are not of an infinite length, the function can't call itself an infinite number of times that way. So the condition can be that the string has a length of zero, in which case it's impossible to call itself with a shorter string.
If you can prove that you know all this, and can use it in a real world program, then you're on your way to passing the interview. Not if you copy and paste some source you found on the internet.
Hope this helps!
We can easily reverse a string in the recursion method using the ternary operator
function reverseString(strToReverse) {
return str.length > 1 ? reverse(str.slice(1)) + str.charAt(0) : str;
}
reverseString("America");
Not the smartest way to reverse a string, but it is recursive:
function reverse(input, output) {
output = output || '';
if (input.length > 0) {
output = output.concat(input[input.length -1]);
return reverse(input.substr(0, input.length - 1), output);
}
return output;
}
console.log(reverse('Obama'));
Here's a jsfiddle
Maybe something like this?
var base = 'Obama',
index = base.length,
result = '';
function recursive(){
if (index == 0) return;
index -= 1;
result += base[index];
recursive();
}
recursive();
alert(result);
jsfiddle:
https://jsfiddle.net/hy1d84jL/
EDIT: You can think of recursion as an infinite for..loop. Let's just use it in "controlled" way and define the bounds - 0 for minimum and the length of Obama word as the maximum. Now, let's just make it call itself whatever number of times and do what you need in order to reverse the string, which is - decrement the index variable by one and sum the character from the end. Hope it helps. Nice question.
If the function can only have the single input i would split the string into smaller and smaller pieces, and add them all together in the reverse order
function reverseString(strToReverse){
if (strToReverse.length <= 1) { return strToReverse; }
return reverseString(strToReverse.substr(1, strToReverse.length - 1) + strToReverse[0];
}

Apparent trouble with behavior of substring in JavaScript

I am writing a recursive algorithm to build a finite state automaton by parsing a regular expression. The automaton iterates through the expression, pushing characters to a stack and operators to an "operator stack." When I encounter "(" (indicating a grouping operation), I push a "sub automaton" to the stack and pass the rest of the pattern to the sub automaton to parse. When that automaton encounters ")", it passes the rest of the string up to the parent automaton to finish parsing. Here is the code:
var NFA = function(par) {
this.stack = [];
this.op_stack = [];
this.parent = par;
};
NFA.prototype.parse = function(pattern) {
var done = false;
for(var i in pattern) {
if (done === true) {
break;
}
switch(pattern.charAt(i)) {
case "(":
var sub_nfa = new NFA(this);
this.stack.push(sub_nfa);
sub_nfa.parse(pattern.substring(i+1, pattern.length));
done = true;
break;
case ")":
if (this.parent !== null) {
var len = pattern.length;
/*TROUBLE SPOT*/
this.parent.parse(pattern.substring(i, pattern.length));
done = true;
break;
}
case "*":
this.op_stack.push(operator.KLEENE);
break;
case "|":
this.op_stack.push(operator.UNION);
break;
default:
if(this.stack.length > 0) {
//only push concat after we see at least one symbol
this.op_stack.push(operator.CONCAT);
}
this.stack.push(pattern.charAt(i));
}
}
};
Note the area marked "TROUBLE SPOT". Given the regular expression "(a|b)a", the call this.parent.parse, is called exactly once: when the sub-automaton encounters ")". At this point, pattern.substring(i, pattern.length) = ")a". This "works", but it isn't correct because I need to consume the ")" input before I pass the string to the parent automaton. However, if I change the call to this.parent.parse(pattern.substring(i+1, pattern.length), parse gets handed the empty string! I have tried stepping through the code and I cannot explain this behavior. What am I missing?
At Juan's suggestion, I made a quick jsfiddle to show the problem when trying to parse "(a|b)a" with this algorithm. In the ")" case, it populates an empty div with the substring at the i index and the substring at the i+1 index. It shows that while there are 2 characters in the substring at i, the substring at i+1 is the empty string! Here's the link: http://jsfiddle.net/XC6QM/1/
EDIT: I edited this question to reflect the fact that using charAt(i) doesn't change the behavior of the algorithm.
I think the previous answer was on the right track. But there also looks to me to be an off-by-one error. Shouldn't you be increasing the index for your substring? You don't want to include the ")" in the parent parse, right?
this.parent.parse(pattern.substring(i + 1, pattern.length));
But this will still fail because of the error Juan mentioned. A quick temporary fix to test this would be to convert the i to a number:
this.parent.parse(pattern.substring(+i + 1, pattern.length));
This might do it for you. But you should probably go back and switch away from the for-in loop on the string. I think that's causing your issue. Turn it into an array with str.split('') and then use an integer to loop. That will prevent further such issues.
The real problem is the fact that you were using a for in to iterate through the characters of the string. With the for in loop, your i is going to be a string, therefore, when you try to do i+1, it does string concatenation.
If you change your iteration to be
for(var i=0; i < pattern.length; i++) {
Then it all works fine http://jsfiddle.net/XC6QM/2/
Scott's answer correctly identified the problem but I think his solution (converting the indexes to numbers) is not ideal. You're better off looping with a numeric index to begin with
Also, you should not use brackets to access characters within a string, that does not work in IE 7
switch(pattern[i]) {
should be
switch(pattern.charAt(i)) {

Categories