I am writing a recursive algorithm to build a finite state automaton by parsing a regular expression. The automaton iterates through the expression, pushing characters to a stack and operators to an "operator stack." When I encounter "(" (indicating a grouping operation), I push a "sub automaton" to the stack and pass the rest of the pattern to the sub automaton to parse. When that automaton encounters ")", it passes the rest of the string up to the parent automaton to finish parsing. Here is the code:
var NFA = function(par) {
this.stack = [];
this.op_stack = [];
this.parent = par;
};
NFA.prototype.parse = function(pattern) {
var done = false;
for(var i in pattern) {
if (done === true) {
break;
}
switch(pattern.charAt(i)) {
case "(":
var sub_nfa = new NFA(this);
this.stack.push(sub_nfa);
sub_nfa.parse(pattern.substring(i+1, pattern.length));
done = true;
break;
case ")":
if (this.parent !== null) {
var len = pattern.length;
/*TROUBLE SPOT*/
this.parent.parse(pattern.substring(i, pattern.length));
done = true;
break;
}
case "*":
this.op_stack.push(operator.KLEENE);
break;
case "|":
this.op_stack.push(operator.UNION);
break;
default:
if(this.stack.length > 0) {
//only push concat after we see at least one symbol
this.op_stack.push(operator.CONCAT);
}
this.stack.push(pattern.charAt(i));
}
}
};
Note the area marked "TROUBLE SPOT". Given the regular expression "(a|b)a", the call this.parent.parse, is called exactly once: when the sub-automaton encounters ")". At this point, pattern.substring(i, pattern.length) = ")a". This "works", but it isn't correct because I need to consume the ")" input before I pass the string to the parent automaton. However, if I change the call to this.parent.parse(pattern.substring(i+1, pattern.length), parse gets handed the empty string! I have tried stepping through the code and I cannot explain this behavior. What am I missing?
At Juan's suggestion, I made a quick jsfiddle to show the problem when trying to parse "(a|b)a" with this algorithm. In the ")" case, it populates an empty div with the substring at the i index and the substring at the i+1 index. It shows that while there are 2 characters in the substring at i, the substring at i+1 is the empty string! Here's the link: http://jsfiddle.net/XC6QM/1/
EDIT: I edited this question to reflect the fact that using charAt(i) doesn't change the behavior of the algorithm.
I think the previous answer was on the right track. But there also looks to me to be an off-by-one error. Shouldn't you be increasing the index for your substring? You don't want to include the ")" in the parent parse, right?
this.parent.parse(pattern.substring(i + 1, pattern.length));
But this will still fail because of the error Juan mentioned. A quick temporary fix to test this would be to convert the i to a number:
this.parent.parse(pattern.substring(+i + 1, pattern.length));
This might do it for you. But you should probably go back and switch away from the for-in loop on the string. I think that's causing your issue. Turn it into an array with str.split('') and then use an integer to loop. That will prevent further such issues.
The real problem is the fact that you were using a for in to iterate through the characters of the string. With the for in loop, your i is going to be a string, therefore, when you try to do i+1, it does string concatenation.
If you change your iteration to be
for(var i=0; i < pattern.length; i++) {
Then it all works fine http://jsfiddle.net/XC6QM/2/
Scott's answer correctly identified the problem but I think his solution (converting the indexes to numbers) is not ideal. You're better off looping with a numeric index to begin with
Also, you should not use brackets to access characters within a string, that does not work in IE 7
switch(pattern[i]) {
should be
switch(pattern.charAt(i)) {
Related
I was in an interview the other day and I was asked to write a method that reverses a string recursively.
I started writing a method that calls itself and got stuck.
Here is what I was asked, reverse a string "Obama" recursively in JavaScript.
Here is how far I got.
function reverseString(strToReverse)
{
reverseString(strToReverse);
};
And got stuck, they said NO for i loops.
Anyone got any ideas?
Look at it this way: the reversed string will start with the last letter of the original, followed by all but the last letter, reversed.
So:
function reverseString(strToReverse)
{
if (strToReverse.length <= 1)
return strToReverse;
// last char +
// 0 .. second-last-char, reversed
return strToReverse[strToReverse.length - 1] +
reverseString( strToReverse.substring(0, strToReverse.length - 1) );
}
See the solution by #MaxZoom below for a more concise version.
Note that the tail-recursive style in my own answer provides no advantage over the plain-recursive version since JS interpreters are not required to perform tail call elimination.
[Original]
Here's a tail recursive version that works by removing a character from the front of the input string and prepending it to the front of the "accumulator" string:
function reverse(s, acc) {
if (s.length <= 0) { return acc; }
return reverse(s.slice(1), s.charAt(0) + (acc||''));
}
reverse('foobar'); // => "raboof"
The simplest one:
function reverse(input) {
if (input == null || input.length < 2) return input;
return reverse(input.substring(1)) + input.charAt(0);
}
console.log(reverse('Barack Obama'));
Single line solution. And if they asked I tell them it's recursive in the native code part and not to ask any more stupid questions.
var result = "Obama".split('').reverse().join('');
Output: amabO
The real problem here is not "how to reverse a string". The real problem is, "do you understand recursion". That is what the interview question is about!
So, in order to solve the problem, you need to show you know what recursion is about, not that you can reverse the string "Obama". If all you needed to do was reverse the string "Obama", you could write return "amabO"; see?
In other words, this specific programming task is not what it's all about! The real solution is not to copy and paste the code from the answers here, but to know about recursion.
In brief,
Recursion involves calling the same function again, yes, but that's not all
In order to prevent stack overflow, you MUST ensure that the function doesn't call itself indefinitely
So there's always a condition under which the function can exit without calling itself (again)
And when it does call itself again, it should do so with parameters that make the above condition more likely.
In the case of string operations, one way to make that all happen is to make sure that it calls itself only with strings that are shorter than the one it was called with. Since strings are not of an infinite length, the function can't call itself an infinite number of times that way. So the condition can be that the string has a length of zero, in which case it's impossible to call itself with a shorter string.
If you can prove that you know all this, and can use it in a real world program, then you're on your way to passing the interview. Not if you copy and paste some source you found on the internet.
Hope this helps!
We can easily reverse a string in the recursion method using the ternary operator
function reverseString(strToReverse) {
return str.length > 1 ? reverse(str.slice(1)) + str.charAt(0) : str;
}
reverseString("America");
Not the smartest way to reverse a string, but it is recursive:
function reverse(input, output) {
output = output || '';
if (input.length > 0) {
output = output.concat(input[input.length -1]);
return reverse(input.substr(0, input.length - 1), output);
}
return output;
}
console.log(reverse('Obama'));
Here's a jsfiddle
Maybe something like this?
var base = 'Obama',
index = base.length,
result = '';
function recursive(){
if (index == 0) return;
index -= 1;
result += base[index];
recursive();
}
recursive();
alert(result);
jsfiddle:
https://jsfiddle.net/hy1d84jL/
EDIT: You can think of recursion as an infinite for..loop. Let's just use it in "controlled" way and define the bounds - 0 for minimum and the length of Obama word as the maximum. Now, let's just make it call itself whatever number of times and do what you need in order to reverse the string, which is - decrement the index variable by one and sum the character from the end. Hope it helps. Nice question.
If the function can only have the single input i would split the string into smaller and smaller pieces, and add them all together in the reverse order
function reverseString(strToReverse){
if (strToReverse.length <= 1) { return strToReverse; }
return reverseString(strToReverse.substr(1, strToReverse.length - 1) + strToReverse[0];
}
I make a serialized list (with JQuery) and then want to delete a Parameter/Value pair from the list. What's the best way to do this? My code seems kinda clunky to take care of edge conditions that the Parameter/Value pair might be first, last, or in the middle of the list.
function serializeDeleteItem(strSerialize, strParamName)
{
// Delete Parameter/Value pair from Serialized list
var strRegEx;
var rExp;
strRegEx = "((^[?&]?" + strParamName + "\=[^\&]*[&]?))|([&]" + strParamName + "\=[^\&]*)|(" + strParamName + "\=[^\&]*[&])";
rExp = new RegExp(strRegEx, "i");
strSerialize = strSerialize.replace(rExp, "");
return strSerialize;
}
Examples / Test rig at http://jsfiddle.net/7Awzw/
EDIT: Modified the test rig to preserve any leading "?" or "&" so that function could be used with URL Query String or fragment of serialized string
See: http://jsfiddle.net/7Awzw/5/
This version is longer than yours, but imho it's more maintainable. It will find and remove the serialized parameter regardless of where it is in the list.
Notes:
To avoid problems with removing items in the middle of an array, we iterate in reverse.
For exact matching of parameter names, we expect them to start at the beginning of the split string, and to terminate with =.
Assuming there is just one instance of the given param, we break once it's found. If there may be more, just remove that line.
Code
function serializeDeleteItem(strSerialize, strParamName)
{
var arrSerialize = strSerialize.split("&");
var i = arrSerialize.length;
while (i--) {
if (arrSerialize[i].indexOf(strParamName+"=") == 0) {
arrSerialize.splice(i,1);
break; // Found the one and only, we're outta here.
}
}
return arrSerialize.join("&");
}
This fails a few of your tests - the ones with serialized strings starting with '?' or '&'. If you feel those are valid, then you could do this at the start of the function, and all tests will pass:
if (strSerialize.length && (strSerialize[0] == '?' || strSerialize[0] == '&'))
strSerialize = strSerialize.slice(1);
Performance Comparison
I've put together a test in jsperf to compare the regex approach with this string method. It's reporting that the regex solution is 49% slower than strings, in IE10 on 32-bit Win7.
I am looking to find the best possible way to find how many $ symbols are on a page. Is there a better method than reading document.body.innerHTML and calc how many $-as are on that?
Your question can be split into two parts:
How can we get the the webpage text content without HTML tags?
We can generalize the second question a bit.
How can we find the number of string occurrences in another string?
And the 'best possible way to do this':
Amaan got the idea right of finding the text, but lets take it further.
var text = document.body.innerText || document.body.textContent;
Adding textContent to the code helps us cover more browsers, since innerText is not supported by all of them.
The second part is a bit trickier. It all depends on the number of '$' symbol occurrences on the page.
For example, if we know for sure, that there is at least one occurrence of the symbol on the page we would use this code:
text.match(/\$/g).length;
Which performs a global regular expression match on the given string and counts the length of the returned array. It's pretty fast and concise.
On the other hand, if we're not sure if the symbol appears on the page at least once, we should modify the code to look like this:
if (match = text.match(/\$/g)) {
match.length;
}
This just checks the value returned by the match function and if it's null, does nothing.
I would recommend using the third option only when there is a large occurrence of the symbols in the page or you're going to perform the search many many times. This is a custom function (taken from here) to count the occurrence of the specified string in another string. It performs better than the other two, but is longer and harder to understand.
var occurrences = function(string, subString, allowOverlapping) {
string += "";
subString += "";
if (subString.length <= 0) return string.length + 1;
var n = 0,
pos = 0;
var step = (allowOverlapping) ? (1) : (subString.length);
while (true) {
pos = string.indexOf(subString, pos);
if (pos >= 0) {
n++;
pos += step;
} else break;
}
return (n);
};
occurrences(text, '$');
I'm also including a little jsfiddle 'benchmark' so you can compare these three different approaches yourself.
Also: No, there isn't a better way of doing this than just getting the body text and counting how many '$' symbols there are.
You should probably use document.body.innerText or document.body.textContent to avoid getting your HTML give you false positives.
Something like this should work:
document.body.innerText.match(/\$/g).length;
An alternate way I can think of, would be to use window.find like this:
var len = 0;
while(window.find('$') === true){
len++;
}
(This may be unreliable because it depends on where the user clicked last. It will work fine if you do it onload, before any user interaction.)
I want to find the number of tabs at the beginning of a string (and of course I want it to be fast running code ;) ). This is my idea, but not sure if this is the best/fastest choice:
//The regular expression
var findBegTabs = /(^\t+)/g;
//This string has 3 tabs and 2 spaces: "<tab><tab><space>something<space><tab>"
var str = " something ";
//Look for the tabs at the beginning
var match = reg.exec( str );
//We found...
var numOfTabs = ( match ) ? match[ 0 ].length : 0;
Another possibility is to use a loop and charAt:
//This string has 3 tabs and 2 spaces: "<tab><tab><space>something<space><tab>"
var str = " something ";
var numOfTabs = 0;
var start = 0;
//Loop and count number of tabs at beg
while ( str.charAt( start++ ) == "\t" ) numOfTabs++;
In general if you can calculate the data by simply iterating through the string and doing a character check at every index, this will be faster than a regex/regular expression which must build up a more complex searching engine. I encourage you to profile this but I think you'll find the straight search is faster.
Note: Your search should use === instead of == here as you don't need to introduce conversions in the equality check.
function numberOfTabs(text) {
var count = 0;
var index = 0;
while (text.charAt(index++) === "\t") {
count++;
}
return count;
}
Try using a profiler (such as jsPerf or one of the many available backend profilers) to create and run benchmarks on your target systems (the browsers and/or interpreters you plan to support for your software).
It's useful to reason about which solution will perform best based on your expected data and target system(s); however, you may sometimes be surprised by which solution actually performs fastest, especially with regard to big-oh analysis and typical data sets.
In your specific case, iterating over characters in the string will likely be faster than regular expression operations.
One-liner (if you find smallest is best):
"\t\tsomething".split(/[^\t]/)[0].length;
i.e. splitting by all non-tab characters, then fetching the first element and obtaining its length.
I am sorry for the very newbie question, but this is driving me mad.
I have a word. For each letter of the word, the characters position in one array is found and then returns the character at the same position found in a parallel array (basic cipher). This is what I already have:
*array 1 is the array to search through*
*array 2 is the array to match the index positions*
var character
var position
var newWord
for(var position=0; position < array1.length; position = position +1)
{
character = array1.charAt(count); *finds each characters positions*
position= array1.indexOf(character); *index position of each character from the 1st array*
newWord = array2[position]; *returns matching characters from 2nd array*
}
document.write(othertext + newWord); *returns new string*
The problem I have is that at the moment the function only writes out the last letter of the new word. I do want to add more text to the document.write, but if I place within the for loop it will write out the new word but also the other text inbetween each word. What i actually want to do is return the othertext + newWord rather than document.write so that I can use it later on. (just using doc.write to text my code) :-)
I know its something really simple, but I cant see where I am going wrong. Any advice?
Thanks
Issy
The solution is to build newWord within the loop using += instead of =. Just set it to an empty string before the loop.
There are other problems with this code. Variable count is never initialized. But let's assume that loops should be using count instead of position as it's principal counter. In that case, if I am not mistaken, this loop will just generate array2 as newWord. First two lines of loop's body cancel each other in a matter of speaking, and position will always be equal to count, so letters from array2 will be used sequentially from beginning to the end.
Could you provide one example of input and desired output, so that we understand what you actually want to accomplish?
A good way of structuring your code and your question is that you define a function that you need to implement. In your case this could look like:
function transcode(sourceAlphabet, destinationAlphabet, s) {
var newWord = "";
// TODO: write some code
return newWord;
}
That way, you clearly state what you want and which parameters are involved. It is also easy to write automatic tests later, for example:
function testTranscode(sourceAlphabet, destinationAlphabet, s, expected) {
var actual = transcode(sourceAlphabet, destinationAlphabet, s);
if (actual !== expected) {
document.writeln('<p class="error">FAIL: expected "' + expected + '", got "' + actual + '".</p>');
} else {
document.writeln('<p class="ok">OK: "' + actual + '".');
}
}
function test() {
testTranscode('abcdefgh', 'defghabc', 'ace', 'dfh');
}
test();