Extracting nested function names from a JavaScript function - javascript

Given a function, I'm trying to find out the names of the nested functions in it (only one level deep).
A simple regex against toString() worked until I started using functions with comments in them. It turns out that some browsers store parts of the raw source while others reconstruct the source from what's compiled; The output of toString() may contain the original code comments in some browsers. As an aside, here are my findings:
Test subject
function/*post-keyword*/fn/*post-name*/()/*post-parens*/{
/*inside*/
}
document.write(fn.toString());
Results
Browser post-keyword post-name post-parens inside
----------- ------------ --------- ----------- --------
Firefox No No No No
Safari No No No No
Chrome No No Yes Yes
IE Yes Yes Yes Yes
Opera Yes Yes Yes Yes
I'm looking for a cross-browser way of extracting the nested function names from a given function. The solution should be able to extract "fn1" and "fn2" out of the following function:
function someFn() {
/**
* Some comment
*/
function fn1() {
alert("/*This is not a comment, it's a string literal*/");
}
function // keyword
fn2 // name
(x, y) // arguments
{
/*
body
*/
}
var f = function () { // anonymous, ignore
};
}
The solution doesn't have to be pure regex.
Update: You can assume that we're always dealing with valid, properly nested code with all string literals, comments and blocks terminated properly. This is because I'm parsing a function that has already been compiled as a valid function.
Update2: If you're wondering about the motivation behind this: I'm working on a new JavaScript unit testing framework that's called jsUnity. There are several different formats in which you can write tests & test suites. One of them is a function:
function myTests() {
function setUp() {
}
function tearDown() {
}
function testSomething() {
}
function testSomethingElse() {
}
}
Since the functions are hidden inside a closure, there's no way for me invoke them from outside the function. I therefore convert the outer function to a string, extract the function names, append a "now run the given inner function" statement at the bottom and recompile it as a function with new Function(). If the test function have comments in them, it gets tricky to extract the function names and to avoid false positives. Hence I'm soliciting the help of the SO community...
Update3: I've come up with a new solution that doesn't require a lot of semantic fiddling with code. I use the original source itself to probe for first-level functions.

Cosmetic changes and bugfix
The regular expression must read \bfunction\b to avoid false positives!
Functions defined in blocks (e.g. in the bodies of loops) will be ignored if nested does not evaluate to true.
function tokenize(code) {
var code = code.split(/\\./).join(''),
regex = /\bfunction\b|\(|\)|\{|\}|\/\*|\*\/|\/\/|"|'|\n|\s+/mg,
tokens = [],
pos = 0;
for(var matches; matches = regex.exec(code); pos = regex.lastIndex) {
var match = matches[0],
matchStart = regex.lastIndex - match.length;
if(pos < matchStart)
tokens.push(code.substring(pos, matchStart));
tokens.push(match);
}
if(pos < code.length)
tokens.push(code.substring(pos));
return tokens;
}
var separators = {
'/*' : '*/',
'//' : '\n',
'"' : '"',
'\'' : '\''
};
function extractInnerFunctionNames(func, nested) {
var names = [],
tokens = tokenize(func.toString()),
level = 0;
for(var i = 0; i < tokens.length; ++i) {
var token = tokens[i];
switch(token) {
case '{':
++level;
break;
case '}':
--level;
break;
case '/*':
case '//':
case '"':
case '\'':
var sep = separators[token];
while(++i < tokens.length && tokens[i] !== sep);
break;
case 'function':
if(level === 1 || (nested && level)) {
while(++i < tokens.length) {
token = tokens[i];
if(token === '(')
break;
if(/^\s+$/.test(token))
continue;
if(token === '/*' || token === '//') {
var sep = separators[token];
while(++i < tokens.length && tokens[i] !== sep);
continue;
}
names.push(token);
break;
}
}
break;
}
}
return names;
}

The academically correct way to handle this would be creating a lexer and parser for a subset of Javascript (the function definition), generated by a formal grammar (see this link on the subject, for example).
Take a look at JS/CC, for a Javascript parser generator.
Other solutions are just regex hacks, that lead to unmaintainable/unreadable code and probably to hidden parsing errors in particular cases.
As a side note, I'm not sure to understand why you aren't specifying the list of unit test functions in your product in a different way (an array of functions?).

Would it matter if you defined your tests like:
var tests = {
test1: function (){
console.log( "test 1 ran" );
},
test2: function (){
console.log( "test 2 ran" );
},
test3: function (){
console.log( "test 3 ran" );
}
};
Then you could run them as easily as this:
for( var test in tests ){
tests[test]();
}
Which looks much more easier.
You can even carry the tests around in JSON that way.

I like what you're doing with jsUnity. And when I see something I like (and have enough free time ;)), I try to reimplement it in a way which better suits my needs (also known as 'not-invented-here' syndrome).
The result of my efforts is described in this article, the code can be found here.
Feel free to rip-out any parts you like - you can assume the code to be in the public domain.

The trick is to basically generate a probe function that will check if a given name is the name of a nested (first-level) function. The probe function uses the function body of the original function, prefixed with code to check the given name within the scope of the probe function. OK, this can be better explained with the actual code:
function splitFunction(fn) {
var tokens =
/^[\s\r\n]*function[\s\r\n]*([^\(\s\r\n]*?)[\s\r\n]*\([^\)\s\r\n]*\)[\s\r\n]*\{((?:[^}]*\}?)+)\}\s*$/
.exec(fn);
if (!tokens) {
throw "Invalid function.";
}
return {
name: tokens[1],
body: tokens[2]
};
}
var probeOutside = function () {
return eval(
"typeof $fn$ === \"function\""
.split("$fn$")
.join(arguments[0]));
};
function extractFunctions(fn) {
var fnParts = splitFunction(fn);
var probeInside = new Function(
splitFunction(probeOutside).body + fnParts.body);
var tokens;
var fns = [];
var tokenRe = /(\w+)/g;
while ((tokens = tokenRe.exec(fnParts.body))) {
var token = tokens[1];
try {
if (probeInside(token) && !probeOutside(token)) {
fns.push(token);
}
} catch (e) {
// ignore token
}
}
return fns;
}
Runs fine against the following on Firefox, IE, Safari, Opera and Chrome:
function testGlobalFn() {}
function testSuite() {
function testA() {
function testNested() {
}
}
// function testComment() {}
// function testGlobalFn() {}
function // comments
testB /* don't matter */
() // neither does whitespace
{
var s = "function testString() {}";
}
}
document.write(extractFunctions(testSuite));
// writes "testA,testB"
Edit by Christoph, with inline answers by Ates:
Some comments, questions and suggestions:
Is there a reason for checking
typeof $fn$ !== "undefined" && $fn$ instanceof Function
instead of using
typeof $fn$ === "function"
instanceof is less safe than using typeof because it will fail when passing objects between frame boundaries. I know that IE returns wrong typeof information for some built-in functions, but afaik instanceof will fail in these cases as well, so why the more complicated but less safe test?
[AG] There was absolutely no legitimate reason for it. I've changed it to the simpler "typeof === function" as you suggested.
How are you going to prevent the wrongful exclusion of functions for which a function with the same name exists in the outer scope, e.g.
function foo() {}
function TestSuite() {
function foo() {}
}
[AG] I have no idea. Can you think of anything. Which one is better do you think? (a) Wrongful exclusion of a function inside. (b) Wronfgul inclusion of a function outside.
I started to think that the ideal solution will be a combination of your solution and this probing approach; figure out the real function names that are inside the closure and then use probing to collect references to the actual functions (so that they can be directly called from outside).
It might be possible to modify your implementation so that the function's body only has to be eval()'ed once and not once per token, which is rather inefficient. I might try to see what I can come up with when I have some more free time today...
[AG] Note that the entire function body is not eval'd. It's only the bit that's inserted to the top of the body.
[CG] Your right - the function's body only gets parsed once during the creation of probeInside - you did some nice hacking, there ;). I have some free time today, so let's see what I can come up with...
A solution that uses your parsing method to extract the real function names could just use one eval to return an array of references to the actual functions:
return eval("[" + fnList + "]");
[CG] Here is with what I came up. An added bonus is that the outer function stays intact and thus may still act as closure around the inner functions. Just copy the code into a blank page and see if it works - no guarantees on bug-freelessness ;)
<pre><script>
var extractFunctions = (function() {
var level, names;
function tokenize(code) {
var code = code.split(/\\./).join(''),
regex = /\bfunction\b|\(|\)|\{|\}|\/\*|\*\/|\/\/|"|'|\n|\s+|\\/mg,
tokens = [],
pos = 0;
for(var matches; matches = regex.exec(code); pos = regex.lastIndex) {
var match = matches[0],
matchStart = regex.lastIndex - match.length;
if(pos < matchStart)
tokens.push(code.substring(pos, matchStart));
tokens.push(match);
}
if(pos < code.length)
tokens.push(code.substring(pos));
return tokens;
}
function parse(tokens, callback) {
for(var i = 0; i < tokens.length; ++i) {
var j = callback(tokens[i], tokens, i);
if(j === false) break;
else if(typeof j === 'number') i = j;
}
}
function skip(tokens, idx, limiter, escapes) {
while(++idx < tokens.length && tokens[idx] !== limiter)
if(escapes && tokens[idx] === '\\') ++idx;
return idx;
}
function removeDeclaration(token, tokens, idx) {
switch(token) {
case '/*':
return skip(tokens, idx, '*/');
case '//':
return skip(tokens, idx, '\n');
case ')':
tokens.splice(0, idx + 1);
return false;
}
}
function extractTopLevelFunctionNames(token, tokens, idx) {
switch(token) {
case '{':
++level;
return;
case '}':
--level;
return;
case '/*':
return skip(tokens, idx, '*/');
case '//':
return skip(tokens, idx, '\n');
case '"':
case '\'':
return skip(tokens, idx, token, true);
case 'function':
if(level === 1) {
while(++idx < tokens.length) {
token = tokens[idx];
if(token === '(')
return idx;
if(/^\s+$/.test(token))
continue;
if(token === '/*') {
idx = skip(tokens, idx, '*/');
continue;
}
if(token === '//') {
idx = skip(tokens, idx, '\n');
continue;
}
names.push(token);
return idx;
}
}
return;
}
}
function getTopLevelFunctionRefs(func) {
var tokens = tokenize(func.toString());
parse(tokens, removeDeclaration);
names = [], level = 0;
parse(tokens, extractTopLevelFunctionNames);
var code = tokens.join('') + '\nthis._refs = [' +
names.join(',') + '];';
return (new (new Function(code)))._refs;
}
return getTopLevelFunctionRefs;
})();
function testSuite() {
function testA() {
function testNested() {
}
}
// function testComment() {}
// function testGlobalFn() {}
function // comments
testB /* don't matter */
() // neither does whitespace
{
var s = "function testString() {}";
}
}
document.writeln(extractFunctions(testSuite).join('\n---\n'));
</script></pre>
Not as elegant as LISP-macros, but still nice what JAvaScript is capable of ;)

<pre>
<script type="text/javascript">
function someFn() {
/**
* Some comment
*/
function fn1() {
alert("/*This is not a comment, it's a string literal*/");
}
function // keyword
fn2 // name
(x, y) // arguments
{
/*
body
*/
}
function fn3() {
alert("this is the word function in a string literal");
}
var f = function () { // anonymous, ignore
};
}
var s = someFn.toString();
// remove inline comments
s = s.replace(/\/\/.*/g, "");
// compact all whitespace to a single space
s = s.replace(/\s{2,}/g, " ");
// remove all block comments, including those in string literals
s = s.replace(/\/\*.*?\*\//g, "");
document.writeln(s);
// remove string literals to avoid false matches with the keyword 'function'
s = s.replace(/'.*?'/g, "");
s = s.replace(/".*?"/g, "");
document.writeln(s);
// find all the function definitions
var matches = s.match(/function(.*?)\(/g);
for (var ii = 1; ii < matches.length; ++ii) {
// extract the function name
var funcName = matches[ii].replace(/function(.+)\(/, "$1");
// remove any remaining leading or trailing whitespace
funcName = funcName.replace(/\s+$|^\s+/g, "");
if (funcName === '') {
// anonymous function, discard
continue;
}
// output the results
document.writeln('[' + funcName + ']');
}
</script>
</pre>
I'm sure I missed something, but from your requirements in the original question, I think I've met the goal, including getting rid of the possibility of finding the function keyword in string literals.
One last point, I don't see any problem with mangling the string literals in the function blocks. Your requirement was to find the function names, so I didn't bother trying to preserve the function content.

Related

Get line number with abstract syntax tree in node js

Im making a program that takes some code via parameter, and transform the code adding some console.logs to the code. This is the program:
const escodegen = require('escodegen');
const espree = require('espree');
const estraverse = require('estraverse');
function addLogging(code) {
const ast = espree.parse(code);
estraverse.traverse(ast, {
enter: function(node, parent) {
if (node.type === 'FunctionDeclaration' ||
node.type === 'FunctionExpression') {
addBeforeCode(node);
}
}
});
return escodegen.generate(ast);
}
function addBeforeCode(node) {
const name = node.id ? node.id.name : '<anonymous function>';
const beforeCode = "console.log('Entering " + name + "()');";
const beforeNodes = espree.parse(beforeCode).body;
node.body.body = beforeNodes.concat(node.body.body);
}
So if we pass this code to the function:
console.log(addLogging(`
function foo(a, b) {
var x = 'blah';
var y = (function () {
return 3;
})();
}
foo(1, 'wut', 3);
`));
This is the output of this program:
function foo(a, b) {
console.log('Entering foo()');
var x = 'blah';
var y = function () {
console.log('Entering <anonymous function>()');
return 3;
}();
}
foo(1, 'wut', 3);
And this is the AST (Abstract Syntax Tree) for that last function passed to addLoggin:
https://astexplorer.net/#/gist/b5826862c47dfb7dbb54cec15079b430/latest
So i wanted to add more information to the console logs like for example the line number we are on. As far as i know, in the ast, the node has a value caled 'start' and 'end' which indicates in which character that node starts and where it ends. How can i use this to get the line number we are on? Seems pretty confusing to me to be honest. I was thinking about doing a split of the file by "\n", so that way i have the total line numbers, but then how can i know i which one im on?
Thank you in advance.
Your idea is fine. First find the offsets in the original code where each line starts. Then compare the start index of the node with those collected indexes to determine the line number.
I will assume here that you want the reported line number to refer to the original code, not the code as it is returned by your function.
So from bottom up, make the following changes. First expect the line number as argument to addBeforeCode:
function addBeforeCode(node, lineNum) {
const name = node.id ? node.id.name : '<anonymous function>';
const beforeCode = `console.log("${lineNum}: Entering ${name}()");`;
const beforeNodes = espree.parse(beforeCode).body;
node.body.body = beforeNodes.concat(node.body.body);
}
Define a function to collect the offsets in the original code that correspond to the starts of the lines:
function getLineOffsets(str) {
const regex = /\r?\n/g;
const offsets = [0];
while (regex.exec(str)) offsets.push(regex.lastIndex);
offsets.push(str.length);
return offsets;
}
NB: If you have support for matchAll, then the above can be written a bit more concise.
Then use the above in your main function:
function addLogging(code) {
const lineStarts = getLineOffsets(code); // <---
let lineNum = 0; // <---
const ast = espree.parse(code);
estraverse.traverse(ast, {
enter: function(node, parent) {
if (node.type === 'FunctionDeclaration' ||
node.type === 'FunctionExpression') {
// Look for the corresponding line number in the source code:
while (lineStarts[lineNum] < node.body.body[0].start) lineNum++;
// Actually we now went one line too far, so pass one less:
addBeforeCode(node, lineNum-1);
}
}
});
return escodegen.generate(ast);
}
Unrelated to your question, but be aware that functions can be arrow functions, which have an expression syntax. So they would not have a block, and you would not be able to inject a console.log in the same way. You might want to make your code capable to deal with that, or alternatively, to skip over those.

How to add keyword to acorn or esprima parser

I am working on a language that transpiles to javascript and has a similar syntax. However I want to include some new type of block statements. For syntax purposes they are the same as an IfStatement. How can I get esprima or acorn to parse this program MyStatement {a=1;} without throwing an error? Its fine if it calls it an IfStatement. I would prefer not to fork esprima.
It turns out, that the plugin capabilities of acorn are not really documented. It seems like forking acorn would be the easiest route. In this case, it is as simple as searching for occurances of _if and following a similar pattern for _MyStatement.
However it is possible to write a plugin to accomplish what I was trying to do. It seems a bit of a hack, but here is the code. The basic steps are:
To exend Parse and add to the list of keywords that will be recognized by the first pass
Create a TokenType for the new keyword and add it to the Parser.acorn.keywordTypes, extend parseStatement so that it processes the new TokenType
Create a handler for the new TokenType which will add information to the Abstract Syntax Tree as required by the keyword functionality and also consume tokens using commands like this.expect(tt.parenR) to eat a '(' or this.parseExpression() to process an entire expression.
Here is the code:
var program =
`
MyStatement {
MyStatement(true) {
MyStatement() {
var a = 1;
}
}
if (1) {
var c = 0;
}
}
`;
const acorn = require("acorn");
const Parser = acorn.Parser;
const tt = acorn.tokTypes; //used to access standard token types like "("
const TokenType = acorn.TokenType; //used to create new types of Tokens.
//add a new keyword to Acorn.
Parser.acorn.keywordTypes["MyStatement"] = new TokenType("MyStatement",{keyword: "MyStatement"});
//const isIdentifierStart = acorn.isIdentifierStart;
function wordsRegexp(words) {
return new RegExp("^(?:" + words.replace(/ /g, "|") + ")$")
}
var bruceware = function(Parser) {
return class extends Parser {
parse(program) {
console.log("hooking parse.");
//it appears it is necessary to add keywords here also.
var newKeywords = "break case catch continue debugger default do else finally for function if return switch throw try var while with null true false instanceof typeof void delete new in this const class extends export import super";
newKeywords += " MyStatement";
this.keywords = wordsRegexp(newKeywords);
return(super.parse(program));
}
parseStatement(context, topLevel, exports) {
var starttype = this.type;
console.log("!!!hooking parseStatement", starttype);
if (starttype == Parser.acorn.keywordTypes["MyStatement"]) {
console.log("Parse MyStatement");
var node = this.startNode();
return this.parseMyStatement(node);
}
else {
return(super.parseStatement(context, topLevel, exports));
}
}
parseMyStatement(node) {
console.log("parse MyStatement");
this.next();
//In my language, MyStatement doesn't have to have a parameter. It could be called as `MyStatement { ... }`
if (this.type == tt.parenL) {
node.test = this.parseOptionalParenExpression();
}
else {
node.test = 0; //If there is no test, just make it 0 for now (note that this may break code generation later).
}
node.isMyStatement = true; //set a flag so we know that this if a "MyStatement" instead of an if statement.
//process the body of the block just like a normal if statement for now.
// allow function declarations in branches, but only in non-strict mode
node.consequent = this.parseStatement("if");
//node.alternate = this.eat(acornTypes["else"]) ? this.parseStatement("if") : null;
return this.finishNode(node, "IfStatement")
};
//In my language, MyStatement, optionally has a parameter. It can also by called as MyStatement() { ... }
parseOptionalParenExpression() {
this.expect(tt.parenL);
//see what type it is
console.log("Type: ", this.type);
//allow it to be blank.
var val = 0; //for now just make the condition 0. Note that this may break code generation later.
if (this.type == tt.parenR) {
this.expect(tt.parenR);
}
else {
val = this.parseExpression();
this.expect(tt.parenR);
}
return val
};
}
}
process.stdout.write('\033c'); //cls
var result2 = Parser.extend(bruceware).parse(program); //attempt to parse
console.log(JSON.stringify(result2,null,' ')); //show the results.

Conventions for "function expression" declaration

Im new to js and its sometimes hard for me to get used to its code conventions. So i have a question, how i should declare function expression? Look at my code, is it right how i did it, or there are better practices?
function onAddButtonClick() {
var engWord = document.getElementById('engWord'),
japWord = document.getElementById('japWord'),
engVal = engWord.value,
japVal = japWord.value,
engExpr = (engVal !== ""),
japExpr = (japVal !== ""),
duplicateNum,
checkImg,
numOfWords;
duplicateNum = (function () {
var i,
pair;
for (i = 0; i < dictionary.length; i++) {
pair = dictionary[i];
if (pair.eng === engVal && pair.jap === japVal) {
return 3;
} else if (pair.jap === japVal) {
return 2;
} else if (pair.eng === engVal) {
return 1;
}
}
return 0;
}());
//remove focus from inputs
engWord.blur();
japWord.blur();
...
}
Thanks in advance.
You did fine. Using the opening ( is not syntactically required in this context, but it makes a great warning to the human reader of the code about what is going on. The convention helps.
At the end, the invoking parens () can go inside, or outside, the closing ). Doug Crawford recommends inside and many linters check for this. Despite his claims of dog balls1, it really doesn't matter.
By the way, the idea of function expression being declared and then immediately running is called an IFFE -- Immediately Invoked Function Expression

JavaScript deferred execution

For a personal challenge, I'm implementing LINQ in JavaScript (well, a set of functions with LINQ-like functionality). However, as of right now, the functions are processing the data immediately; that's correct behavior for some functions (such as Sum or Aggregate), but incorrect for others (such as Select or While).
I'm curious if there's a construct in JavaScript that could get me the same behavior as in .Net, where no real processing happens until the collection is enumerated or a function with immediate execution is used.
Note: I believe this task (implementing LINQ in JS) has already been done. That's not the point. This is a challenge to myself from myself, which is likely to help me increase my understanding of LINQ (and, coincidentally, JS). Beyond personal edification, I'm going to be using LINQ for my job soon, may use JS for my job depending on the needs of individual projects, and I use JS for some things outside of work.
Edit: It seems I've attracted people unfamiliar with LINQ, so I suppose I should give some explanation on that front. LINQ is Language-INtegrated Query, something from .Net. LINQ allows for SQL-like queries on many data sources (including actual SQL relational databases), such as LINQ to Objects, which is what I'm trying to achieve.
One of the features of LINQ is deferred execution on many of the methods. If I have a collection customers and call var query = customers.Where(c => c.Age > 40); (or what it would end up being in JS, var query = customers.Where(function (c) { return c.Age > 40; });), the return value is an interface type, and the actual processing of the collection (returning the subset of the collection containing only customers older than 40) hasn't happened yet. When I use one of the methods without deferred execution (eg, query.First() or query.ToArray()), then all of the deferred processing happens. This could be a chain, such as customers.Where(...).Skip(5).Select(...).OrderBy(...) (each "..." being a function).
The upshot is that code like this:
var collection = [1, 2, 3, 4, 5];
var query = collection.Where(function (n) { return n % 2 == 0; });
collection.push(6);
alert(query.Max());
Would result in "6".
As an addendum, I'm currently implementing this project by prototyping my methods onto both Object and Array, iterating over the elements of this, and skipping any elements which are functions. Something like making an Enumerable class may be superior (and in fact may be required for my deferred execution plan, if something like returning a function or an anonymous object is required), but that's what I've currently got. My functions generally appear as something along these lines:
Object.prototype.Distinct = Array.prototype.Distinct = function (comparer) {
comparer = comparer || function (a, b) { return a == b; };
var result = [];
for (var idx in this) {
var item = this[idx];
if (typeof item == "function") continue;
if (!result.Contains(item, comparer)) result.push(item);
}
return result;
};
Fundamentally what you need to do is return objects from your functions rather than performing operations. The objects you return will contain the code necessary to perform the operations in the future. Consider an example use case:
var myCollection = [];
for(var i = 0; i < 100; i++) { myCollection.push(i); }
var query = Iter(myCollection).Where(function(v) { return v % 2 === 0; })
.Skip(5).Select(function(v) { return v*2; });
var v;
while(v = query.Next()) {
console.log(v);
}
We expect as output:
20
24
28
...
188
192
196
In order to do that we define the methods .Where(), .Skip(), and .Select() to return instances of classes with overridden versions of the .Next() method. Working code that supports this functionality: ( set trace to true to observe that the execution order is lazy)
var trace = false;
function extend(target, src) {
for(var k in src) {
target[k] = src[k];
}
return target;
}
function Iter(wrapThis) {
if(wrapThis.Next) {
return wrapThis;
} else {
return new ArrayIter(wrapThis);
}
}
Iter.prototype = {
constructor: Iter,
Where: function(fn) { return new WhereIter(this, fn); },
Skip: function(count) { return new SkipIter(this, count); },
Select: function(fn) { return new SelectIter(this, fn); }
};
function ArrayIter(arr) {
this.arr = arr.slice();
this.idx = 0;
}
ArrayIter.prototype = extend(Object.create(Iter.prototype),
{
constructor: ArrayIter,
Next: function() {
if(this.idx >= this.arr.length) {
return null;
} else {
return this.arr[this.idx++];
}
}
});
function WhereIter(src, filter) {
this.src = src; this.filter = filter;
}
WhereIter.prototype = extend(Object.create(Iter.prototype), {
constructor: WhereIter,
Next: function() {
var v;
while(true) {
v = this.src.Next();
trace && console.log('Where processing: ' + v);
if(v === null || this.filter.call(this, v)) { break; }
}
return v;
}
});
function SkipIter(src, count) {
this.src = src; this.count = count;
this.skipped = 0;
}
SkipIter.prototype = extend(Object.create(Iter.prototype), {
constructor: SkipIter,
Next: function() {
var v;
while(this.count > this.skipped++) {
v = this.src.Next();
trace && console.log('Skip processing: ' + v);
if(v === null) { return v; }
}
return this.src.Next();
}
});
function SelectIter(src, fn) {
this.src = src; this.fn = fn;
}
SelectIter.prototype = extend(Object.create(Iter.prototype), {
constructor: SelectIter,
Next: function() {
var v = this.src.Next();
trace && console.log('Select processing: ' + v);
if(v === null) { return null; }
return this.fn.call(this, v);
}
});
var myCollection = [];
for(var i = 0; i < 100; i++) {
myCollection.push(i);
}
var query = Iter(myCollection).Where(function(v) { return v % 2 === 0; })
.Skip(5).Select(function(v) { return v*2; });
var v;
while(v = query.Next()) {
console.log(v);
}
You also may want to look into "string lambdas" to make your queries much more readable. That would allow you to say "v*2" instead of function(v) { return v*2; }
I am not entirely clear on what exactly you wish to do, but I think what you should look into is the defineProperty method. What you would probably wish to do is then to redefine the .length property and execute the code only once it's read. Or if you want to do it only once the property itself is read do it at that point. Not sure how LINQ works or even what it is, so that's why I am a bit vague. Either way, with defineProperty you can do something like
Object.defineProperty(o, "a", { get : function(){return 1;});
Allowing you to do actions only once the property is accessed (and you can do a lot more than that as well).

Find and execute Javascript fragments in a bunch of HTML

I need to detect and eval the Javascript code contained in a string.
The following code works, but it only evaluates the first <script>...</script> it founds.
function executeJs(html) {
var scriptFragment = "<script(.+?)>(.+?)<\/script>";
match = new RegExp(scriptFragment, "im");
var matches = html.match(match);
if (matches.length >= 2) {
eval(matches[2]);
}
}
I wonder if there is a method that allows me to iterate and execute all Javascript fragments.
The reason it only takes the first one is because you're missing the g flag. Try this:
function executeJs(html) {
var scriptFragment = '<script(.*?)>(.+?)<\/script>';
var re = new RegExp(scriptFragment, 'gim'), match;
while ((match = re.exec(html)) != null) {
eval(match[2]);
}
}
executeJs('<script>alert("hello")</script>abc<script>alert("world")</script>');
Here is some code that does the same thing in a slightly different way. You can pass the string to the function and it will eval all the script tags and return the cleaned source(without script). There is also a slight difference in the way IE handles it, that is handled in the code as well, you may adapt it to your requirements. Also, the evaluated code has the global context. Hope it helps.
function parseScript(_source)
{
var source = _source;
var scripts = new Array();
// Strip out tags
while(source.indexOf("<script") > -1 || source.indexOf("</script") > -1)
{
var s = source.indexOf("<script");
var s_e = source.indexOf(">", s);
var e = source.indexOf("</script", s);
var e_e = source.indexOf(">", e);
// Add to scripts array
scripts.push(source.substring(s_e+1, e));
// Strip from source
source = source.substring(0, s) + source.substring(e_e+1);
}
// Loop through every script collected and eval it
for(var i=0; i<scripts.length; i++)
{
try
{
//eval(scripts[i]);
if(window.execScript)
{
window.execScript(scripts[i]); // IE
}
else
{
window.setTimeout(scripts[i],0); // Changed this from eval() to setTimeout() to get it in Global scope
}
}
catch(ex)
{
// do what you want here when a script fails
alert("Javascript Handler failed interpretation. Even I am wondering why(?)");
}
}
// Return the cleaned source
return source;
}
Blixt should be right...
You may also take a look at prototype's String.evalScripts function.
http://api.prototypejs.org/language/string.html#evalscripts-instance_method

Categories