Hooking Function Constructor (JavaScript)

Hooking Function Constructor (JavaScript) - javascript

Does anyone know of a way to detect when a new function is created?
I know you can hook the function class constructor but that will only detect new Function() calls, not function func(){} declarations.
I assume the JS engine just creates a default function class when it sees a function declaration instead of looking at the global Function class.
Does anyone know if this is the case or if you can intercept function declarations.
Example function to hook :
function add(x,y){return x+y}
A hook should get the function when it is defined and hopefully redefine it to be able to capture the arguments, this, and return value.

(V8 developer here.)
No, there is no way to intercept all function declarations.
FWIW, this isn't up to individual engines to decide; the JavaScript language provides no way to accomplish this, and engines aim to implement the language without any custom deviations (i.e. neither adding nor removing features).
I can offer two alternative ideas that may or may not serve your purposes:
(1) V8 has a --trace flag which logs all function entries and returns to stdout. Be warned: this tends to produce a lot of output.
(2) You could feed the source of the app you are interested in to a JavaScript parser, and then run whatever analysis (and/or transformation) you want over the AST it produces. This won't catch dynamically generated and eval-ed functions, but it will find all function func() {...} declarations.
The need is game hacking.
In the interest of helping you ask better questions (so that you may get more helpful answers), I'd like to point out that that's not the level of detail that #Bergi was asking for: the general area in which you are active does not provide any hints for possible alternative approaches. Wanting to hook function declarations so that you can capture their arguments and return values is a search for a specific solution. #Bergi's question is: what is the specific problem that you are trying to solve by intercepting all functions?
That said, "game hacking", while being very vague, does sound sufficiently shady that I'd like to humbly suggest that you not only think about whether you could, but alse give sufficient thought to whether you should.

Related

Is there a way to access/modify the "Execution Context" of a Node module/function

I understand the scoping rules of Node/JavaScript, and from, for example, Understanding Execution Context and Execution Stack in Javascript, I think I understand the principle of how Execution Contexts work: the question is can you actually get access to them?
From an answer to the 2015 question How can I get the execution context of a javascript function inside V8 engine (involving (ab)using (new Error()).stack), and the answer to the 2018 question How can we get the execution context of a function?, I suspect the answer is "no" but just in case things have changed: is it possible to access/modify the Execution Context of a Node module/function?
(I'm very aware this screams either XY Problem or a desire to abuse Node/JavaScript: hopefully the Background section below will provide more context, and – if there's a different way of achieving what I want – that will work as well.)
In a nutshell, I want to achieve something like:
sharedVar = 1 ;
anotherModule.someFunction() ;
console.log( sharedVar ) ; // Where 'sharedVar' has changed value
Having a function in a different module being able to change variables in its caller's scope at will seems the definition of "A Dangerous Thing™", so – if it's possible at all – I expect it would need to be more like:
sharedVar = 1 ;
anotherModule.hereIsMyExecutionContext( SOMETHING ) ;
anotherModule.someFunction() ;
console.log( sharedVar ) ; // Where 'sharedVar' has changed value
and anotherModule would be something like:
let otherExecutionContext ;
function hereIsMyExecutionContext( anExecutionContext ) {
otherExecutionContext = anExecutionContext ;
}
function someFunction() {
//do something else
otherExecutionContext.sharedVar = 42 ;
}
and the question becomes what (if anything) can I replace SOMETHING with?
Notes / Things That Don't Work (for me)
You shouldn't be trying to do this! I realize what I'm trying to achieve isn't "clean" code. The use-case is very specific, where brevity (particularly in the code whose value I want changing) is paramount. It is not "production" code where the danger of forgotten, unexpected side-effects matters.
Returning a new value from the function. In my real use-case, there would be several variables that I would like someFunction() to be able to alter. Having to do { var1, var2, ... varN } = anotherModule.someFunction() would be both inconvenient and distracting (while the function might change some of these variables' values, it is not the main purpose of the function).
Have the variables members of anotherModule. While using anotherModule.sharedVar would be the clean, encapsulated way of doing things, I'd really prefer not to have to use the module name every time: not only is it more typing, but it distracts from what the code that would be using these variables is really doing.
Use the global scope. If I wasn't using "use strict";, it would be possible to have sharedVar on the global object and freely accessible by both bits of code. As I am using strict-mode (and don't want to change that), I'd need to use global.sharedVar which has the same "cumbersomeness" as attaching it to anotherModule.
Use import. It looks like using import { sharedVar } from anotherModule allows "live" sharing of the variables I want between the two modules. However, by using import, the module using the shared variable has to be an ES Module (not a CommonJS Module), and cannot be loaded dynamically using require() (as I also need it to be). Loading it dynamically the ESM way (using import() as a function returning a promise) appears to work, but repeated loadings come from a cache. According to this answer to How to reimport module with ES6 import, there isn't a "clean" way of clearing that cache (cf. delete require.cache[] that works with CommonJS modules). If there is a clean way of invalidating a cached ESM loaded through import(), please feel free to add an answer to that question (and a comment here, please), although the current Node JS docs on ESMs don't mention one, so I'm not hopeful :-(
Background
In early December, a random question on SO alerted me to the Advent of Code website, where a different programming problem is revealed everyday, and I decided to have a go using Node. The first couple of problems I tackled using standalone JS files, at which point I realized that there was a lot of common code being copy-pasted between each file. In parallel with solving the puzzles, I decided to create a "framework" program to coordinate them, and to provide as much of the common code as possible. One goal in creating the framework was that the individual "solution" files should be as "lean" as possible: they should contain the absolute minimum code over that needed to solve the problem.
One of the features of the framework relevant to this question is that it reloads (currently using require()) the selected solution file each time, so that I can work on the solution without re-running the framework... this is why switching to import and ES Modules has drawbacks, as I cannot (cleanly) invalidate the cached solution module.
The other relevant feature is that the framework provides aoc.print(...) and aoc.trace(...) functions. These format and print their arguments: the first all the time; the second conditionally, depending on whether the solution is being run normally or in "trace" mode. The latter is little more than:
function trace( ... ) {
if( traceMode ) {
print( ... )
}
}
Each problem has two sets of inputs: an "example" input, for which expected answers are provided, and the "real" input (which tends to be larger and more involved). The framework would typically run the example input with tracing enabled (so I could check the "inner workings") and run the "real" input with tracing disabled (because it would produce too much output). For most problems, this was fine: the time "wasted" preparing the parameters for the call, and then making the call to aoc.trace() only to find there was nothing to do, was negligible. However, for one solution in particular (involving 10 million iterations), the difference was significant: nearly 30s when making the ignored trace; under a second if they calls were commented-out, or I "short-circuited" the trace-mode decision by using the following construct:
TRACE && aoc.print( ... )
where TRACE is set to true/false as appropriate. My "problem" is that TRACE doesn't track the trace mode in the framework: I have to set it manually. I could, of course, use aoc.traceMode && aoc.print( ... ), but as discussed above, this is more to type than I'd like and makes the solution's code more "cluttered" than I'd ideally want (I freely admit these are somewhat trivial reasons...)

One of the features of the framework relevant to this question is that it reloads the selected solution file each time
This is key. You should load not leave the loading, parsing and execution to require, where you cannot control it. Instead, load the file as text, do nefarious things to the code, then evaluate it.
The evaluation can be done with eval, new Function, the vm module, or by messing with the module system.
The nefarious things I was referring to would most easily be prefixing the code by some "implicit imports", whether you do that by require, import or just const TRACE = true;. But you can also do any other kind of preprocessing, such as macro replacements, where you might simply remove lines that contain trace(…);.

Looks like you're heading the wrong way. The Execution Context isn't meant to be accessible by the user code.
An option is to make a class and pass its instances around different modules for your purpose. JS objects exist in the heap and can be accessed anywhere as long as you have its reference, so you can control it at will.

How do you know which parameters to set for a javascript function?

Coming from Java, Javascript can be really frustrating.
I'm hoping someone can put this into simple terms for me.
I'm struggling to understand how Javascript programmers know which parameters to pass for a method they're calling - especially when that method is being called as a callback (which in my eyes seems like an added level of complexity).
For example, take the function addEventListener. In this function, a typical use looks like
myDOMItem.addEventListener("click", function(e){...}, false);
In the documentation for this function (hyperlinked to name above) I don't see any mention of this option. Whereas in Java you can easily know if your parameters match the type especially with a good IDE, in Javascript it seems like a huge guessing game or requires serious in-depth knowledge of each function.
How do Javascript programmers do it?

The documentation you linked to does show the form in your example:
target.addEventListener(type, listener[, useCapture]);
The type parameter is the string "click", listener is the function object, and useCapture is false.

We either remember things, or get a good IDE that actually has decent support. Everyone has their favorite, so I won't presume to say that there is a "best" IDE. In my humble opinion, Webstorm is one of the best.

In the Mozilla documentation is clearly stated that the function takes four arguments:
string (type)
function/implementer of EventListener (listener)
boolean (useCapture).
boolean (wantsUntrusted). Available only for Mozilla/Gecko browsers.
As you can see from the signature, last two parameters are optional and therefore wrapped in brackets:
target.addEventListener(type, listener[, useCapture, wantsUntrusted ]);
To effectively code in JavaScript, you need to organize development environment just like for Java. Java world provides almost endless number of tools for different tasks (UI development, server-side development etc.). The same situation with JavaScript. I would like to avoid ads for IDEs/Editors. Therefore I advise you to use search to find the right tools for your JavaScript-related dev stack.
P.S. Personal opinion. I also came to JavaScript from Java. From my experience, the main problem for Java developers is not the lack of tools but the lack of strict code structure. Since Java gives powerful OOP experience, probably it will be more easily for you to start working with JavaScript in OOP terms. Good development stack for OOP-like JavaScript is provided by the open source project Google Closure Tools.

Should I use eval to patch/extend a widget?

This question may be too opinion based but since avoiding eval is almost unanimously agreed upon, I'd imagine there should be objective answers.
I'd like to "patch" an existing open-source widget to avoid creating my own custom build of it. But the parts I want to patch are internal functions which are completely hidden. IE:
function dostuff(){
function iCantBeEdittedSinceImNotAPrototypeMethod_unlessDoStuffIsCompletelyOverwrittenButItsVeryLong(){
//code that needs patched
};
};
The only way I can see of doing this is to turn the function dostuff into text, search and replace iCantBeEditted and then eval it.
How bad is this use of eval?
And the specific context that I'd like to use this with is fullcalendar's compareSegs function that decides sorts the order of events.
Concerns:
safety of eval
modifying a minified file. I can find the desired function with a somewhat fail safe pattern but even if I make it variable name agnostic, this could cause problems when fullcalendar is updated.
performance. This file is 72kb after being minified.
So, should I do this? Is there an easier alternative?

How can I prove that my JavaScript files are in the scope of a specific JS or ECMA version?

Let's say you would get a bunch of .js files and now it is your job to sort them into groups like:
requires at least JavaScript 1.85
requires at least E4X (ECMAScript 4 EX)
requires at least ECMAScript 5
or something like this.
I am interested in any solution, but especially in those which work using JavaScript or PHP. This is used for creation of automated specifications, but it shouldn't matter - this is a nice task which should be easy to solve - however, I have no idea how and it is not easy for me. So, if this is easy to you, please share any hints.
I would expect something like this - http://kangax.github.com/es5-compat-table/# - just not for browsers, rather for a given file to be checked against different implementations of JavaScript.
My guess is, that each version must have some specifics, which can be tested for. However, all I can find is stuff about "what version does this browser support".
PS: Don't take "now it is your job" literally, I used it to demonstrate the task, not to imply that I expect work done for me; while in the progress of solving this, it would be just nice to have some help or direction.
EDIT: I took the easy way out, by recquiring ECMAScript 5 to be supported at least as good as by the current FireFox for my projekt to work as intendet and expected.
However, I am still intereseted in any solution-attemps or at least an definite answer of "is possible(, with XY)" or "is not possible, because ..."; XY can be just some Keyword, like FrameworkXY or DesignPatternXY or whatever or a more detailed solution of course.

Essentially you are looking to find the minimum requirements for some javascript file. I'd say that isn't possible until run time. JavaScript is a dynamic language. As such you don't have compile time errors. As a result, you can't tell until you are within some closure that something doesn't work, and even then it would be misleading. Your dependencies could in fact fix many compatibility issues.
Example:
JS File A uses some ES5 feature
JS File B provides a shim for ES5 deficient browsers or at least mimics it in some way.
JS File A and B are always loaded together, but independently A looks like it won't work.
Example2:
Object.create is what you want to test
Some guy named Crockford adds create to Object.prototype
Object.create now works in less compatible browsers, and nothing is broken.
Solution 1:
Build or find a dependency map. You definitely already have a dependency map, either explicitly or you could generate it by iterating over you HTML files.
Run all relevant code paths in environments with decreasing functionality (eg: ES5, then E4X, then JS 1.x, and so forth).
Once a bundle of JS files fail for some code path you know their minimum requirement.
Perhaps you could iterate over the public functions in your objects and use dependency injection to fill in constructors and methods. This sounds really hard though.
Solution 2:
Use webdriver to visit your pages in various environments.
Map window.onerror to a function that tells you if your current page broke while performing some actions.
On error you will know that there is a problem with the bundle on the current page so save that data.
Both these solutions assume that you always write perfect JS that never has errors, which is something you should strive for but isn't realistic. This might; however, provide you with some basic "smoke testing" though.

This is not possible in an exact way, and it also is not a great way of looking at things for this type of issue.
Why its not possible
Javascript doesn't have static typing. But properties are determined by the prototype chain. This means that for any piece of code you would have to infer the type of an object and check along the prototype chain before determining what function would be called for a function call.
You would for instance, have to be able to tell that $(x).bind() o $(x).map are not making calls to the ecmascript5 map or bind functions, but the jQuery ones. This means that you would really have to parse out the whole code and make inferences on type. If you didn't have the whole code base this would be impossible. If you had a function that took an object and you called bind, you would have no idea if that was supposed to be Function.prototype.bind or jQuery.bind because thats not decided till runtime. In fact its possible (though not good coding practice) that it could be both, and that what is run depends on the input to a function, or even depends on user input. So you might be able to make a guess about this, but you couldn't do it exactly.
Making all of this even more impossible, the eval function combined with the ability to get user input or ajax data means that you don't even know what types some objects are or could be, even leaving aside the issue that eval could attempt to run code that meets any specification.
Here's an example of a piece of code that you couldn't parse
var userInput = $("#input").val();
var objectThatCouldBeAnything = eval(userInput);
object.map(function(x){
return !!x;
});
There's no way to tell if this code is parsing a jQuery object in the eval and running jQuery.map or producing an array and running Array.prototype.map. And thats the strength and weakness of a dynamically typed language like javascript. It provides tremendous flexibility, but limits what you can tell about the code before run time.
Why its not a good strategy
ECMAScript specifications are a standard, but in practice they are never implemented perfectly or consistently. Different environments implement different parts of the standard. Having a "ECMAScript5" piece of code does not guarantee that any particular browser will implement all of its properties perfectly. You really have to determine that on a property by property basis.
What you're much better off doing is finding a list of functions or properties that are used by the code. You can then compare that against the supported properties for a particular environment.
This is still a difficult to impossible problem for the reasons mentioned above, but its at least a useful one. And you could gain value doing this even using a loose approximation (assuming that bind actually is ecmascript5 unless its on a $() wrap. Thats not going to be perfect, but still might be useful).
Trying to figure out a standard thats implemented just isn't practical in terms of helping you decide whether to use it in a particular environment. Its much better to know what functions or properties its using so that you can compare that to the environment and add polyfills if necessary.

Efficiency of Plain Functions vs. Immediate Functions?

Someone mentioned that immediate or self-executing functions have to store the whole stack. Is this true...If so what are the pros and cons of using something like the module pattern (which is based on an immediate function) vs. a plain function?
A function is inherently private, but you can return items that you want to be public, so it can handle privacy.
The main difference I see, is that you don't have global imports or the ability to make sure that the developer ( wait that's me ) uses new with the function ( or it is complicated ).
In general when trying to provide privacy and state when should one use the module pattern and when should one just use a plain function?
A second side question is does a function provide state when used with new?

Any function closure that persists because there are lasting references to variables or functions inside it occupies some amount of memory. In today's computers (even phones), this amount of memory is generally insignificant unless you're somehow repeating it thousands of times. So, using the language features to solve your design problems is generally more important than worrying about this amount of memory.
When you say "the whole stack", the calling stack for a top-level self-executing function is very small. There's really nothing else on the stack except for the one function that's being called.
A function is also an object. So, when it's used with new, it creates a new object that can have state (it's properties) if you assign values to those properties. That's one of the main ways that objects are created in javascript. You can either call a function and examine it's return value or you can use it with new and the function serves as the constructor for a new object. A given function is usually designed to be used in one way or the other, not both.
The module pattern is generally used to control which variables are public and when making them public to put them into a structured namespace that uses very few top-level global variables. It isn't really something you choose instead of self-executing functions because they don't really solve the same problem. You can read more about the module pattern here: http://www.yuiblog.com/blog/2007/06/12/module-pattern/
You can read about a number of the options here: http://www.adequatelygood.com/2010/3/JavaScript-Module-Pattern-In-Depth and http://www.klauskomenda.com/code/javascript-programming-patterns/.
It is easier to discuss the pros/cons of a given technique in light of a specific problem that one is trying to solve or a specific design issue rather than a generic discussion of which is better when the things you've asked about are not really solving equivalent issues.
The best reference I know of for protected and private members (which can be hacked into javascript, but are not a core language feature) is this one: http://javascript.crockford.com/private.html. You are making tradeoffs when you use this method instead of the default prototype feature of the language, but you can achieve privacy if you really need it. But, you should know that javascript was not build with private or protected methods in mind so to get that level of privacy, you're using some conventions about how you write your code to get that.

We Keep Coding

JavaScript is the programming language of the Web.