Serializing the complete Javascript state of a website including Closure/Hidden scopies?

Serializing the complete Javascript state of a website including Closure/Hidden scopies? - javascript

I would like to save a "snapshot" of a webpage which however should remain in an "interactive" state -> all Javascript state has to be saved and restored.
Example showing the issue I'm trying to solve: Given a webpage which executes the following script in the global scope:
function f(x) { return function() { return x; } }
var g = f(2);
I'd like to save both the function f (more or less trivial) and the variable g (which closes over x from the f invocation) to a file and restore the state of the website later.
As far as I could figure out it seems to be impossible to do this using only "web" technologies (ie. with the permissions the webpage itself has). I'm therefore guessing I'll have to implement a browser addon to achieve this.
Does something like this already exist? What would be a good starting point? I noticed that Firefox Session Restore does something similar, do you know if I could reuse this mechanism? If not would it be feasible to implement something like this as a "debugger" style addon? Are there simpler solutions?

Javascript objects hold onto DOM/other native objects. Native objects have hidden state and can be entangled with global browser state or addons.
So the only real way I can think of is to run a browser in a VM and snapshot/clone that VM.

Related

JavaScript security: access objects within (function(){})

Even though, I am not a JS newbie, I still have never used (function() {}) before as there was no need. But now I am concerned with security on userside level for my JS game to avoid cheating. So what I did is I placed the following code in my js file:
(function() {
'use strict';
let a = 1;
});
I tried to access the a variable from console and I couldn't. So I wanted to know - will users be able to access those variables and change them if I use this kind of structure?

There is nothing you can to do completely secure that variable. Users will still be able to access it using the debugger, local overrides, or probably other means as well.
A value like that which you need to be immutable to a skilled browser user needs to be stored on a secured backend (server, api, or something).
For example see:
Is there a way to change variable values while debugging JavaScript?

Is just calling deleteLater() enough to avoid memory leaks in Javascript with Embind?

I'm a longtime Java/C++ programmer and novice Javascript programmer. I'm trying to make a web app with a class I have previously coded in C++.
In my Javascript web app, I'm using Embind to create and use the class originally coded in C++. On the Embind documentation page it says,
JavaScript code must explicitly delete any C++ object handles it has received, or the Emscripten heap will grow indefinitely.
and the examples on the page show the created object being deleted immediately after use:
var x = new Module.MyClass;
x.method();
x.delete();
In my web app, I want my object from C++ to persist for the lifetime of the webpage. I want to be able to press a button on the page and update the state of my object. If I .delete() the object at the end of the script, it won't persist when I try pushing the button later.
In the Embind example, embind.test.js, it is possible to call .deleteLater() on a newly created object:
var v = (new cm.ValHolder({})).deleteLater();
My question is, if I simply call .deleteLater() upon the object creation, is this enough for the object to be deleted when the app is done running or when the page is closed? I'm trying to avoid growing the heap indefinitely or cause any memory leaks.
Again, I'm new to Javascript so please point out if I'm missing anything obvious or if ignorant of a best practice concerning memory leaks and pointers in Javascript.
Let me know if I need to clarify anything. Thanks!
reference: https://kripken.github.io/emscripten-site/docs/porting/connecting_cpp_and_javascript/embind.html#memory-management

I'm on the same path and really don't have a concrete answer for you, but as a JS developer I can collaborate with this:
Calling delete() on an already deleted object throws an ugly exception (uncatchable from JS). There is an undocumented method to check this: obj.isDeleted(). Also obj.SS.count will be 0 when the object is "deleted"
In the browser not deleting objects could for sure break your application but from the point of view of a C++ developer this only happens in the context of the document - so by just reloading the page - you get all the memory back (is not necessary to kill the browser)
when exceptions are thrown from C++ code , like sigint, bad memory allocation, etc, it seems is not possible to catch them, even using DISABLE_EXCEPTION_CATCHING or the rest of the debug flags. Module.onAbort or similar also won't handle them. So If anybody will handle deleteLater() registered objects when the program throws must be at C++ side.
I see there's no documentation about delete(), isDeleted() deleteLater() I think those would be great candidates for a PR.

How to suspend node REPL and resume at a later stage with all the environment preserved?

I wish to suspend a REPL session so that I could shut down the system and then at a later time continue to work on the REPL session as if I'd never closed it, i.e. without having to lose all the environment.
I think that the possible solutions to this could be
Snapshot memory, save to file and load env from file later: I think this would be the neatest solution, like happens when you use the 'hibernate' feature of Windows. I've found this heapdump utility which is intended to take a memory snapshot for analysis of memory leaks, but I don't know if you could resurrect the whole environment from that snapshot and I have found no tools that do so.
Save commands and replay them: A major shortcoming of this method is while it works for simple things like var x = "Hello World";, it wouldn't work for things like var reciptId = bankAccount.makePayment(1000); as it will repeat actions on each replay rather than saving the details of the original function call.
Serialize / Deserialize the whole environment: This would involve making a list of all objects that exist in the environment, and then make a mechanism to write each of them to a file i.e. serialize them, and then make a mechanism that deserializes these and loads them when required. I am yet to see a clean way to serialize and deserialize js variables without limitations. I think that the major limitation of this method is its inability to retain references, so the objects loose their class, things would have to be duplicated upon serialization and lose their equality on deserialization - e.g.
var f = function (x) {...};
var a = {};
a.f = f;
a.f === f? //is true, not true if your serialization mechanism saves a function defn for f and a.f separately and deserializes them separately
and cyclic references would probably not work (x = {}; x.cyclic = x;...). So this method, if it ever works would require a lot of dirty work.
So the question really is, how difficult is it to achieve what I wish to achieve? What could be some other solutions to do this? Is there a major obstruction to achieving this which I'm overlooking?
Also, are there any alternatives to the node repl program (like a console in a browser) that can be suspended like this?
Related :
Swift REPL: how to save/load the REPL state? (a.k.a. suspend/resume, snapshot, clone)

So if you want to be able to "suspend" a REPL session and then pick up where you left off after a shut down doesn't seem to be directly available in Node.js's REPL. The closest thing to this is the Persistent History feature of the REPL which was added (i think) in Node 4.2.1. This will allow you to view the history of the commands in your REPL in plain text but thats the closest thing available out of the box with Node.
Persistent History
By default, the REPL will persist history between node REPL sessions by saving to a .node_repl_history file in the user's home directory. This can be disabled by setting the environment variable NODE_REPL_HISTORY="".
Previously in Node.js/io.js v2.x, REPL history was controlled by using a NODE_REPL_HISTORY_FILE environment variable, and the history was saved in JSON format. This variable has now been deprecated, and your REPL history will automatically be converted to using plain text. The new file will be saved to either your home directory, or a directory defined by the NODE_REPL_HISTORY variable, as documented below.
Full docs for the REPL module are available here.
However, there is a REPL "wrapper" node module that will do what you're asking. What you can do is, save your REPL history out to a file and then load the history file on the next session and gain access to what you saved to the file in your next REPL session.The module is Nesh. It has a lot of additional features including configuring your shell and evaluating different version of JS such as ES6/ES7 (Using Babel) & Coffeescript.
Install nesh:
npm install -g nesh
Launch nesh in the terminal by simply typing nesh. Work as you normally would within any other REPL session and when you want to save you can type the following in nesh to save your REPL history to the given file:
.save <filepath>
In your next REPL session, even after a shutdown, you can relaunch your nesh session and reload your history by typing:
.load <filepath>
This will re-evaluate the entire history file and will makes any variables or functions available in the current REPL/nesh session.
Hope this is helpful and I think it meets your needs.

What I think you are looking for is how to suspend and resume a process. See this answer on how to suspend and resume a process

Multiple Rhino (java) threads manipulate the same file

I am writing a piece of javascript (ecmascript) within a 3rd-party application which uses embedded Rhino. The application may start multiple Java threads to handle data concurrently. It seems that every Java thread starts its own embedded Rhino context which in turn runs my script.
The purpose of my script is, to receive data from the application and use it to maintain the contents of a particular file. I need a fail-safe solution to handle the concurrency from my script.
So far, what I have come up with is to call out to java and use java.nio.channels.FileLock. However, the documentation here states:
File locks are held on behalf of the entire Java virtual machine. They are not suitable for controlling access to a file by multiple threads within the same virtual machine.
Sure enough, the blocking call FileChannel.lock() does not block but throws an exception, leading to the following ugly code:
var count = 0;
while ( count < 100 )
{
try
{
var rFile = new java.io.RandomAccessFile(this.mapFile, "rw");
var lock = rFile.getChannel().lock();
try
{
// Here I do whatever the script needs to do with the file
}
finally
{
lock.release();
}
rFile.close();
break;
} catch (ex) {
// This is reached whenever another instance has a lock
count++;
java.lang.Thread.sleep( 10 );
}
}
Q: How can I solve this in a safe and reliable manner?
I have seen posts regarding Rhino sync() being similar to Java synchronized but that does not seem to work between multiple instances of Rhino.
UPDATE
I have tried the suggestion of using Synchronizer with org.mozilla.javascript.tools.shell.Global as a template:
function synchronize( fn, obj )
{
return new Packages.org.mozilla.javascript.Synchronizer(fn).call(obj);
}
Next, I use this function as follows:
var mapFile = new java.io.File(mapFilePath);
// MapWriter is a js object
var writer = new MapWriter( mapFile, tempMap );
var on = Packages.java.lang.Class.forName("java.lang.Object");
// Call the writer's update function synchronized
synchronize( function() { writer.update() } , on );
However I see that two threads enter the update() function simultaneously. What is wrong with my code?

Depending how Rhino is embedded, there are two possibilities:
If the code is executed in the Rhino shell, use the sync(f,lock) function to turn a function into a function that synchronizes on the second argument, or on the this object of its invocation if the second argument is absent. (Earlier versions only had the one-argument method, so unless your third-party application uses a recent version, you may need to use that or roll your own; see below.)
If the application is not using the Rhino shell, but using a custom embedding that does not include concurrency tools, you'll need to roll your own version. The source code for sync is a good starting point (see the source code for Global and Synchronizer; you should be able to use Synchronizer pretty much out-of-the-box the same way Global uses it).
It is possible that the problem is that the object on which you are trying to synchronize is not shared across contexts, but is created multiple times by the embedding or something. If so, you may need to use some sort of hack, especially if you have no control over the embedding. If you have no control over the embedding, you could use some kind of VM-global object on which to synchronize, like Runtime.getRuntime() or something (I can't think of any that I immediately know are single objects, but I suspect several of those with singleton APIs like Runtime are.)
Another candidate for something on which to synchronize would be something like Packages.java.lang.Class.forName("java.lang.Object"), which should refer to the same object (the Object class) in all contexts unless the embedding's class loader setup is extremely unusual.

Is it possible to manipulate every Javascript variables, objects while or after running?

It seems there's no way to completely hide source/encrypt something to prevent users from inspecting the logic behind a script.
Aside from viewing the source, then, is it possible to manipulate every variables, objects while a script is running?
It seems it is possible to some degree: by using Chrome's developer tools or Firebug, you can easily edit variables or even invoke functions on the global scope.
Then what about variables, functions inside of an instantiated objects or self invoked anonymous functions? Here is an example:
var varInGlobal = 'On the global scope: easily editable';
function CustomConstructor()
{
this.exposedProperty = 'Once instantiated, can be easily manipulated too.';
this.func1 = function(){return func1InConstructor();}
var var1InConstructor = 'Can be retrived by invoking func1 from an instantiated object';
// Can it be assigned a new value after this is instantiated?
function func1InConstructor()
{
return var1InConstructor;
}
}
var customObject = new CustomConstructor();
After this is ran on a browser:
// CONSOLE WINDOW
varInGlobal = 'A piece of cake!';
customObject.exposedProperty = 'Has new value now!';
customObject.var1InConstructor; // undefined: the variable can't be access this way
customObject.func1(); // This is the correct way
At this stage, is it possible for a user to edit the variable "var1InConstructor" in customObject?
Here's another example:
There is a RPG game built on Javascript. The hero in the game has two stats: strength and agility. the character's final damage is calculated by combining these two stats. It is clear that players can find out this logic by inspecting the source.
Let's assume the entire script is self invoked and stats/calculate functions are inside of objects' constructors so they can't be reached by normally after instantiated. My question is, can the players edit the character's str and agi while the game is running(by using Firebug or whatever) so they can steamroll everything and ruin the game?

The variable var1InConstructor cannot be re-bound under normal ECMAScript rules as it is visible only within the lexical scope. However, as alex (and others) rightly say, the client should not be trusted.
Here are some ways the user can exploit the assumption that the variable is read-only:
Use a JavaScript debugger (e.g. FireBug) and re-assign the variable while stopped at a breakpoint within the applicable scope.
Copy and paste the original source code, but add a setter with access to the variable. The user could even copy the entire program invalidating almost every assumption about execution.
Modify or inject a value at a usage site: an exploitation might be possible without ever actually updating the original variable (e.g. player.power = function () { return "godlike" }).
In the end, with a client-side program, there is no way to absolutely prevent a user from cheating without a centralized authority (read: server) auditing every action - and even then it still might be possible to cheat by reading additional game state, such as enemy positions.
JavaScript, being easy to read, edit, and execute dynamically is even easier to hack/fiddle with than a compiled application. Obfuscation is possible but, if someone wants to cheat, they will.

I don't think this constitutes an answer, it could be seen as anecdotal, but it's a bit long for a comment.
Everything you do when it comes to the integrity of your coding on this issue has to revolve around needing to verify that the data hasn't changed outside of the logic of your game.
My experience with game development (via flash, primarily...but could be compared to javascript) is that you need to think about everything being a handshake where possible. When you are expecting data to come to the server from the client you want to make sure that you have some form of passage of communication that lessens the chance of someone simply sending false data. Store data on the server side as much as possible and use the client side code to call for it when it's needed, and refresh this data store often.
You'll find that HTML games tend to do a lot of abstraction of the logic to the server side, even for menial tasks. Attacking an enemy, picking up an item, these are calls to functions within server-side code, and is why the game animation could carry on in some of these games while the connection times out in the background, causing error messages to pop up and refresh the interface to the server's last known valid state.
Flash was easier in this regard as you didn't have any access to alter any data or corrupt it unless it left the flash environment

Yes, anything ran on the client should be untrusted if you're using the data from it to update a server side state.

As you suggested, you can't hide the logic/client-side code. You can make it "harder" for people to read the source by obfuscating it, but it's very trivial to undo.
Assuming you're making a game from your example, the first rule of networked games is "never trust the client". You need to either run all the game logic on a server, or you need to validate all the input on a server. Never update the game state based on input from a client without validating it first.

You can't hide any variable.
Also, if the user is so good in javascript, he can easily edit your script, without editing the variables value through the console.

JS code that is injected into an HTML using Ajax is pretty darn difficult to get your hands on, but it also has it's limitations. Most notably, you can't use JS includes in injected HTML . . . only inline JS.
I've been working with some of that recently actually and it's a real pain to debug. You can't see it, step into it, or add breakpoints to it in any way that I can figure out . . . in Firebug or Chrome's built-in tool.
But, as others have said . . . I still wouldn't consider it trusted.

We Keep Coding

JavaScript is the programming language of the Web.