I have a legacy code to maintain and while trying to understand the logic behind the code, I have run into lots of annoying issues.
The application is written mainly in Java Script, with extensive usage of jQuery + different plugins, especially Accordion. It creates a wizard-like flow, where client code for the next step is downloaded in the background by injecting a result of a remote AJAX request. It also uses callbacks a lot and pretty complicated "by convention" programming style (lots of events handlers are created on the fly based on certain object names - e.g. current page name, current step name).
Adding to that, the code is very messy and there is no obvious inner structure - the functions are scattered in the code, file names do not reflect the business role of the code, lots of functions and code snippets are most likely not used at all etc.
PROBLEM: How to approach this code base, so that the inner flow of the code can be sort-of "reverse engineered" using a suite of smart debugging tools.
Ideally, I would like to be able to attach to the running application and step through the code, breaking on each new function call.
Also, it would be nice to be able to create a "diagram of calls" in the application (i.e. in order to run a particular page logic, this particular flow of function calls was executed in a particular order).
Not to mention to be able to run a coverage analysis, identifying potentially orphaned code fragments.
I would like to stress out once more, that it is impossible to understand the inner logic of the application just by looking at the code itself, unless you have LOTS of spare time and beer crates, which I unfortunately do not have :/ (shame...)
An IDE of some sort that would aid in extending that code would be also great, but I am currently looking into possibility to use Visual Studio 2010 to do the job, as the site itself is a mix of Classic ASP and ASP.NET (I'd say - 70% Java Script with jQuery, 30% ASP).
I have obviously tried FireBug, but I was unable to find a way to define a breakpoint or step into the code, which is "injected" into the client JS using AJAX calls (i.e. the application retrieves the code by invoking an URL and injects it to the client local code). Venkman debugger had similar issues.
Any hints would be welcome. Feel free to ask additional questions.
You can try to use "dynaTrace Ajax" to create a diagram of calls, but it is not debugger. If you have access to that application you can use keyword "debugger" to define a breakpoint explicitly in javascript files. Visual Studio theoretically shows all of evaluated and loaded javascript code in Solution Explorer when you've attached to IE.
I'd start with Chrome's developer tools and profiling single activities to find functions to set breakpoints on. Functions can be expanded to get their call stack.
That may or may not be helpful as it's entirely possible to get code that's too -- uh, unique, shall we say -- to make much sense of.
Good luck?
Related
I was wondering if it would be possible to create a JavaScript function to check a condition, and if that is True then I deny access to the code? Right now I am checking the user agent, and if it doesn't meet given criteria then I delete the HTML tag. However, if they go to the network tab then they can still see the GET requests and responses for my code.
This is a website running on localhost because it's an Electron app by the way.
I thought maybe I could issue a 403 error but I'm not sure if that's possible via JS.
Thanks
In an Electron app, your code is still JavaScript source code. You can obfuscate it and/or put it in a ASAR archive to make it harder to read and fine, but the code is still there and accessible to anyone who wants to go to the effort. (For instance, if you use VS Code, you can see the source in resources/app/out directory and its subdirectories.)
You can make it even harder to find and understand the source if you're willing to put in more work. V8, the JavaScript engine Node.js uses, has a feature called startup snapshots: You run V8 and have it load your script (your obfuscated code), take a heap snapshot (after GC), and write it to a file. Then you specify the heap snapshot when loading V8, and it just hauls in the snapshot instead of reading and parsing code. The Atom team have done this on top of Electron. In their case the motivation was startup performance, not hiding the source code, but it has the effect of making your code even harder to find.
Note I said harder, not impossible. At the end of the day, if you want the code to execute on the end-user's computer, that code is going to be accessible to the end user. This is true even of compiled applications, although of course in that case a lot of organizational information that helps a human understand the code is lost in compilation, which happens less with obfuscated code. (But a good obfuscator makes code extremely opaque to human beings.)
I am very new to GNOME extension development, and I am having a hard time working with it, due to a profound lack of documentation (or maybe my Internet is clandestinely censored) of the API. I started by modifying an existing extension, so that it is easier to make my way around it.
The issue is, I can obtain the active window using global.display.focus_window, and also a list of monitors connected to the computer using Main.layoutManager.monitors. Now, what I would like to do, is find out which monitor the obtained window is sitting on (so I can move the top panel to that monitor, as it probably means I am working on that monitor at the moment). I tried various things, like .screen, .monitor etc., but with no success. I have no IntelliSense completion on this, and I am trying to guess what the members could be, as I cannot seem to find any docs on it.
I appreciate the fact that GNOME is way more customizable than what I used before (Unity, which provided no customization at all), but I don't know how to work with it and resources are scarce. I tried looking into the source code, but I am not familiar with how it is organized and I could not find the relevant portion of code where the members I need, if they exist, are declared.
I am coding the .js files, so I need some JavaScript code, I guess.
Thank you very much.
While most of the user-visible parts of Gnome Shell are written in JavaScript, these are often just bindings for the underlying C libraries. If you're working with Windows, Monitors and Screens then you're going to want to reference the Mutter documentation and probably the Shell documentation as well:
Mutter: https://gjs-docs.gnome.org/#q=meta
Shell: https://gjs-docs.gnome.org/shell01/
St (Shell Toolkit): https://gjs-docs.gnome.org/st10/
There is a property on the global object called screen (so global.screen) which is no doubt a MetaScreen which has a function get_n_monitors(), as well as get_primary_monitor(), get_current_monitor() and others. MetaWindow, on the other hand, contains a function called get_monitor() which returns an integer. I gather that monitors are referred to by integer number, which is passed to various functions of MetaScreen and MetaWindow, since there doesn't seem to be an object for that in the Mutter documentation.
Most of the related JavaScript for what you want to do seems to be in layout.js, which probably has better examples of how Mutter is used in Gnome Shell than I can give you. It also includes a Monitor class, which seems to just be a JS wrapper around the monitor index. This class is used in the LayoutManager class (which is the definition of the instance Main.layoutManager).
A note about documentation
Originally, the rationale for not having "proper" gnome-shell documentation was that the (internal JavaScript) API was pretty unstable. The deal was, you don't get a stable API but you get to read the source in the same language you're going to write it in. In some ways this makes sense, given that you can modify the prototype of live objects and monkey-patch at whim.
The API has settled down a lot, but no one has really stepped up to write a script to auto-document it, yet. My best advice would be to bookmark the Mutter, Shell and St documentation and use Github or GitLab's search to make things easier.
There is however documentation for the Gnome API as well some of the built-in modules that are worth a skim:
Gnome API: https://gjs-docs.gnome.org/
Modules: https://gitlab.gnome.org/GNOME/gjs/blob/master/doc/Modules.md
General GObject usage in GJS: https://gitlab.gnome.org/GNOME/gjs/blob/master/doc/Mapping.md
It looks like I'm asking about a tricky problem that's been explored a lot over the past decades without a clear solution. I've seen Is It Possible to Sandbox JavaScript Running In the Browser? along with a few smaller questions, but all of them seem to be mislabeled - they all focus on sandboxing cookies and DOM access, and not JavaScript itself, which is what I'm trying to do; iframes or web workers don't sound exactly like what I'm looking for.
Architecturally, I'm exploring the pathological extreme: not only do I want full control of what functions get executed, so I can disallow access to arbitrary functions, DOM elements, the network, and so forth, I also really want to have control over execution scheduling so I can prevent evil or poorly-written scripts from consuming 100% CPU.
Here are two approaches I've come up with as I've thought about this. I realize I'm only going to perfectly nail two out of fast, introspected and safe, but I want to get as close to all three as I can.
Idea 1: Put everything inside a VM
While it wouldn't present a JS "front", perhaps the simplest and most architecturally elegant solution to my problem could be a tiny, lightweight virtual machine. Actual performance wouldn't be great, but I'd have full introspection into what's being executed, and I'd be able to run eval inside the VM and not at the JS level, preventing potentially malicious code from ever encountering the browser.
Idea 2: Transpilation
First of all, I've had a look at Google Caja, but I'm looking for a solution itself written in JS so that the compilation/processing stage can happen in the browser without me needing to download/run anything else.
I'm very curious about the various transpilers (TypeScript, CoffeeScript, this gigantic list, etc) - if these languages perform full tokenization->AST->code generation that would make them excellent "code firewalls" that could be used to filter function/DOM/etc accesses at compile time, meaning I get my performance back!
My main concern with transpilation is whether there are any attacks that could be used to generate the kind code I'm trying to block. These languages' parsers weren't written with security in mind, after all. This was my motivation behind using a VM.
This technique would also mean I lose execution introspection. Perhaps I could run the code inside one or more web workers and "ping" the workers every second or so, killing off workers that [have presumably gotten stuck in an infinite loop and] don't respond. That could work.
Here is the situation: A complex web app is not working, and it is possible to produce undesired behavior consistently. The cause of the problem is not known.
Proposal: Trace the execution paths of all javascript code. Essentially, produce two monstrous logs which can then be fed into a diff algorithm to determine where the behavior related to the bug begins to diverge (as the cause is not apparent from application behavior, and both comprehending and obtaining a copy of the actual JS code being run is difficult, due to the many pages that must be switched to and copied out from the web inspector. Making it difficult is the fact that all pages are dynamically spliced together with Perl code, where significant portions of JS code exist only as (dynamic...) Perl strings).
The Web Inspector in Chrome does not have an option that I know about for logging an execution trace. Basically what I would like is a log of every line of JS that is executed, in the order that they are executed. I don't see this as being a difficult thing to obtain given that the JS VM is single-threaded. The problem is simply that the existing user-facing tools are not designed for quite this much hardcore debugging. If we look at the Profiler in the Dev Tools, it's clearly capable of the kind of instrumentation that I need, but it is fundamentally designed to do profiling instead of tracing.
How can I get started with this? Is there some way I can build Chrome from source, where I can
switch off JIT in V8?
log every single javascript expression evaluated by V8 to a file
I have zero experience with the development side of Chrome. So e.g. links to dev-builds/branches/versions/distros of Chrome/Chromium/Canary (what's the difference?) are welcome.
At this point it appears that instrumenting the browser with powerful js tracing is still likely to be easier than redesigning the buggy app. The architecture of the page is a disaster, but the functionality is complex, and it almost fully works. I just have to find the one missing piece.
Alternatively, if tools of this sort already exist, what are some other keywords I can search for them with? "Code Tracing" is pretty much the only thing I can come up with.
I tested dynaTrace, which was a happy coincidence as our app supports IE (indeed Chrome support just came out of beta), but this does not produce a text dump, it basically produces a massive Win32 UI expando-tree, which is impossible to diff. This makes me really sad because I know how much more difficult it was to make the representation of the trace show up that way, and yet it turns out being almost utterly useless. Who's going to scroll up and down that tree view and see anything really useful in it, in anything other than a toy example of a web app?
If you are developing a big web app, it is always good to follow a test driven strategy for the coding part of it. Using just a few tips allows you to make a simple unit testing script (using QUnit) to test pretty much all aspects of your app. Here are some potential errors and some ways of solving them.
Make yourself handlers to register long living Objects and a handler to close them the safe way. If the safe way does not succeed then it is the management of the Object itself failing. One example would be Backbone zombie views. Either the view has bad code in the close section, the parent close is not hooked or an infinite loop happened. Testing all view events is also good, although tedious.
By putting all the code for data fetching inside a certain module (I often use a bunch of Backbone.Model objects for each table/document in my DB) and handlers for each using a reqres pattern, you can test them 1 by 1 to see if they all fetch and save correctly.
If complex computation is needed, abstract it in a function or module so that it can be easily tested with known data.
If your app uses data binding, a good policy is to have a JSON schema for all data to be tested against your views containing your bindings. Check against the schema all the data required. This is applied to your Backbone.Model also.
Using a good IDE also helps. PyCharm (if you use Python for backend) or WebStorm are extremely good for testing and developing JavaScript/CoffeeScript. You can breakpoint and study your code at specific locations, inside your browser! Also it runs your code for auto-completion and you can see some of the errors that way.
I just cannot encourage enough the use of modules in your code. Although there is no JavaScript official way of doing it (next ECMAScript draft has it), you can still use good libraries for it. Good ones are: RequireJS, CommonJS or Marionette.Module (if you use Marionette as your framework). I think Ember/AngularJS also offers this kind of functionality but I did not work with them personally so I am not sure.
This might not give you an immediate solution to your problem and I don't think (IMO) there is an easy one either. My focus was to show you ways to develop so that errors can be easily spotted and countered, and all of it (depending on your Unit Testing) during development phase. Errors will always happen, as much as our programmer ego wants us to believe the contrary. Hope I helped :)
I would suggest a divide and conquer strategy, first via logging, and second via code. Wrap suspect methods of the code with console logging in and out events and when the bug occurs hopefully it is occurring between or after some event. If event logging will not shed light, bring parts of the code into a new page/app. Divide and conquer will find when the error starts occurring.
So it seems you're in the realm of weird already, so I'm thinking of a weird solution. I know nothing about the backend of chrome myself so we're in the same boat, but if you're feeling bold here's an idea. Maybe you could find/replace every newline in your javascript files with a piece of code that logs either to a global string or to the console a) what file you're in, b) the contents of "this" or something useful to you, and maybe even c) the time. This would at least get you started. Make sure it's wrapped in something distinct so you can just as easily remove it.
I understand the term sandbox. But my limited skills in JS is unable to help me understand what is sandboxing in JS. So, what actually is sandboxing? Apart from security, why do we need to sandbox JS?
the javascript sandbox does exactly what you've said. It limits the scope of what a script can do. There are also benefits in terms of virtualising the resources the script can call on. This allows the sandbox host to marshal those resources for better performance and say, stop an endlessly looping script bringing the whole browser crashing down.
Sandboxing is the act of creating a scope in which no other part of the application can operate (unless given an opportunity to). More specifically, this is usually a function scope that exposes a limited subset of what's actually going on within it.
One library that's founded on the idea of sandboxes is YUI3. The basic unit of the application is a YUI instance sandbox:
var Y = YUI(); // creates a configurable YUI instance
// Creates a sandbox for one part of your application,
// including the 'node' module.
Y.use('node', function(Z) {
// Z is a YUI instance that's specific to this sandbox.
// Operations inside it are protected from outside code
// unless exposed explicitly. Any modules you request in
// use statement will be separately instanced just for
// this sandbox (in this case, the 'node' module)
//
// That way, if another part of your application decides
// to delete Z.Node (or worse, replace it with a
// malicious proxy of Z.Node) the code you've written
// here won't be affected.
});
The advantages of sandboxes are primarily to reduce application complexity: since sandboxes are immutable, they're much easier to reason about and verify. They also improve runtime security, since a well-designed sandbox should be able to operate as a black box to other scripts running on the page. It does not prevent against all possible attacks, but it protects against many of the simple ones.
Sandboxing creates a limited scope for the script to use. Assuming you're coding for a website, t's worth sandboxing to avoid making edits to a live site when you are uncertain about whether they will work exactly as you expect - and it's impossible to be truly certain without testing. Even if it works properly, if there's a chance of you making a series of alterations to the JS until you've got it tweaked the way you like, you could easily disrupt anyone attempting to use the site while you're updating it.
It's also much easier to tell what's broken when you break things, because of the limited nature of the sandbox.