techniques of debugging large piece of javascript code

techniques of debugging large piece of javascript code - javascript

Recently, I have a chance to work on a project contains some large javascript files. I would say 4000-5000 lines per file. For example, there are 3 big files(custom plugin) build on top of each other. I got a debug task that needs to resolve(logically and it is not a js error). when I tried to debug and understand the logic under chrome dev tool such as step through or tracing where the variable came from and goes, I always get lost at some point because the files are so big. I thought maybe I need to sit down for 1 or 2 day to read through all the files and draw the logic on the paper, I guess that may be not a good solution. I wonder if there is any technique that I missed for debugging and tracing the variables or the logic. please share your experience with me. thanks

Sometimes when I look at something like that, I start by creating a test. Try to test only the defect. Take a working copy and try to reduce it down until you have isolated the problem.
Good luck!
Specifically, for advanced step-debugging, there is:
using the call stack to inspect the local variables from the calling scope without having to step out of the function.
using conditional breakpoints.
https://developers.google.com/chrome-developer-tools/docs/javascript-debugging
If your JavaScript is making a lot of HTTP requests, it might also be useful to use the network tab to check the requests and responses are as expected.

Related

How to prevent access to my website via JavaScript

I was wondering if it would be possible to create a JavaScript function to check a condition, and if that is True then I deny access to the code? Right now I am checking the user agent, and if it doesn't meet given criteria then I delete the HTML tag. However, if they go to the network tab then they can still see the GET requests and responses for my code.
This is a website running on localhost because it's an Electron app by the way.
I thought maybe I could issue a 403 error but I'm not sure if that's possible via JS.
Thanks

In an Electron app, your code is still JavaScript source code. You can obfuscate it and/or put it in a ASAR archive to make it harder to read and fine, but the code is still there and accessible to anyone who wants to go to the effort. (For instance, if you use VS Code, you can see the source in resources/app/out directory and its subdirectories.)
You can make it even harder to find and understand the source if you're willing to put in more work. V8, the JavaScript engine Node.js uses, has a feature called startup snapshots: You run V8 and have it load your script (your obfuscated code), take a heap snapshot (after GC), and write it to a file. Then you specify the heap snapshot when loading V8, and it just hauls in the snapshot instead of reading and parsing code. The Atom team have done this on top of Electron. In their case the motivation was startup performance, not hiding the source code, but it has the effect of making your code even harder to find.
Note I said harder, not impossible. At the end of the day, if you want the code to execute on the end-user's computer, that code is going to be accessible to the end user. This is true even of compiled applications, although of course in that case a lot of organizational information that helps a human understand the code is lost in compilation, which happens less with obfuscated code. (But a good obfuscator makes code extremely opaque to human beings.)

Trace the execution of ALL Javascript in a web app

Here is the situation: A complex web app is not working, and it is possible to produce undesired behavior consistently. The cause of the problem is not known.
Proposal: Trace the execution paths of all javascript code. Essentially, produce two monstrous logs which can then be fed into a diff algorithm to determine where the behavior related to the bug begins to diverge (as the cause is not apparent from application behavior, and both comprehending and obtaining a copy of the actual JS code being run is difficult, due to the many pages that must be switched to and copied out from the web inspector. Making it difficult is the fact that all pages are dynamically spliced together with Perl code, where significant portions of JS code exist only as (dynamic...) Perl strings).
The Web Inspector in Chrome does not have an option that I know about for logging an execution trace. Basically what I would like is a log of every line of JS that is executed, in the order that they are executed. I don't see this as being a difficult thing to obtain given that the JS VM is single-threaded. The problem is simply that the existing user-facing tools are not designed for quite this much hardcore debugging. If we look at the Profiler in the Dev Tools, it's clearly capable of the kind of instrumentation that I need, but it is fundamentally designed to do profiling instead of tracing.
How can I get started with this? Is there some way I can build Chrome from source, where I can
switch off JIT in V8?
log every single javascript expression evaluated by V8 to a file
I have zero experience with the development side of Chrome. So e.g. links to dev-builds/branches/versions/distros of Chrome/Chromium/Canary (what's the difference?) are welcome.
At this point it appears that instrumenting the browser with powerful js tracing is still likely to be easier than redesigning the buggy app. The architecture of the page is a disaster, but the functionality is complex, and it almost fully works. I just have to find the one missing piece.
Alternatively, if tools of this sort already exist, what are some other keywords I can search for them with? "Code Tracing" is pretty much the only thing I can come up with.
I tested dynaTrace, which was a happy coincidence as our app supports IE (indeed Chrome support just came out of beta), but this does not produce a text dump, it basically produces a massive Win32 UI expando-tree, which is impossible to diff. This makes me really sad because I know how much more difficult it was to make the representation of the trace show up that way, and yet it turns out being almost utterly useless. Who's going to scroll up and down that tree view and see anything really useful in it, in anything other than a toy example of a web app?

If you are developing a big web app, it is always good to follow a test driven strategy for the coding part of it. Using just a few tips allows you to make a simple unit testing script (using QUnit) to test pretty much all aspects of your app. Here are some potential errors and some ways of solving them.
Make yourself handlers to register long living Objects and a handler to close them the safe way. If the safe way does not succeed then it is the management of the Object itself failing. One example would be Backbone zombie views. Either the view has bad code in the close section, the parent close is not hooked or an infinite loop happened. Testing all view events is also good, although tedious.
By putting all the code for data fetching inside a certain module (I often use a bunch of Backbone.Model objects for each table/document in my DB) and handlers for each using a reqres pattern, you can test them 1 by 1 to see if they all fetch and save correctly.
If complex computation is needed, abstract it in a function or module so that it can be easily tested with known data.
If your app uses data binding, a good policy is to have a JSON schema for all data to be tested against your views containing your bindings. Check against the schema all the data required. This is applied to your Backbone.Model also.
Using a good IDE also helps. PyCharm (if you use Python for backend) or WebStorm are extremely good for testing and developing JavaScript/CoffeeScript. You can breakpoint and study your code at specific locations, inside your browser! Also it runs your code for auto-completion and you can see some of the errors that way.
I just cannot encourage enough the use of modules in your code. Although there is no JavaScript official way of doing it (next ECMAScript draft has it), you can still use good libraries for it. Good ones are: RequireJS, CommonJS or Marionette.Module (if you use Marionette as your framework). I think Ember/AngularJS also offers this kind of functionality but I did not work with them personally so I am not sure.
This might not give you an immediate solution to your problem and I don't think (IMO) there is an easy one either. My focus was to show you ways to develop so that errors can be easily spotted and countered, and all of it (depending on your Unit Testing) during development phase. Errors will always happen, as much as our programmer ego wants us to believe the contrary. Hope I helped :)

I would suggest a divide and conquer strategy, first via logging, and second via code. Wrap suspect methods of the code with console logging in and out events and when the bug occurs hopefully it is occurring between or after some event. If event logging will not shed light, bring parts of the code into a new page/app. Divide and conquer will find when the error starts occurring.

So it seems you're in the realm of weird already, so I'm thinking of a weird solution. I know nothing about the backend of chrome myself so we're in the same boat, but if you're feeling bold here's an idea. Maybe you could find/replace every newline in your javascript files with a piece of code that logs either to a global string or to the console a) what file you're in, b) the contents of "this" or something useful to you, and maybe even c) the time. This would at least get you started. Make sure it's wrapped in something distinct so you can just as easily remove it.

JavaScript log parser project: Bad idea?

There are numerous log files that I have to review daily for my job. Several good parsers already exist for these log files but I have yet to find exactly what I want. Well, who could make something more tailored to you than you, right?
The reason I am using JavaScript (other than the fact that I already know it) is because it's portable (no need to install anything) but at the same time cross-platform accessible. Before I invest too much time in this, is this a terrible method of accomplishing my goal?
The input will be entered into a text file, delimited by [x] and the values will be put into an array to make accessing these values faster than pulling the static content.
Any special formatting (numbers, dates, etc) will be dealt with before putting the value in the array to prevent a function from repeating this step every time it is used.
These logs may contain 100k+ lines which will be a lot for the browser to handle. However, each line doesn't contain a ton of information.
I have written some of it already, but with even 10,000 lines it's starting to run slow and I don't know if it's because I wasn't efficient enough or if this just cannot be effectively done. I'm thinking this is because all the data is in one giant table. I'd probably be better off paginating it, but that is less than desirable.
Question 1: Is there anything I failed to mention that I should consider?
Question 2: Would you recommend a better alternative?
Question 3: (A bit off topic, so feel free to ignore). Instead of copy/pasting the input, I would like to 'open' the log file but as far as I know JavaScript cannot do this (for security reasons). Can this be accomplished with a input="file" without actually having a server to upload to? I don't know how SSJS works, but it appears that I underestimated the limitations of JavaScript.
I understand this is a bit vague, but I'm trying to keep you all from having to read a book to answer my question. Let me know if I should include additional details. Thanks!

I think JavaScript is an "ok" choice for this. Using a scripting language to parse log files for personal use is a perfectly sane decision.
However, I would NOT use a browser for this. Web browsers place limitations on how long a bit of javascript can run, or on how many instructions it is allowed to run, or both. If you exceed these limits, you'll get something like this:
Since you'll be working with a large amount of data, I suspect you're going to hit this sooner or later. This can be avoided by clever use of setTimeout, or potentially with web workers, but that will add complexity to your project. This is probably not what you want.
Be aware that JavaScript can run outside of browsers as well. For instance, Windows comes with the Windows Script Host. This will let you run JavaScript from the command prompt, without needing a browser. You won't get the "Script too long" error. As an added bonus, you will have full access to the file system, and the ability to pass command-line arguments to your code.
Good luck and happy coding!

To answer your top question in bold: No, it is not a terrible idea.
If JS is the only language you know, you want to avoid setting up any dependencies, and you want to stay platform-independent... JavaScript seems like a good fit for your particular case.
As a more general rule, I would never use JS as a language to write a desktop app. Especially not for doing a task like log parsing. There are many other languages which are much better suited to this type of problem, like Python, Scala, VB, etc. I mention Python and Scala because of their script-like behaviour and minimal setup requirements. Python also has very similar syntax to JS so it might be easier to pick up then other languages. VB (or any .NET language) would work too if you have a Visual Studio license because of it's easy to use GUI builder if that suits your needs better.
My suggested approach: use an existing framework. There are hundreds, if not thousands of log parsers out there which handle all sorts of use-cases and different formats of logs that you should be able to find something close to what you need. It may just take a little more effort than Google'ing "Log Parsers" to find one that works. If you can't find one that suits your exact needs and you are willing to spend time making your own, you should use that time instead to contribute to one of the existing ones which are open source. Extending an existing code base should always be considered before trying to re-invent the wheel for the 10th gillion time.

Given your invariants "javascript, cross-platform, browser ui, as fast as possible" I would consider this approach:
Use command line scripts (windows: JScript; linux: ?) to parse log files and store 'clean'/relevant data in a SQLite Database (fall back: any decent scripting language can do this, the ready made/specialized tools may be used too)
Use the SQLite Manager addon to do your data mining with SQL
If (2) gets clumsy - use the SQLite Manager code base to 'make something more tailored'
Considering your comment:
For Windows-only work you can use the VS Express edition to write an app in C#, VB.NET, C++/CLI, F#, or even (kind of) Javascript (Silverlight). If you want to stick to 'classic' Javascript and a browser, write a .HTA application (full access to the local machine) and use ADO data(base) access and try to get the (old) DataGrid/Flexgrid controls (they may be installed already; search the registry).

Debugging site written mainly in JScript with AJAX code injection

I have a legacy code to maintain and while trying to understand the logic behind the code, I have run into lots of annoying issues.
The application is written mainly in Java Script, with extensive usage of jQuery + different plugins, especially Accordion. It creates a wizard-like flow, where client code for the next step is downloaded in the background by injecting a result of a remote AJAX request. It also uses callbacks a lot and pretty complicated "by convention" programming style (lots of events handlers are created on the fly based on certain object names - e.g. current page name, current step name).
Adding to that, the code is very messy and there is no obvious inner structure - the functions are scattered in the code, file names do not reflect the business role of the code, lots of functions and code snippets are most likely not used at all etc.
PROBLEM: How to approach this code base, so that the inner flow of the code can be sort-of "reverse engineered" using a suite of smart debugging tools.
Ideally, I would like to be able to attach to the running application and step through the code, breaking on each new function call.
Also, it would be nice to be able to create a "diagram of calls" in the application (i.e. in order to run a particular page logic, this particular flow of function calls was executed in a particular order).
Not to mention to be able to run a coverage analysis, identifying potentially orphaned code fragments.
I would like to stress out once more, that it is impossible to understand the inner logic of the application just by looking at the code itself, unless you have LOTS of spare time and beer crates, which I unfortunately do not have :/ (shame...)
An IDE of some sort that would aid in extending that code would be also great, but I am currently looking into possibility to use Visual Studio 2010 to do the job, as the site itself is a mix of Classic ASP and ASP.NET (I'd say - 70% Java Script with jQuery, 30% ASP).
I have obviously tried FireBug, but I was unable to find a way to define a breakpoint or step into the code, which is "injected" into the client JS using AJAX calls (i.e. the application retrieves the code by invoking an URL and injects it to the client local code). Venkman debugger had similar issues.
Any hints would be welcome. Feel free to ask additional questions.

You can try to use "dynaTrace Ajax" to create a diagram of calls, but it is not debugger. If you have access to that application you can use keyword "debugger" to define a breakpoint explicitly in javascript files. Visual Studio theoretically shows all of evaluated and loaded javascript code in Solution Explorer when you've attached to IE.

I'd start with Chrome's developer tools and profiling single activities to find functions to set breakpoints on. Functions can be expanded to get their call stack.
That may or may not be helpful as it's entirely possible to get code that's too -- uh, unique, shall we say -- to make much sense of.
Good luck?

Method to 'compile' javascript to hide the source during page execution?

I wanted to hide some business logic and make the variables inaccessible. Maybe I am missing something but if somebody can read the javascript they can also add their own and read my variables. Is there a way to hide this stuff?

Any code which executes on a client machine is available to the client. Some forms of code are harder to access, but if someone really wants to know what's going on, there's no way you have to stop them.
If you don't want someone to find out what code is being run, do it on a server. Period.

That's one of the downsides of using a scripting language - if you don't distribute the source, nobody can run your scripts!
You can run your JS through an obfuscator first, but if anyone really wants to figure out exactly what your code is doing, it won't be that much work to reverse-engineer, especially since the effects of the code are directly observable in the first place.

Javascript cannot be compiled, that is, it is still Javascript.
But, there's this: http://dean.edwards.name/packer/
Generally, this is used to reduce the code footprint of the Javascript, if say your script is being downloaded thousands of times per minute. There are other methods to accomplish this, but as for hiding the code this sort of works.
Granted, the code can be unpacked. This will keep out a novice but anyone who is determined to read your source code will find a way.
It is even this way with compiled languages, even when they have been obfuscated. It's impossible to hide your code 100% of the time -- if it executes on your machine, it can be read by a determined hacker.

You could encrypt it so no one can read it.
For example
http://daven.se/usefulstuff/javascript-obfuscator.html

You must always validate the data you send back. I've had a rather entertaining time playing pranks on a forum I'm a mod of by manipulating the pages with the Web Developer Toolbar. Whether or not you obfuscate it, always assume that data coming to the server has been intentionally manipulated. Only after you prove it hasn't (or verify the user has permission to act) do you handle the request.

We Keep Coding

JavaScript is the programming language of the Web.