Here is the situation: A complex web app is not working, and it is possible to produce undesired behavior consistently. The cause of the problem is not known.
Proposal: Trace the execution paths of all javascript code. Essentially, produce two monstrous logs which can then be fed into a diff algorithm to determine where the behavior related to the bug begins to diverge (as the cause is not apparent from application behavior, and both comprehending and obtaining a copy of the actual JS code being run is difficult, due to the many pages that must be switched to and copied out from the web inspector. Making it difficult is the fact that all pages are dynamically spliced together with Perl code, where significant portions of JS code exist only as (dynamic...) Perl strings).
The Web Inspector in Chrome does not have an option that I know about for logging an execution trace. Basically what I would like is a log of every line of JS that is executed, in the order that they are executed. I don't see this as being a difficult thing to obtain given that the JS VM is single-threaded. The problem is simply that the existing user-facing tools are not designed for quite this much hardcore debugging. If we look at the Profiler in the Dev Tools, it's clearly capable of the kind of instrumentation that I need, but it is fundamentally designed to do profiling instead of tracing.
How can I get started with this? Is there some way I can build Chrome from source, where I can
switch off JIT in V8?
log every single javascript expression evaluated by V8 to a file
I have zero experience with the development side of Chrome. So e.g. links to dev-builds/branches/versions/distros of Chrome/Chromium/Canary (what's the difference?) are welcome.
At this point it appears that instrumenting the browser with powerful js tracing is still likely to be easier than redesigning the buggy app. The architecture of the page is a disaster, but the functionality is complex, and it almost fully works. I just have to find the one missing piece.
Alternatively, if tools of this sort already exist, what are some other keywords I can search for them with? "Code Tracing" is pretty much the only thing I can come up with.
I tested dynaTrace, which was a happy coincidence as our app supports IE (indeed Chrome support just came out of beta), but this does not produce a text dump, it basically produces a massive Win32 UI expando-tree, which is impossible to diff. This makes me really sad because I know how much more difficult it was to make the representation of the trace show up that way, and yet it turns out being almost utterly useless. Who's going to scroll up and down that tree view and see anything really useful in it, in anything other than a toy example of a web app?
If you are developing a big web app, it is always good to follow a test driven strategy for the coding part of it. Using just a few tips allows you to make a simple unit testing script (using QUnit) to test pretty much all aspects of your app. Here are some potential errors and some ways of solving them.
Make yourself handlers to register long living Objects and a handler to close them the safe way. If the safe way does not succeed then it is the management of the Object itself failing. One example would be Backbone zombie views. Either the view has bad code in the close section, the parent close is not hooked or an infinite loop happened. Testing all view events is also good, although tedious.
By putting all the code for data fetching inside a certain module (I often use a bunch of Backbone.Model objects for each table/document in my DB) and handlers for each using a reqres pattern, you can test them 1 by 1 to see if they all fetch and save correctly.
If complex computation is needed, abstract it in a function or module so that it can be easily tested with known data.
If your app uses data binding, a good policy is to have a JSON schema for all data to be tested against your views containing your bindings. Check against the schema all the data required. This is applied to your Backbone.Model also.
Using a good IDE also helps. PyCharm (if you use Python for backend) or WebStorm are extremely good for testing and developing JavaScript/CoffeeScript. You can breakpoint and study your code at specific locations, inside your browser! Also it runs your code for auto-completion and you can see some of the errors that way.
I just cannot encourage enough the use of modules in your code. Although there is no JavaScript official way of doing it (next ECMAScript draft has it), you can still use good libraries for it. Good ones are: RequireJS, CommonJS or Marionette.Module (if you use Marionette as your framework). I think Ember/AngularJS also offers this kind of functionality but I did not work with them personally so I am not sure.
This might not give you an immediate solution to your problem and I don't think (IMO) there is an easy one either. My focus was to show you ways to develop so that errors can be easily spotted and countered, and all of it (depending on your Unit Testing) during development phase. Errors will always happen, as much as our programmer ego wants us to believe the contrary. Hope I helped :)
I would suggest a divide and conquer strategy, first via logging, and second via code. Wrap suspect methods of the code with console logging in and out events and when the bug occurs hopefully it is occurring between or after some event. If event logging will not shed light, bring parts of the code into a new page/app. Divide and conquer will find when the error starts occurring.
So it seems you're in the realm of weird already, so I'm thinking of a weird solution. I know nothing about the backend of chrome myself so we're in the same boat, but if you're feeling bold here's an idea. Maybe you could find/replace every newline in your javascript files with a piece of code that logs either to a global string or to the console a) what file you're in, b) the contents of "this" or something useful to you, and maybe even c) the time. This would at least get you started. Make sure it's wrapped in something distinct so you can just as easily remove it.
Related
It looks like I'm asking about a tricky problem that's been explored a lot over the past decades without a clear solution. I've seen Is It Possible to Sandbox JavaScript Running In the Browser? along with a few smaller questions, but all of them seem to be mislabeled - they all focus on sandboxing cookies and DOM access, and not JavaScript itself, which is what I'm trying to do; iframes or web workers don't sound exactly like what I'm looking for.
Architecturally, I'm exploring the pathological extreme: not only do I want full control of what functions get executed, so I can disallow access to arbitrary functions, DOM elements, the network, and so forth, I also really want to have control over execution scheduling so I can prevent evil or poorly-written scripts from consuming 100% CPU.
Here are two approaches I've come up with as I've thought about this. I realize I'm only going to perfectly nail two out of fast, introspected and safe, but I want to get as close to all three as I can.
Idea 1: Put everything inside a VM
While it wouldn't present a JS "front", perhaps the simplest and most architecturally elegant solution to my problem could be a tiny, lightweight virtual machine. Actual performance wouldn't be great, but I'd have full introspection into what's being executed, and I'd be able to run eval inside the VM and not at the JS level, preventing potentially malicious code from ever encountering the browser.
Idea 2: Transpilation
First of all, I've had a look at Google Caja, but I'm looking for a solution itself written in JS so that the compilation/processing stage can happen in the browser without me needing to download/run anything else.
I'm very curious about the various transpilers (TypeScript, CoffeeScript, this gigantic list, etc) - if these languages perform full tokenization->AST->code generation that would make them excellent "code firewalls" that could be used to filter function/DOM/etc accesses at compile time, meaning I get my performance back!
My main concern with transpilation is whether there are any attacks that could be used to generate the kind code I'm trying to block. These languages' parsers weren't written with security in mind, after all. This was my motivation behind using a VM.
This technique would also mean I lose execution introspection. Perhaps I could run the code inside one or more web workers and "ping" the workers every second or so, killing off workers that [have presumably gotten stuck in an infinite loop and] don't respond. That could work.
One of the great things about being a web developer in recent history is all the sharing that's going on, especially with JavaScript libraries. There's all these awesome tools to use: jQuery, jQuery-UI, Lightbox, bxslider, underscore.js, Backbone.js, the list goes on. Then there comes a time when one or more of these libraries need to be updated. But JavaScript runs on the client, it doesn't compile, and it's difficult or impossible to be notified when a problem occurs. What is the best technique right now to assure that after you update one or more JavaScript libraries, your web application will not start throwing JavaScript errors?
There's no way the best response to this is to just test. Especially with a complicated application it can be too difficult to go through every possibility and make sure no errors are thrown. What are other web application developers out there doing to make sure they don't have a deployment with an embarassing and crippling JavaScript bug caused by updating?
The best answer is to "just test". What you are asking, essentially, is "How do I test to make sure my software still works?". You can do all the homework you want to see what changed, but eventually you just need to test your application.
That being said, there are generic testing tools like JSLint and Selenium, but ultimately your application is going to be unique enough that you will need to have unit tests to cover the business logic and standard QA for non-standard processes.
One way to ensure that things still work functionally is to have a suite of automated browser tests (utilizing a tool like Selenium) that you run on your development environment.
I wouldn't really call this "just testing" but unit testing will help do the job. Write tests once for your app looking for the expected results (good practice either way), then when you update the plugins run these tests again.
http://qunitjs.com/
Nothing beats a good QA and browser testing though.
Plenty of other answers already include "test", so hopefully that's obvious at this point.
The other thing that you should ALWAYS do is read the release notes for every single version up to the version that you're upgrading to. I can't speak intelligently about bxslider or Lightbox, but the other major libraries that you referenced are extremely good at releasing detailed changelogs which will notify you of breaking changes. You can decide for yourself if any of these changes will have a negative effect on your app ( in addition to testing of course!).
I'm going to come clean and admit that out of sheer ignorance I've been flying blind with my websites until recently when I deployed a site with Elmah (an error logging facility for ASP.NET) and a controller on the server to send all uncaught JavaScript exceptions. This was an eyeopening experience to say the least.
One of my sites is getting about 150-200 visitors a day. About once a day I get an Elmah JavaScript stacktrace similar to this one:
CCS.Exceptions.JavaScriptException: Unspecified error.: at document path
'http://www.*******.com/*****'. at anonymous('Unspecified error.',
'http://ajax.googleapis.com/ajax/libs/jqueryui/1.9.0/jquery-ui.min.js', '5')
at anonymous()
I'm relatively new to JavaScript and web development in general so I'd appreciate some advice:
How should I go about getting to the root cause of problems like this? Any advice for debugging problems in production (specifically JavaScript)?
Where would you rank this on the scale of importance? I realize that question is almost impossible to answer without knowing everything on my plate. But what I'm hoping for is advice for prioritizing this kind of stuff. Is my hair on fire? Or, "Meh, this is really common and given all of the browsers and OSes these days this kind of thing is bound to happen."
The chances are pretty good that each one of these exceptions is a web page that is NOT working as designed. In order to understand that priority, you need to understand the following:
What percentage of page views are being affected?
What are the specific exceptions and what trouble do they cause?
How important is that trouble to your business.
How much time will it take to identify and fix these and how does the business importance of that compare with other projects on your plate?
In general, you won't know the answers to any of these questions without further research so you'll either have to decide that all exceptions that aren't fully understood are a bad thing and should at least be understood.
Or, you'll have to decide to launch an investigative project to get more information related to these questions and then decide whether to fix it or not.
Or, you'll have to decide that you just havne't heard enough reports of problems with your web site to warrant any further investigation (I personally don't like this option because you're guessing that it doesn't matter).
Without knowing much detail about your business, my recommendation would be to budget and plan an investigation project to learn where these errors are coming from and understand their impact on the usability of the site. Making decisions with some data in hand is way better than guessing.
As for debugging, you ultimately need to find out which line of code is triggering the exception. You also may want to record what browser configuration is generating the error (in case it's a browser version related issue). You can start by understand what each piece of information in your stacktrace report means and what it tells you and then find out if you can enable more detailed tracking of the exceptions in the Elmah system. If not and you can figure out approximately where the error's are coming from, you can implement your own exception logging that might be able to capture additional information.
You may also browse through any trouble ticket reports you have on the site because there may be some internal or external reports of problems with the site that might be correlated with these exceptions.
JavaScript errors can be difficult to get to the bottom of. The best thing you can do is ensure that your JavaScript is written cleanly and put everything into namespaces to keep your window object as clean as possible and avoid variable hoisting which is a common issue when JavaScript libraries get unruly. Below is the preferred namespace pattern that I use across the board -
/* Namespace pattern */
var myAppNamespace = myAppNamespace || {};
(function(ns) {
ns.doSomething = function() {
// enter code here
};
}(myAppNamespace));
/* Usage */
myAppNamespace.doSomething();
There are numerous log files that I have to review daily for my job. Several good parsers already exist for these log files but I have yet to find exactly what I want. Well, who could make something more tailored to you than you, right?
The reason I am using JavaScript (other than the fact that I already know it) is because it's portable (no need to install anything) but at the same time cross-platform accessible. Before I invest too much time in this, is this a terrible method of accomplishing my goal?
The input will be entered into a text file, delimited by [x] and the values will be put into an array to make accessing these values faster than pulling the static content.
Any special formatting (numbers, dates, etc) will be dealt with before putting the value in the array to prevent a function from repeating this step every time it is used.
These logs may contain 100k+ lines which will be a lot for the browser to handle. However, each line doesn't contain a ton of information.
I have written some of it already, but with even 10,000 lines it's starting to run slow and I don't know if it's because I wasn't efficient enough or if this just cannot be effectively done. I'm thinking this is because all the data is in one giant table. I'd probably be better off paginating it, but that is less than desirable.
Question 1: Is there anything I failed to mention that I should consider?
Question 2: Would you recommend a better alternative?
Question 3: (A bit off topic, so feel free to ignore). Instead of copy/pasting the input, I would like to 'open' the log file but as far as I know JavaScript cannot do this (for security reasons). Can this be accomplished with a input="file" without actually having a server to upload to? I don't know how SSJS works, but it appears that I underestimated the limitations of JavaScript.
I understand this is a bit vague, but I'm trying to keep you all from having to read a book to answer my question. Let me know if I should include additional details. Thanks!
I think JavaScript is an "ok" choice for this. Using a scripting language to parse log files for personal use is a perfectly sane decision.
However, I would NOT use a browser for this. Web browsers place limitations on how long a bit of javascript can run, or on how many instructions it is allowed to run, or both. If you exceed these limits, you'll get something like this:
Since you'll be working with a large amount of data, I suspect you're going to hit this sooner or later. This can be avoided by clever use of setTimeout, or potentially with web workers, but that will add complexity to your project. This is probably not what you want.
Be aware that JavaScript can run outside of browsers as well. For instance, Windows comes with the Windows Script Host. This will let you run JavaScript from the command prompt, without needing a browser. You won't get the "Script too long" error. As an added bonus, you will have full access to the file system, and the ability to pass command-line arguments to your code.
Good luck and happy coding!
To answer your top question in bold: No, it is not a terrible idea.
If JS is the only language you know, you want to avoid setting up any dependencies, and you want to stay platform-independent... JavaScript seems like a good fit for your particular case.
As a more general rule, I would never use JS as a language to write a desktop app. Especially not for doing a task like log parsing. There are many other languages which are much better suited to this type of problem, like Python, Scala, VB, etc. I mention Python and Scala because of their script-like behaviour and minimal setup requirements. Python also has very similar syntax to JS so it might be easier to pick up then other languages. VB (or any .NET language) would work too if you have a Visual Studio license because of it's easy to use GUI builder if that suits your needs better.
My suggested approach: use an existing framework. There are hundreds, if not thousands of log parsers out there which handle all sorts of use-cases and different formats of logs that you should be able to find something close to what you need. It may just take a little more effort than Google'ing "Log Parsers" to find one that works. If you can't find one that suits your exact needs and you are willing to spend time making your own, you should use that time instead to contribute to one of the existing ones which are open source. Extending an existing code base should always be considered before trying to re-invent the wheel for the 10th gillion time.
Given your invariants "javascript, cross-platform, browser ui, as fast as possible" I would consider this approach:
Use command line scripts (windows: JScript; linux: ?) to parse log files and store 'clean'/relevant data in a SQLite Database (fall back: any decent scripting language can do this, the ready made/specialized tools may be used too)
Use the SQLite Manager addon to do your data mining with SQL
If (2) gets clumsy - use the SQLite Manager code base to 'make something more tailored'
Considering your comment:
For Windows-only work you can use the VS Express edition to write an app in C#, VB.NET, C++/CLI, F#, or even (kind of) Javascript (Silverlight). If you want to stick to 'classic' Javascript and a browser, write a .HTA application (full access to the local machine) and use ADO data(base) access and try to get the (old) DataGrid/Flexgrid controls (they may be installed already; search the registry).
I have a legacy code to maintain and while trying to understand the logic behind the code, I have run into lots of annoying issues.
The application is written mainly in Java Script, with extensive usage of jQuery + different plugins, especially Accordion. It creates a wizard-like flow, where client code for the next step is downloaded in the background by injecting a result of a remote AJAX request. It also uses callbacks a lot and pretty complicated "by convention" programming style (lots of events handlers are created on the fly based on certain object names - e.g. current page name, current step name).
Adding to that, the code is very messy and there is no obvious inner structure - the functions are scattered in the code, file names do not reflect the business role of the code, lots of functions and code snippets are most likely not used at all etc.
PROBLEM: How to approach this code base, so that the inner flow of the code can be sort-of "reverse engineered" using a suite of smart debugging tools.
Ideally, I would like to be able to attach to the running application and step through the code, breaking on each new function call.
Also, it would be nice to be able to create a "diagram of calls" in the application (i.e. in order to run a particular page logic, this particular flow of function calls was executed in a particular order).
Not to mention to be able to run a coverage analysis, identifying potentially orphaned code fragments.
I would like to stress out once more, that it is impossible to understand the inner logic of the application just by looking at the code itself, unless you have LOTS of spare time and beer crates, which I unfortunately do not have :/ (shame...)
An IDE of some sort that would aid in extending that code would be also great, but I am currently looking into possibility to use Visual Studio 2010 to do the job, as the site itself is a mix of Classic ASP and ASP.NET (I'd say - 70% Java Script with jQuery, 30% ASP).
I have obviously tried FireBug, but I was unable to find a way to define a breakpoint or step into the code, which is "injected" into the client JS using AJAX calls (i.e. the application retrieves the code by invoking an URL and injects it to the client local code). Venkman debugger had similar issues.
Any hints would be welcome. Feel free to ask additional questions.
You can try to use "dynaTrace Ajax" to create a diagram of calls, but it is not debugger. If you have access to that application you can use keyword "debugger" to define a breakpoint explicitly in javascript files. Visual Studio theoretically shows all of evaluated and loaded javascript code in Solution Explorer when you've attached to IE.
I'd start with Chrome's developer tools and profiling single activities to find functions to set breakpoints on. Functions can be expanded to get their call stack.
That may or may not be helpful as it's entirely possible to get code that's too -- uh, unique, shall we say -- to make much sense of.
Good luck?