This is a generic question about paradigms, and I apologize if this is an inappropriate place to ask. Polite recommendations on the correct place to ask this will be appreciated :)
I'm working for a company that has a separate codebase for each of its websites. I've been asked to take a sizable piece of functionality out of one codebase and put it into an external library, so that multiple codebases can make use of it.
The problem is that the code is tightly coupled to the codebase it was built in, and I'm having a difficult time extracting it. I've approached this problem from multiple angles, and restarted from scratch each time. Each time, I start running into complexities, and it feels like I'm approaching the problem the wrong way. I was wondering if anyone else has had experience doing this, or if there is a recommeneded way to proceed?
Here's what I've tried:
I copied the relevant files into a new project, carefully replacing each reference to the old codebase with vanilla javascript. This has been a laborious process, and I keep running into issues I can't solve
I placed a very basic HTML file in the old codebase, as well as a blank javascript file. I've been cut and pasting functions one at a time into that javascript file, and calling them in the old codebase as well as the basic HTML file.
I created another new project, and copy and pasted functions one at a time into the new project.
Each approach has presented me with its own challenges, but I can't get around the fact that the original code is so tightly coupled to the original codebase that progress is very slow, and I'm beginning to question whether any of the code is salvageable.
The old code may not be salvageable, and it's more than reasonable to reach a point where you go back and say so.
The typical goal I have in cases such as these, cases where nearly all of the old code is unsalvageable but something new needs to not only take over for it, but quickly be used by old and new codebases alike, is to refactor the code into models, services, and components (less MVC and more 'data, how you get and change data, and how you view and interact with data').
In cases where you are building something to replicate the old, but get to write it from scratch, I treat it like it's brand new, and start from the interfaces, first. By knowing what the outer-edges should look like, and by keeping the internal code clean, and leaning on DI (the principle, not any wrapper in particular), I build the system I think I should be able to have, such that new projects/products can happily integrate with the right thing.
...for projects which need to have a product revamped, inside of a busted old system, I take nearly the same tack; I design the interface that I want, I make sure that everything is DI friendly (this becomes more important, here), and then I build a facade that looks exactly like how the old bustedness is called and used, and inside of that facade, I instantiate the sane system, I transform whatever the old, awful data points were, into our new models, I do whatever it is my system needs to do, and on the way out of the system, I transform our awesome new models into the terrifying results that the old system was responsible for making.
The latest such thing is a new platform which hosts new APIs.
The APIs, however, talk to awful, old, stateful, session-based web-services, which make horizontal-scaling absolutely impossible (not what you want to hear, when your goal is to distribute the new platform on Node, on AWS).
The solution was to build the APIs exactly as we expect them; get the interface to look as nice and be as useful as possible, while serving the actual needs of the API clients.
Then, we made sure that the modules which provided the APIs used DI for the service that acts as a connector to the old back-end.
That's so that we can simply swap out that service, when it comes time to connect to a better implementation of the system.
That service, however, needs transformers.
It needs one transformer to convert our awesome new request objects, into the scary old ball of mud that just kept growing.
Then it needs another transformer to turn the output from the ugly old data, into our new models, that our whole system uses.
Those transformers don't necessarily need to be injected into the service, because their implementation details are tied pretty tightly to the place that they're calling, and any update to the service, or any new service called will require transformer work for that specific service's implementation details.
Then there are problems on the front-end side, where communication used to take too much for granted, when talking to a server.
We now have transformers on the client side, which are used at the last possible second (actually, we wrote client-side services) to convert the old way of doing things to talk to the new form.
Any magic global data, which was randomly called in the middle of a process was factored into the service, the transform, and the API in general, if it serves a specific / reusable enough purpose.
Any of those magically grabbed pieces of information are now explicitly passed in. Some are client-only, and thus are either config data for instantiation, or are parameters for particular methods on services.
Session data is now explicitly passed back from the client, in the form of tokens/ids on each request that requires them (for now).
So the new platform stays 100% stateless (and thus scales wonderfully, from that aspect).
So long as all of that magical data gets pulled out of the internals, and passed through, the system can keep being refactored, without too much worry.
As soon as state and state-management exist on the inside of your system, it starts getting harder to work with, and harder to refactor (but you already know that).
Doing a refactor of a product which never leaves the page (ie: involves no APIs/services, or at least none that are tightly coupled to your front-end), isn't really much different.
Remove global state, by explicitly forcing it to be passed into your system (build time, call-time, whenever fits the data's purpose).
If there are async race conditions with moving parts that touch too many things, see if you can resolve them with promises, to get you out of nested callback hell.
My team is now largely using set-based programming (.map, .filter, .reduce, over arrays) and functional programming, in general, to simplify much of the code we look at and write, as each new function may only be 3-5 lines long (some being one-liners).
So our services will tend to be structured in an OOP sort of way, but as much as possible, will remain pure (no outer state modified by/for function calls), and the internals of those calls will typically look much more like chained or composed functional programming.
This has less to do with the overall refactor, and more to do with the micro refactors, as we build our systems.
For the macro-level, it's really your interface, and the facade you wrap the old stuff in, and the removal of all global state (which functional helps with) which make the difference.
The other alternative, of course, is to copy and paste the whole file/page, and start erasing things that you know aren't going to break, until you get to the things that might break, and continue from there. It's not pretty, it's not reusable, but I've been forced to do it a few times in my life, and regretted it every time.
Related
Today crypto libraries for JavaScript exists sjcl and hence there may be the situation that
a password/key/secret/sensitivedata is stored somewhere in a variable in JavaScript.
I do not want to risk that this sensitve data is leaked/disclosed and hence I would very much like to know if there is a way to reliably wipe a variable in Javascript so that the memory used by the JavaScript Engine will not have any remaining info about he data? I would for instance not want to rely on some GC to wipe the data lazily etc.
An answer might feature an example code that kills/wipes a variable and also an explanation when (and if there are differences on what JavaScript implementation Browsers type /Nodejs) it makes sence to trust that the data has been deleted?
Else if the task is impossible I would appreciate a explanation why this is so as well and also accept this as an answer
The goal is not to protect the webpage user from accessing the script variable (this cannot be done I guess). The goal is more to guarantee that the memory of the javascript engine does not keep shadow/cached copies of the data, after the point necessary. I do want to have the data be gone so that no-one (attacker software) can get the secret data via looking at the memory been associated with the Javascript Variables.
JavaScript is garbage collected. In addition, there is no mechanism for deterministic resource management built in. You can make one, but the resource would have to be external.
Even if you build such a mechanism (with a C++ external module in Node for example), engines don't give you strong guarantees on when their copy of the memory is cleared. You would have to manually assign to the same variable parts of the resource data and replace it with junk yourself. That would likely work but there is still no guarantee at the engine level.
This is simply not a problem JavaScript implementations are built to do well at this point. There is no SecureString. That said - smart people are working on variants of ECMAScript (the JS standard) that give you much stronger guarantees. That's a good first step towards addressing the problem (but no such guarantee yet).
I don't even want to get started on browsers, where browser extensions can easily get better hooks than you and write over Function.prototype.call and hook on every function call, JavaScript has quite powerful AOP capabilities built in, for worse in this instance.
One possible solution would be to run the whole program within a VM that uses encrypted RAM, but I'm against rolling your own crypto like that. Generally, an attacker should not have access to your program's RAM in the first place, if they do, they can install a browser extension :)
I want to improve my Coffeescript coding style. When I program in Scala, I can write a module in an hour or two, run it and have only a few minor bugs that I can quickly identify and fix.
In Coffeescript, I spend about the same time up front but I end up having a staggering amount of small bugs that would have been caught by a static type checker and I end up having to compile, reload the browser, step through some code, add some break points, etc. It's an infuriating experience and takes significantly longer.
It's much harder to abstract and encapsulate functionality due to the lack of interfaces and many other OO-features.
Are there design patterns that replace the encapsulation/abstraction generally provided by OO? Or is there a primer/guide on how to think in a more Coffeescript-y way (or how to solve problems using a prototypical approach)?
What have you done to become more productive in Coffeescript (or Javascript - perhaps even any dynamically typed languages)?
If you're coming from a statically-typed, class-centric language like Java or Scala, learning JavaScript/CoffeeScript is going to be a challenge. The compiler doesn't help you nearly as much, which means that it takes you minutes to discover small mistakes instead of seconds.
If that's your major bottleneck, then I'd suggest embracing a more test-driven coding methodology. Use a library like QUnit to write small tests for each piece of functionality you develop. Used properly, this style gives you the same benefits as a static compiler without compromising the flexibility of a dynamic language.
Don't go straight to Coffee Script. Learn the core concepts from prototype and Javascript OO. IMMO You can learn both at the same time, but you will benefit much more if you get Vanilla Javascript first. Based on my personal experience, Coffee Script syntactic sugar for classes can be a trap if you don't understand prototypical inheritances (it's easy to get stuck on a bug).
Coffee Script debugging is still not a completely solved matter in terms of tools, the only way I know it can be done is to write tests (a pain when you're just starting) or look at the generated code (at least for the more obscure bugs).
It was odd for me too; in my case coming from a C/C++ background.
What clicked for me is that you can reduce your iteration time significantly with a few tweaks of your work environment. The idea is to reduce it enough that you can write code in small chunks and test your code very very frequently.
On the lack of compile time checks: You'll get used to it. Like significant white space, the lack of compile time type checking just melts away after a few weeks. It's hard to say how exactly, but at least I can tell you that it did happen for me.
On the lack of interfaces: that's a tricky one. It would be nice to get a little more help in larger systems to remind you to implement entire interfaces. If you're finding that you really are losing a lot of time to that, you could write your own run time checks, and insert them where appropriate. E.g. if you register your objects with a central manager, that would be a good time to ensure that the objects qualify for the role they're being submitted to.
In general, it's a good to bear in mind that you have decent reflection abilities to hand.
On the lack of encapsulation: Given that coffeescript implements a very nice class wrapper to the prototype scheme, I'm assuming you mean the lack of private variables? There are actually a number of ways you can hide details from clients, if you feel the need to, and I do; usually to stop myself from shooting my foot in the future. The key is usually to squirrel things away in closures.
Also, have a look at Object.__defineGetter__ / Object.defineProperty? Getters and setter can help a lot in these situations.
On reducing iteration time:
I was using the built in file watcher in coffee to compile the scripts on change. Coupled with TextMate's ability to save all open files on losing focus, this meant that testing was a matter of switching from textmate to chrome/firefox and hitting refresh. Quite fast.
On a node.js project though, I've setup my views to just compile and serve on the fly so even the file watcher is superfluous. They're cached in release, but in the debug mode they're always reloaded from disk, recompiled, and on encountering errors I just serve them up instead. So now every few minutes I switch to the browser, hit refresh and either see my test running, or the compiler errors.
I'm rendering a news feed.
I'm planning to use Backbone.js for my javascript stuff because I'm sick of doing manual DOM binds with JQuery.
So right now I'm looking at 2 options.
When the user loads the page, the "news feed" container is blank. But the page triggers a javascript which renders the items of the news feed onto the screen. This would tie into Backbone's models and collections, etc.
When the user loads the page, the "news feed" is rendered by the server. Even if javascript was turned off, the items would still show because the server rendered it via a templating engine.
I want to use Backbone.js to keep my javascript clean. So, I should pick #1, right?? But #1 is much more complicated than #2.
By the way, the reason I'm asking this question is because I don't want to use the routing feature of Backbone.js. I would load each page individually, and use Backbone for the individual items of the page. In other words, I'm using Backbone.js halfway.
If I were to use the routing feature of Backbone.js, then the obvious answer would be #1, right? But I'm afraid it would take too much time to build the route system, and time should be balanced into my equation as well.
I'm sorry if this question is confusing: I just want to know the best practice of using Backbone.js and saving time as well.
There are advantages and disadvantages to both, so I would say this: choose the option that is best for you, according to your requirements.
I don't know Backbone.js, so I'm going to keep my answer to client- versus server-side rendering.
Client-side Rendering
This approach allows you to render your structure quickly on the server-side, then let the user's JavaScript pick up the actual content.
Pros:
Quicker perceived user experience: if there's enough static content on the initial render, then the user gets their page back (or at least the beginning of it) quicker and they won't be bothered about the dynamic content, because in all likelihood that will render reasonably quickly too.
Better control of caching: By requiring that the browser makes multiple requests, you can set up your server to use different caching headers for each URL, depending on your requirements. In this way, you could allow users to cache the initial page render, but require that a user fetch dynamic (changing) content every time.
Cons:
User must have JavaScript enabled: This is an obvious one and I shouldn't even need to mention it, but you are cutting out a (very small) portion of your user base if you don't provide a graceful alternative to your JS-heavy site.
Complexity: This one is a little subjective, but in some ways it's just simpler to have everything in your server-side language and not require so much back-and-forth. Of course, it can go both ways.
Slow post-processing: This depends on the browser, but the fact is that if a lot of DOM manipulation or other post-processing needs to occur after retrieving the dynamic content, it might be faster to let the server do it if the server is underutilized. Most browsers are good at basic DOM manipulation, but if you have to do JSON parsing, sorting, arithmetic, etc., some of that might be faster on the server.
Server-side Rendering
This approach allows the user to receive everything at once and also caters to browsers that don't have good JavaScript support, but it also means everything takes a bit longer before the browser gets the first <html> tag.
Pros:
Content appears all at once: If your server is fast, it will render everything all at once, and that's that. No messy XmlHttpRequests (does anyone still use those directly?).
Quick post-processing: Just like you wouldn't want your application layer to do sorting of a database queryset because the database is faster, you might also want to reserve a good amount of processing on the server-side. If you design for the client-side approach, it's easy to get carried away and put the processing in the wrong place.
Cons:
Slower perceived user experience: A user won't be able to see a single byte until the server's work is all done. Sure, the server is probably going to zip through it, but it's still a few extra seconds on the user's side and you would do them a favor by rendering what you can right away.
Does not scale as well because server spends more time on requests: It might be that you really want the server to finish a request quickly and move on to the next connection.
Which of these are most important to your requirements? That should inform your decision.
I don't know backbone, but here's a simple thought: if at all possible and secure, do everything on the client instead of the server. That way the server has less work to do and can therefore handle more connections and scale better.
But #1 is much more complicated than #2.
Not really. Once you get your hang of Backbone and jQuery and client-side templating (and maybe throw CoffeeScript into the mix, too), then this is not really difficult. In fact, it greatly simplifies your server code, as all the display-related functions are now removed. You could also even have different clients (mobile version, for example) running against the same server.
Even if javascript was turned off, the items would still show because the server rendered it via a templating engine.
That is the important consideration here. If you want to support users without Javascript, then you need a non-JS version.
If you already have a non-JS version, you can think about if you still need the "enhanced" version, and if you do, if you want to re-use the server-side templating you already have coded and tested and need to maintain anyway, or duplicate the effort client-side, which adds development cost, but as you say may provide a superior experience and lower the load on the server (although I cannot imagine that fetching rendered data versus fetching XML data makes that much of a difference).
If you do not need to support users without Javascript, then by all means, render on the client.
I think Backbone's aim is to organize a Javascript in-page client application. But first of all you should take a position on the next statement:
Even if javascript was turned off, the web-app still works in "post-back mode".
Is that one of your requirements? (This is not a simple requirement.) If no, then I'll advice you: "Do more JS". But if yes then I believe your best friend is jQuery load function.
A Note: I'm a Java programmer and there's a lot of server-side frameworks that bring the ability to write applications that work ajax-ly when js is enabled and switch on post-backs when it isn't. I think Wicket and Echo2 are two of them but it's meant they are server-side libraries...
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am very experienced in engineering large-scale systems, but I am still relatively new to ajax-based design. I know how to use the apis, and I am fairly comfortable using jquery and javascript as a whole, but I often find myself thinking way too hard about the overall architecture.
Right now, my current application just has javascript files sprinkled all over the place, all in a /js directory. Most of them use jQuery, but some use YUI or a combination between the two because the features weren't available in jQuery.
Some of my REST server methods accept normal GET methods with request parameters, but others needed much more complex POSTs to handle the incoming data (lists of lists of objects). The handling of all of my ajax stuff is a mix and mash of different methods as a result of the complexity of the data I'm dealing with.
What I'd really like is to read about how to design an ajax-based system that is very clean and elegant architecturally, and is consistent from the simplest to the most complex of cases. Does such a resource exist?
Also suggestions on naming conventions of javascript files and conventions for ajax endpoint directory/method names?
Also how to do with entering form data? Should you use get or post to do this?
Also about validation of form data when all the constraints are already on the server? How to make this very trivial to do so you're not doing it for each form?
What are the best ways to generate new page content when people click things and settings this up so that it's easy to do over and over.
How to deal with application-specific javascript files depending on each other and managing this nicely.
I am also using Spring and Spring-MVC, but I don't expect this to make much difference. My questions are purely browser related.
There's a TL;DR summary at the end.
I can't really point you to a good resource for this as I haven't found one myself. However, all is not lost. You already have experience in developing large-scale applications and taking this knowledge into the browser-space doesn't require a lot of re-thinking.
First of all, unless your application is really trivial, I wouldn't start refactoring the entire codebase straight away because there are bound to be endless cases you haven't thought of yet.
Design the core architecture of the system you want first. In your case you probably want all your AJAX requests to go through one point. Select the XHR interface from either jQuery or YUI and write a wrapper around it that takes an option hash. All the XHR calls you write for new code go through there. This allows you to switch out the framework performing the XHR calls at any time with another framework or your own.
Next up, harmonize the wire protocol. I'd recommend using JSON and POST requests (POST requests have the additional benefit for FORM submissions of not being cached). Make a list of the different types of request/responses you need. For each of these responses, make a JS object to encapsulate them. (E.g. the form submission response is returned to the caller as a FormReponse object which has accessor functions for the validation errors, etc). The JS overhead for this is totally trivial and makes it easy to change the JSON protocol itself without going through your widget code to change the access of the raw JSON.
If you're dealing with a lot of forms, make sure they all have the same structure so you can use a JS object to serialize them. Most frameworks seem to have various native functions to do this, but I'd recommend rolling your own so you don't have to deal with shortcomings.
The added business value at this point is of course zero because all you have is the start of a sane way of doing things and even more JS code to load into your app.
If you have new code to write, write it on the APIs you've just implemented. That's a good way to see if you're not doing anything really stupid. Keep the other JS as it is for now but once you have to fix a bug or add a feature there, refactor that code to use your new APIs. Over time you'll find that the important code is all running on your APIs and a lot of the other stuff will slowly become obsolete.
Don't go overboard with re-inventing the wheel, though. Keep this new structure limited to data interaction and the HTTP wire and use your primary JS framework for handling anything related to the DOM, browser quirks, etc.
Also set up a global logger object and don't use console directly. Have your logger object use console or a custom DOM logger or whatever you need in different environments. That makes it easy to build in custom log levels, log filters, etc. Obviously you have to set up your build environment to scrub that code out for production builds (you do have a build process for this, right?)
My personal favorite for relatively sane JS source-code layout and namespacing is the Dojo framework. Object definitions relative to their namespace have obvious relations to their location on disk, there's a build system in place for custom builds, third-party modules, etc. The Dojo dependency/build system depends on dojo.require and dojo.provide statements in the code. When running on source a dojo.require statement will trigger the blocking load of the resource in question. For production the build system follows these statements and inserts the resource into the final bundle at that location. The documentation is a bit sparse, but it's definitely a good start for inspiration.
The TL;DR answer is,
Have all XHR calls go through a single interface
Don't pass raw response data back to higher levels
Do gradual refactoring of existing code
Harmonize the wire protocol
Use the dynamic power of JS for building light-weight proxy code
Sane code structure and call graphs look the same in JS as they do in other languages
Ajax Patterns is a pretty awesome book/site: http://ajaxpatterns.org/
Otherwise you may want to check out Advanced Ajax Architecture Best Practices
How you go about designing your site should be based on the features and size of your app. I would keep your focus there as opposed to looking for a architecture that works for all cases.
The previous programmer left the website in pretty unusable state, and I am having difficulty modifying anything. I am new to web design so I don't know whether my skills are a mismatch to this kind of job or is it normal in the real industry to have websites like these
The Home page includes three frames
Each of these frames have their own javascript functions ( between <head>, and also call other common javascript functions (using <script src=..>
Excessive usage of document.all - in fact the elements are referred or accessed by document.all only.
Excessive usage of XSLT and Web Services - Though I know that using Web Services is generally considered a good design choice - is there any other way I can consume these services other than using xslt. For example, the menu is created using the data returned by a web method.
Every <div>, <td> and every other element has an id, and these id's are manipulated by the javascript functions, and then some appropriate web service and the xslt files are loaded based on these..
From the security perspective, he used T-SQL's for xml auto for most of the data that is returned by the web service - is it a good choice from the security standpoint to expose the table names and column names to the end user??
I am a lot confused about the state of the application itself. Should I learn about the intricacies that he has developed and continue working on it, or should I start rewriting everything? What I am perplexed a lot is the lack of alternatives - and whether this is the common way web projects are handled in the real world or was it an exception?
Any suggestions, any pointers are welcome. Thanks
No, it is not acceptable in this industry that people keep writing un-maintainable code.
My advice to you is to go up the chain and convince everyone that this needs to be rewritten. If they question you, find an external consultant with relevant web development skills to review the application (for 1 day).
Keeping this website as-is, because it 'works' is like keeping a working model Ford-T car on today's highways, very dangerous. Security and maintenance costs are likely the most persuading topics to convince anyone against keeping this site 'as-is'.
Next, get yourself trained, it will pay off if you can rewrite this application knowing the basics. Todays technology (asp.net MVC) allows you to implement core business value faster than trying to maintain this unconventionally written app.
Tough spot for an inexperienced developer (or any) to be left in. I think you have a few hard weeks a head of you where you really need to read up on the technologies involved to get a better understanding of them and what is best practice. You will also need to really dig down into the existing code to understand how it all hangs together.
When you done all that you really need to think about your options. Usually re-writing something from scratch (especially if it actually works) is a bad idea. This obviously depend on the size of the project, for a smaller projects with only a couple of thousand lines of code it might be OK. When looking at someone elses code it is also easy to overlook that all that weird shit going on could actually be fixes for valid requirements. Things often start out looking neat, but then the real words comes visiting.
You will need to present the business with time estimates for re-writing to see if that is an option at all, but I'm guessing you will need to accept the way things are and do your best with what you have. Maybe you could gradually improves things.
I would recommend moving the project to MVC3 and rewriting the XSLT portions to function using views and/or partial views with MVC. The Razor model binding syntax is very clean and should be able to quickly cleave out the dirty XSLT code and be left with just the model properties you would need.
I would then have those web services invoked from MVC serverside and for you to deserialize the object results into real objects (or even just use straight XQuery or Json traversing to directly pull stuff out for your model) and bind those to your views.
This could be a rather gargantuan leap for technology at your company though. Some places have aversion to change.
I'd guess this was written 6-7 years ago, and hacked on since then. Every project accumulates a certain amount of bubble gum and duct tape. Sounds like this one's got it bad. I suggest breaking this up into bite size chunks. I assume that the site is actually working right now? So you don't want to break anything, the "business" often thinks "it was working just fine when the last guy was here."
Get a feel for your biggest pain points for maintaining the project, and what you'll get the biggest wins from fixing. a rewrite is great, if you have the time and support. But if it's a complex site, there's a lot to be said for a mature application. Mature in the sense that it fulfills the business needs, not that it's good code.
Also, working on small parts will get you better acquainted with the project and the business needs, so when you start the rewrite you'll have a better perspective.