My program processes many different types of documents to extract information from them.
It has a very generic structure to fit the hundreds of different types/formats of docs we use.
Processor Code :
Processor.prototype.process = function(){
var self = this;
var fields = self.processor_config;
var p = {};
for(key in fields){
if(fields.hasOwnProperty(key)){
p[key]=self._processKey(fields[key]);
if(typeof(p[key])=='undefined' || p[key]===''){
self.emit('warning', {
type: 'Problem parsing Key',
msg: 'Key : '+key,
doc: self.docName
});
}
}
}
self.emit('extracted',p);
};
The processKey() function then sorts out what to do based on the "type" field :
Productor.prototype._processKey = function(cnf) {
var self = this;
var value;
if(cnf.type=="css"){
value = self._processCSSKey(cnf);
}else if(cnf.type=="regexp"){
value = self._processRegexpKey(cnf);
}else if(cnf.type=="custom"){
value = self._processCustomKey(cnf);
}
return value;
};
Information on what to extract from each type of document comes from a mongodb collection :
Fictional config with 2 types of fields :
doc_name: "FormK7",
processor_config: {
company:{
type:"css"
selector:"p",
ord:3,
attr:{type:"text",parser:""}
},
litigation:{
type:"custom"
func:"(function(){var a =['123'];return a})()"
},
}
The example above is pretty useless, but the real thing has more complicated functions (not much though).
My custom processor looks like :
Productor.prototype._processCustomKey = function(cnf) {
var value = eval(cnf.func);
return value;
};
My problem is that I haven't found a way to process custom keys without using eval. And yet the simple mention of 'eval' brings to my mind pictures of an angry Douglas Crockford banishing me to a dark oblivion for all eternity...
Additional info :
In real life, the function stored in mongo is minified.
The generic processor is needed as having one processor per document type would be extremely wasteful (formats change all the time, there's hundreds of them and some are used once only...). So these functions need to be recorded in some way. And they are too different to be hardcoded...
There is no user input and the app is not accessible from the web. A malicious user would have access to the server before being able to inject code in mongo, so the security concern is extremely low.
So the question is the following :
Is eval really evil in this case, or is it a valid use case? Is there a better way/best practice to handle this?
You could create a module that exports from mongo to a temporal file and then use require to import that file. That way you wont be evaling and, like Bergi mentioned, it would prevent the scope problems that eval has.
Also, once you export it, and before requiring it, you could use something like Esprima to create an AST of the code and analyse it to see if it complies with some criteria you may have. (Like, perhaps, you could forbid the use of this in the code imported as a safety measure - thats part of the rules ADSafe has)
Related
So, I'm a big fan of creating global namespaces in javascript. For example, if my app is named Xyz I normally have an object XYZ which I fill with properties and nested objects, for an example:
XYZ.Resources.ErrorMessage // = "An error while making request, please try again"
XYZ.DAL.City // = { getAll: function() { ... }, getById: function(id) { .. } }
XYZ.ViewModels.City // = { .... }
XYZ.Models.City // = { .... }
I sort of picked this up while working on a project with Knockout, and I really like it because there are no wild references to some objects declare in god-knows-where. Everything is in one place.
Now. This is ok for front-end, however, I'm currently developing a basic skeleton for a project which will start in a month, and it uses Node.
What I wanted was, instead of all the requires in .js files, I'd have a single object ('XYZ') which would hold all requires in one place. For example:
Instead of:
// route.js file
var cityModel = require('./models/city');
var cityService = require('./services/city');
app.get('/city', function() { ...........});
I would make an object:
XYZ.Models.City = require('./models/city');
XYZ.DAL.City = require('./services/city');
And use it like:
// route.js file
var cityModel = XYZ.Models.City;
var cityService = XYZ.DAL.City;
app.get('/city', function() { ...........});
I don't really have in-depth knowledge but all of the requires get cached and are served, if cached, from memory so re-requiring in multiple files isn't a problem.
Is this an ok workflow, or should I just stick to the standard procedure of referencing dependencies?
edit: I forgot to say, would this sort-of-factory pattern block the main thread, or delay the starting of the server? I just need to know what are the downsides... I don't mind the requires in code, but I just renamed a single folder and had to go through five files to change the paths... Which is really inconvenient.
I think that's a bad idea, because you are going to serve a ton of modules every single time, and you may not need them always. Your namespaced object will get quite monstrous. require will check the module cache first, so I'd use standard requires for each request / script that you need on the server.
Based on a parameter, function should select a json file out of more 100 json and fire a query to other system.
There will lots of query around in hundreds.
Obviously if else and switch won't be manageable. I took a look for strategy patten in the javascript.
var queryCode = req.param('queryCode');
if(queryCode == 'x'){
//do something
} else if( queryCode == 'y'){
//do something
} else if( queryCode == 'z') {
//do something
}
// do something might become large sometimes...
so I want to replace it something like strategy pattern. Which will be the best design.
Thanks in advance for any suggestion for this problem.
First of all, your concern is really good, if/else chains are evil.
When you have some different behaviors -maybe long, maybe unrelated- and you have to choose one in runtime based on some variable value, there is not sense in creating a great list of if else. That would be hard to maintain, introduces risks, most probably is also mixing responsibilities in the same class, adding new behaviors is dirty (can break old things and implies modify already tested classes, add new different responsibilities to an already working class) and many other reasons.
You was already correct mentioning the Strategy pattern. This will be the one that fits better your problem. You can also take a look at the Command Pattern but the general concept will be the same: encapsulate the different behaviors in separate classes.
Then you can use a Factory to retrieve the correct strategy to use.
In a nutshell, you will have a bunch of strategy classes, all implementing a method, lets say execute
//strategyA.js
function StrategyA(){
}
StrategyA.prototype = {
execute: function() {
//custom behavior here
}
}
module.exports = StrategyA;
//strategyB.js
function StrategyB(){
}
StrategyB.prototype = {
execute: function() {
//custom behavior here
}
}
module.exports = StrategyB;
Then you create the factory class, that create the correct class according to a parameter. The mapping value->class ideally would be on a confing file and then register it to the factory class, but for simplicity you can hardcode it in the same file. Something like this:
//factory.js
var StrategyA = require('./strategyA.js'),
StrategyB = require('./strategyB.js');
var _ = require('underscore');//asuming you have underscore
module.exports = function () {
var serviceDescriptions: [
{ name: 'a', service: StrategyA},
{name: 'b', service: StrategyB}
];
var getStrategy: function (name) {
//asuming you have underscore, otherwise, just iterate the array to look for the proper service
return _.find(this.serviceDescriptions, {name: name}).service;
};
}
With all this, starting is more complex, but specially if you have a lot of different strategies, or have to add more in the future, would be a good investment in the midterm. And your main code will be just something as simple as:
var Factory = require("factory.js");
...
var queryCode = req.param('queryCode');
var strategy = Factory.getStrategy(queryCode);
strategy.execute()
So, no matter how many different behaviors you have, or how long or complex or different they are, your main class will always look the same, simple and easy to follow.
I am reading through the Meteor example app "todos" for learning purposes.
They use some all caps var and store them in Session.
It's defined at the first line:
var EDITING_KEY = 'EDITING_TODO_ID';
And used many times. For example:
Template.todosItem.helpers({
//...
editingClass: function() {
return Session.equals(EDITING_KEY, this._id) && 'editing';
}
});
Template.todosItem.events({
'blur input[type=text]': function(event) {
if (Session.equals(EDITING_KEY, this._id))
Session.set(EDITING_KEY, null);
},
//...
});
What is it and what makes it special?
EDITING_KEY is a file-scoped "constant" defined in todos-item.js used to reference the currently edited item minimongo _id in the global reactive persistent client-side dictionary Session.
It is used to avoid having to write the same string again and again everywhere, in that case 'EDITING_TODO_ID'. Writing it everywhere can lead to dumb bugs caused by typos like your templates not updating because you wrote 'EDITNG' instead of 'EDITING'.
Since Session simply needs a string as first parameter, these two lines do the very same thing :
Session.get(EDITING_KEY)
Session.get('EDITING_TODO_ID')
The example project uses this multiple times to avoid bugs and make auto-completion nicer.
You can see some more examples in other files, such as at the top of app-body.js :
var MENU_KEY = 'menuOpen';
Session.setDefault(MENU_KEY, false);
var USER_MENU_KEY = 'userMenuOpen';
Session.setDefault(USER_MENU_KEY, false);
var SHOW_CONNECTION_ISSUE_KEY = 'showConnectionIssue';
Session.setDefault(SHOW_CONNECTION_ISSUE_KEY, false);
You could go further and define those in a global key registry that would make sure there is no duplicated key, for example with an underlying Set. That could be a fun thing to do to train.
Since Meteor now supports ES2015 this should be rewritten to const EDITING_KEY = 'EDITING_TODO_ID' to avoid overwriting it by accident.
This appears just to be a variable that tracks what to-do is currently being edited. There's nothing special about it being in all-caps.
Are there any dangers/caveats one should be aware of when creating JavaScript namespaces?
Our project is fairly expansive and we are running a lot of JavaScript files (20+, expecting more). It is impossible to have any code maintainability without using namespaces, so we are implementing them like so:
var namespace1 = {
doSomething: function() {
...
},
doSomethingElse: function() {
...
}
}
And then to create hierarchies, we link them like so:
var globalNamespace = {
functions1: namespace1,
functions2: namespace2,
...
}
This works fine, but it is essentially a "trick" to make JS behave as if it did have namespaces. Although this method gets used a lot, most literature on this seems to focus on how to do it, and not whether there are any possible drawbacks. As we write more JS code, this is quickly becoming an integral part of the way our system works. So it's important that it works seamlessly.
Were there any situations in which this "induced" namespace system caused you errors, or otherwise needed special attention? Can we safely expect identical behaviour across all browsers?
The way you define namespaces in your example it appears to create globals out of each namespace so you end up with
window.namespace1
window.namespace2
window.globalNamespace
window.globalNamespace.namespace1
window.globalNamespace.namespace2
So if you have anything that clobbers window.namespace1 it will also clobber window.globalNamespace.namespace1
edit:
Here's how we got around this problem:
namespacing = {
init: function(namespace) {
var spaces = [];
namespace.split('.').each(function(space) {
var curSpace = window,
i;
spaces.push(space);
for (i = 0; i < spaces.length; i++) {
if (typeof curSpace[spaces[i]] === 'undefined') {
curSpace[spaces[i]] = {};
}
curSpace = curSpace[spaces[i]];
}
});
}
};
Then you use it like this:
namespacing.init('globalNamespace.namespace1');
globalNamespace.namespace1.doSomething = function() { ... };
This way you don't have to introduce new global variables and you can confidently add to an existing namespace without clobbering other objects in it.
Since you are basically adding functions to objects and those objects into other objects, I would expect each browser to handle this the same way.
But if you want modularity, why not use a (relatively) simple framework like require.js? That will allow you and your team to write code in a modular fashion and allows the team to 'import' these modules where needed:
require(["helper/util"], function() {
//This function is called when scripts/helper/util.js is loaded.
});
Require.js will take care of dependencies, and it will also prevent polluting the global namespace.
We use a similar system at work and it does the job just fine. I don't see any drawbacks there could be; it's just objects and properties. For that same reason, cross browser compatibility should be good. You can end up having to write some long names to resolve to a particular function, like Foo.Bar.Test.Namespace2.Function, but even then that can be solved by assigning it to a variable before hand.
This is how I'd recommend doing it, so you stay out of the global scope entirely except for your "base" namespace. We do something similar where I work. Let's say you work for Acme co, and want ACME to be your base namespace.
At the top of every file, you'd include:
if (!window.ACME) { window.ACME = {} }
Then you just go and define whatever you want in terms of that.
ACME.Foo = {
bar: function () { console.log("baz"); }
}
If you want a deeper level of namespace, you just do the same thing for each level.
if (!window.ACME) { window.ACME = {} }
if (!ACME.Foo) { ACME.Foo = {} }
This way each file can be tested independently and they'll set up the namespace infrastructure automatically, but when you compile them together or if you test multiple files simultaneously, they won't keep overwriting things that are already defined.
I have some constants in JavaScript that I'd like to reuse in several files while saving typing, reducing bugs from mistyping, keeping runtime performance high, and being useful on either the node.js server scripts or on the client web browser scripts.
example:
const cAPPLE = 17;
const cPEAR = 23;
const cGRAPE = 38;
...(some later js file)...
for...if (deliciousness[i][cAPPLE] > 45) ...
Here are some things I could do:
copy/paste const list to top of each file where used. Oh, Yuck. I'd rather not. This is compatible with keeping the constant names short and simple. It violates DRY and invites all sorts of awful bugs if anything in the list changes.
constant list ---> const.js
on browser, this is FINE ... script gets fed in by the html file and works fine.
but on node.js, the require mechanism changes the constant names, interfering with code reuse and requiring more typing, because of how require works....
AFAIK This doesn't work, by design, in node.js, for any const.js without using globals:
require('./const.js');
for...if...deliciousness[i][cAPPLE] > 45 ...;
This is the node.js way:
(... const.js ....)
exports.APPLE = 17;
(... dependency.js ... )
var C = require('./const.js');
for...if...deliciousness[i][C.APPLE] > 45.....
so I would either have to have two files of constants, one for the node.js requires and one for the browser, or I have to go with something further down the list...
3 make the constants properties of an object to be imported ... still needs two files... since the node.js way of importing doesn't match the browser. Also makes the names longer and probably takes a little more time to do the lookups which as I've hinted may occur in loops.
4 External constant list, internal adapter.... read the external constants, however stored, into internal structure in each file instead of trying to use the external list directly
const.js
exports.cAPPLE = 17
browser.js
const cAPPLE = exports.cAPPLE;
...code requiring cAPPLE...
node.js
CONST = require(./const.js)
const cAPPLE = CONST.cAPPLE;
...code requiring cAPPLE...
This requires a one-time-hit per file to write the code to extract the constants back out, and so would duplicate a bunch of code over and over in a slightly more evolved cut and paste.
It does allows the code requiring cAPPLE to continue to work based on use of short named constants
Are there any other solutions, perhaps a more experienced JavaScripter might know, that I might be overlooking?
module.exports = Object.create({},{
"foo": { value:"bar", writable:false, enumerable:true }
});
Properties are not writable. Works in strict mode unlike "const".
I would just make them global keys:
...(module consts.js)...
global.APPLE = 17;
global.PEAR = 23;
global.GRAPE = 38;
...(some later js file)...
var C = require('./const.js');
for (var i = 0; i < something.length; i++) {
if (deliciousness[i][global.APPLE] > 45) { blah(); }
}
They wouldn't be enforced constants, but if you stick to the ALL_CAPS naming convention for constants it should be apparent that they shouldn't be altered. And you should be able to reuse the same file for the browser if you include it and use it like so:
var global = {};
<script src="const.js"></script>
<script>
if (someVar > global.GRAPE) { doStuff(); }
</script>
You can make an object unwritable using Object.freeze .
var configs={
ENVIRONMENT:"development",
BUILDPATH:"./buildFiles/",
}
Object.freeze(configs);
module.exports=configs;
Than you can use it as constant
var config=require('config');
// config.BUILDPATH will act as constant and will be not writable.