I'm building an app which has a feature for embedding expressions/rules in a config yaml file. So for example user can reference a variable defined in yaml file like ${variables.name == 'John'} or ${is_equal(variables.name, 'John')}. I can probably get by with simple expressions but I want to support complex rules/expressions such ${variables.name == 'John'} and (${variables.age > 18} OR ${variables.adult == true})
I'm looking for a parsing/dsl/rules-engine library that can support these type of expressions and normalize it. I'm open using ruby, javascript, java, or python if anyone knows of a library for that languages.
One option I thought of was to just support javascript as conditons/rules and basically pass it through eval with the right context setup with access to variables and other reference-able vars.
I don't know if you use Golang or not, but if you use it, I recommend this https://github.com/antonmedv/expr.
I have used it for parsing bot strategy that (stock options bot). This is from my test unit:
func TestPattern(t *testing.T) {
a := "pattern('asdas asd 12dasd') && lastdigit(23asd) < sma(50) && sma(14) > sma(12) && ( macd(5,20) > macd_signal(12,26,9) || macd(5,20) <= macd_histogram(12,26,9) )"
r, _ := regexp.Compile(`(\w+)(\s+)?[(]['\d.,\s\w]+[)]`)
indicator := r.FindAllString(a, -1)
t.Logf("%v\n", indicator)
t.Logf("%v\n", len(indicator))
for _, i := range indicator {
t.Logf("%v\n", i)
if strings.HasPrefix(i, "pattern") {
r, _ = regexp.Compile(`pattern(\s+)?\('(.+)'\)`)
check1 := r.ReplaceAllString(i, "$2")
t.Logf("%v\n", check1)
r, _ = regexp.Compile(`[^du]`)
check2 := r.FindAllString(check1, -1)
t.Logf("%v\n", len(check2))
} else if strings.HasPrefix(i, "lastdigit") {
r, _ = regexp.Compile(`lastdigit(\s+)?\((.+)\)`)
args := r.ReplaceAllString(i, "$2")
r, _ = regexp.Compile(`[^\d]`)
parameter := r.FindAllString(args, -1)
t.Logf("%v\n", parameter)
} else {
}
}
}
Combine it with regex and you have good (if not great, string translator).
And for Java, I personally use https://github.com/ridencww/expression-evaluator but not for production. It has similar feature with above link.
It supports many condition and you don't have to worry about Parentheses and Brackets.
Assignment =
Operators + - * / DIV MOD % ^
Logical < <= == != >= > AND OR NOT
Ternary ? :
Shift << >>
Property ${<id>}
DataSource #<id>
Constants NULL PI
Functions CLEARGLOBAL, CLEARGLOBALS, DIM, GETGLOBAL, SETGLOBAL
NOW PRECISION
Hope it helps.
You might be surprised to see how far you can get with a syntax parser and 50 lines of code!
Check this out. The Abstract Syntax Tree (AST) on the right represents the code on the left in nice data structures. You can use these data structures to write your own simple interpreter.
I wrote a little example of one:
https://codesandbox.io/s/nostalgic-tree-rpxlb?file=/src/index.js
Open up the console (button in the bottom), and you'll see the result of the expression!
This example can only handle (||) and (>), but looking at the code (line 24), you can see how you could make it support any other JS operator. Just add a case to the branch, evaluate the sides, and do the calculation on JS.
Parenthesis and operator precedence are all handled by the parser for you.
I'm not sure if this is the solution for you, but it will for sure be fun ;)
One option I thought of was to just support javascript as
conditons/rules and basically pass it through eval with the right
context setup with access to variables and other reference-able vars.
I would personally lean towards something like this. If you are getting into complexities such as logic comparisons, a DSL can become a beast since you are basically almost writing a compiler and a language at that point. You might want to just not have a config, and instead have the configurable file just be JavaScript (or whatever language) that can then be evaluated and then loaded. Then whoever your target audience is for this "config" file can just supplement logical expressions as needed.
The only reason I would not do this is if this configuration file was being exposed to the public or something, but in that case security for a parser would also be quite difficult.
I did something like that once, you can probably pick it up and adapt it to your needs.
TL;DR: thanks to Python's eval, you doing this is a breeze.
The problem was to parse dates and durations in textual form. What I did was to create a yaml file mapping regex pattern to the result. The mapping itself was a python expression that would be evaluated with the match object, and had access to other functions and variables defined elsewhere in the file.
For example, the following self-contained snippet would recognize times like "l'11 agosto del 1993" (Italian for "August 11th, 1993,).
__meta_vars__:
month: (gennaio|febbraio|marzo|aprile|maggio|giugno|luglio|agosto|settembre|ottobre|novembre|dicembre)
prep_art: (il\s|l\s?'\s?|nel\s|nell\s?'\s?|del\s|dell\s?'\s?)
schema:
date: http://www.w3.org/2001/XMLSchema#date
__meta_func__:
- >
def month_to_num(month):
""" gennaio -> 1, febbraio -> 2, ..., dicembre -> 12 """
try:
return index_in_or(meta_vars['month'], month) + 1
except ValueError:
return month
Tempo:
- \b{prep_art}(?P<day>\d{{1,2}}) (?P<month>{month}) {prep_art}?\s*(?P<year>\d{{4}}): >
'"{}-{:02d}-{:02d}"^^<{schema}>'.format(match.group('year'),
month_to_num(match.group('month')),
int(match.group('day')),
schema=schema['date'])
__meta_func__ and __meta_vars (not the best names, I know) define functions and variables that are accessible to the match transformation rules. To make the rules easier to write, the pattern is formatted by using the meta-variables, so that {month} is replaced with the pattern matching all months. The transformation rule calls the meta-function month_to_num to convert the month to a number from 1 to 12, and reads from the schema meta-variable. On the example above, the match results in the string "1993-08-11"^^<http://www.w3.org/2001/XMLSchema#date>, but some other rules would produce a dictionary.
Doing this is quite easy in Python, as you can use exec to evaluate strings as Python code (obligatory warning about security implications). The meta-functions and meta-variables are evaluated and stored in a dictionary, which is then passed to the match transformation rules.
The code is on github, feel free to ask any questions if you need clarifications. Relevant parts, slightly edited:
class DateNormalizer:
def _meta_init(self, specs):
""" Reads the meta variables and the meta functions from the specification
:param dict specs: The specifications loaded from the file
:return: None
"""
self.meta_vars = specs.pop('__meta_vars__')
# compile meta functions in a dictionary
self.meta_funcs = {}
for f in specs.pop('__meta_funcs__'):
exec f in self.meta_funcs
# make meta variables available to the meta functions just defined
self.meta_funcs['__builtins__']['meta_vars'] = self.meta_vars
self.globals = self.meta_funcs
self.globals.update(self.meta_vars)
def normalize(self, expression):
""" Find the first matching part in the given expression
:param str expression: The expression in which to search the match
:return: Tuple with (start, end), category, result
:rtype: tuple
"""
expression = expression.lower()
for category, regexes in self.regexes.iteritems():
for regex, transform in regexes:
match = regex.search(expression)
if match:
result = eval(transform, self.globals, {'match': match})
start, end = match.span()
return (first_position + start, first_position + end) , category, result
Here are some categorized Ruby options and resources:
Insecure
Pass expression to eval in the language of your choice.
It must be mentioned that eval is technically an option, but extraordinary trust must exist in its inputs and it is safer to avoid it altogether.
Heavyweight
Write a parser for your expressions and an interpreter to evaluate them
A cost-intensive solution would be implementing your own expression language. That is, to design a lexicon for your expression language, implement a parser for it, and an interpreter to execute the code that's parsed.
Some Parsing Options (ruby)
Parslet
TreeTop
Citrus
Roll-your-own with StringScanner
Medium Weight
Pick an existing language to write expressions in and parse / interpret those expressions.
This route assumes you can pick a known language to write your expressions in. The benefit is that a parser likely already exists for that language to turn it into an Abstract Syntax Tree (data structure that can be walked for interpretation).
A ruby example with the Parser gem
require 'parser'
class MyInterpreter
# https://whitequark.github.io/ast/AST/Processor/Mixin.html
include ::Parser::AST::Processor::Mixin
def on_str(node)
node.children.first
end
def on_int(node)
node.children.first.to_i
end
def on_if(node)
expression, truthy, falsey = *node.children
if process(expression)
process(truthy)
else
process(falsey)
end
end
def on_true(_node)
true
end
def on_false(_node)
false
end
def on_lvar(node)
# lookup a variable by name=node.children.first
end
def on_send(node, &block)
# allow things like ==, string methods? whatever
end
# ... etc
end
ast = Parser::ConcurrentRuby.parse(<<~RUBY)
name == 'John' && adult
RUBY
MyParser.new.process(ast)
# => true
The benefit here is that a parser and syntax is predetermined and you can interpret only what you need to (and prevent malicious code from executing by controller what on_send and on_const allow).
Templating
This is more markup-oriented and possibly doesn't apply, but you could find some use in a templating library, which parses expressions and evaluates for you. Control and supplying variables to the expressions would be possible depending on the library you use for this. The output of the expression could be checked for truthiness.
Liquid
Jinja
Some toughs and things you should consider.
1. Unified Expression Language (EL),
Another option is EL, specified as part of the JSP 2.1 standard (JSR-245). Official documentation.
They have some nice examples that can give you a good overview of the syntax. For example:
El Expression: `${100.0 == 100}` Result= `true`
El Expression: `${4 > 3}` Result= `true`
You can use this to evaluate small script-like expressions. And there are some implementations: Juel is one open source implementation of the EL language.
2. Audience and Security
All the answers recommend using different interpreters, parser generators. And all are valid ways to add functionality to process complex data. But I would like to add an important note here.
Every interpreter has a parser, and injection attacks target those parsers, tricking them to interpret data as commands. You should have a clear understanding how the interpreter's parser works, because that's the key to reduce the chances to have a successful injection attack Real world parsers have many corner cases and flaws that may not match the specs. And have clear the measures to mitigate possible flaws.
And even if your application is not facing the public. You can have external or internal actors that can abuse this feature.
I'm building an app which has a feature for embedding expressions/rules in a config yaml file.
I'm looking for a parsing/dsl/rules-engine library that can support these type of expressions and normalize it. I'm open using ruby, javascript, java, or python if anyone knows of a library for that languages.
One possibility might be to embed a rule interpreter such as ClipsRules inside your application. You could then code your application in C++ (perhaps inspired by my clips-rules-gcc project) and link to it some C++ YAML library such as yaml-cpp.
Another approach could be to embed some Python interpreter inside a rule interpreter (perhaps the same ClipsRules) and some YAML library.
A third approach could be to use Guile (or SBCL or Javascript v8) and extend it with some "expert system shell".
Before starting to code, be sure to read several books such as the Dragon Book, the Garbage Collection handbook, Lisp In Small Pieces, Programming Language Pragmatics. Be aware of various parser generators such as ANTLR or GNU bison, and of JIT compilation libraries like libgccjit or asmjit.
You might need to contact a lawyer about legal compatibility of various open source licenses.
I'm using Scala.js, and have written a trait that is implemented for both JVM and JS. I'm using third-party JVM and JS libraries to implement it in the two sides, which should provide functionally equivalent results in the JVM and browser. But, I need to write a test to verify that!
If I were just testing two vanilla Scala implementations, I'd know how to do it. I'd write generators of the trait's inputs, and drive each function from those, comparing the results of each. (I can assume that either the function results are booleans, integers, longs, strings, collections of same, or could be toString()'d.)
Is anyone out there doing this kind of testing?
How would I do this where one implementation is in Javascript? Phantom? (Can I pass a generated JS file to it, rather than simple JS-as-strings?) Something else?
You can use Scala's reflective toolbox in a macro to execute your test code at compilation time (on the JVM). You can then use the result and generate code that compares the value.
So we want to write a macro, that given the following code:
FuncTest.test { (1.0).toString }
Can generate something like this:
assert("1.0" == (1.0).toString)
This actually sounds harder than in is. Let's start with a macro skeleton for FuncTest:
import scala.language.experimental.macros
import scala.reflect.macros.blackbox.Context
object FuncTest {
def test[T](x: => T): Unit = macro FuncTestImpl.impl[T]
}
class FuncTestImpl(val c: Context) {
import c.universe._
def impl[T : WeakTypeTag](x: Tree): Tree = ???
}
Inside impl, we want to run the code in x and then generate an assertion (or whatever suits the test framework you use):
import scala.reflect.runtime.{universe => ru}
import scala.tools.reflect.ToolBox
def impl[T : WeakTypeTag](x: Tree): Tree = {
// Make a tool box (runtime compiler and evaluater)
val mirror = ru.runtimeMirror(getClass.getClassLoader)
val toolBox = mirror.mkToolBox()
// Import trees from compile time to runtime universe
val importer = ru.mkImporter(c.universe)
val tree = toolBox.untypecheck(importer.importTree(x))
// Evaluate expression and make a literal tree
val result = toolBox.eval(tree)
val resultTree = reifyLiteral(result)
// Emit assertion
q"assert($x == $resultTree)"
}
The only problem we have, is reifyLiteral. It basically is supposed to take an arbitrary value and create a literal out of it. This is hard / impossible in general. However, it is very easy for some basic values (primitives, strings, etc.):
/** Creates a literal tree out of a value (if possible) */
private def reifyLiteral(x: Any): Tree = x match {
case x: Int => q"$x"
case x: String => q"$x"
// Example for Seq
case x: Seq[_] =>
val elems = x.map(reifyLiteral)
q"Seq(..$elems)"
case _ =>
c.abort(c.enclosingPosition, s"Cannot reify $x of type ${x.getClass}")
}
That's it. You can now write:
FuncTest.test { /* your code */ }
To automatically generate tests for computational libraries.
Caveat The toolbox does not get the right classpath injected at the moment. So if you use an external library (which I assume you do), you will need to tweak that as well. Let me know if you need help there.
Recently I have seen people talk about the using of macros in JavaScript. I have no idea what that means and after looking up documentation on MDN I came up without any answer. So that leads me to my question …
What are JavaScript macros?
How/why are they used?
Is it a form of meta-programming?
Answers with examples and example code would be appreciated greatly.
As has been posted in comments it's a macro system for javascript.
It's a way of defining text replacements in the "pre-processing phase". So you would define a macro and use it in your code, then run them both through sweet.js and the output would be code with text replacements.
example:
macro swap {
rule { ($a, $b) } => {
var tmp = $a;
$a = $b;
$b = tmp;
}
}
var a = 10;
var b = 20;
swap (a, b)
After running this through sweet.js we get the expanded code:
var a$1 = 10;
var b$2 = 20;
var tmp$3 = a$1;
a$1 = b$2;
b$2 = tmp$3;
I think the use case for this is more centered around frameworks and the likes. It's a more flexible way of saving lines of code than a function.
In JS, a function can basically either
Mathematically calculate something based off of it's inputs
Perform some kind of side effect on another variable within the scope of the function
But Macros are more flexible, it's like code that just takes in inputs and actually equates to text that can be compiled as regular code. It's a common feature in languages that prioritize code prettiness over code-traceability. A couple of notable examples are Ruby, Rust, and Elixir.
Here's an example of a ruby macro and what the equivalent code would look like in js.
In ruby you can do this to tell a class to have certain relationship methods in it's ORM
class Movie < ActiveRecord::Base
has_many :reviews
end
In this case, saying has_many :reviews dumps a bunch of methods onto the Movie class. So because of that line you can call movie.reviews.
In TypeORM right now you could do something like
class Movie {
#OneToMany(() => Review, review => review.movie)
reviews: Review[];
}
If macros made it into js, you could clean this up to look more like
class Movie {
oneToMany(Review, "reviews", "movie")
}
My guess is that this won't happen any time soon. IMO one of the things you lose with macros, is that it becomes less clear what your code actually does, I also have to imagine it would be a pretty big change for linters and type checkers. There's also other ways of dumping a bunch of functionality into an object or function.
For example in react hook form you can accomplish something similar using closures and spreading.
<input {...register("firstName")} placeholder="Bill" />
I am new to Rhino parser. Can i use this rhino parser in javascript code to extract the Abstract Syntax Tree of javascript code in any html file. If so ho should i start this.This is for Analyzing AST of the code for computing the ratio between keywords and words used in javascript, to identify common decryption schemes, and to calculate the occurrences of certain classes of function calls such as fromCharCode(), eval(),and some string functions that are commonly used for the decryption
and execution of drive-by-download exploits.
As far as I know, you can't access the AST from JavaScript in Rhino. I would look at the Esprima parser though. It's a complete JavaScript parser written in JavaScript and it has a simple API for doing code analysis.
Here's a simple example that calculates the keyword to identifier ratio:
var tokens = esprima.parse(script, { tokens: true }).tokens;
var identifierCount = 0;
var keywordCount = 0;
tokens.forEach(function (token) {
if (token.type === 'Keyword') {
keywordCount++;
}
else if (token.type === 'Identifier') {
identifierCount++;
}
});
var ratio = keywordCount / identifierCount;
I know that JavaScript doesn't support macros (Lisp-style ones) but I was wondering if anyone had a solution to maybe simulate macros? I Googled it, and one of the solutions suggested using eval(), but as he said, would be quite costly.
They don't really have to be very fancy. I just want to do simple stuff with them. And it shouldn't make debugging significantly harder :)
You could use parenscript. That'll give you macros for Javascript.
A library by Mozilla (called SweetJS) is designed to simulate macros in JavaScript. For example, you can use SweetJS to replace the function keyword with def.
One can also now use ClojureScript to compile clojure to javascript and get macros that way. Note ClojureScript uses Google Closure.
I've written a gameboy emulator in javascript and I simulate macros for cpu emulation this way:
macro code (the function returns a string with the macro code):
function CPU_CP_A(R,C) { // this function simulates the CP instruction,
return ''+ // sets CPU flags and stores in CCC the number
'FZ=(RA=='+R+');'+ // of cpu cycles needed
'FN=1;'+
'FC=RA<'+R+';'+
'FH=(RA&0x0F)<('+R+'&0x0F);'+
'ICC='+C+';';
}
Using the "macro", so the code is generated "on the fly" and we don't need to make function calls to it or write lots of repeated code for each istruction...
OP[0xB8]=new Function(CPU_CP_A('RB',4)); // CP B
OP[0xB9]=new Function(CPU_CP_A('RC',4)); // CP C
OP[0xBA]=new Function(CPU_CP_A('RD',4)); // CP D
OP[0xBB]=new Function(CPU_CP_A('RE',4)); // CP E
OP[0xBC]=new Function('T1=HL>>8;'+CPU_CP_A('T1',4)); // CP H
OP[0xBD]=new Function('T1=HL&0xFF;'+CPU_CP_A('T1',4)); // CP L
OP[0xBE]=new Function('T1=MEM[HL];'+CPU_CP_A('T1',8)); // CP (HL)
OP[0xBF]=new Function(CPU_CP_A('RA',4)); // CP A
Now we can execute emulated code like this:
OP[MEM[PC]](); // MEM is an array of bytes and PC the program counter
Hope it helps...
function unless(condition,body) {
return 'if(! '+condition.toSource()+'() ) {' + body.toSource()+'(); }';
}
eval(unless( function() {
return false;
}, function() {
alert("OK");
}));
LispyScript is the latest language that compiles to Javascript, that supports macros. It has a Lisp like tree syntax, but also maintains the same Javascript semantics.
Disclaimer: I am the author of LispyScript.
Check out the Linux/Unix/GNU M4 processor. It is a generic and powerful macro processor for any language. It is especially oriented towards Algol-style languages of which JavaScript is a member.
Javascript is interpreted. Eval isn't any more costly that anything else in Javascript.