I am having trouble understanding how to properly use Visitors in ANTLR4, Javascript target.
I have prepared a very basic grammar, it accepts INT + INT or INT - INT operations.
grammar PlusMinus;
INT : [0-9]+;
WS : [ \t\r]+ -> skip;
PLUS : '+';
MINUS : '-';
input : plusOrMinus
;
plusOrMinus
: numberLeft PLUS numberRight # Plus
| numberLeft MINUS numberRight # Minus
;
numberLeft : INT;
numberRight : INT;
From this grammar ANTLR will generate a Visitor that has these three functions, visitInput, visitPlus and visitMinus. I start from visitInput where I will be able to fetch the operation ctx by doing this operation = ctx.plusOrMinus().
This is where I get stuck, how do I know if operation is of type plus or minus? In other words, where do I pass ctx.plusOrMinux(), to visitPlus() or visitMinus()?
I managed to create a visitor that does work, but it's very ugly, I am posting it here because perhaps it will help to better understand my question. Lines 20-29 is where the problem is.
First of all... PLUS and MINUS are lexer rules. You don't visit tokens (the result of lexer rules).
It rather looks like you're expecting this to work like a listener (where you set up your function that gets called when the tree walker reaches that node. You can be called on enter or exit from the node (depends on whether you want to get the node before or after you've processed it's children). Visitors expect you to handle your own tree navigation, which is sometimes useful, but listeners are cleaner where they suit the purpose. With nesting, You'll probably want to listen after the children nodes are processed, so you'll want to implement an exitPlusOrMins() function on your listener. I'd suggest stopping your code in the debugger inside this function to take a look at the objects you have available to you (in the ctx object).
(You also need to rethink your numberLeft and numberRIght parser rules. Something more like:
plusOrMinus: lexpr=INT (op=PLUS | op=MINUS) rexpr=INT;
would give you a pretty close equivalent to what you have so far. What you have will work with a recursive descent parser like ANTLR (so far as this example goes), but you're headed in the wrong direction making them different parse rules. Specifically, by making them two alternative parse rules, you're giving PLUS a higher precedence than minus, and PLUS and MINUS should have the same precedence in order of evaluation. As a result, they need to be the same parse rule.). When you place alternatives like this in a parser rule, you're also establishing precedence, so be careful about the order of these rules.
To get further than adding or subtracting integers, though, you'll need lexpr and rexpr to actually be expressions themselves (you should read up on expression parsing in the ANTLR book; it's covered very nicely).
With that rule, your exitPlusOrMinus can parse the int values of lexpr and rexpr and then evaluate the value of the op to determine whether to add or subtract.
Related
I'm building an app which has a feature for embedding expressions/rules in a config yaml file. So for example user can reference a variable defined in yaml file like ${variables.name == 'John'} or ${is_equal(variables.name, 'John')}. I can probably get by with simple expressions but I want to support complex rules/expressions such ${variables.name == 'John'} and (${variables.age > 18} OR ${variables.adult == true})
I'm looking for a parsing/dsl/rules-engine library that can support these type of expressions and normalize it. I'm open using ruby, javascript, java, or python if anyone knows of a library for that languages.
One option I thought of was to just support javascript as conditons/rules and basically pass it through eval with the right context setup with access to variables and other reference-able vars.
I don't know if you use Golang or not, but if you use it, I recommend this https://github.com/antonmedv/expr.
I have used it for parsing bot strategy that (stock options bot). This is from my test unit:
func TestPattern(t *testing.T) {
a := "pattern('asdas asd 12dasd') && lastdigit(23asd) < sma(50) && sma(14) > sma(12) && ( macd(5,20) > macd_signal(12,26,9) || macd(5,20) <= macd_histogram(12,26,9) )"
r, _ := regexp.Compile(`(\w+)(\s+)?[(]['\d.,\s\w]+[)]`)
indicator := r.FindAllString(a, -1)
t.Logf("%v\n", indicator)
t.Logf("%v\n", len(indicator))
for _, i := range indicator {
t.Logf("%v\n", i)
if strings.HasPrefix(i, "pattern") {
r, _ = regexp.Compile(`pattern(\s+)?\('(.+)'\)`)
check1 := r.ReplaceAllString(i, "$2")
t.Logf("%v\n", check1)
r, _ = regexp.Compile(`[^du]`)
check2 := r.FindAllString(check1, -1)
t.Logf("%v\n", len(check2))
} else if strings.HasPrefix(i, "lastdigit") {
r, _ = regexp.Compile(`lastdigit(\s+)?\((.+)\)`)
args := r.ReplaceAllString(i, "$2")
r, _ = regexp.Compile(`[^\d]`)
parameter := r.FindAllString(args, -1)
t.Logf("%v\n", parameter)
} else {
}
}
}
Combine it with regex and you have good (if not great, string translator).
And for Java, I personally use https://github.com/ridencww/expression-evaluator but not for production. It has similar feature with above link.
It supports many condition and you don't have to worry about Parentheses and Brackets.
Assignment =
Operators + - * / DIV MOD % ^
Logical < <= == != >= > AND OR NOT
Ternary ? :
Shift << >>
Property ${<id>}
DataSource #<id>
Constants NULL PI
Functions CLEARGLOBAL, CLEARGLOBALS, DIM, GETGLOBAL, SETGLOBAL
NOW PRECISION
Hope it helps.
You might be surprised to see how far you can get with a syntax parser and 50 lines of code!
Check this out. The Abstract Syntax Tree (AST) on the right represents the code on the left in nice data structures. You can use these data structures to write your own simple interpreter.
I wrote a little example of one:
https://codesandbox.io/s/nostalgic-tree-rpxlb?file=/src/index.js
Open up the console (button in the bottom), and you'll see the result of the expression!
This example can only handle (||) and (>), but looking at the code (line 24), you can see how you could make it support any other JS operator. Just add a case to the branch, evaluate the sides, and do the calculation on JS.
Parenthesis and operator precedence are all handled by the parser for you.
I'm not sure if this is the solution for you, but it will for sure be fun ;)
One option I thought of was to just support javascript as
conditons/rules and basically pass it through eval with the right
context setup with access to variables and other reference-able vars.
I would personally lean towards something like this. If you are getting into complexities such as logic comparisons, a DSL can become a beast since you are basically almost writing a compiler and a language at that point. You might want to just not have a config, and instead have the configurable file just be JavaScript (or whatever language) that can then be evaluated and then loaded. Then whoever your target audience is for this "config" file can just supplement logical expressions as needed.
The only reason I would not do this is if this configuration file was being exposed to the public or something, but in that case security for a parser would also be quite difficult.
I did something like that once, you can probably pick it up and adapt it to your needs.
TL;DR: thanks to Python's eval, you doing this is a breeze.
The problem was to parse dates and durations in textual form. What I did was to create a yaml file mapping regex pattern to the result. The mapping itself was a python expression that would be evaluated with the match object, and had access to other functions and variables defined elsewhere in the file.
For example, the following self-contained snippet would recognize times like "l'11 agosto del 1993" (Italian for "August 11th, 1993,).
__meta_vars__:
month: (gennaio|febbraio|marzo|aprile|maggio|giugno|luglio|agosto|settembre|ottobre|novembre|dicembre)
prep_art: (il\s|l\s?'\s?|nel\s|nell\s?'\s?|del\s|dell\s?'\s?)
schema:
date: http://www.w3.org/2001/XMLSchema#date
__meta_func__:
- >
def month_to_num(month):
""" gennaio -> 1, febbraio -> 2, ..., dicembre -> 12 """
try:
return index_in_or(meta_vars['month'], month) + 1
except ValueError:
return month
Tempo:
- \b{prep_art}(?P<day>\d{{1,2}}) (?P<month>{month}) {prep_art}?\s*(?P<year>\d{{4}}): >
'"{}-{:02d}-{:02d}"^^<{schema}>'.format(match.group('year'),
month_to_num(match.group('month')),
int(match.group('day')),
schema=schema['date'])
__meta_func__ and __meta_vars (not the best names, I know) define functions and variables that are accessible to the match transformation rules. To make the rules easier to write, the pattern is formatted by using the meta-variables, so that {month} is replaced with the pattern matching all months. The transformation rule calls the meta-function month_to_num to convert the month to a number from 1 to 12, and reads from the schema meta-variable. On the example above, the match results in the string "1993-08-11"^^<http://www.w3.org/2001/XMLSchema#date>, but some other rules would produce a dictionary.
Doing this is quite easy in Python, as you can use exec to evaluate strings as Python code (obligatory warning about security implications). The meta-functions and meta-variables are evaluated and stored in a dictionary, which is then passed to the match transformation rules.
The code is on github, feel free to ask any questions if you need clarifications. Relevant parts, slightly edited:
class DateNormalizer:
def _meta_init(self, specs):
""" Reads the meta variables and the meta functions from the specification
:param dict specs: The specifications loaded from the file
:return: None
"""
self.meta_vars = specs.pop('__meta_vars__')
# compile meta functions in a dictionary
self.meta_funcs = {}
for f in specs.pop('__meta_funcs__'):
exec f in self.meta_funcs
# make meta variables available to the meta functions just defined
self.meta_funcs['__builtins__']['meta_vars'] = self.meta_vars
self.globals = self.meta_funcs
self.globals.update(self.meta_vars)
def normalize(self, expression):
""" Find the first matching part in the given expression
:param str expression: The expression in which to search the match
:return: Tuple with (start, end), category, result
:rtype: tuple
"""
expression = expression.lower()
for category, regexes in self.regexes.iteritems():
for regex, transform in regexes:
match = regex.search(expression)
if match:
result = eval(transform, self.globals, {'match': match})
start, end = match.span()
return (first_position + start, first_position + end) , category, result
Here are some categorized Ruby options and resources:
Insecure
Pass expression to eval in the language of your choice.
It must be mentioned that eval is technically an option, but extraordinary trust must exist in its inputs and it is safer to avoid it altogether.
Heavyweight
Write a parser for your expressions and an interpreter to evaluate them
A cost-intensive solution would be implementing your own expression language. That is, to design a lexicon for your expression language, implement a parser for it, and an interpreter to execute the code that's parsed.
Some Parsing Options (ruby)
Parslet
TreeTop
Citrus
Roll-your-own with StringScanner
Medium Weight
Pick an existing language to write expressions in and parse / interpret those expressions.
This route assumes you can pick a known language to write your expressions in. The benefit is that a parser likely already exists for that language to turn it into an Abstract Syntax Tree (data structure that can be walked for interpretation).
A ruby example with the Parser gem
require 'parser'
class MyInterpreter
# https://whitequark.github.io/ast/AST/Processor/Mixin.html
include ::Parser::AST::Processor::Mixin
def on_str(node)
node.children.first
end
def on_int(node)
node.children.first.to_i
end
def on_if(node)
expression, truthy, falsey = *node.children
if process(expression)
process(truthy)
else
process(falsey)
end
end
def on_true(_node)
true
end
def on_false(_node)
false
end
def on_lvar(node)
# lookup a variable by name=node.children.first
end
def on_send(node, &block)
# allow things like ==, string methods? whatever
end
# ... etc
end
ast = Parser::ConcurrentRuby.parse(<<~RUBY)
name == 'John' && adult
RUBY
MyParser.new.process(ast)
# => true
The benefit here is that a parser and syntax is predetermined and you can interpret only what you need to (and prevent malicious code from executing by controller what on_send and on_const allow).
Templating
This is more markup-oriented and possibly doesn't apply, but you could find some use in a templating library, which parses expressions and evaluates for you. Control and supplying variables to the expressions would be possible depending on the library you use for this. The output of the expression could be checked for truthiness.
Liquid
Jinja
Some toughs and things you should consider.
1. Unified Expression Language (EL),
Another option is EL, specified as part of the JSP 2.1 standard (JSR-245). Official documentation.
They have some nice examples that can give you a good overview of the syntax. For example:
El Expression: `${100.0 == 100}` Result= `true`
El Expression: `${4 > 3}` Result= `true`
You can use this to evaluate small script-like expressions. And there are some implementations: Juel is one open source implementation of the EL language.
2. Audience and Security
All the answers recommend using different interpreters, parser generators. And all are valid ways to add functionality to process complex data. But I would like to add an important note here.
Every interpreter has a parser, and injection attacks target those parsers, tricking them to interpret data as commands. You should have a clear understanding how the interpreter's parser works, because that's the key to reduce the chances to have a successful injection attack Real world parsers have many corner cases and flaws that may not match the specs. And have clear the measures to mitigate possible flaws.
And even if your application is not facing the public. You can have external or internal actors that can abuse this feature.
I'm building an app which has a feature for embedding expressions/rules in a config yaml file.
I'm looking for a parsing/dsl/rules-engine library that can support these type of expressions and normalize it. I'm open using ruby, javascript, java, or python if anyone knows of a library for that languages.
One possibility might be to embed a rule interpreter such as ClipsRules inside your application. You could then code your application in C++ (perhaps inspired by my clips-rules-gcc project) and link to it some C++ YAML library such as yaml-cpp.
Another approach could be to embed some Python interpreter inside a rule interpreter (perhaps the same ClipsRules) and some YAML library.
A third approach could be to use Guile (or SBCL or Javascript v8) and extend it with some "expert system shell".
Before starting to code, be sure to read several books such as the Dragon Book, the Garbage Collection handbook, Lisp In Small Pieces, Programming Language Pragmatics. Be aware of various parser generators such as ANTLR or GNU bison, and of JIT compilation libraries like libgccjit or asmjit.
You might need to contact a lawyer about legal compatibility of various open source licenses.
I am trying to puzzle out a way to de-obfuscate javascript that looks like this:
https://jsfiddle.net/douglasg14b/4951br9f/2/
var testString = 'Test | String'
var wf6 = {
fq4: 'su',
k8d: 'bs',
l8z: 'tri',
cy1: 'ng',
t5j: 'te',
ol: 'stS',
x3q: 'tri',
l9x: 'ng',
gh: 'xO'
};
//Obfuscated
let test1 = testString[wf6.fq4 + wf6.k8d + wf6.l8z + wf6.cy1](4,11);
//Normal
let test2 = testString.substring(4,11);
let test3;
//More complex obfuscation
(function moreComplex(){
let h = "i",
w = "nde",
T0 = "f",
hj = '|',
a = eval(wf6.t5j + wf6.ol + wf6.x3q + wf6.l9x).length;
//Obfuscated
test3 = testString[wf6.fq4 + wf6.k8d + wf6.l8z + wf6.cy1](testString[h + w + wf6.gh + T0](hj), a);
//Normal
let test4 = testString.substring(testString.indexOf('|'), testString.length);
})();
$('.span1').text(test1);
$('.span2').text(test3);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<span class="span1"></span><br>
<span class="span2"></span>
This is a small example, the file I'm working with is ~60k lines long and is full this kind of obfuscation. Everywhere a string can be used as a property name, this kind of obfuscation is used.
The way I can think of doing this, is to evaluate all the string concatenations so they are turned into a readable equivalent. Though, I am not sure how to go about this and ignore all the other working code that exists between all the concatenations.
Thoughts?
Bonus question: Is there a commonly used name for this kind of obfuscation that might make searches a bit easier?
Edit: Added a more complex example.
You have the basic idea right: you have to partially-evaluate the program and precompute all the constant computations. In your case, the constant computations of main interest are the concatenation steps over values which don't change.
To do this, you need a program transformation system (PTS). This is a tool that will read/parse source code for a specified language and build an abstract syntax tree, allow you specify transformations and analyses over the AST, and run those, and then spit out the modified AST as source code again.
In your case, you obviously want a PTS that is wired to know JavaScript out of the box (rare) or is willing to accept a description of JavaScript and then read JavaScript (more typical) with the hope that you can build or get a JavaScript description easily. [I build a PTS that has JavaScript descriptions available, see my bio].
With that in hand, you need to:
code an analyzer that inspects each variable found in an expression to see if that expression is constant (e.g., "wf6"). To demonstrate it is constant, you will have to find the variable definition, and check that all the values used in the variable definition are themselves constants. If there is more than one variable definition, you might have to check that all definitions produce the same value. You need to check for side-effects on the variable (e.g, there are no function calls "foo(...,wf6,...)" which would allow the variable's value to be modified). You need to worry about whether an eval command to accomplish such a side effect exists [this is virtually impossible to do, so you often have to just ignore evals and assume they do not do such things]. Many PTSes will have a way to allow you to build such analyzers; some are easier than others.
For every constant valued variable, substitute the value of that variable in the code
For every constant-valued sub-expression after such substitutions, "fold" (calculate) the result of that expression and substitute that value for that subexpression and repeat until no more folding is possible. Obviously you want to do this for at least all "+" operators. [OP just modified his example; he'll want to do it for "eval" operators too when all its operands are constant].
You may have to iterate this process, as folding an expression may make it obvious that a variable now has a constant value
The above process is called "constant propagation" in the compiler literature and is a feature of many compilers.
In your case, you could restrict the constant folding to just string concatenates. However, once you have adequate machinery to do constant value propagation, doing all or most operators on constants isn't that hard. You may need this to undo other obfuscations involving constants since that
seems to be the obfuscation style used on the code you are working on.
You'll need a special rule that transforms
var['string'](args)
into
var.string(args)
as a final step.
You have another complication: that is knowing that you have all the JavaScript relevant to producing constant-valued variables. A single web page may have many included chunks of JavaScript; you will need all of them to demonstrate there are no side effects on a variable. I assume in your case you are sure you have it all.
With respect to producing known-constant values, you may have worry about a tricky case: an expression that produces constant values from non-constant operands. Imagine the obfuscated expression was:
x=random(); // produce a value between 0 and 1
one=x+(1-x); // not constant by constant propagation, but constant by algebraic relations
teststring['st'[one]+'vu'[one+1]+'bz'[one]+...](4,11)
You can see it always computes 'substring' as a property. You can add a transformation rule that understands the trick used to compute "one", e.g., a rule for each algebraic trick used to compute known constants. Unfortunately for you, there's an infinite number of algebra theorems one can use to manufacture constants; how many are really used in your example bit of code? [Welcome to the problem of reverse engineering with a smart adversary].
Nope, none of this "easy". Presumably that's why the obfuscation method
used was chosen.
I am implementing jQuery chaining - using Mika Tuupola's Chained plugin - in my rails project (using nested form_for partials) and need to dynamically change the chaining attribute:
The code that works without substitution:
$(".employee_title_2").remoteChained({
parents : ".employee_title_1",
url : "titles/employee_title_2",
loading : "Loading...",
clear : true
});
The attributes being substituted are .employee_title_1 and .employee_title_2:
var t2 = new Date().getTime();
var A1 = ".employee_title_1A_" + t2;
var B2 = ".employee_title_2B_" + t2;
In ruby speak, I'm namespacing the variables by adding datetime.
Here's the code I'm using for on-the-fly substitution:
$(`"${B2}"`).remoteChained({
parents : `"${A1}"`,
url : "titles/employee_title_2",
loading : "Loading...",
clear : true
});
Which throws this error:
Uncaught Error: Syntax error, unrecognized expression:
".employee_title_2B_1462463848339"
The issue appears to be the leading '.' How do I escape it, assuming that's the issue? Researching the error message Syntax error, unrecognized expression lead to SO question #14347611 - which suggests "a string is only considered to be HTML if it starts with a less-than ('<) character" Unfortunately, I don't understand how to implement the solution. My javascript skills are weak!
Incidentally, while new Date().getTime(); isn't in date format, it works for my purpose, i.e., it increments as new nested form fields are added to the page
Thanks in advance for your assistance.
$(`"${B2b}"`).remoteChained({
// ^ ^
// These quotes should not be here
As it is evaluated to a string containing something like:
".my_class"
and to tie it together:
$('".my_class"')...
Same goes for the other place you use backtick notation. In your case you could simply use:
$(B2).remoteChained({
parents : A1,
url : "titles/employee_title_2",
loading : "Loading...",
clear : true
});
The back tick (``) syntax is new for Javascript, and provides a templating feature, similar to the way that Ruby provides interpolated strings. For instance, this Javascript code:
var who = "men";
var what = "country";
var famous_quote = `Now is the time for all good ${who} to come to the aid of their #{what}`;
is interpolated in exactly the same way as this Ruby code:
who = "men"
what = "country"
famous_quote = "Now is the time for all good #{who} to come to the aid of their #{what}"
In both cases, the quote ends up reading, "Now is the time for all good men to come to the aid of their country". Similar feature, slightly different syntax.
Moving on to jQuery selectors, you have some flexibility in how you specify them. For instance, this code:
$(".my_class").show();
is functionally equivalent to this code:
var my_class_name = ".my_class";
$(my_class_name).show();
This is a great thing, because that means that you can store the name of jQuery selectors in variables and use them instead of requiring string literals. You can also build them from components, as you will find in this example:
var mine_or_yours = (user_selection == "me") ? "my" : "your";
var my_class_name = "." + mine_or_yours + "_class";
$(my_class_name).show();
This is essentially the behavior that you're trying to get working. Using the two features together (interpolation and dynamic jQuery selectors), you have this:
$(`"${B2}"`).remote_chained(...);
which produces this code through string interpolation:
$("\".employee_title_2B_1462463848339\"").remote_chained(...);
which is not correct. and is actually the cause of the error message from jQuery, because of the embedded double quotes in the value of the string. jQuery is specifically complaining about the extra double quotes surrounding the value that you're passing to the selector.
What you actually want is the equivalent of this:
$(".employee_title_2B_1462463848339").remote_chained(...);
which could either be written this way:
$(`${B2}`).remote_chained(...);
or, much more simply and portably, like so:
$(B2).remote_chained(...);
Try this little sample code to prove the equivalence it to yourself:
if (`${B2}` == B2) {
alert("The world continues to spin on its axis...");
} else if (`"${B2}"` == B2) {
alert("Lucy, you've got some 'splain' to do!");
} else {
alert("Well, back to the drawing board...");
}
So, we've established the equivalency of interpolation to the original strings. We've also established the equivalency of literal jQuery selectors to dynamic selectors. Now, it's time to put the techniques together in the original code context.
Try this instead of the interpolation version:
$(B2).remoteChained({
parents : A1,
url : "titles/employee_title_2",
loading : "Loading...",
clear : true
});
We already know that $(B2) is a perfectly acceptable dynamic jQuery selector, so that works. The value passed to the parents key in the remoteChained hash simply requires a string, and A1 already fits the bill, so there's no need to introduce interpolation in that case, either.
Realistically, nothing about this issue is related to Chained; it just happens to be included in the statement that's failing. So, that means that you can easily isolate the failing code (building and using the jQuery selectors), which makes it far easier to debug.
Note that the Javascript syntax was codified just last year with ECMAScript version 6, so the support for it is still a mixed bag. Check your browser support to make sure that you can use it reliably.
Ive got this labratory equipment that is connected to my PC. It uses special OCX file to communicate with the device (reading, setting parameters and such). I got this code from manual that seems to be working. I get a message box saying "Magnification =1272.814 Last error=API not initialized".
<HTML>
<HEAD>
<SCRIPT LANGUAGE="VBScript">
<!--
Sub window_onLoad()
Dim Value
Dim er
call Api1.Initialise("")
call Api1.Get("AP_MAG",Value)
call Api1.GetLastError(er)
call window.alert("Magnification = " + CStr(Value)+"Last error="+er)
call Api1.ClosingControl()
end sub
-->
</SCRIPT>
<TITLE>New Page</TITLE>
</HEAD>
<BODY>
<object classid="CLSID:71BD42C4-EBD3-11D0-AB3A-444553540000" id="Api1">
<PARAM NAME="_Version" VALUE="65536">
<PARAM NAME="_ExtentX" VALUE="2096">
<PARAM NAME="_ExtentY" VALUE="1058">
<PARAM NAME="_StockProps" VALUE="0">
</OBJECT>
</BODY>
</HTML>
So because I have 0% knowledge in vbs and about 10% in jscript I`m trying to rewrite the same thing in Javascript. And I also have some necessary code already written in js.
<script language="JScript">
var Api1=new ActiveXObject("ApiCtrl");
var value;
var er;
Api1.Initialise("");
Api1.Get("AP_MAG",value);
Api1.GetLastError(er);
window.alert("Magnification = " + value+"\n Last error="+er);
Api1.ClosingControl();
</script>
Unfortunately I get a type mismatch error in either .Get or .GetLastError methods either with var value; var er; or var value=""; var er="";
Here is what API manual has to say
long GetLastError(VARIANT* Error)
[out] Error is the error string
associated with the error code for the last error
Remarks: This call will return a VT_BSTR VARIANT associated with the last error. Return
Value: If the call succeeds, it returns 0. If the call fails, an error
code is returned from the function.
long Get(LPCTSTR lpszParam, VARIANT* vValue)
[in] lpszParam is the name of the parameter e.g. “AP_MAG”
[in][out] vValue is the value of the parameter Remarks: This call will get the
value of the parameter specified and return it in vValue. In C++,
before calling this functions you have to specify the variant type
(vValue.vt) to either VT_R4 or VT_BSTR. If no variant type is defined
for vValue, it defaults to VT_R4 for analogue parameters (AP_XXXX) and
VT_BSTR for digital parameters (DP_XXXX). If the variant type is VT_R4
for an analogue parameter, then the floating point representation is
returned in the variant. If a VT_BSTR variant is passed, analogue
values are returned as scaled strings with the units appended (e.g.
AP_WD would return “= 10mm”). For digital parameters, VT_R4 variants
result in a state number and VT_BSTR variants result in a state string
(e.g. DP_RUNUPSTATE would return state 0 or “Shutdown” or the
equivalent in the language being supported). In C++, if the variant
type was specified as VT_BSTR then the API will internally allocate a
BSTR which the caller has to de-allocate using the SDK call
::SysFreeString (vValue.bstrVal)
Welcome to StackOverflow!
Well, each language is made with purpose. Then come to deal with ActiveX objects in browser (or WSH) environment, VBScript is the best choice, while JavaScript is most worst.
JavaScript hasn't so-called out parameters. That mean all function arguments are passed by value (as copy). Lets show you this with examples.
' VBScript
Dim X, Y
X = 1
Y = 2
Foo X, Y
MsgBox "Outer X = " & X & ", Y = " & Y
'> Local args: 6, 8
'> Outer X = 1, Y = 8
Sub Foo(ByVal arg1, ByRef arg2)
arg1 = 6
arg2 = 8
MsgBox "Local args: " & arg1 & ", " & arg2
End Sub
By default in VBS the arguments are passed by reference, so ByRef prefix in function arguments declaration is optional. I include it for clarity.
What the example illustrate is the meaning of "by reference" or "out" parameter. It behave like return value because it modify referenced variable. While modifying "by value" variable has no effect outside of the function scope, because we modify a "copy" of that variable.
// JavaScript
function foo(arg1) {
arg1 = 2;
alert('Local var = ' + arg1);
}
var x = 0;
foo(x);
alert('Outer var = ' + x);
// Local var = 2
// Outer var = 0
Now take a look at this thread. Looks like there is a kind of partial solution by using empty objects. I'm not sure in which cases that will work, but for sure it's very limited hack.
If this not help in your case, then looks like it's time to go with VBScript. Starting with VBS is easy anyway. It's the most user friendly language I ever touch. I was need days, even weeks with other languages only to get started, while just after a few hours with VBS I was able to use it freely.
[EDIT] Well, I made a lot more efforts to reply as may looks like at the glance :) Starting with the language limitation you met. Afterwards going to explain the nature of that limitation (what's "in/out" parameter), and the best way to do that is via example, and this is what I did. Afterwards I show you the only workaround out there to deal with this limitation in JS. Can we consider this as complete answer?
You not mention whether you test this "empty-object-trick", but as you still asking I presume you did that and it's not work with your OCX, right? Then, in this case, you're just forced to deal with your OCX via VBScript, what was my answer from the beginning. And as you prefer to stay with JS then you need to integrate a piece of VB code in your solution.
And as you noted too, this VBs/Js integration is a whole new question. Yes, good question of course, but it's a metter of new topic.
Ok, lets say that the question you append below: "why it should work with passing objects as a function parameter", is still a part of the main question. Well, as you see, even people using JS daily (am not one of them) has no idea what happens "behind the hood", i.e. do not expect an answer on what the JS-engine do in this case, or how this cheat the JS-engine to do something that it's not designed to do. Personally, as I use JS very rarely and not for such tasks, am not even sure if this trick works at all. But as the JS-guys assert it works (in some cases) then we s'd trust them. But that's all about. If this approach fail then it's not an option.
Now what's remain is a bit of homework, you s'd research all available methods for VBs/Js integration, also test them to see which one is most applicable to your domain, and if by chance you meet with difficulties, just then come-back to the forum with new topic and the concrete issue you're trying to resolve.
And to become as helpful as possible, I'll facilitate you with several references to get started.
Here is the plan...
1. If it's possible to work without VBs/Js integration then use stay-alone .VBS files (in WSH environment), else ...
2. In case you work in browser environment (HTML or HTA) then you can embed both (VBs/Js), and your integration w'd be simple.
3. Or may integrate VBs/Js with Windows Script Files (.wsf).
4. Or use ScriptControl that allow running VBScript from within JScript (or backward/opposite).
Links:
Using the ScriptControl
How To Call Functions Using the Script Control
An example VBs/Js integration using ScriptControl via
Batch-Embeded-Script
What is Batch-Embeded-Script:
VBS/Batch Hybrid
JS/Batch Hybrid
5. Some other method (if you find, that am not aware of).
Well, after all this improvements I not see what I can append more, and as I think, now
my answer is more than complete. If you agree with my answer then accept it by clicking on the big white arrow. Of course, if you expect to get better reply from other users, you may still wait, but keep in mind that unanswered questions stay active just for awhile and then become closed.
I have a bbcode -> html converter that responds to the change event in a textarea. Currently, this is done using a series of regular expressions, and there are a number of pathological cases. I've always wanted to sharpen the pencil on this grammar, but didn't want to get into yak shaving. But... recently I became aware of pegjs, which seems a pretty complete implementation of PEG parser generation. I have most of the grammar specified, but am now left wondering whether this is an appropriate use of a full-blown parser.
My specific questions are:
As my application relies on translating what I can to HTML and leaving the rest as raw text, does implementing bbcode using a parser that can fail on a syntax error make sense? For example: [url=/foo/bar]click me![/url] would certainly be expected to succeed once the closing bracket on the close tag is entered. But what would the user see in the meantime? With regex, I can just ignore non-matching stuff and treat it as normal text for preview purposes. With a formal grammar, I don't know whether this is possible because I am relying on creating the HTML from a parse tree and what fails a parse is ... what?
I am unclear where the transformations should be done. In a formal lex/yacc-based parser, I would have header files and symbols that denoted the node type. In pegjs, I get nested arrays with the node text. I can emit the translated code as an action of the pegjs generated parser, but it seems like a code smell to combine a parser and an emitter. However, if I call PEG.parse.parse(), I get back something like this:
[
[
"[",
"img",
"",
[
"/",
"f",
"o",
"o",
"/",
"b",
"a",
"r"
],
"",
"]"
],
[
"[/",
"img",
"]"
]
]
given a grammar like:
document
= (open_tag / close_tag / new_line / text)*
open_tag
= ("[" tag_name "="? tag_data? tag_attributes? "]")
close_tag
= ("[/" tag_name "]")
text
= non_tag+
non_tag
= [\n\[\]]
new_line
= ("\r\n" / "\n")
I'm abbreviating the grammar, of course, but you get the idea. So, if you notice, there is no contextual information in the array of arrays that tells me what kind of a node I have and I'm left to do the string comparisons again even thought the parser has already done this. I expect it's possible to define callbacks and use actions to run them during a parse, but there is scant information available on the Web about how one might do that.
Am I barking up the wrong tree? Should I fall back to regex scanning and forget about parsing?
Thanks
First question (grammar for incomplete texts):
You can add
incomplete_tag = ("[" tag_name "="? tag_data? tag_attributes?)
// the closing bracket is omitted ---^
after open_tag and change document to include an incomplete tag at the end. The trick is that you provide the parser with all needed productions to always parse, but the valid ones come first. You then can ignore incomplete_tag during the live preview.
Second question (how to include actions):
You write socalled actions after expressions. An action is Javascript code enclosed by braces and are allowed after a pegjs expression, i. e. also in the middle of a production!
In practice actions like { return result.join("") } are almost always necessary because pegjs splits into single characters. Also complicated nested arrays can be returned. Therefore I usually write helper functions in the pegjs initializer at the head of the grammar to keep actions small. If you choose the function names carefully the action is self-documenting.
For an examle see PEG for Python style indentation. Disclaimer: this is an answer of mine.
Regarding your first question I have tosay that a live preview is going to be difficult. The problems you pointed out regarding that the parser won't understand that the input is "work in progress" are correct. Peg.js tells you at which point the error is, so maybe you could take that info and go a few words back and parse again or if an end tag is missing try adding it at the end.
The second part of your question is easier but your grammar won't look so nice afterwards. Basically what you do is put callbacks on every rule, so for example
text
= text:non_tag+ {
// we captured the text in an array and can manipulate it now
return text.join("");
}
At the moment you have to write these callbacks inline in your grammar. I'm doing a lot of this stuff at work right now, so I might make a pullrequest to peg.js to fix that. But I'm not sure when I find the time to do this.
Try something like this replacement rule. You're on the right track; you just have to tell it to assemble the results.
text
= result:non_tag+ { return result.join(''); }