Standard boolean order of operation - javascript

I'm writing a shunting yard algorithm in Javascript for boolean logic, and I'm running into a hitch with order of operations. The operations that I allow are:
and, or, implies, equals(biconditional), not, xor, nor, nand
However, I don't know what the precedence is for these. As of now, I have:
not>equals>implies>xor>nor>nand>or>and
Is this correct? Is there any standard I can use, similar to the PEMDAS/BODMAS system for numbers?

The reason you are having such a hard time finding a definition of precedence of those operators for JavaScript is that:
Precedence only comes into play when using infix notation. Since you mention the shunting yard algorithm I take you intend to use infix notation.
Each language can define it's own precedence and since you are creating a DSL, you create the precedence, but it must be consistent.
Those names are really prefix function names and infix is more common with symbols for operators than names. You should be using operators and not function names:
and &
or |
implies →
equals (biconditional) ↔
not !
xor ⊕
nor ⊽
nand ⊼
When parsing you convert infix to prefix or postfix, so the operators symbols should change to function names if you are building an intermediate form such as an AST.
You didn't mention associativity which you will need for NOT.
There appears to be no standard as noted by the differences between these two respectable sources.
From "Foundations of Computer Science" by Jeffrey D. Ullman
Associativity and Precedence of Logical Operators
The order of precedence we shall use is
1. NOT (highest)
2. NAND
3. NOR
4. AND
5. OR
6. IMPLIES
7. BICONDITIONAL(lowest)
From Mathematica
NOT
AND
NAND
XOR
OR
NOR
EQUIVALENT
IMPLIES

It's seems there is no a standart.
I've a book (Digital design by Morris Mano) that says NOT>AND>OR. this is an accepted opinion.
About the rest, i've found few opinions.
This guy think EQUIV is the lowest (Wikipedia assist). but this guy think EQUIV is in the middle XOR>EQUIV>OR (with few references).
Another disagreement is about XOR place. here this third guy agree with the second guy :)
In short, two opinions:
1) NOT>AND>NAND>XOR>EQUIV>OR>NOR (ignoring NOR)
2) NOT>AND>NAND>NOR>OR>IMPLIES>XOR>EQUIV
Note: only the NOT>AND>OR part is certified academically.

Related

Comma-separated list of integers with max 5 elements [duplicate]

I've seen regex patterns that use explicitly numbered repetition instead of ?, * and +, i.e.:
Explicit Shorthand
(something){0,1} (something)?
(something){1} (something)
(something){0,} (something)*
(something){1,} (something)+
The questions are:
Are these two forms identical? What if you add possessive/reluctant modifiers?
If they are identical, which one is more idiomatic? More readable? Simply "better"?
To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.
The only time I would use explicitly numbered repetition is when the repetition is greater than 1:
Exactly two: {2}
Two or more: {2,}
Two to four: {2,4}
I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: \d\d\d but I would rather write \d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.
If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.
I can see how, if you have a regex that does a lot of bounded repetition, you might want to use the {n,m} form consistently for readability's sake. For example:
/^
abc{2,5}
xyz{0,1}
foo{3,12}
bar{1,}
$/x
But I can't recall ever seeing such a case in real life. When I see {0,1}, {0,} or {1,} being used in a question, it's virtually always being done out of ignorance. And in the process of answering such a question, we should also suggest that they use the ?, * or + instead.
And of course, {1} is pure clutter. Some people seem to have a vague notion that it means "one and only one"--after all, it must mean something, right? Why would such a pathologically terse language support a construct that takes up a whole three characters and does nothing at all? Its only legitimate use that I know of is to isolate a backreference that's followed by a literal digit (e.g. \1{1}0), but there are other ways to do that.
They're all identical unless you're using an exceptional regex engine. However, not all regex engines support numbered repetition, ? or +.
If all of them are available, I'd use characters rather than numbers, simply because it's more intuitive for me.
They're equivalent (and you'll find out if they're available by testing your context.)
The problem I'd anticipate is when you may not be the only person ever needing to work with your code.
Regexes are difficult enough for most people. Anytime someone uses an unusual syntax, the question
arises: "Why didn't they do it the standard way? What were they thinking that I'm missing?"

Javascript: How to see source code implementation of native functions like toString

I am looking to see how some Javascript functions work under the hood. For e.g. I want to learn how Chrome's V8 Engine implements the Unary (-) operation or the String.prototype.toString() method.
How can I see the native C/C++ implementation? I have seen many answers here linking to the Chromium repository and the V8 repository, but these are giant and not very beginner friendly, and there aren't really any guides anywhere as far as I could find.
I'm looking for something like this:
// Pseudo code
function -(arg) {
return arg * -1
}
Obviously, I understand that I wouldn't find this written in Javascript. I'm just looking for a similar level of detail.
I'm yet to find an answer that concisely shows how to find the native implementation of Javascript functions anywhere. Could someone point me in the right direction?
The ECMA specs here give the following specs for the Unary - operation:
Unary - Operator
The unary - operator converts its operand to Number type and then
negates it. Note that negating +0 produces −0, and negating −0
produces +0.
The production UnaryExpression : - UnaryExpression is evaluated as
follows:
Let expr be the result of evaluating UnaryExpression. Let oldValue be
ToNumber(GetValue(expr)). If oldValue is NaN, return NaN. Return the
result of negating oldValue; that is, compute a Number with the same
magnitude but opposite sign.
This is quite useful, but what I'm trying to understand is, how
compute a Number with the same magnitude but opposite sign
Is calculated. Is it the number * -1 or something else? Or is it multiple ways?
There is no piece of code or single implementation of individual operators in V8. V8 is a just-in-time compiler that executes all JavaScript by compiling it to native code on the fly. V8 supports about 10 different CPU architectures and for each has 4 tiers of compilers. That already makes 40 different implementations of every operator. In many of those, compilation goes through a long pipeline of stages that transform the input to the actual machine code. And in each case the exact transformation depends on the type information that is available at compile time (collected on previous runs).
To understand what's going on you would need to understand a significant part of V8's complicated architecture, so it is pretty much impossible to answer your question in an SO reply. If you are just interested in the semantics, I rather suggest looking at the EcmaScript language definition.
(The snippet you cite is just a helper function for the converting compiler-internal type information for unary operators in one of the many stages.)
Edit: the excerpt from the EcmaScript definition you cite in the updated question is the right place to look. Keep in mind that all JavaScript numbers are IEEE floating point numbers. The sentence is basically saying that - just inverts the sign bit of such a number. You'd have to refer to the IEEE 754 standard for more details. Multiplication with -1.0 is a much more complicated operation, but will have the same result in most cases (probably with the exception of NaN operands).

figuring out javascript equality operator

While trying to fully understand the difference between equality operator and identity operator, I came across an article at MSDN that explains what they both do, in terms of their inner workings, but I still had a few doubts and decided to create a flowchart so I could have a better picture. Now my question is, is this flowchart correct? or am I missing something?
It's also my understanding that the identity operator (===) would work pretty much the same way, but without attempting to convert A and B to boolean, number or string, in the first step. Is that correct?
You can see the image here too:
Ok here is the real thing, it was a matter of principles ;)
is this flowchart correct?
No. You should use the ECMAScript specification for the Abstract Equality Comparison Algorithm to create the flowchart. ToBoolean is certainly not the first step (it's not used in any step).
or am I missing something?
Yes, a lot.
It's also my understanding that the identity operator (===) would work pretty much the same way, but without attempting to convert A and B to boolean, number or string, in the first step. Is that correct?
The Strict Equality Comparison Algorithm is almost identical to the Abstract Equality Comparison Algorithm, there is a difference only if the argument Types are different, and in that case there is a precise order in which the Types are made equal before the comparison is made.
is this flowchart correct?
No. Apart from being layouted terrible, it is misleading and partially wrong.
Am I missing something?
Yes. The first step, "try to convert A and B to boolean, string or number" is wrong - that's not the first step in the equality comparison algorithm. Also, when to convert which of the variables to which type?
Then, the next step should be a type distinction, instead of repeatedly asking for identical values of a specific type.
The "last" step "Can they (the types) be coerced into any of the last 5 situations? -> Coerce types" lacks detail. All the detail. Which is the most relevant part of the sloppy equality comparison:
Which types can be coerced?
What types would be coerced to which?
How does the coercion of the values work?
And no, after the coercion the algorithm pretty much starts at the beginning, not with the question about strings.
It's also my understanding that the identity operator (===) would work pretty much the same way, but without attempting to convert A and B to boolean, number or string, in the first step.
That first step is not apparent in the actual algorithm, so No. In fact, === works the same except the last step, which coerces values into other types - instead, false is returned.
Edit: Your second diagram is accurate (correct), although it still features some odd layout decisions.

Is ECMAScript really a dialect of Lisp?

A friend of mine drew my attention the welcome message of 4th European Lisp Symposium:
... implementation and application of
any of the Lisp dialects, including
Common Lisp, Scheme, Emacs Lisp,
AutoLisp, ISLISP, Dylan, Clojure,
ACL2, ECMAScript, ...
and then asked if ECMAScript is really a dialect of Lisp. Can it really be considered so? Why?
Is there a well defined and clear-cut set of criteria to help us detect whether a language is a dialect of Lisp? Or is being a dialect taken in a very loose sense (and in that case can we add Python, Perl, Haskell, etc. to the list of Lisp dialects?)
Brendan Eich wanted to do a Scheme-like language for Netscape, but reality intervened and he ended up having to make do with something that looked vaguely like C and Java for "normal" people, but which worked like a functional language.
Personally I think it's an unnecessary stretch to call ECMAScript "Lisp", but to each his own. The key thing about a real Lisp seems like the characteristic that data structure notation and code notation are the same, and that's not true about ECMAScript (or Ruby or Python or any other dynamic functional language that's not Lisp).
Caveat: I have no Lisp credentials :-)
It's not. It's got a lot of functional roots, but so do plenty of other non-lisp languages nowadays, as you pointed out.
Lisps have one remaining characteristic that make them lisps, which is that lisp code is written in terms of lisp data structures (homoiconicity). This is what enables lisps powerful macro system, and why it looks so bizzare to non-lispers. A function call is just a list, where the first element in the list is the name of the function.
Since lisp code is just lisp data, it's possible to do some extremely powerful stuff with metaprogramming, that just can't be done in other languages. Many lisps, even modern ones like clojure, are largely implemented in themselves as a set of macros.
Even though I wouldn't call JavaScript a Lisp, it is, in my humble opinion, more akin to the Lisp way of doing things than most mainstream languages (even functional ones).
For one, just like Lisp, it's, in essence, a simple, imperative language based on the untyped lambda calculus that is fit to be driven by a REPL.
Second, it's easy to embed literal data (including code in the form of lambda expressions) in JavaScript, since a subset of it is equivalent to JSON. This is a common Lisp pattern.
Third, its model of values and types is very lispy. It's object-oriented in a broad sense of the word in that all values have a concept of identity, but it's not particularly object-oriented in most narrower senses of the word. Just as in Lisp, objects are typed and very dynamic. Code is usually split into units of functions, not classes.
In fact, there are a couple of (more or less) recent developments in the JavaScript world that make the language feel pretty lispy at times. Take jQuery, for example. Embedding CSS selectors as a sublanguage is a pretty Lisp-like approach, in my opinion. Or consider ECMAScript Harmony's metaobject protocol: It really looks like a direct port of Common Lisp's (much more so than either Python's or Ruby's metaobject systems!). The list goes on.
JavaScript does lack macros and a sensible implementation of a REPL with editor integration, which is unfortunate. Certainly, influences from other languages are very much visible as well (and not necessarily in a bad way). Still, there is a significant amount of cultural compatibility between the Lisp and JavaScript camps. Some of it may be coincidental (like the recent rise of JavaScript JIT compilation), some systematic, but it's definitely there.
If you call ECMAScript Lisp, you're basically asserting that any dynamic language is Lisp. Since we already have "dynamic language", you're reducing "Lisp" to a useless synonym for it instead of allowing it to have a more specific meaning.
Lisp should properly refer to a language with certain attributes.
A language is Lisp if:
Its source code is tree-structured data, which has a straightforward printed notation as nested lists. Every possible tree structure has a rendering in the corresponding notation and is susceptible to being given a meaning as a construct; the notation itself doesn't have to be extended to extend the language.
The tree-structured data is a principal data structure in the language itself, which makes programs susceptible to manipulation by programs.
The language has symbol data type. Symbols have a printed representation which is interned: when two or more instances of the same printed notation for a symbol appear in the notation, they all denote the same object.
A symbol object's principal virtue is that it is different from all other symbols. Symbols are paired with various other entities in various ways in the semantics of Lisp programs, and thereby serve as names for those entities.
For instance, dialect of Lisp typically have variables, just like other languages. In Lisp, variables are denoted by symbols (the objects in memory) rather than textual names. When part of a Lisp program defines some variable a, the syntax for that a is a symbol object and not the character string "a", which is just that symbol's name for the purposes of printing. A reference to the variable, the expression written as a elsewhere in the program, is also an on object. Because of the way symbols work, it is the same object; this object sameness then connects the reference to the definition. Object sameness might be implemented as pointer equality at the machine level. We know that two symbol values are the same because they are pointers to the same memory location in the heap (an object of symbol type).
Case in point: the NewLisp dialect which has a non-traditional memory management for most data types, including nested lists, makes an exception for symbols by making them behave in the above way. Without this, it wouldn't be Lisp. Quote: "Objects in newLISP (excluding symbols and contexts) are passed by value copy to other user-defined functions. As a result, each newLISP object only requires one reference." [emphasis mine]. Passing symbols too, as by value copy, would destroy their identity: a function receiving a symbol wouldn't be getting the original one, and therefore not correctly receiving its identity.
Compound expressions in a Lisp language—those which are not simple primaries like numbers or strings—consist of a simple list, whose first element is a symbol indicating the operation. The remaining elements, if any, are argument expressions. The Lisp dialect applies some sort of evaluation strategy to reduce the expression to a value, and evoke any side effects it may have.
I would tentatively argue that lists being made of binary cells that hold pairs of values, terminated by a special empty list object, probably should be considered part of the definition of Lisp: the whole business of being able to make a new list out of an existing one by "consing" a new item to the front, and the easy recursion on the "first" and "rest" of a list, and so on.
And then I would stop right there. Some people believe that Lisp systems have to be interactive: provide an environment with a listener, in which everything is mutable, and can be redefined at any time and so on. Some believe that Lisps have to have first-class functions: that there has to be a lambda operator and so on. Staunch traditionalists might even insists that there have to be car and cdr functions, the dotted pair notation supporting improper lists, and that lists have to be made up of cells, and terminated by specifically the symbol nil denoting the empty list, and also a Boolean false. Insisting on car and cdr allows Scheme to be a Lisp, but nil being the list terminator and false rules
The more we shovel into the definition of "Lisp dialect", though, the more it becomes political; people get upset that their favorite dialect (perhaps which they created themselves) is being excluded on some technicality. Insisting on car and cdr allows Scheme to be a Lisp, but nil being the list terminator and false rules it out. What, Scheme not a Lisp?
So, based on the above, ECMAScript isn't a dialect of Lisp. However, an ECMAScript implementation contains functionality which can be exposed as a Lisp dialect and numerous such dialects have been developed. Someone who needs wants ECMAScript to be considered a Lisp for some emotional reasons should perhaps be content with that: that the semantics to support Lisp is there, and just needs a suitable interface to that semantics, which can be developed in ECMAScript and which can interoperate with ECMAScript code.
No it's not.
In order to be considered a Lisp, one has to be homoiconic, which ECMAscript is not.
Not a 'dialect'. I learned LISP in the 70's and haven't used it since, but when I learned JavaScript recently I found myself thinking it was LISP-like. I think that's due to 2 factors: (1) JSON is a list-like associative structures and (2) it's seems as though JS 'objects' are essentially JSON. So even though you don't write JS programs in JSON as you would write LISP in lists, you kind of almost do.
So my answer is that there are enough similarities that programmers familiar with LISP will be reminded of it when they use JavaScript. Statements like JS = LISP in a Java suit are only expressing that feeling. I believe that's all there is to it.
Yes, it is. Quoting Crockford:
"JavaScript has much in common with Scheme. It is a dynamic language. It has a flexible datatype (arrays) that can easily simulate s-expressions. And most importantly, functions are lambdas.
Because of this deep similarity, all of the functions in [recursive programming primer] 'The Little Schemer' can be written in JavaScript."
http://www.crockford.com/javascript/little.html
On the subject of homoiconicity, I would recommend searching that word along with JavaScript. Saying that it is "not homoiconic" is true but not the end of the story.
I think that ECMAScript is a dialect of LISP in the same sense that English is a dialect of French. There are commonalities, but you'll have trouble with assignments in one armed only with knowledge of the other :)
I find it interesting that only one of the three keynote presentations highlighted for the 4th European Lisp Symposium directly concerns Lisp (the other two being about x86/JVM/Python and Scala).
"dialect" is definitely stretching it too far. Still, as someone who has learned and used Python, Javascript, and Scheme, Javascript clearly has a far Lisp-ier feel to it (and Coffeescript probably even more so) than Python.
As for why the European Lisp Symposium would want to portray Javascript as a Lisp, obviously they want to piggyback on the popularity of the Javascript for which the programmer population is many, many times larger than all the rest of the Lisp dialects in their list combined.

need a JavaScript Regex that requires upper or lowercase letters

I have a regex that right now only allows lowercase letters, I need one that requires either lowercase or uppercase letters:
/(?=.*[a-z])/
You Can’t Get There from Here
I have a regex that right now only allows lowercase letters, I need one that requires either lowercase or uppercase letters: /(?=.*[a-z])/
Unfortunately, it is utterly impossible to do this correctly using Javascript! Read this flavor comparison’s ECMA column for all of what Javascript cannot do.
Theory vs Practice
The proper pattern for lowercase is the standard Unicode derived binary property \p{Lowercase}, and the proper pattern for uppercase is similarly \p{Uppercase}. These are normative properties that sometimes include non-letters in them under certain exotic circumstances.
Using just General Category properties, you can have \p{Ll} for Lowercase_Letter, \p{Lu} for Uppercase_Letter, and \p{Lt} for titlecase letter. Remember they are three cases in Unicode, not two). There is a standard alias \p{LC} which means [\p{Lu}\p{Lt}\p{Ll}].
If you want a letter than is not a lowercase letter, you could use (?=\P{Ll})\pL. Written in longhand that’s (?=\P{Lowercase_Letter})\p{Letter}. Again, these mix some of the Other_Lowercase code points that \p{Lowercase} recognizes. I must again stress that the Lowercase property is a superset of the Lowercase_Letter property.
Remember the previous paragraph, swapping in upper everywhere I have written lower, and you get the same thing for the capitals.
Possible Platforms
Because access to these essential properties is the minimal level of critical functionality necessary for Unicode regular expressions, some versions of Javascript implement them in just the way I have written them above. However, the standard for Javascript still does not require them, so you cannot in general count on them. This means that it is impossible to this correctly under all implementations of Javascript.
Languages in which it is possible to do what you want done minimally include:
C♯ and Java (both only General Categories)
Ruby if and only if v1.9 or better (only binary properties, including General Categories)
PHP and PCRE (only General Category and Script properties plus a couple extras)
ICU’s C++ library and Perl, which both support all Unicode properties
Of those listed bove, only the last line’s — ICU and Perl — strictly and completely meet all Level 1 compliance requirements (plus some Levels 2 and 3) for the proper handling of Unicode in regexes. However, all of those I’ve listed in the previous paragraph’s bullets can easily handle most, and quite probably all, of what you need.
Javascript is not amongst those, however. Your version might, though, if you are very lucky and never have to run on a standard-only Javascript platform.
Summary
So very sadly, you cannot really use Javascript regexes for Unicode work unless you have a non-standard extension. Some people do, but most do not. If you do not, you may have to use a different platform until the relevant ECMA standard catches up with the 21st century (Unicode 3.1 came out a decade ago!!).
If anyone knows of a Javascript library that implements the Level 1 requirements of UTS#18 on Unicode Regular Expressions including both RL1.2 “Properties” and RL1.2a “Annex C: Compatibility Properties”, please chime in.
Not sure if you mean mixed-case, or strictly lowercase plus strictly uppercase.
Here's the mixed-case version:
/^[a-zA-Z]+$/
And the strictly one-or-the-other version:
/^([a-z]+|[A-Z]+)$/
Try /(?=.*[a-z])/i
Note the i at the end, this makes the expression case insensitive.
Or add an uppercase range to your regex:
/(?=.*[a-zA-Z])/

Categories