First of all, I'd like to say that I'm not trying to start a discussion on what is the best coding style.
Rather, I was wondering what is actually the global standard when it comes to styling your code. I've seen different websites and mainly open source organisations which have their own guideline page, which for example says that you should put } else { on the same line.
Are there some (un)written rules concerning code style which apply to all JavaScript being written? Is there a common preference for specific coding styles? Or is this really on a per-organisation basis?
These are widely accepted*:
Variable names contain only characters a-zA-Z_ (and sometimes $0-9)
Indent by 4 spaces or a tab character (Never mix!)
Constructor functions begin with an uppercase letter
Terminate every statement with a semicolon
Egyptian bracing
always use blocks in after if, else, etc., even for a single statement
One space after a comma, no space before
Assignment/comparison operators are surrounded by spaces
Avoid lines containing multiple statements
Use ' as a string delimiter
From my experience, most conventions are subject to heated discussions.
So, no, there is no general rule. Some people even try to completely avoid semicolons
* or are they? ;)
There isn't one standard. Are there any guidelines out there that you can follow if you want to keep your code consistent? How about google's coding style? http://google-styleguide.googlecode.com/svn/trunk/javascriptguide.xml
We use that as basic guidelines at our company
Douglas Crockford's JavaScript: The Good Parts is widely used as a basis for coding guidelines.
His JSLint tool can be used to check whether code meets his recommendations.
Standard is the new standard.
I've been using it in all my projects.
Related
In the past I've always used underscores for defining class and id attributes in HTML. Over the last few years I changed over to dashes, mostly to align myself with the trend in the community, not necessarily because it made sense to me.
I've always thought dashes have more drawbacks, and I don't see the benefits:
Code completion & Editing
Most editors treat dashes as word separators, so I can't tab through to the symbol I want. Say the class is "featured-product", I have to auto-complete "featured", enter a hyphen, and complete "product".
With underscores "featured_product" is treated as one word, so it can be filled in one step.
The same applies to navigating through the document. Jumping by words or double-clicking on class names is broken by hyphens.
(More generally, I think of classes and ids as tokens, so it doesn't make sense to me that a token should be so easily splittable on hyphens.)
Ambiguity with arithmetic operator
Using dashes breaks object-property access to form elements in JavaScript. This is only possible with underscores:
form.first_name.value='Stormageddon';
(Admittedly I don't access form elements this way myself, but when deciding on dashes vs underscores as a universal rule, consider that someone might.)
Languages like Sass (especially throughout the Compass framework) have settled on dashes as a standard, even for variable names. They originally used underscores in the beginning too. The fact that this is parsed differently strikes me as odd:
$list-item-10
$list-item - 10
Inconsistency with variable naming across languages
Back in the day, I used to write underscored_names for variables in PHP, ruby, HTML/CSS, and JavaScript. This was convenient and consistent, but again in order to "fit in" I now use:
dash-case in HTML/CSS
camelCase in JavaScript
underscore_case in PHP and ruby
This doesn't really bother me too much, but I wonder why these became so misaligned, seemingly on purpose. At least with underscores it was possible to maintain consistency:
var featured_product = $('#featured_product'); // instead of
var featuredProduct = $('#featured-product');
The differences create situations where we have to translate strings unnecessarily, along with the potential for bugs.
So I ask: Why did the community almost universally settle on dashes, and are there any reasons that outweigh underscores?
There is a related question from back around the time this started, but I'm of the opinion that it's not (or shouldn't have been) just a matter of taste. I'd like to understand why we all settled on this convention if it really was just a matter of taste.
Code completion
Whether dash is interpreted as punctuation or as an opaque identifier depends on the editor of choice, I guess. However, as a personal preference, I favor being able to tab between each word in a CSS file and would find it annoying if they were separated with underscore and there were no stops.
Also, using hyphens allows you to take advantage of the |= attribute selector, which selects any element containing the text, optionally followed by a dash:
span[class|="em"] { font-style: italic; }
This would make the following HTML elements have italic font-style:
<span class="em">I'm italic</span>
<span class="em-strong">I'm italic too</span>
Ambiguity with arithmetic operator
I'd say that access to HTML elements via dot notation in JavaScript is a bug rather than a feature. It's a terrible construct from the early days of terrible JavaScript implementations and isn't really a great practice. For most of the stuff you do with JavaScript these days, you'd want to use CSS Selectors for fetching elements from the DOM anyway, which makes the whole dot notation rather useless. Which one would you prefer?
var firstName = $('#first-name');
var firstName = document.querySelector('#first-name');
var firstName = document.forms[0].first_name;
I find the two first options much more preferable, especially since '#first-name' can be replaced with a JavaScript variable and built dynamically. I also find them more pleasant on the eyes.
The fact that Sass enables arithmetic in its extensions to CSS doesn't really apply to CSS itself, but I do understand (and embrace) the fact that Sass follows the language style of CSS (except for the $ prefix of variables, which of course should have been #). If Sass documents are to look and feel like CSS documents, they need to follow the same style as CSS, which uses dash as a delimiter. In CSS3, arithmetic is limited to the calc function, which goes to show that in CSS itself, this isn't an issue.
Inconsistency with variable naming across languages
All languages, being markup languages, programming languages, styling languages or scripting languages, have their own style. You will find this within sub-languages of language groups like XML, where e.g. XSLT uses lower-case with hyphen delimiters and XML Schema uses camel-casing.
In general, you will find that adopting the style that feels and looks most "native" to the language you're writing in is better than trying to shoe-horn your own style into every different language. Since you can't avoid having to use native libraries and language constructs, your style will be "polluted" by the native style whether you like it or not, so it's pretty much futile to even try.
My advice is to not find a favorite style across languages, but instead make yourself at home within each language and learn to love all of its quirks. One of CSS' quirks is that keywords and identifiers are written in lowercase and separated by hyphens. Personally, I find this very visually appealing and think it fits in with the all-lowercase (although no-hyphen) HTML.
Perhaps a key reason why the HTML/CSS community aligned itself with dashes instead of underscores is due to historical deficiencies in specs and browser implementations.
From a Mozilla doc published March 2001 # https://developer.mozilla.org/en-US/docs/Underscores_in_class_and_ID_Names
The CSS1 specification, published in its final form in 1996, did not
allow for the use of underscores in class and ID names unless they
were "escaped." An escaped underscore would look something like this:
p.urgent\_note {color: maroon;}
This was not well supported by browsers at the time, however, and the
practice has never caught on. CSS2, published in 1998, also forbade
the use of underscores in class and ID names. However, errata to the
specification published in early 2001 made underscores legal for the
first time. This unfortunately complicated an already complex
landscape.
I generally like underscores but the backslash just makes it ugly beyond hope, not to mention the scarce support at the time. I can understand why developers avoided it like the plague. Of course, we don't need the backslash nowadays, but the dash-etiquette has already been firmly established.
I don't think anyone can answer this definitively, but here are my educated guesses:
Underscores require hitting the Shift key, and are therefore harder to type.
CSS selectors which are part of the official CSS specifications use dashes (such as pseudo-classes like :first-child and pseudo-elements :first-line), not underscores. Same thing for properties, e.g. text-decoration, background-color, etc. Programmers are creatures of habit. It makes sense that they would follow the standard's style if there's no good reason not to.
This one is further out on the ledge, but... Whether it's myth or fact, there is a longstanding idea that Google treats words separated by underscores as a single word, and words separated by dashes as separate words. (Matt Cutts on Underscores vs. Dashes.) For this reason, I know that my preference now for creating page URLs is to use-words-with-dashes, and for me at least, this has bled into my naming conventions for other things, like CSS selectors.
There are many reasons, but one of the most important thing is maintaining consistency.
I think this article explains it comprehensively.
CSS is a hyphen-delimited syntax. By this I mean we write things like font-size, line-height, border-bottom etc.
So:
You just shouldn’t mix syntaxes: it’s inconsistent.
There's been a clear uptick in hyphen-separated, whole-word segments of URLs over recent years. This is encouraged by SEO best practices. Google explicitly "recommend that you use hyphens (-) instead of underscores (_) in your URLs": http://www.google.com/support/webmasters/bin/answer.py?answer=76329.
As noted, different conventions have prevailed at different times in different contexts, but they typically are not a formal part of any protocol or framework.
My hypothesis, then, is that Google's position anchors this pattern within one key context (SEO), and the trend to use this pattern in class, id, and attribute names is simply the herd moving slowly in this general direction.
I think it's a programmer dependent thing. Someones like to use dashes, others use underscores.
I personally use underscores (_) because I use it in other places too. Such as:
- JavaScript variables (var my_name);
- My controller actions (public function view_detail)
Another reason that I use underscores, is this that in most IDEs two words separated by underscores are considered as 1 word. (and are possible to select with double_click).
point of refactoring only btn to bt
case: btn_pink
search btn in word
result btn
case: btn-pink
search btn in word
result btn | btn-pink
case: btn-pink
search btn in regexp
\bbtn\b(?!-) type to hard
result btn
I've seen regex patterns that use explicitly numbered repetition instead of ?, * and +, i.e.:
Explicit Shorthand
(something){0,1} (something)?
(something){1} (something)
(something){0,} (something)*
(something){1,} (something)+
The questions are:
Are these two forms identical? What if you add possessive/reluctant modifiers?
If they are identical, which one is more idiomatic? More readable? Simply "better"?
To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.
The only time I would use explicitly numbered repetition is when the repetition is greater than 1:
Exactly two: {2}
Two or more: {2,}
Two to four: {2,4}
I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: \d\d\d but I would rather write \d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.
If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.
I can see how, if you have a regex that does a lot of bounded repetition, you might want to use the {n,m} form consistently for readability's sake. For example:
/^
abc{2,5}
xyz{0,1}
foo{3,12}
bar{1,}
$/x
But I can't recall ever seeing such a case in real life. When I see {0,1}, {0,} or {1,} being used in a question, it's virtually always being done out of ignorance. And in the process of answering such a question, we should also suggest that they use the ?, * or + instead.
And of course, {1} is pure clutter. Some people seem to have a vague notion that it means "one and only one"--after all, it must mean something, right? Why would such a pathologically terse language support a construct that takes up a whole three characters and does nothing at all? Its only legitimate use that I know of is to isolate a backreference that's followed by a literal digit (e.g. \1{1}0), but there are other ways to do that.
They're all identical unless you're using an exceptional regex engine. However, not all regex engines support numbered repetition, ? or +.
If all of them are available, I'd use characters rather than numbers, simply because it's more intuitive for me.
They're equivalent (and you'll find out if they're available by testing your context.)
The problem I'd anticipate is when you may not be the only person ever needing to work with your code.
Regexes are difficult enough for most people. Anytime someone uses an unusual syntax, the question
arises: "Why didn't they do it the standard way? What were they thinking that I'm missing?"
So CodeMirror uses modes to tokenise its code.
It breaks up the document into lines and makes each line a stream, which is then put through into the pre-defined mode. It can span multiple lines by using its state parameter.
It seems ACE has a similar method.
Neither of these methods use RegExp inherently (but obviously whomever creates the mode can code in RegExp into their mode).
From what I've read of Atom's code and style, is that it calls different syntax highlighters grammars and they resemble closely the grammars from TextMate.
These grammars resemble JSON objects which contain classnames and RegExps (see how to write a TextMate grammar).
I can't figure out for the life of me how exactly Atom Text Editor actually performs the parsing of code, keeping its state and also extending through various scopes.
If someone could point me in the right direction that would be great.
You're probably better of asking your question in the Atom forums, since they are frequented by the Atom developers.
The question was answered here.
Atom uses its first-mate module, which relies on oniguruma for parsing Regular Expressions.
Python and JavaScript both allow developers to use or to omit semicolons. However, I've often seen it suggested (in books and blogs) that I should not use semicolons in Python, while I should always use them in JavaScript.
Is there a technical difference between how the languages use semicolons or is this just a cultural difference?
Semicolons in Python are totally optional (unless you want to have multiple statements in a single line, of course). I personally think Python code with semicolons at the end of every statement looks very ugly.
Now in Javascript, if you don't write a semicolon, one is automatically inserted1 at the end of line. And this can cause problems. Consider:
function add(a, b) {
return
a + b
}
You'd think this returns a + b, but Javascript just outsmarted you and sees this as:
function add() {
return;
a + b;
}
Returning undefined instead.
1 See page 27, item 7.9 - Automatic Semicolon Insertion on ECMAScript Language Specification for more details and caveats.
This had me confused for the longest time. I thought it was just a cultural difference, and that everyone complaining about semicolon insertion being the worst feature in the language was an idiot. The oft-repeated example from NullUserException's answer didn't sway me because, disregarding indentation, Python behaves the same as JavaScript in that case.
Then one day, I wrote something vaguely like this:
alert(2)
(x = $("#foo")).detach()
I expected it to be interpreted like this:
alert(2);
(x = $("#foo")).detach();
It was actually interpreted like this:
alert(2)(x = $("#foo")).detach();
I now use semicolons.
JavaScript will only1 treat a newline as a semicolon in these cases:
It's a syntax error not to.
The newline is between the throw or return keyword and an expression.
The newline is between the continue or break keyword and an identifier.
The newline is between a variable and a postfix ++ or -- operator.
This leaves cases like this where the behaviour is not what you'd expect. Some people2 have adopted conventions that only use semicolons where necessary. I prefer to follow the standard convention of always using them, now that I know it's not pointless.
1 I've omitted a few minor details, consult ECMA-262 5e Section 7.9 for the exact description.
2 Twitter Bootstrap is one high-profile example.
Aside from the syntactical issues, it is partly cultural. In Python culture any extraneous characters are an anathema, and those that are not white-space or alphanumeric, doubly so.
So things like leading $ signs, semi-colons, and curly braces, are not liked. What you do in your code though, is up to you, but to really understand a language it is not enough just to learn the syntax.
JavaScript is designed to "look like C", so semicolons are part of the culture. Python syntax is different enough to not make programmers feel uncomfortable if the semicolons are "missing".
The answer why you don't see them in Python code is: no one needs them, and the code looks cleaner without them.
Generally speaking, semicolons is just a tradition. Many new languages have just dropped them for good (take Python, Ruby, Scala, Go, Groovy, and Io for example). Programmers don't need them, and neither do compilers. If a language lets you not type an extra character you never needed, you will want to take advantage of that, won't you?
It's just that JavaScript's attempt to drop them wasn't very successful, and many prefer the convention to always use them, because that makes code less ambiguous.
It is mostly that Python looks nothing like Java, and JavaScript does, which leads people to treat it that way. It is very simple to not get into trouble using semicolons with JavaScript (Semicolons in JavaScript are optional), and anything else is FUD.
Both are dynamic typing to increase the readability.
Python Enhancement Proposal 8, or PEP 8, is a style guide for Python code. In 2001, Guido van Rossum, Barry Warsaw, and Nick Coghlan created PEP 8 to help Python programmers write consistent and readable code. Reference.
So in JavaScript we have the ECMAScript specification that describes how, if a statement is not explicitly terminated with a semicolon, sometimes a semicolon will be automatically inserted by the JavaScript engine (called “automatic semicolon insertion” (ASI)). Reference.
See this article from Google talking about JavaScript too.
Is it better for sake of compatibility/JIT-Compiling performance to use UPPER CASE or lower case, in JS/HTML? For eg:
<DIV> my content </DIV>
<div> my content </div>
ALERT(DOCUMENT.LOCATION);
alert(document.location);
This is not a newbie question, I know lowercase is the de-facto standard. But since I've seen some uppercase JS+HTML, I was wondering which would be better to write in. (like SQL is fully uppercase?)
I don't think it'd make a difference, speed-wise.
XHTML: Lowercase tags is what the W3C specified.
JavaScript: It probably wouldn't work, because I have never seen anyone's code use all caps in JS.
SQL is fully uppercase to differentiate the actions, functions, etc from the actual data. You can use lowercase, but it becomes less readable (to some, me included).
IMO, wading through a heap of uppercase tags is less readable than lowercase tags. I'd say user agents don't care what case the tags are. Here's a little bit of history: when I made a website in 1999, uppercase tags were the standard.
You can still find some dodgy non updated websites out there that still write
'Use <B></B> to make text bold'
It is incorrect (in xhtml, at least) to use <DIV>...</DIV>; it is a <div>...</div>.
Likewise, I'd use lower-case in the javascript (for alert(document.location);), as that is their names ;-p
I can't imagine it making any difference compatibility- or performance-wise. I think some people find UPPERCASE easier to recognize as markup or code rather than content.
You can do some benchmarks, if you must.
(XHTML specifies lowercase as the standard, so if your goal is satisfying validators, then go with that)
JavaScript is (using Fx3.0) case sensitive.
var myURL = document.URL; // sets myURL to the current URL
var myURL2 = DOCUMENT.URL; // ReferenceError: "DOCUMENT" is not defined
HTML allows mixed-case tags, XHTML demands lower-case only tags, attributes.
It certainly makes a difference with javascript since it is case sensitive.
The accepted community standard for html is lowercase, even though browser don't care.
So be nice to the ones who have to read your code later!
I'd definitely go with lowercase wherever possible. I tend to like camel casing multi-word variable names, but even that can be eschewed in favor of underscores.