The JSON spec says that JSON is an object or an array. In the case of an object,
An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a
string. ...
And later, the spec says that a string is surrounded in quotes.
Why?
Thus,
{"Property1":"Value1","Property2":18}
and not
{Property1:"Value1",Property2:18}
Question 1: why not allow the name in the name/value pairs to be unquoted identifiers?
Question 2: Is there a semantic difference between the two representations above, when evaluated in Javascript?
I leave a quote from a presentation that Douglas Crockford (the creator of the JSON standard) gave to Yahoo.
He talks about how he discovered JSON, and amongst other things why he decided to use quoted keys:
....
That was when we discovered the
unquoted name problem. It turns out
ECMA Script 3 has a whack reserved
word policy. Reserved words must be
quoted in the key position, which is
really a nuisance. When I got around
to formulizing this into a standard, I
didn't want to have to put all of the
reserved words in the standard,
because it would look really stupid.
At the time, I was trying to convince
people: yeah, you can write
applications in JavaScript, it's
actually going to work and it's a good
language. I didn't want to say, then,
at the same time: and look at this
really stupid thing they did! So I
decided, instead, let's just quote the
keys.
That way, we don't have to tell
anybody about how whack it is.
That's why, to this day, keys are quoted in
JSON.
You can find the complete video and transcript here.
Question 1: why not allow the name in the name/value pairs to be unquoted identifiers?
The design philosophy of JSON is "Keep it simple"
"Quote names with "" is a lot simpler than "You may quote names with " or ' but you don't have to, unless they contain certain characters (or combinations of characters that would make it a keyword) and ' or " may need to be quoted depending on what delimiter you selected".
Question 2: Is there a semantic difference between the two representations above, when evaluated in Javascript?
No. In JavaScript they are identical.
Both : and whitespace are permitted in identifiers. Without the quotes, this would cause ambiguity when trying to determine what exactly constitutes the identifier.
In javascript objects can be used like a hash/hashtable with key pairs.
However if your key has characters that javascript could not tokenize as a name, it would fail when trying it access like a property on an object rather than a key.
var test = {};
test["key"] = 1;
test["#my-div"] = "<div> stuff </div>";
// test = { "key": 1, "#my-div": "<div> stuff </div>" };
console.log(test.key); // should be 1
console.log(test["key"]); // should be 1
console.log(test["#my-div"]); // should be "<div> stuff </div>";
console.log(test.#my-div); // would not work.
identifiers can sometimes have characters that can not be evaluated as a token/identifier in javascript, thus its best to put all identifiers in strings for consistency.
If json describes objects, then in practise you get the following
var foo = {};
var bar = 1;
foo["bar"] = "hello";
foo[bar] = "goodbye";
so then,
foo.bar == "hello";
foo[1] == "goodbye" // in setting it used the value of var bar
so even if your examples do produce the same result, their equivalents in "raw code" wouldn't. Maybe that's why?? dunno, just an idea.
I think the right answer to Cheeso's question is that the implementation surpassed the documentation. It no longer requires a string as the key, but rather something else, which can either be a string (ie quoted) or (probably) anything that can be used as a variable name, which I will guess means start with a letter, _, or $, and include only letters, numbers, and the $ and _.
I wanted to simplify the rest for the next person who visits this question with the same idea I did. Here's the meat:
Variable names are not interpolated in JSON when used as an object key (Thanks Friedo!)
Breton, using "identifier" instead of "key", wrote that "if an identifier happens to be a reserved word, it is interpreted as that word rather than as an identifier." This may be true, but I tried it without any trouble:
var a = {do:1,long:2,super:3,abstract:4,var:5,break:6,boolean:7};
a.break
=> 6
About using quotes, Quentin wrote "...but you don't have to, unless [the key] contains certain characters (or combinations of characters that would make it a keyword)"
I found the former part (certain characters) is true, using the # sign (in fact, I think $ and _ are the only characters that don't cause the error):
var a = {a#b:1};
=> Syntax error
var a = {"a#b":1};
a['a#b']
=> 1
but the parenthetical about keywords, as I showed above, isn't true.
What I wanted works because the text between the opening { and the colon, or between the comma and the colon for subsequent properties is used as an unquoted string to make an object key, or, as Friedo put it, a variable name there doesn't get interpolated:
var uid = getUID();
var token = getToken(); // Returns ABC123
var data = {uid:uid,token:token};
data.token
=> ABC123
It may reduce data size if quotes on name are only allowed when necessary
Related
Suppose we have a string with some (astral) Unicode characters:
const s = 'Hi π Unicode!'
The [] operator and .charAt() method don't work for getting the 4th character, which should be "π":
> s[3]
'οΏ½'
> s.charAt(3)
'οΏ½'
The .codePointAt() does get the correct value for the 4th character, but unfortunately it's a number and has to be converted back to a string using String.fromCodePoint():
> String.fromCodePoint(s.codePointAt(3))
'π'
Similarly, converting the string into an array using splats yields valid Unicode characters, so that's another way of getting the 4th one:
> [...s][3]
'π'
But i can't believe that going from string to number back to string, or having to split the string into an array are the only ways of doing this seemingly trivial thing. Isn't there a simple method for doing this?
> s.simpleMethod(3)
'π'
Note: i know that the definition of "character" is somewhat fuzzy, but for the purpose of this question a character is simply the symbol that corresponds to a Unicode codepoint (no combining characters, no grapheme clusters, etc).
Update: the String.fromCodePoint(str.codePointAt(n)) method is not really viable, since the nth position there doesn't take previous astral symbols into account: String.fromCodePoint('ππ'.codePointAt(1)) // => 'οΏ½'
(I feel kinda dumb asking this; like i'm probably missing something obvious. But previous answers to this questions don't work on strings with Unicode simbols on astral planes.)
The string iterator is the only thing that iterates through code points rather than UCS-2/UTF-16 code units. So:
const string = 'Hi π Unicode!';
for (const symbol of string) {
console.log(symbol);
}
So to get a specific code point based on its index from a string:
const string = 'Hi π Unicode!';
// Note: The spread operator uses the string iterator under the hood.
const symbols = [...string];
symbols[3]; // 'π'
Still, this would break with grapheme clusters, or emoji sequences such as π¨βπ©βπ§βπ¦ (π¨ + U+200D ZERO WIDTH JOINER + π© + U+200D ZERO WIDTH JOINER + π§ + U+200D ZERO WIDTH JOINER + π¦). Text segmentation helps with that.
Do you actually need to get the 4th code point in the string, though? Whatβs your use case?
You can use the new u flag to regexp if it's available to you.
const chars = 'Hi π Unicode!'.match(/./ug);
console.log(chars);
The accepted answer to this question is out of date.
There is now a member of the String object called .at()/1 which does exactly what you're hoping for. If you have shims, shams, a transcompiler like TypeScript or Babel, etc, just set whatever your local configuration is, and you should be good to go.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/at
Amusingly, the spec for this feature, as well as the most common implementation shim (the one that I use,) is written by the person who authored the now out-of date accepted answer here. So even when he's out of date, he's still up to date.
If shimming or transcompiling isn't appropriate for you, there's a library called jsesc that can handle it for you through simple escaping. I'll give you three guesses who wrote the library. First two don't count.
https://www.npmjs.com/package/jsesc
According to Crockford's json.org, a JSON object is made up of members, which is made up of pairs.
Every pair is made of a string and a value, with a string being defined as:
A string is a sequence of zero or more
Unicode characters, wrapped in double
quotes, using backslash escapes. A
character is represented as a single
character string. A string is very
much like a C or Java string.
But in practice most programmers don't even know that a JSON key should be surrounded by double quotes, because most browsers don't require the use of double quotes.
Does it make any sense to bother surrounding your JSON in double quotes?
Valid Example:
{
"keyName" : 34
}
As opposed to the invalid:
{
keyName : 34
}
The real reason about why JSON keys should be in quotes, relies in the semantics of Identifiers of ECMAScript 3.
Reserved words cannot be used as property names in Object Literals without quotes, for example:
({function: 0}) // SyntaxError
({if: 0}) // SyntaxError
({true: 0}) // SyntaxError
// etc...
While if you use quotes the property names are valid:
({"function": 0}) // Ok
({"if": 0}) // Ok
({"true": 0}) // Ok
The own Crockford explains it in this talk, they wanted to keep the JSON standard simple, and they wouldn't like to have all those semantic restrictions on it:
....
That was when we discovered the
unquoted name problem. It turns out
ECMA Script 3 has a whack reserved
word policy. Reserved words must be
quoted in the key position, which is
really a nuisance. When I got around
to formulizing this into a standard, I
didn't want to have to put all of the
reserved words in the standard,
because it would look really stupid.
At the time, I was trying to convince
people: yeah, you can write
applications in JavaScript, it's
actually going to work and it's a good
language. I didn't want to say, then,
at the same time: and look at this
really stupid thing they did! So I
decided, instead, let's just quote the
keys.
That way, we don't have to tell
anybody about how whack it is.
That's why, to this day, keys are quoted in
JSON.
...
The ECMAScript 5th Edition Standard fixes this, now in an ES5 implementation, even reserved words can be used without quotes, in both, Object literals and member access (obj.function Ok in ES5).
Just for the record, this standard is being implemented these days by software vendors, you can see what browsers include this feature on this compatibility table (see Reserved words as property names)
Yes, it's invalid JSON and will be rejected otherwise in many cases, for example jQuery 1.4+ has a check that makes unquoted JSON silently fail. Why not be compliant?
Let's take another example:
{ myKey: "value" }
{ my-Key: "value" }
{ my-Key[]: "value" }
...all of these would be valid with quotes, why not be consistent and use them in all cases, eliminating the possibility of a problem?
One more common example in the web developer world: There are thousands of examples of invalid HTML that renders in most browsers...does that make it any less painful to debug or maintain? Not at all, quite the opposite.
Also #Matthew makes the best point of all in comments below, this already fails, unquoted keys will throw a syntax error with JSON.parse() in all major browsers (and any others that implement it correctly), you can test it here.
If I understand the standard correctly, what JSON calls "objects" are actually much closer to maps ("dictionaries") than to actual objects in the usual sense. The current standard easily accommodates an extension allowing keys of any type, making
{
"1" : 31.0,
1 : 17,
1n : "valueForBigInt1"
}
a valid "object/map" of 3 different elements.
If not for this reason, I believe the designers would have made quotes around keys optional for all cases (maybe except keywords).
YAML, which is in fact a superset of JSON, supports what you want to do. ALthough its a superset, it lets you keep it as simple as you want.
YAML is a breath of fresh air and it may be worth your time to have a look at it. Best place to start is here: http://en.wikipedia.org/wiki/YAML
There are libs for every language under the sun, including JS, eg https://github.com/nodeca/js-yaml
var jsn=getAttr(ref,"json-data").toString();
console.log(jsn); //{test: true,stringtest:"hallo"}. it's OK.
JSON.parse(jsn); //Uncaught SyntaxError: Unexpected token s, line: line with JSON.parse;
I think JSON.parse does something not right with this data.. I tried to remove stringtest:"hallo" - no result... PS: also I think that I do something wrong then I have asked this question
At the first time I tried JSON.parse("{"+jsn+"}");.
Your JSON is not properly formatted, as your object keys must be surrounded by quotation marks. The following will work:
var jsn = '{"test": true, "stringtest": "hallo"}';
JSON.parse(jsn);
Edit: The RFC4627, which specifies JSON format, states:
2.2. Objects
An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a
string. A single colon comes after each name, separating the name
from the value. A single comma separates a value from a following
name. The names within an object SHOULD be unique.
object = begin-object [ member *( value-separator member ) ]
end-object
member = string name-separator value
As you can see, JSON objects are composed of name/value pairs, where a name is a string. Again, the RFC says:
The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).
string = quotation-mark *char quotation-mark
quotation-mark = %x22 ; "
So, according to the RFC, keys must be surrounded by double quotes, not single one. Still, I guess some parsers may be more tolerant and accept both of them, but I'd stick to the standard.
I would like to know if the following javascript is 'valid' or not.
var object = {
'name' : 'test',
'age' : 56,
'age' : 25
}
As you can see I deliberately repeated one of the attributes. (age)
For what it matters
A quick test on chrome seems to prove that the javascript is valid and object.age is equal to 25.
p.s.
I am asking this because we have written some javascript code generator and I want to know if that is valid js or not.
Technically it is valid, but not recommended.
According to RFC4627 (emphasis mine):
2.2. Objects
An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a
string. A single colon comes after each name, separating the name
from the value. A single comma separates a value from a following
name. The names within an object SHOULD be unique.
Granted this says should, so yes you can do it...but it may result in unpredictable behavior depending on who's parsing it down the road (typically the last property wins because of how most forward-reading parsers behave).
Also note that this RFC applies to JSON not object literals (JSON is a stricter subset of that), but the conventions apply to both.
i have a json string im converting to object with a simple eval(string);
heres the sample of the json string:
var json = #'
"{ description" : { "#cdata-section" : "<some html here>" } }
';
var item = eval('('+json+')');
I am trying to access it like so
item.description.#cdata-section
my problem is that javascript does not like the # in the field name.. is there a way to access it?
item.description['#cdata-section']
Remember that all Javascript objects are just hash tables underneath, so you can always access elements with subscript notation.
Whenever an element name would cause a problem with the dot notation (such as using a variable element name, or one with weird characters, etc.) just use a string instead.
var cdata = item.description["#cdata-section"];
While the official spec for JSON specifies simply for chars to be provided as a field identifier, when you parse your JSON into a Javascript object, you now fall under the restrictions of a Javascript identifier.
In the Javascript spec, an identifier can start with either a letter, underscore or $. Subsequent chars may be any letter, digit, underscore or $.
So basically, the # is valid under the JSON spec but not under Javascript.