Why is JSON.parse so picky with quotes? - javascript

Basically, I am trying to create an object like this by providing a string to JSON.parse():
a = {x:1}
// -> Object {x: 1}
Intuitively I tried:
a = JSON.parse('{x:1}')
// -> Uncaught SyntaxError: Unexpected token x
After some fiddling I figured out:
a = JSON.parse('{"x":1}')
// -> Object {x: 1}
But then I accidentally changed the syntax and bonus confusion kicked in:
a = JSON.parse("{'x':1}")
//-> Uncaught SyntaxError: Unexpected token '
So now I am looking for an explanation why
one must to quote the property name
the implementation accepts single quotes, but fails on double quotes

The main reason for confusion seems to be the difference between JSON and JavaScript objects.
JSON (JavaScript Object Notation) is a data format meant to allow data exchange in a simple format. That is the reason why there is one valid syntax only. It makes parsing much easier. You can find more information on the JSON website.
Some notes about JSON:
Keys must be quoted with "
Values might be strings, numbers, objects, arrays, booleans or "null"
String values must be quoted with "
JavaScript objects on the other hand are related to JSON (obviously), but not identical. Valid JSON is also a valid JavaScript object. The other way around, however, is not.
For example:
keys and values can be quoted with " or '
keys do not always have to be quoted
values might be functions or JavaScript objects

As pointed out in the comments, because that's what the JSON spec specifies. The reason AFAIK is that JSON is meant to be a data interchange format (language agnostic). Many languages, even those with hash literals, do not allow unquoted strings as hash table keys.

Related

JSON.parse : Bad control character in string literal

I think this is a basic question, but I can't understand the reason. in JSON, it's valid to has a special character like "asdfadf\tadfadf", but when try parse it, it's shown error.
eg. code:
let s = '{"adf":"asdf\tasdfdaf"}';
JSON.parse(s);
Error:
Uncaught SyntaxError: Bad control character in string literal in JSON at position 12
at JSON.parse (<anonymous>)
at <anonymous>:1:6
I need to understand what is the issue, and solution.
You have to take into account that \t will be involved in two parsing operations: first, when your string constant is parsed by JavaScript, and second, when you parse the JSON string with JSON.parse(). Thus you must use
let s = '{"adf":"asdf\\tasdfdaf"}';
If you don't, you'll get the error you're seeing from the actual tab character embedded in the string. JSON does not allow "control" characters to be embedded in strings.
Also, if the actual JSON content is being created in server-side code somehow, it's probably easier to skip building a string that you're immediately going to parse as JSON. Instead, have your code build a JavaScript object initializer directly:
let s = { "adf": "asdf\tasdfdaf" };
In your server environment there is likely to be a JSON tool that will allow you to take a data structure from PHP or Java or whatever and transform that to JSON, which will also work as JavaScript syntax.
let t = '{"adf":"asdf\\tasdfdaf"}';
var obj = JSON.parse(t)
console.log(obj)

Why does JSON.parse("foo") fail but JSON.parse(' "foo" ') success?

When I try to JSON.parse("foo"), I get an Error:
Uncaught SyntaxError: Unexpected token o in JSON at position 1
at JSON.parse (<anonymous>)
at <anonymous>:1:6
But, when I use JSON.parse('"foo"'), I can get the result as expected:
const value = JSON.parse('"foo"');
console.log(value)
So why does this happen?
You can find what makes valid JSON at json.org. JSON can represent null, booleans, numbers, strings, arrays and objects. For strings, they must be enclosed in double quotes.
JSON itself is a string format, so when you specify a string in JSON format, you need:
quotes to tell JavaScript you are specifying a string literal. Those quotes are not part of the string value itself; they are just delimiters.
double-quotes inside that string literal to follow the JSON syntax for the string data type.
So here are some examples of correct arguments for JSON.parse:
JSON.parse("true")
JSON.parse("false")
JSON.parse("null")
JSON.parse("42")
JSON.parse("[13]")
JSON.parse('"hello"')
JSON.parse('{"name": "Mary"}')
But not:
JSON.parse("yes")
JSON.parse("no")
JSON.parse("none")
JSON.parse('hello')
JSON.parse({"name": "Mary"})
Because the argument will be cast to string when it is not, the following will also work, but it is confusing (and useless):
JSON.parse(true)
JSON.parse(false)
JSON.parse(null)
JSON.parse(42)
This happens because JSON.parse() gives a value back by formatting the string given using the syntaxis to declare values in Javascript. Therefore using foo naked would be like casting an undefined variable.
Notice also that the notation uses double quotes " instead of single ones '. That's why JSON.parse('"foo"') works but JSON.parse("'foo'") won't.

JSON Attribute Best Practice

Usually, what I see is {"attribute":241241}
but writting this does exactly the same thing : {attribute:241241}.
Is {attribute:241241} considered bad practice and should be avoided or not?
{attribute:241241} does not validate using JSONLint, so yes, it is bad practice and should be avoided.
Further, if you look at the JSON spec at json.org, you will find that for objects the first value is always a string (indicated by double quotes).
You are confusing JSON with Object literal syntax, in Javascript doing
var o = {
"attribute":241241
};
is not JSON, but object literal. And yes, the quotes are useless there.
The JSON specification requires that keys be quoted, so the second form isn't valid JSON.
As Snakes and Coffee noted, it is a valid Javascript object literal (on which JSON is based), and some JSON parsers will accept it anyway. So it may work in some cases, but sooner or later you'll run into a perfectly functional JSON parser that rejects your non-quoted keys because they aren't valid per the spec.
As per the specification here, the names in name/value pairs must be of type string.
An object is an unordered collection of zero or more name/value pairs,
where a name is a string and a value is a string, number, boolean,
null, object, or array.

Is '""' a valid JSON string?

I am confused. To quote json.org
JSON is built on two structures:
A collection of name/value pairs. In various languages, this is
realized as an object, record, struct, dictionary, hash table, keyed
list, or associative array.
An ordered list of values. In most
languages, this is realized as an array, vector, list, or sequence.
So, I don't think '""' should be a valid JSON string as its neither a list values(i.e. does not start with '[' and ends with ']') but JSON.parse doesn't give exception and returns empty string.
Is it a valid JSON string.
No, '' is not valid JSON. JSON.parse('') does throw an error – just look in your browser console.
Next time you have an "is this valid JSON?" question, just run it through a JSON validator. That's why they exist.
So, I don't think "" should be a valid JSON string
It is a valid JSON string (which is a data type that may appear in a JSON text).
as its neither a list values(i.e. does not start with '[' and ends with ']')
A JSON text (i.e. a complete JSON document) must (at the outermost level) be…
(Here I cut the original answer because the specification has been revised).
A JSON text is a serialized value.
(quoting the JSON specification
So "" is a valid JSON text. This wasn’t the case when the original version of this answer was written. Some JSON parsers may break when the outer most value is not an object or array.
The original answer (which is now incorrect resumes here):
…either an object or an array. A string is not a valid JSON text.
The formal specification says:
A JSON text is a serialized object or array.
But back to quoting the question here:
but JSON.parse doesn't give exception and returns empty string.
The JSON parser you are using is being overly-liberal. Don't assume that all JSON parsers will be.
For example, if I run perl -MJSON -E'say decode_json(q{""})' I get:
JSON text must be an object or array (but found number, string, true, false or null, use allow_nonref to allow this) at -e line 1.
Following the newest JSON RFC 7159, "" is in fact valid JSON. But in some earlier standards it wasn't.
Quote:
A JSON text is a sequence of tokens. The set of tokens includes six structural characters, strings, numbers, and three literal names.
A JSON text is a serialized value. Note that certain previous specifications of JSON constrained a JSON text to be an object or an array.
Implementations that generate only objects or arrays where a JSON text is called for will be interoperable in the sense that all implementations will accept these as conforming JSON texts.
2023 Update
The other answers on this question are outdated and contain some incorrect information. The most recent JSON specification is RFC 8259, which obsoletes the previous RFCs referenced here. It says:
Note that certain previous specifications of JSON constrained a JSON text to be an object or an array.
A string by itself is therefore valid as an entire JSON text. It doesn't need to be contained within an outer object construct. The examples section of the RFC shows this:
Here are three small JSON texts containing only values:
"Hello world!"
42
true
The content of the string of course is irrelevant, so an empty string by itself is acceptable.
Note: JSON requires that strings be quoted with " so single quotes are still not valid:
"" // Valid
'' // Invalid

in JSON, Why is each name quoted?

The JSON spec says that JSON is an object or an array. In the case of an object,
An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a
string. ...
And later, the spec says that a string is surrounded in quotes.
Why?
Thus,
{"Property1":"Value1","Property2":18}
and not
{Property1:"Value1",Property2:18}
Question 1: why not allow the name in the name/value pairs to be unquoted identifiers?
Question 2: Is there a semantic difference between the two representations above, when evaluated in Javascript?
I leave a quote from a presentation that Douglas Crockford (the creator of the JSON standard) gave to Yahoo.
He talks about how he discovered JSON, and amongst other things why he decided to use quoted keys:
....
That was when we discovered the
unquoted name problem. It turns out
ECMA Script 3 has a whack reserved
word policy. Reserved words must be
quoted in the key position, which is
really a nuisance. When I got around
to formulizing this into a standard, I
didn't want to have to put all of the
reserved words in the standard,
because it would look really stupid.
At the time, I was trying to convince
people: yeah, you can write
applications in JavaScript, it's
actually going to work and it's a good
language. I didn't want to say, then,
at the same time: and look at this
really stupid thing they did! So I
decided, instead, let's just quote the
keys.
That way, we don't have to tell
anybody about how whack it is.
That's why, to this day, keys are quoted in
JSON.
You can find the complete video and transcript here.
Question 1: why not allow the name in the name/value pairs to be unquoted identifiers?
The design philosophy of JSON is "Keep it simple"
"Quote names with "" is a lot simpler than "You may quote names with " or ' but you don't have to, unless they contain certain characters (or combinations of characters that would make it a keyword) and ' or " may need to be quoted depending on what delimiter you selected".
Question 2: Is there a semantic difference between the two representations above, when evaluated in Javascript?
No. In JavaScript they are identical.
Both : and whitespace are permitted in identifiers. Without the quotes, this would cause ambiguity when trying to determine what exactly constitutes the identifier.
In javascript objects can be used like a hash/hashtable with key pairs.
However if your key has characters that javascript could not tokenize as a name, it would fail when trying it access like a property on an object rather than a key.
var test = {};
test["key"] = 1;
test["#my-div"] = "<div> stuff </div>";
// test = { "key": 1, "#my-div": "<div> stuff </div>" };
console.log(test.key); // should be 1
console.log(test["key"]); // should be 1
console.log(test["#my-div"]); // should be "<div> stuff </div>";
console.log(test.#my-div); // would not work.
identifiers can sometimes have characters that can not be evaluated as a token/identifier in javascript, thus its best to put all identifiers in strings for consistency.
If json describes objects, then in practise you get the following
var foo = {};
var bar = 1;
foo["bar"] = "hello";
foo[bar] = "goodbye";
so then,
foo.bar == "hello";
foo[1] == "goodbye" // in setting it used the value of var bar
so even if your examples do produce the same result, their equivalents in "raw code" wouldn't. Maybe that's why?? dunno, just an idea.
I think the right answer to Cheeso's question is that the implementation surpassed the documentation. It no longer requires a string as the key, but rather something else, which can either be a string (ie quoted) or (probably) anything that can be used as a variable name, which I will guess means start with a letter, _, or $, and include only letters, numbers, and the $ and _.
I wanted to simplify the rest for the next person who visits this question with the same idea I did. Here's the meat:
Variable names are not interpolated in JSON when used as an object key (Thanks Friedo!)
Breton, using "identifier" instead of "key", wrote that "if an identifier happens to be a reserved word, it is interpreted as that word rather than as an identifier." This may be true, but I tried it without any trouble:
var a = {do:1,long:2,super:3,abstract:4,var:5,break:6,boolean:7};
a.break
=> 6
About using quotes, Quentin wrote "...but you don't have to, unless [the key] contains certain characters (or combinations of characters that would make it a keyword)"
I found the former part (certain characters) is true, using the # sign (in fact, I think $ and _ are the only characters that don't cause the error):
var a = {a#b:1};
=> Syntax error
var a = {"a#b":1};
a['a#b']
=> 1
but the parenthetical about keywords, as I showed above, isn't true.
What I wanted works because the text between the opening { and the colon, or between the comma and the colon for subsequent properties is used as an unquoted string to make an object key, or, as Friedo put it, a variable name there doesn't get interpolated:
var uid = getUID();
var token = getToken(); // Returns ABC123
var data = {uid:uid,token:token};
data.token
=> ABC123
It may reduce data size if quotes on name are only allowed when necessary

Categories