Best way to store JSON in an HTML attribute? - javascript

I need to put a JSON object into an attribute on an HTML element.
The HTML does not have to validate.
Answered by Quentin: Store the JSON in a data-* attribute, which is valid HTML5.
The JSON object could be any size - i.e. huge
Answered by Maiku Mori: The limit for an HTML attribute is potentially 65536 characters.
What if the JSON contains special characters? e.g. {foo: '<"bar/>'}
Answered by Quentin: Encode the JSON string before putting it into the attribute, as per the usual conventions. For PHP, use the htmlentities() function.
EDIT - Example solution using PHP and jQuery
Writing the JSON into the HTML attribute:
<?php
$data = array(
'1' => 'test',
'foo' => '<"bar/>'
);
$json = json_encode($data);
?>
CLICK ME
Retrieving the JSON using jQuery:
$('a').click(function() {
// Read the contents of the attribute (returns a string)
var data = $(this).data('json');
// Parse the string back into a proper JSON object
var json = $.parseJSON($(this).data('json'));
// Object now available
console.log(json.foo);
});

The HTML does not have to validate.
Why not? Validation is really easy QA that catches lots of mistakes. Use an HTML 5 data-* attribute.
The JSON object could be any size (i.e. huge).
I've not seen any documentation on browser limits to attribute sizes.
If you do run into them, then store the data in a <script>. Define an object and map element ids to property names in that object.
What if the JSON contains special characters? (e.g. {test: '<"myString/>'})
Just follow the normal rules for including untrusted data in attribute values. Use & and " (if you’re wrapping the attribute value in double quotes) or ' (if you’re wrapping the attribute value in single quotes).
Note, however, that that is not JSON (which requires that property names be strings and strings be delimited only with double quotes).

Depending on where you put it,
In a <div> as you asked, you need to ensure that the JSON does not contain HTML specials that could start a tag, HTML comment, embedded doctype, etc. You need to escape at least <, and & in such a way that the original character does not appear in the escaped sequence.
In <script> elements you need to ensure that the JSON does not contain an end tag </script> or escaping text boundary: <!-- or -->.
In event handlers you need to ensure that the JSON preserves its meaning even if it has things that look like HTML entities and does not break attribute boundaries (" or ').
For the first two cases (and for old JSON parsers) you should encode U+2028 and U+2029 since those are newline characters in JavaScript even though they are allowed in strings unencoded in JSON.
For correctness, you need to escape \ and JSON quote characters and it's never a bad idea to always encode NUL.
If the HTML might be served without a content encoding, you should encode + to prevent UTF-7 attacks.
In any case, the following escaping table will work:
NUL -> \u0000
CR -> \n or \u000a
LF -> \r or \u000d
" -> \u0022
& -> \u0026
' -> \u0027
+ -> \u002b
/ -> \/ or \u002f
< -> \u003c
> -> \u003e
\ -> \\ or \u005c
U+2028 -> \u2028
U+2029 -> \u2029
So the JSON string value for the text Hello, <World>! with a newline at the end would be "Hello, \u003cWorld\u003e!\r\n".

Another way you can do it – is put json data inside <script> tag, but not with type="text/javascript", but with type="text/bootstrap" or type="text/json" type, to avoid javascript execution.
Then, in some place of your program, you can ask for it in this way:
function getData(key) {
try {
return JSON.parse($('script[type="text/json"]#' + key).text());
} catch (err) { // if we have not valid json or dont have it
return null;
}
}
On server side, you can do something like this (this example with php and twig):
<script id="my_model" type="text/json">
{{ my_model|json_encode()|raw }}
</script>

Another option is to base64 encode the JSON string and if you need to use it in your javascript decode it with the atob() function.
var data = JSON.parse(atob(base64EncodedJSON));

For simple JSON objects, the code below would work.
Encode:
var jsonObject = { numCells: 5, cellWidth: 1242 };
var attributeString = escape(JSON.stringify(jsonObject));
Decode:
var jsonString = unescape(attributeString);
var jsonObject = JSON.parse(jsonString);

You can use knockoutjs,
<p>First name: <strong data-bind="text: firstName" >todo</strong></p>
<p>Last name: <strong data-bind="text: lastName">todo</strong></p>
knockout.js
// This is a simple *viewmodel* - JavaScript that defines the data and behavior of your UI
function AppViewModel() {
this.firstName = "Jayson";
this.lastName = "Monterroso";
}
// Activates knockout.js
ko.applyBindings(new AppViewModel());
Output
First name: Jayson
Last name: Monterroso
Check this:
http://learn.knockoutjs.com/

Nothing fancy here. From PHP, give the JSON string a run through htmlspecialchars to make sure no special characters can be interpreted as HTML. From Javascript, no escaping necessary; just set the attribute and you're good to go.

Another thought that could be used is store the JSON data as a base64 string in the attribute and then using window.atob or window.btoa to restore it to usable JSON data.
<?php
$json = array("data"=>"Some json data thing");
echo "<div data-json=\"".base64_encode(json_encode($json))."\"></div>";
?>

What you can do is use cdata around your element/s like this
<![CDATA[ <div class='log' mydata='${aL.logData}'>${aL.logMessage}</div> ]]>
where mydata is a raw json string. Hope this helps you and others.

In our case replace ' by ' and inserting the json between simple quotes works perfectly for vue:
php:
$data = json_encode($data);
$data = preg_replace("/'/", ''', $data);
html:
<vue_tag :data='<?=$json?>' />

Related

javascript function with parameters doesnt work with IE

Im calling the function getStateCityZip from inside a php function from line
$result = '<select name="'.$name.'" id="'.$name.'_new_id"'.$css_class.' '.$att.' onChange="getStateCityZip(this.value,`'.$userType.'`);"> ';
and the javascript function is
function getStateCityZip(customer_id,userType){
var userType= String(userType);
alert(customer_id + userType);
$.post( "start_quote.php", { "action":"get_state_city_zip",
"customer_id" : customer_id})
.done(function( data ) {
var obj = jQuery.parseJSON ( data );
if (userType=='shipper'){
$('#s_country').val(obj.billAddress.bill_country);
$('#s_state').val(obj.billAddress.bill_state);
$('#s_city').val(obj.billAddress.bill_city);
$('#s_zip').val(obj.billAddress.bill_zip);
}
}
This works fine on chrome but on IE it doesnt even give alert. If I were to remove the parameters from function definition and from the point where I call the function and hardcode the values inside the function defintion then it works just fine. In all the other cases I have tried it doesnt work for IE. All help will be appreciated
IE doesn't support template literals. You've used template literals around your $userType (the backticks).
In general, it's a bad idea to stuff strings into onClick attributes like this. But if you want to, you need to:
Ensure that all characters with special meaning in JavaScript string literals are escaped, and
Ensure that characters with special meaning in HTML are encoded (because the text in an onClick attribute — as with all attributes — is HTML text).
To do that:
'...onChange="getStateCityZip(this.value, '.htmlspecialchars(json_encode($userType)).'...'
json_encode handles ensuring that a valid JavaScript string is output, and htmlspecialchars ensures that any HTML special characters (like &) are property encoded (as & in the case of &). The characters your JavaScript function sees will be faithfully replicated, because the HTML parser consumes the encoded entities and emits the original character, and the JavaScript parser handles the escaped characters.
Escape ' instead of using backticks:
$result = '<select
name="'.$name.'"
id="'.$name.'_new_id"'.$css_class.' '.$att.'
onChange="getStateCityZip(this.value, \''.$userType.'\');"> ';
// ^ here and here ^

How to insert arbitrary JSON in HTML's script tag

I would like to store a JSON's contents in a HTML document's source, inside a script tag.
The content of that JSON does depend on user submitted input, thus great care is needed to sanitise that string for XSS.
I've read two concept here on SO.
1. Replace all occurrences of the </script tag into <\/script, or replace all </ into <\/ server side.
Code wise it looks like the following (using Python and jinja2 for the example):
// view
data = {
'test': 'asdas</script><b>as\'da</b><b>as"da</b>',
}
context_dict = {
'data_json': json.dumps(data, ensure_ascii=False).replace('</script', r'<\/script'),
}
// template
<script>
var data_json = {{ data_json | safe }};
</script>
// js
access it simply as window.data_json object
2. Encode the data as a HTML entity encoded JSON string, and unescape + parse it in client side. Unescape is from this answer: https://stackoverflow.com/a/34064434/518169
// view
context_dict = {
'data_json': json.dumps(data, ensure_ascii=False),
}
// template
<script>
var data_json = '{{ data_json }}'; // encoded into HTML entities, like < > &
</script>
// js
function htmlDecode(input) {
var doc = new DOMParser().parseFromString(input, "text/html");
return doc.documentElement.textContent;
}
var decoded = htmlDecode(window.data_json);
var data_json = JSON.parse(decoded);
This method doesn't work because \" in a script source becames " in a JS variable. Also, it creates a much bigger HTML document and also is not really human readable, so I'd go with the first one if it doesn't mean a huge security risk.
Is there any security risk in using the first version? Is it enough to sanitise a JSON encoded string with .replace('</script', r'<\/script')?
Reference on SO:
Best way to store JSON in an HTML attribute?
Why split the <script> tag when writing it with document.write()?
Script tag in JavaScript string
Sanitize <script> element contents
Escape </ in script tag contents
Some great external resources about this issue:
Flask's tojson filter's implementation source
Rail's json_escape method's help and source
A 5 year long discussion in Django ticket and proposed code
Here's how I dealt with the relatively minor part of this issue, the encoding problem with storing JSON in a script element. The short answer is you have to escape either < or / as together they terminate the script element -- even inside a JSON string literal. You can't HTML-encode entities for a script element. You could JavaScript-backslash-escape the slash. I preferred to JavaScript-hex-escape the less-than angle-bracket as \u003C.
.replace('<', r'\u003C')
I ran into this problem trying to pass the json from oembed results. Some of them contain script close tags (without mentioning Twitter by name).
json_for_script = json.dumps(data).replace('<', r'\u003C');
This turns data = {'test': 'foo </script> bar'}; into
'{"test": "foo \\u003C/script> bar"}'
which is valid JSON that won't terminate a script element.
I got the idea from this little gem inside the Jinja template engine. It's what's run when you use the {{data|tojson}} filter.
def htmlsafe_json_dumps(obj, dumper=None, **kwargs):
"""Works exactly like :func:`dumps` but is safe for use in ``<script>``
tags. It accepts the same arguments and returns a JSON string. Note that
this is available in templates through the ``|tojson`` filter which will
also mark the result as safe. Due to how this function escapes certain
characters this is safe even if used outside of ``<script>`` tags.
The following characters are escaped in strings:
- ``<``
- ``>``
- ``&``
- ``'``
This makes it safe to embed such strings in any place in HTML with the
notable exception of double quoted attributes. In that case single
quote your attributes or HTML escape it in addition.
"""
if dumper is None:
dumper = json.dumps
rv = dumper(obj, **kwargs) \
.replace(u'<', u'\\u003c') \
.replace(u'>', u'\\u003e') \
.replace(u'&', u'\\u0026') \
.replace(u"'", u'\\u0027')
return Markup(rv)
(You could use \x3C instead of \u003C and that would work in a script element because it's valid JavaScript. But might as well stick to valid JSON.)
First of all, your paranoia is well founded.
an HTML-parser could be tricked by a closing script tag (better assume by any closing tag)
a JS-parser could be tricked by backslashes and quotes (with a really bad encoder)
Yes, it would be much "safer" to encode all characters that could confuse the different parsers involved. Keeping it human-readable might be contradicting your security paradigm.
Note: The result of JSON String encoding should be canoncical and OFC, not broken, as in parsable. JSON is a subset of JS and thus be JS parsable without any risk. So all you have to do is make sure the HTML-Parser instance that extracts the JS-code is not tricked by your user data.
So the real pitfall is the nesting of both parsers. Actually, I would urge you to put something like that into a separate request. That way you would avoid that scenario completely.
Assuming all possible styles and error-corrections that could happen in such a parser it might be that other tags (open or close) might achieve a similar feat.
As in: suggesting to the parser that the script tag has ended implicitly.
So it is advisable to encode slash and all tag braces (/,<,>), not just the closing of a script-tag, in whatever reversible method you choose, as long as long as it would not confuse the HTML-Parser:
Best choice would be base64 (but you want more readable)
HTMLentities will do, although confusing humans :)
Doing your own escaping will work as well, just escape the individual characters rather than the </script fragment
In conclusion, yes, it's probably best with a few changes, but please note that you will be one step away from "safe" already, by trying something like this in the first place, instead of loading the JSON via XHR or at least using a rigorous string encoding like base64.
P.S.: If you can learn from other people's code encoding the strings that's nice, but you should not resort to "libraries" or other people's functions if they don't do exactly what you need.
So rather write and thoroughly test your own (de/en)coder and know that this pitfall has been sealed.

Converting textarea value into a valid JSON string

I am trying to make sure input from user is converted into a valid JSON string before submitted to server.
What I mean by 'Converting' is escaping characters such as '\n' and '"'.
Btw, I am taking user input from HTML textarea.
Converting user input to a valid JSON string is very important for me as it will be posted to the server and sent back to client in JSON format. (Invalid JSON string will make whole response invalid)
If User entered
Hello New World,
My Name is "Wonderful".
in HTML <textarea>,
var content = $("textarea").val();
content will contain new-line character and double quotes character.
It's not a problem for server and database to handle and store data.
My problem occurs when the server sends back the data posted by clients to them in JSON format as they were posted.
Let me clarify it further by giving some example of my server's response.
It's a JSON response and looks like this
{ "code": 0, "id": 1, "content": "USER_POSTED_CONTENT" }
If USER_POSTED_CONTENT contains new-line character '\n', double quotes or any characters that are must be escaped but not escaped, then it is no longer a valid JSON string and client's JavaScript engine cannot parse data.
So I am trying to make sure client is submitting valid JSON string.
This is what I came up with after doing some researches.
String.prototype.escapeForJson = function() {
return this
.replace(/\b/g, "")
.replace(/\f/g, "")
.replace(/\\/g, "\\")
.replace(/\"/g, "\\\"")
.replace(/\t/g, "\\t")
.replace(/\r/g, "\\r")
.replace(/\n/g, "\\n")
.replace(/\u2028/g, "\\u2028")
.replace(/\u2029/g, "\\u2029");
};
I use this function to escape all the characters that need to be escaped in order to create a valid JSON string.
var content = txt.val().escapeForJson();
$.ajax(
...
data:{ "content": content }
...
);
But then... it seems like str = JSON.stringify(str); does the same job!
However, after reading what JSON.stringify is really for, I am just confused. It says JSON.stringify is to convert JSON Object into string.
I am not really converting JSON Object to string.
So my question is...
Is it totally ok to use JSON.stringify to convert user input to valid JSON string object??
UPDATES:
JSON.stringify(content) worked good but it added double quotes in the beginning and in the end. And I had to manually remove it for my needs.
Yep, it is totally ok.
You do not need to re-invent what does exist already, and your code will be more useable for another developer.
EDIT:
You might want to use object instead a simple string because you would like to send some other information.
For example, you might want to send the content of another input which will be developed later.
You should not use stringify is the target browser is IE7 or lesser without adding json2.js.
I don't think JSON.stringify does what you need. Check the out the behavior when handling some of your cases:
JSON.stringify('\n\rhello\n')
*desired : "\\n\\rhello\\n"
*actual : "\n\rhello\n"
JSON.stringify('\b\rhello\n')
*desired : "\\rhello\\n"
*actual : "\b\rhello\n"
JSON.stringify('\b\f\b\f\b\f')
*desired : ""
*actual : ""\b\f\b\f\b\f""
The stringify function returns a valid JSON string. A valid JSON string does not require these characters to be escaped.
The question is... Do you just need valid JSON strings? Or do you need valid JSON strings AND escaped characters? If the former: use stringify, if the latter: use stringify, and then use your function on top of it.
Highly relevant: How to escape a JSON string containing newline characters using javascript?
Complexity. I don't know what say.
Take the urlencode function from your function list and kick it around a bit.
<?php
$textdata = $_POST['textdata'];
///// Try without this one line and json encoding tanks
$textdata = urlencode($textdata);
/******* textarea data slides into JSON string because JSON is designed to hold urlencoded strings ******/
$json_string = json_encode($textdata);
//////////// decode just for kicks and used decoded for the form
$mydata = json_decode($json_string, "true");
/// url decode
$mydata = urldecode($mydata['textdata']);
?>
<html>
<form action="" method="post">
<textarea name="textdata"><?php echo $mydata; ?></textarea>
<input type="submit">
</html>
Same thing can be done in Javascript to store textarea data in local storage. Again textarea will fail unless all the unix formatting is deal with. The answer is take urldecode/urlencode and kick it around.
I believe that urlencode on the server side will be a C wrapped function that iterates the char array once verses running a snippet of interpreted code.
The text area returned will be exactly what was entered with zero chance of upsetting a wyswyg editor or basic HTML5 textarea which could use a combination of HTML/CSS, DOS, Apple and Unix depending on what text is cut/pasted.
The down votes are hilarious and show an obvious lack of knowledge. You only need to ask yourself, if this data were file contents or some other array of lines, how would you pass this data in a URL? JSON.stringify is okay but url encoding works best in a client/server ajax.

How to use a JSON literal string?

Since the JSON format specifies that single quotes should not be escaped, most libraries (or even the native JSON parser) will fail if you have an escaped single quote in it. Now this usually is not a problem since most of the time you do an XHR that fetches some data formatted as JSON and you use the responseText which contains your JSON string that you can then parse, etc.
In this particular situation, I have a JSON string stored in a database as text... so the database contains something like {"property":"value"} and I want to output this as part of an HTML page created by the server so that the JavaScript code in that page looks something like this:
var x = '{"property":"value"}';
Now if the JSON string in the database contains a single quote like this:
{"property":"val'ue"}
Then I need to escape it or else I will never be able to use it as a string:
console.clear();
var obj = {prop:"val'ue"};
var str = JSON.stringify(obj);
console.log("JSON string is %s",str);
console.dir(JSON.parse(str)); //No problem here
//This obviously can't work since the string is closed and it causes an invalid script
//console.dir(JSON.parse('{prop:"val'ue"}'));
//so I need to escape it to use a literal JSON string
console.dir(JSON.parse('{"prop":"val\'ue"}'));
The question then is why {"prop":"val\'ue"} not considered a valid JSON string ?
In JavaScript - the string '{"prop":"val\'ue"}' is a correct way to encode the JSON as a string literal.
As the JavaScript interpreter reads the single-quoted string, it will convert the \' to '. The value of the string is {"prop":"val'ue"} which is valid JSON.
In order to create the invalid JSON string, you would have to write '{"prop":"val\\\'ue"}'
If I understand the question right, you are trying to generate JavaScript code that will set some variable to the decoded version of a JSON string you have stored in the database. So now you are encoding the string again, as the way to get this string into JavaScript is to use a string literal, passing it through JSON.parse(). You can probably rely on using the server side JSON encoder to encode the JSON string as a JavaScript string literal. For instance:
<?php $jsonString = '{"prop":"val\'ue"}'; ?>
var myJson = JSON.parse(<?php echo json_encode($jsonString) ?>);
// Prints out:
// var myJson = JSON.parse("{\"prop\":\"val'ue\"}");
// And results: Object - { prop: "val'ue"}
However, If you are 100% sure the JSON is going to be valid, and don't need the weight of the extra parsing / error checking - you could skip all that extra encoding and just write:
var myJson = <?php echo $jsonString; ?>
Remember, JSON is valid JavaScript syntax for defining objects after all!
According to jsonlint it is valid without escaping the single quote, so this is fine:
{"prop": "val'ue"}
But this is invalid:
{"prop":"val\'ue"}
According to json.org json:
is completely language independent but
uses conventions that are familiar to
programmers of the C-family of
languages, including C, C++, C#, Java,
JavaScript, Perl, Python, and many
others
So it is the language conventions in c-type languages regarding the reverse solidus (\) that means that your example is not valid.
You might try the following, however, it's ugly.
JSON.parse("{\"obj\":\"val'ue\"}");
Or just store the string to a var first. This should not store the literal backslash value and therefore the JSON parser should work.
var str = '{"obj" : "val\'ue"}';
JSON.parse(str);

Accessing Json fields with weird characters

i have a json string im converting to object with a simple eval(string);
heres the sample of the json string:
var json = #'
"{ description" : { "#cdata-section" : "<some html here>" } }
';
var item = eval('('+json+')');
I am trying to access it like so
item.description.#cdata-section
my problem is that javascript does not like the # in the field name.. is there a way to access it?
item.description['#cdata-section']
Remember that all Javascript objects are just hash tables underneath, so you can always access elements with subscript notation.
Whenever an element name would cause a problem with the dot notation (such as using a variable element name, or one with weird characters, etc.) just use a string instead.
var cdata = item.description["#cdata-section"];
While the official spec for JSON specifies simply for chars to be provided as a field identifier, when you parse your JSON into a Javascript object, you now fall under the restrictions of a Javascript identifier.
In the Javascript spec, an identifier can start with either a letter, underscore or $. Subsequent chars may be any letter, digit, underscore or $.
So basically, the # is valid under the JSON spec but not under Javascript.

Categories