I'm currently trying out websockets, creating a client in JavaScript and a server in Python.
I'm stuck on a simple problem, though: when I send something from the client to the server it always contains a special ending character, but I don't know how to remove it.
I've tried data[:-1] thinking that would get rid of it, but it didn't.
With the character my JSON code won't validate.
This is what I send through JavaScript:
ws.send('{"test":"test"}');
This is what I get in python:
{"test":"test"}�
I thought the ending character was \xff
The expression "data[:-1]" is an expression that produces a copy of data missing the last character. It doesn't modify the "data" variable. To do that, you have to assign back to "data", like so:
data = data[:-1]
My suspicion is the "special ending character" is a bug, somewhere, either in your code or how you're using the APIs. Network code does not generally introduce random characters into the data stream. Good luck!
Related
I'm working on a webapp that sanitizes models for the view. However, it is stripping too many wanted characters, like forward slashes, semi-colons, colons, dollar signs, quote marks and accented letters from foreign languages. e.g. 3/8"W becomes 38w.
Do I need to modify the function to be less aggressive, or should I simply not use the sanitize function at all? I guess the bigger question is, what is sanitization for?
Full disclosure - I didn't write the function and I'm not fantastic with regex.
value = value.replace(/[^a-z0-9áéíóúñü .,_-]/gim, "").trim();
The sanitization concept is mainly aimed for sanitizing data from bad characters before being saved in database or processed with any type of queries.
That said, you shouldn't care about sanitizing data at front end so much because javascript can be disabled.
Any thing in client side can be bypassed.
You should care so much about that at back end.
Sanitization should be done for data before saving in database.
Escaping should be done for data after retrieving from database.
When implementing Server Sent Events on your application server, you can terminate a message and have it send by ending it with two line breaks: \n\n, as demonstrated on this documentation page.
So, what if you're receiving user input and forwarding it to all interested parties (as is typical in a chat application)? Could a malicious user not insert two line breaks in their payload to terminate the message early? Even more, could they not then set special fields such as the id and retry fields, now that they have access to the first characters of a line?
It seems that the only alternative is to instead scan their entire payload, and then replace instances of \n with something like \ndata:, such that their entire message payload has to maintain its position in the data tag.
However, is this not very inefficient? Having to scan the entire message payload for each message and then potentially do replacements involves not only scanning each entire payload, but also reallocating in the case of maleficence.
Or is there an alternative? I'm currently trying to decide between websockets and SSE, as they are quite similar, and this issue is making me learn more towards WebSockets, because it feels as if they would be more efficient if they are able to avoid this potential vulnerability.
Edit: To clarify, I'm mostly ignorant as to whether or not there is a way around having to scan each message in its entirety for \n\n. And if not, does WebSockets have the same issue where you need to scan each message in its entirety? Because if it does, then no matter. But if that's not the case, then it seems to be a point in favor of using websockets over SSE.
it shouldnt be necessary to scan the payload if you're encoding the user data correctly. With JSON it is safe to use the "data" field in server-sent events because JSON decode newline and controls characters per default, as the RFC says:
The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks, except for the characters that must be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).
https://www.rfc-editor.org/rfc/rfc7159#page-8
the important thing is that nobody sneaks in an newline charactes but this isnt new to server sent events, header are seperate by a single new line and can be tampered too (if not correctly encoded) see https://www.owasp.org/index.php/HTTP_Response_Splitting
Heres an example of an server sent application with json encoding:
https://repl.it/#BlackEspresso/PointedWelloffCircles
you shouldnt be able to tampere the data field even with the newline characters are allowed
Encoding souldnt stop you from using server side events, but there are major differences between websockets and sse. For a comparison see this answer: https://stackoverflow.com/a/5326159/1749420
Unless I'm missing something obvious, sanitizing input is a common thing in web development.
Since the source that you shared explicitly mentioned a PHP example, I just did some research and lookie here:
https://www.php.net/manual/en/filter.filters.sanitize.php
FILTER_SANITIZE_SPECIAL_CHARS
HTML-escape '"<>& and characters with ASCII value less than 32,
optionally strip or encode other special characters.
and:
'\n' = 10 = 0x0A = line feed
So I'm not sure I understand why you would assume that converting certain input to character entities would necessarily be a bad thing.
Avoiding users to abuse the system by uploading unwanted input is what sanitization is for.
I'm making a standard jQuery $.ajax() call, doing a POST. The call passes a string to PHP controller.
The problem is the following: when a en dash (–) character is used in the string, by the time it reaches PHP it's replaced with a (?) character. A normal hyphen (-) does not cause the problem.
The site's encoding is UTF-8. I'm not sure how to get around this problem. I probably could do some character replacing, but then do I need to do it for every single "problematic" punctuation mark?
And the problem aside, shouldn't this just work if the encoding is correct?
Confusing.
Update:
I used mb_detect_encoding() on the passed string. The result is "ASCII"... I'm working with a legacy code. How do I fix something like that?
On PHP side, the $_REQUEST global was used to retrieve the Ajax data. After I changed it to $_POST, the en dashes are kept.
I don't really get why $_REQUEST was failing, though.
Anyway, this worked in this case. I truly dislike the devs who wrote this code and created this project :)
I have a Sencha Touch app. One of the stores I have uses an ajax proxy and a json reader. Some of the strings in the JSON returned from my sinatra app occasionally contain this character:
http://www.fileformat.info/info/unicode/char/2028/index.htm
Although it's invisible, the character occurs twice in the second string here, between the period and the ending quote:
"description": "Each of the levels requires logic, skill, and brute force to crush the enemy.
"
Try copy and pasting "Each of the levels requires logic, skill, and brute force to crush the enemy.
" into your javascript console! It won't be parsed as a string, and fails with SyntaxError: Unexpected token ILLEGAL.
This causes the JSON response to fail. I've been stuck on this for a long time! Any suggestions?
The only reliable way to fix this is server-side. Make sure your JSON generator emits those characters escaped, e.g. as \u2028.
In my experience, it's easiest to simply encode your JSON in plain ASCII which will always work. The downside is that it's less efficient as non-ASCII characters will take up more space, so depending on the frequency of those, you may not want that trade-off...
The documentation for Perl's JSON::XS has a good explanation of the problem and advice for how to fix it in Perl:
http://search.cpan.org/perldoc?JSON::XS#JSON_and_ECMAscript
Conceptually you are only allowed to send out strings from the server that are valid JavaScript literals by escaping appropriately.
If you want to fix this issue on the client you need an extra workaround step (only seems to work in Firefox):
var a = escape("Each of the levels requires logic, skill, and brute force to crush the enemy.");
alert(unescape(a));
But the discussion is obsolete, because you must escape on the server.
Avoid using eval to parse JSON.
Use JSON.parse or https://github.com/douglascrockford/JSON-js.
JSON.parse('{}'); // where {} is your JSON String
Im using .net multiline control. The I use jQuery to get data from that control:
$('.detailsCommentContent').val()
in this moment when I alert that value new lines are visible.
Then I make request www.example.com?commentContent= + "$('.detailsCommentContent').val()"
And in the http request there are no newLines signs at this moment.
What should I do to keep this new Lines symbols ?
thanks for help
Use a POST request for data that changes something, not a GET. That way you can POST whatever data you like without encoding it, and best of all: bots that are just spidering your page can't do any damage to your site. See also http://thedailywtf.com/Articles/The_Spider_of_Doom.aspx
You're going to have to URL-encode the value:
var url = "http://www.yourpage.com?commentContent=" +
encodeURIComponent($('.detailsCommentContent').val());
<Textarea> elements store "new lines" based on the host operating system, and sometimes even just the browser.
Between windows, mac, and linux, this could be \r\n, \n, or \r.
\r is a "carriage return", and \n is a "line feed". (C++ people will be familiar with crlf)
These characters make up a new line.
Here's the problem:
When you get the value of a textarea, it doesn't always preserve the line marker when going to store to a variable. You can try and replace the line-break, or encode it as per #Pointy's answer.
This question's existed in a few forms, but they're all hard to find because we have a lot of differnt names for the line-break.
Basically, the browser hates you for some reason, you need to hide what you're doing because it's trying to be really really smart.
use the escape() function on the value, or use a POST instead of a GET.