Security of adding php information into javascript - javascript

The script I have works perfectly fine but I have done some reading and now I am wondering if this code is secure and if not what would be the secure & correct way of implementing it:
<?php
function get_current_page_url()
{
$url = $_SERVER['REQUEST_URI'];
$main_url = explode('?', $url);
$main_url = explode('/', $main_url[0]);
return $main_url = end($main_url);
}
?>
<script type="text/javascript">
var pageUrl = "<?php echo get_current_page_url() ?>?" + form_data;
</script>

Can you write an answer to this question with the secure & correct way to implement it, preventing these security issues?
Injecting content from PHP into a JavaScript literal in a <script> block:
var current = <?php echo json_encode(get_current_page_url(), JSON_HEX_TAG); ?>;
var pageUrl = current + '?' + form_data;
JSON-encoding produces a string literal including the quotes. This is good enough for JavaScript because JSON literals are (nearly) a subset of JavaScript literals.
(The ‘nearly’ is because due to an oversight in design, raw characters U+2018 and U+2019 are allowable in JSON string literals but not JavaScript. This doesn't matter here as PHP encodes those and other non-ASCII characters to \u escapes anyway.)
The JavaScript is enclosed in a <script> block in an HTML document, so it's essential that if the string </script> is inside the PHP string it doesn't come out directly in the source—if it did it would end not just the string literal, but the whole script block. JSON_HEX_TAG prevents this happening by replacing the < with the escape \u003C. It is not wholly necessary from PHP 5.4 onwards as these versions also escape the / by default.
Using JSON_HEX_QUOT|JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_APOS will encode more characters that are special in an HTML context, which isn't wholly necessary here but would allow you to inject into a string in an HTML <script> block as well as an XHTML <script> block or and HTML onxxx="..." event handler without having to perform the extra layer of htmlspecialchars() that could be necessary in these places.
Having said that, you are better off not injecting into JavaScript at all, because keeping track of multiple nested injection contexts is hard. It's better to keep JS out of your HTML documents entirely, ideally. You can inject the content into data- attributes with plain old HTML-escaping, like you should be using everywhere you put content into HTML, and then read the strings direct from the DOM:
<body data-current-url="<?php echo htmlspecialchars(get_current_page_url()); ?>">
...
var current = document.body.getAttribute('data-current-url');
var pageUrl = current + '?' + form_data;
Having said that, all of this is likely superfluous in this case because you can already read the document's URL directly from JavaScript using the location object. Unless you are doing something wacky involving mod_rewrite, it would be the same to say just:
var pageUrl = location.pathname.split('/').pop() + '?' + form_data;
and if all you want is to be able to make a relative link to the current page with a different query string, you don't even need to do that, you can simply use '?' + form_data.

Related

Unterminated string literal with PHP in JavaScript

Please tell me why this code tells me
SyntaxError: unterminated string literal
My code:
<script>
console.log(" <?php $geladen = file_get_contents("./testtext"); echo $geladen; ?> ");
</script>
That's a JavaScript error message, which strongly implies one of two things:
the JavaScript that reaches the browser still includes the <?php etc., meaning the PHP didn't get parsed on the server (and thus the browser flipped out on "./testtext"), or
the file testtext (and therefore your variable $geladen) contains quotation marks. Either is possible from the very little information you have posted.
You can figure out which it is by looking at the HTML in your browser.
If it's the former (if you see <?php in the HTML), then you need to fix your server configuration.
If it's the latter (if testtext contains any " marks), then you need to encode it properly before echoing, using json_encode() like this:
<script>
console.log(" <?php $geladen = file_get_contents("./testtext"); echo json_encode($geladen); ?> ");
</script>
All that said, mixing PHP and HTML (not to mention PHP, HTML, and JavaScript) this way is not a great practice. You'd be much better off using a templating engine of some sort (Twig, Blade, etc.).
If the contents of 'testtext' contains a quote mark, it will break the javascript. Try addslashes().
<script>
console.log(" <?php $geladen = addslashes(file_get_contents("./testtext")); echo $geladen; ?> ");
</script>

Converting href perl variables to normal scalar variables

I have these two variables that I am trying to compare. They both have the same value, however, one is a href variable - meaning, it's being read from a file like this
<a href=http://google.com>Variable</a>
It's read like this, but displayed as an anchor tag in the browser, so when I go to compare a value using print "$collect_zids{$key} --> $temp";I see in the browser as
Variable --> Variable
How it appears in the browser. One text another link.
I'm assuming these two values are different hence why this code does not run
if($collect_zids{$key} eq $from_picture){
print "<h1>Hello</h1>";
}
Is there a way I can convert the href variable into a normal scalar variable so that I can compare them?
Thanks!
P.S. I think Javascript might be the only way, however, I don't have any experience with it.
There is no such thing as an "href variable". You have two scalar variables. One contains plain text and the other contains HTML. Your task is to extract the text inside the HTML <a> tag from the HTML variable and to compare that text with the text from the plain text variable.
One way to do that would be to remove the HTML from the HTML variable.
my $html = '<a href=http://google.com>Variable</a>';
my $text = 'Variable';
$html =~ s/<.+?>//g;
if ($html eq $text) {
say "Equal";
} else {
say "Not Equal [$html/$text]";
}
But it cannot be emphasised enough that parsing HTML using a regular expression is very fragile and is guaranteed not to work in many cases. Far better to use a real HTML parser. HTML::Strip is made for this very purpose.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use HTML::Strip;
my $html = '<a href=http://google.com>Variable</a>';
my $text = 'Variable';
my $parser = HTML::Strip->new;
$html = $parser->parse($html);
if ($html eq $text) {
say "Equal";
} else {
say "Not Equal [$html/$text]";
}
It's also worth pointing out that this is answered in the Perl FAQ
How do I remove HTML from a string?
Use HTML::Strip, or HTML::FormatText which not only removes HTML but
also attempts to do a little simple formatting of the resulting plain
text.
Update: In a comment, you say
I have no way of using these methods since I am not explicitly defining the variable.
Which is clearly not true. How a variable is initialised has no bearing whatsoever on how you can use it.
I assume your HTML text is in the variable $from_picture, so you would strip the HTML with code like this:
my $parser = HTML::Strip->new;
my $stripped = $parser->parse($from_picture);
if($collect_zids{$key} eq $stripped){
print "<h1>Hello</h1>";
}
I have no idea where you got the idea that you couldn't use my solution because I was directly initialising the variables, where you were reading the data from a file. An important skill in programming is the ability to see through complex situations and extract the relevant details. It appears you need to do some more work in this area :-)
I found the answer using the Perl module HTML::FormatText;
use HTML::FormatText;
my $formatter = HTML::FormatText->new();
my $string = HTML::FormatText->format_file("path_to_the_file"); #$string variable to hold the result and the path must be for a file.
After using the HTML::FormatText module, I was able to get the raw string that was being read, instead of it being interpreted as HTML. So, I was getting <a href=http://google.com>Variable</a> returned, instead of just Variable. After getting the raw string, I could use regex to extract the parts that I needed.
Credit to - https://metacpan.org/pod/HTML::FormatText

XSS prevention and .innerHTML

When I allow users to insert data as an argument to the JS innerHTML function like this:
element.innerHTML = “User provided variable”;
I understood that in order to prevent XSS, I have to HTML encode, and then JS encode the user input because the user could insert something like this:
<img src=a onerror='alert();'>
Only HTML or only JS encoding would not help because the .innerHTML method as I understood decodes the input before inserting it into the page. With HTML+JS encoding, I noticed that the .innerHTML decodes only the JS, but the HTML encoding remains.
But I was able to achieve the same by double encoding into HTML.
My question is: Could somebody provide an example of why I should HTML encode and then JS encode, and not double encode in HTML when using the .innerHTML method?
Could somebody provide an example of why I should HTML encode and then
JS encode, and not double encode in HTML when using the .innerHTML
method?
Sure.
Assuming the "user provided data" is populated in your JavaScript by the server, then you will have to JS encode to get it there.
This following is pseudocode on the server-side end, but in JavaScript on the front end:
var userProdividedData = "<%=serverVariableSetByUser %>";
element.innerHTML = userProdividedData;
Like ASP.NET <%= %> outputs the server side variable without encoding. If the user is "good" and supplies the value foo then this results in the following JavaScript being rendered:
var userProdividedData = "foo";
element.innerHTML = userProdividedData;
So far no problems.
Now say a malicious user supplies the value "; alert("xss attack!");//. This would be rendered as:
var userProdividedData = ""; alert("xss attack!");//";
element.innerHTML = userProdividedData;
which would result in an XSS exploit where the code is actually executed in the first line of the above.
To prevent this, as you say you JS encode. The OWASP XSS prevention cheat sheet rule #3 says:
Except for alphanumeric characters, escape all characters less than
256 with the \xHH format to prevent switching out of the data value
into the script context or into another attribute.
So to secure against this your code would be
var userProdividedData = "<%=JsEncode(serverVariableSetByUser) %>";
element.innerHTML = userProdividedData;
where JsEncode encodes as per the OWASP recommendation.
This would prevent the above attack as it would now render as follows:
var userProdividedData = "\x22\x3b\x20alert\x28\x22xss\x20attack\x21\x22\x29\x3b\x2f\x2f";
element.innerHTML = userProdividedData;
Now you have secured your JavaScript variable assignment against XSS.
However, what if a malicious user supplied <img src="xx" onerror="alert('xss attack')" /> as the value? This would be fine for the variable assignment part as it would simply get converted into the hex entity equivalent like above.
However the line
element.innerHTML = userProdividedData;
would cause alert('xss attack') to be executed when the browser renders the inner HTML. This would be like a DOM Based XSS attack as it is using rendered JavaScript rather than HTML, however, as it passes though the server it is still classed as reflected or stored XSS depending on where the value is initially set.
This is why you would need to HTML encode too. This can be done via a function such as:
function escapeHTML (unsafe_str) {
return unsafe_str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/\"/g, '"')
.replace(/\'/g, ''')
.replace(/\//g, '/')
}
making your code
element.innerHTML = escapeHTML(userProdividedData);
or could be done via JQuery's text() function.
Update regarding question in comments
I just have one more question: You mentioned that we must JS encode
because an attacker could enter "; alert("xss attack!");//. But if we
would use HTML encoding instead of JS encoding, wouldn't that also
HTML encode the " sign and make this attack impossible because we
would have: var userProdividedData =""; alert("xss attack!");//";
I'm taking your question to mean the following: Rather than JS encoding followed by HTML encoding, why don't we don't just HTML encode in the first place, and leave it at that?
Well because they could encode an attack such as <img src="xx" onerror="alert('xss attack')" /> all encoded using the \xHH format to insert their payload - this would achieve the desired HTML sequence of the attack without using any of the characters that HTML encoding would affect.
There are some other attacks too: If the attacker entered \ then they could force the browser to miss the closing quote (as \ is the escape character in JavaScript).
This would render as:
var userProdividedData = "\";
which would trigger a JavaScript error because it is not a properly terminated statement. This could cause a Denial of Service to the application if it is rendered in a prominent place.
Additionally say there were two pieces of user controlled data:
var userProdividedData = "<%=serverVariableSetByUser1 %>" + ' - ' + "<%=serverVariableSetByUser2 %>";
the user could then enter \ in the first and ;alert('xss');// in the second. This would change the string concatenation into one big assignment, followed by an XSS attack:
var userProdividedData = "\" + ' - ' + ";alert('xss');//";
Because of edge cases like these it is recommended to follow the OWASP guidelines as they are as close to bulletproof as you can get. You might think that adding \ to the list of HTML encoded values solves this, however there are other reasons to use JS followed by HTML when rendering content in this manner because this method also works for data in attribute values:
<a href="javascript:void(0)" onclick="myFunction('<%=JsEncode(serverVariableSetByUser) %>'); return false">
Despite whether it is single or double quoted:
<a href='javascript:void(0)' onclick='myFunction("<%=JsEncode(serverVariableSetByUser) %>"); return false'>
Or even unquoted:
<a href=javascript:void(0) onclick=myFunction("<%=JsEncode(serverVariableSetByUser) %>");return false;>
If you HTML encoded like mentioned in your comment an entity value:
onclick='var userProdividedData ="";"' (shortened version)
the code is actually run via the browser's HTML parser first, so userProdividedData would be
";;
instead of
";
so when you add it to the innerHTML call you would have XSS again. Note that <script> blocks are not processed via the browser's HTML parser, except for the closing </script> tag, but that's another story.
It is always wise to encode as late as possible such as shown above. Then if you need to output the value in anything other than a JavaScript context (e.g. an actual alert box does not render HTML, then it will still display correctly).
That is, with the above I can call
alert(serverVariableSetByUser);
just as easily as setting HTML
element.innerHTML = escapeHTML(userProdividedData);
In both cases it will be displayed correctly without certain characters from disrupting output or causing undesirable code execution.
A simple way to make sure the contents of your element is properly encoded (and will not be parsed as HTML) is to use textContent instead of innerHTML:
element.textContent = "User provided variable with <img src=a>";
Another option is to use innerHTML only after you have encoded (preferably on the server if you get the chance) the values you intend to use.
I have faced this issue in my ASP.NET Webforms application. The fix to this is relatively simple.
Install HtmlSanitizationLibrary from NuGet Package Manager and refer this in your application. At the code behind, please use the sanitizer class in the following way.
For example, if the current code looks something like this,
YourHtmlElement.InnerHtml = "Your HTML content" ;
Then, replace this with the following:
string unsafeHtml = "Your HTML content";
YourHtmlElement.InnerHtml = Sanitizer.GetSafeHtml(unsafeHtml);
This fix will remove the Veracode vulnerability and make sure that the string gets rendered as HTML. Encoding the string at code behind will render it as 'un-encoded string' rather than RAW HTML as it is encoded before the render begins.

Storing HTML in a Javascript Variable

I am currently coding a website that will allow a user to input data into a MySQL database using a WYSIWYG editor. The data stores into the database without a problem and I can query it using PHP and display it on my webpage.
Up to this point everything is working ok until I try to move the HTML stored in the MySQL database into a javascript variable. I was able to get it working using CDATA[], but not for every browser. It works in Firefox, but not IE or Chrome. I am looking for a solution that will be able to work in all of the browsers. Any help would be greatly appreciated.
Since you're using PHP:
<script>
var foo = <?php echo json_encode($htmlFromDatabase); ?>
</script>
The json_encode method, while normally used for encoding JSON objects, is also useful for converting other PHP variables (like strings) to their JavaScript equivalents.
"Safefy" your code, like this
str_replace( array("\r", "\r\n", "\n", "\t"), '', str_replace('"','\"',$str));
The above function clears linebreaks, and tabs so that your code appears in one line. If it breaks into more than one line, then it cannot be parsed as a string in JS and an error is thrown. Also we are escaping " to \", maybe there are more string replacements that need to take place, it depends in your content.
and inline it in javascript,
//<![CDATA[
var myHtml = <?php echo '"'.$stuff.'"'; ?>;
//]]>
keep in mind the '"' part so that it appears like this var myHtml = "test";

Javascript can't find my mod_rewrite query string!

I use the following javascript class to pull variables out of a query string:
getUrlVars : function() {
var vars = {};
var parts = window.location.href.replace(/[?&]+([^=&]+)=([^&]*)/gi, function(m,key,value) {
vars[key] = value;
});
return vars;
}
So this works: http://example.com/signinup.html?opt=login
I need http://www.example.com/login/ to work the same way. Using mod_rewrite:
RewriteRule ^login/? signinup.html?opt=login [QSA]
allows the page to load, the javascript to load, the css to load, but my javascript functions can't find the opt key (i.e., it's undefined). How do I get opt to my javascript?
Javascript is client-side. Mod_rewrite is server-side.
Therefore Javascript will never see the rewritten URL. As far as your browser is concerned, the URL that you entered is the finished address.
The only real solution is to change your Javascript so it looks at the URL it's got rather than the old version (or possibly parse for both alternatives, since the old URL will still work and people may still have old bookmarks).
The other possible solution would be to go to your server-side code (PHP? whatever?) where you can see the rewritten URL, and insert some javascript code there which you can parse on the client side. Not an ideal solution though. You'd be better of just going with option 1 and changing you Javascript to cope with the URLs it's actually going to be getting.
Your issue is that JavaScript runs on the client side, so it will never see the ?opt=login part to which the URL gets converted internally on the server.
Apart from changing your regular expression to match the new URL format, the easiest workaround might be to write a JavaScript statement on server side that introduces the value of the opt variable into JavaScript.
If you're using PHP, you can have the PHP create a JavaScript variable for you. For example:
$params = "?";
foreach($_GET as $key => $value) {
$params = $params . $key . "=" . $value . "&";
}
echo 'var urlParams = "' . $params . '"';
Now, you JavaScript will have access to a urlParams variable that looks like this
?opt=login&
Then, in your Javascript code, wherever you expected to use the URL parameters, use the urlParams instead.
If it's a special case, then put it as a special case in some way. If you rewrite generally, change your general regular expression. The way mod_rewrite works, the client never knows the rewritten URL. From the client, it's /login/ and /login/ only. Only the server ever knows that it's really signinup.html?opt=login. So there's no way your regular expression or location.href can know about it.
Unless you use the [R] flag in your RewriteRule, the browser (and thus javascript) will never know about the new URL. If you don't want to be redirecting people, you're going to have to add some code to your login page that GET parameters as javascript in the page.

Categories