Advanced Javascript Obfuscation - javascript

I've been training heavily in JS obfuscation, starting to know my way around all advanced concepts, but I recently found an obfuscated code, I believe it is some form of "Native Javascript Code", I just can't find ANY documentation on this type of obfuscation :
Here is a small extract :
'\141\75\160\162\157\155\160\164\50\47\105\156\164\162\145\172\40'
It is called this way :
eval(eval('\141\75\160\162\157\155\160\164\50\47\105\156\164\162\145\172\40'))
Since the code is the work of another and I encoutered it in a JS challenge I'm not posting the full code, so the example I gave won't work, but the full code does work.
So here is my question:
What type of code is this? And where can I learn more about it?
Any suggestions appreciated :)

It's just a string with the characters escaped. You can read it in the JavaScript console in any browser:
console.log('\141\75\160\162\157\155\160\164\50\47\105\156\164\162\145\172\40')
will print:
"a=prompt('Entrez "

It's just escaped characters, one part outputting the string of a query and another actually running the returned string - try calling it in a console.
eval('\160\162\157\155\160\164\50\47\105\156\164\162\145\172\47\51')
Might help?

These numbers is the ascii codes (http://www.asciitable.com/index/asciifull.gif) of characters (in Octal representation).
You can convert it to characters. This is used when somebody wants to make an XSS attack, or wants to hide the js code.
So the string what you written represents:
a=prompt('Entrez
The js engines, browsers can translate the octal format to the 'real' string. With eval function it could run. (in case the 'translated' code has no syntax errors)

Related

Unicode characters cannot be decoded

I use browserless.js (headless Chrome) to fetch the html code of a website, and then use a regular expression to find certain image URLs.
One example is the following:
https://vignette.wikia.nocookie.net/moviepedia/images/8/88/Adrien_Brody.jpg/revision/latest/top-crop/width/360/height/450?cb\u003d20141113231800\u0026path-prefix\u003dde
There are unicode characters such as \u003d, which should be decoded (in this case to =). The reason is that I want to include these images in a site, and without decoding some of them cannot be displayed (like that one above, just paste the URL; it gives broken-image.webp).
I have tried lots of things, but nothing works.
JSON.parse(JSON.stringify(...))
String.prototype.normalize()
decodeURIComponent
Curiously, the regular expression for "\u003d" (i.e. "\\u003d" in js) does not match that string above, but "u003d" does.
This is all very weird, and my current guess is that browserless is responsible for some weird formatting behind the scenes. Namely, when I console log the URL and copy paste it somewhere else, every method mentioned above works for decoding.
I hope that someone can help me on this.
Just to mark this one as answered. Thomas replied:
JSON.parse(`"${url}"`)

Javascript String.fromCharCode() latin1 encoding issue

I don't know if I can write a fiddle for this so I'll just try to explain this as well as I possibly can.
We have an application where we've written an editor. We need to check some grammar rules on strings/tokens that are being entered into the editor.
However, when using String.fromCharCode(190), instead of getting a "." as in utf-8 we get a "¾" from latin1.
I've checked whether or not we set latin1 as the default encoding somewhere but I've been unable to find anything.
Can anyone point me into the right direction or possibly find a solution for this issue?
The HTML charset is UTF-8 as well as the javascript file (this only adds to my confusion haha).
As per the doc, String.fromCharCode() returns a unicode character. It's got nothing to do with encoding. "¾" is the unicode character for 190, that's it. http://unicode-table.com/

QML Javascript "console.log" with UTF-8

As we all know that in QML Javascript we can log to console by calling the 'console.log' function.
However, as per my experience, I can't log UTF-8 strings to console that the strings are still in proper characters, some are replaced by question marks.
For example:
console.log("XXXaàáảãạYYY");
will become (in console):
XXX??????YYY
How to solve this problem? I know console is just for debugging so I can use latin, but anyway, full UTF-8 support would be more interesting.
My console show them just fine.
I guess the problem is not about QML, but your console can't handle those UTF-8 characters.

encodeURIComponent encodes differently, depending on environment

I am passing an object via the url using:
encodeURIComponent(JSON.stringify(myObject))
"ä" is encoded as "%C3%A4" on my local server.
Unfortunately it is encoded as "a%CC%88" on the webserver.
Which breaks my app because it is part of the name of a database field which isn't found when wrong encoded. And I can't control that there are no ä's in field names because the app allows users to upload their own data.
How can I make sure that "ä" is always encoded correctly?
SORRY. To make this clear: The encoding happens both times client-side in the browser. But when the web-app is served from the webserver the "ä" is encoded as "%C3%A4" instead of "a%CC%88" (I've tested both in the same chrome browser)
Thanks for all your help. It got me to dig deeper:
I have code that runs on an event. It loops through checkboxes and creates an array of objects containing (also) the field names. The code gets the field names from an attribute named "feld" of the checkbox:
<div class="checkbox">
<label>
<input class="feld_waehlen" type="checkbox" dstyp="Taxonomie" datensammlung="SISF Index 2 (2005)" feld="Artname vollständig">Artname vollständig
</label>
</div>
running this code:
console.log("this.getAttribute('feld') = " + this.getAttribute('feld'));
gives as expected: $(this).attr('feld') = Artname vollständig
If while looping, I run:
console.log('encodeURIComponent("Artname vollständig") = ' + encodeURIComponent("Artname vollständig"));
the answer is correct: encodeURIComponent("Artname vollständig") = Artname%20vollst%C3%A4ndig
But if I run:
console.log("encodeURIComponent(this.getAttribute('feld')) = " + encodeURIComponent(this.getAttribute('feld')));
the answer is: encodeURIComponent(this.getAttribute('feld')) = Artname%20vollsta%CC%88ndig
This happens all in the browser. But the issue only appears, when the web-app is served from the webserver (a couchapp running on cloudant.com).
How can it be that the method "getAttribute" returns a different encoding?
The following code has been tested on Chrome 29 OS X, IE 8 Windows XP.
encodeURIComponent("ä") //%C3%A4"
decodeURIComponent("%C3%A4") //ä
so basically "%C3%A4" should be the expected output.
I think the issue here might be encodeURIComponent require a UTF-8 encoding while your server-side language returns something other than this.
encodeURICompoent - MDN
just a follow up in case somebody runs into this issue later.
It seems to be unique to cloudant.com where my couchapp was hosted.
This is the answer I got from their very helpful support:
OK - I think I've found the culprit. The issue is that, due to internal optimisations (which are not present in CouchDB), the form of unicode strings can get changed. In this case, ä is represented as:
U+0061 LATIN SMALL LETTER A character
U+0308 COMBINING DIAERESIS character (̈)
instead of
U+00E4 LATIN SMALL LETTER A WITH DIAERESIS character (ä)
Both are semantically equivalent, so the fix is to normalize your unicode strings before comparison. Unfortunately, JavaScript has no built-in unicode normalization, but you can use a library such ashttps://github.com/walling/unorm.
It's not an issue for me any more as I changed to a virtual server running on digitalocean.com with vanilla couchdb (and am very happy with it).
But I do think this could hit others developing couchapps in German or other languages needing utf8 and hosting them on cloudant.com
Thanks for your great help.
Alex

regex to find url in a text

I have to find the first url in the text with a regular expression:
for example:
I love this website:http://www.youtube.com/music it's fantastic
or
[ es. http://www.youtube.com/music] text
I looked into this issue last year and developed a solution that you may want to look at - See: URL Linkification (HTTP/FTP) This link is a test page for the Javascript solution with many examples of difficult-to-linkify URLs.
My regex solution, written for both PHP and Javascript - is not simple (but neither is the problem as it turns out.) For more information I would recommend also reading:
The Problem With URLs by Jeff Atwood, and
An Improved Liberal, Accurate Regex Pattern for Matching URLs by John Gruber
The comments following Jeff's blog post are a must read if you want to do this right...
Note that this question gets asked a lot. Maybe do a search next time :)
You can't do this perfectly with a regular expression. You may be interested in this blog post. There is a bit more information on Regex Guru, but even those look very fragile. You will need to have additional checks outside of your regular expression to catch the edge cases.
Identifying URLs is tricky because they are often surrounded by punctuation marks and because users frequently do not use the full form of the URL. Many JavaScript functions exist for replacing URLs with hyperlinks, but I was unable to find one that works as well as the urlize filter in the Python-based web framework Django. I therefore ported Django's urlize function to JavaScript: https://github.com/ljosa/urlize.js
It actually would not pick up the URL in your example because there is a colon right before the URL. But if we modify the example a little:
urlize("I love this website: http://www.youtube.com/music it's fantastic", true, true)
=> 'I love this website: http://www.youtube.com/music it's fantastic"'
Note the second argument which, if true, inserts rel="nofollow" and the third argument which, if true, quotes characters that have special meaning in HTML.
This might work->
\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
Found it somewhere
Will find links ->
http://foo.com/blah_blah/
(Something like http://foo.com/blah_blah)
http://foo.com/blah_blah_(wikipedia)
Hope this works....
i am using this regex : :) ( its translated ABNF )
[a-zA-Z]([a-zA-Z]|[0-9]|\+|\-|\.)*:\/\/((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:)*#)?(\[((([0-9A-Fa-f]{1,4}:){6}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|::([0-9A-Fa-f]{1,4}:){5}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|([0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){4}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,1}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){3}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,2}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){2}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,3}[0-9A-Fa-f]{1,4})?::[0-9A-Fa-f]{1,4}:([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,4}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,5}[0-9A-Fa-f]{1,4})?::[0-9A-Fa-f]{1,4}|(([0-9A-Fa-f]{1,4}:){0,6}[0-9A-Fa-f]{1,4})?::)|v[0-9A-Fa-f]\.(([a-zA-Z]|[0-9]|-|\.|_|~)|[!$&'\(\)\*\+,;=]|:))\]|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=])*)(:[0-9]*)?(((\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*|\/((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*)?|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*))?\/?(\?((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?(\#((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?
You can use the following regex expression for extracting any type of url coming in message.
String regex = "(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&/=]*)";
Typescript/Angular
This works for me:
const regExpressionUrl = new RegExp(/(https?:\/\/[^\s]+)/g); //detect URL
Ref: https://www.regextester.com/96249%7CRegular

Categories