My app depends on a webservice to form it's URIs, so sometimes it comes up with (what I believe is) a windows-1250 encoded string (/punk%92d) and express fails as follows:
Connect
400 Error: Failed to decode param 'punk%92d'
at Layer.match
So I thought about converting each link to that segment into utf-8 (example: /punk’d, so there would be no reference to the offending enconding), and back again to windows-1250 to work with the external webservice.
I tried this approach using both iconv and iconv-lite but there's always something wrong with the results: /punk d, /punk�d, etc.
Here's a sample using iconv:
var str = 'punk’d';
var buf = new Buffer(str.toString('binary'), 'binary');
console.log(new Iconv('UTF-8', 'Windows-1250').convert(buf).toString('binary'));
…and iconv-lite:
console.log(iconv.decode(new Buffer(str), 'win1250'));
I know using binary is a bad approach, but I was hoping something, anything would just do the job. I obviously tried multiple variations of this code since my knowledge of Buffers is limited, an even simpler things wouldn't work, like:
console.log(new Buffer('punk’d').toString('utf-8'));
So I'm interested in either a way to handle those encoded strings in the URI within express, or an effective way to convert them within node.js.
Sorry if this seems like too simple of a thing to try, but since Node and Express are both JavaScript, have you tried simply using decodeURIComponent('punk’d')? It looks to me that it's simply a standard encoded URI. I think you're getting that weird output from iconv because you're converting from the wrong encoding.
Related
Forgive the lack of code but I'm working in a production environment and it's a lot. I can provide more specific examples if needed but I will explain the basic situation here.
I'm using a library called PdfMake (which is excellent btw, great for making pdf's in JS if you need that) to make a dynamic PDF on the fly.
Now I need to get the PDF into an S3 bucket, but our stack currently relies on ColdFusion. Luckily, PdfMake has a handy method to convert the pdf data to Base64 and ColdFusion has a handy function to convert Base64 to binary.
So I sent the base64 to my server, convert it to binary, make a new coldfusion pdf and read it like this (fileData is a base64 encoded string)
public function upload_pdf(string fileName, any fileData){
var myPdf = new pdf();
var binary = ToBinary(arguments.fileData);
myPdf.read( source=binary, name="fileSource");
}
For some reason this action is failing. I usually get an error that says "The Document has no catalog of type dictionary", which is very cryptic and brings up no helpful results when I search it. Sometimes, without changing anything, I get an error that reads "the rebuild document still has no trailer". From googling it seems trailers are something specific to do with PDFs.
Intuitively I would think that this would work, since base64 and binary are versatile types of encoding. However I'm at a loss as to how to even begin to fix or diagnose this. I will probably begin looking for another solution altogether but I am curious to learn more about what is happening here so if anyone has any ideas I am down for some discussion.
FOR ANY CURIOUS READERS I HAVE SOLVED THIS PROBLEM, HERE IS THE WORKING CODE:
public function upload_pdf(string fileName, string fileData){
var myPdf = new pdf();
var binary = decodeBinary(Replace(arguments.fileData, " ", "+", "ALL"));
myPdf.setSource(binary);
}
I am trying to read blns.json from this repo.
I've tried JSON.parse, I've tried turning blns.json to blns.js and requiring the file through module.exports. I've even simply tried console.log() on the array and nothing:
Invalid or unexpected token
What is the best way to read this file in node to be consumed by my tests?
The problem with blns.json file is in the fact that it contains bytes that do not conform with JSON (https://github.com/minimaxir/big-list-of-naughty-strings/issues/20).
You can instead load blns.base64.json which contains base64 representations of naughty strings and then decode them to Buffer (https://nodejs.org/dist/latest-v8.x/docs/api/buffer.html#buffer_class_method_buffer_from_string_encoding).
Keep in mind that if you will try to convert Buffers to Stings, then specific bytes will be lost and some strings will cease to be naughty. But if you are going to use blns to test a web app, then probably it does not matter.
I have a URL that links to a javascript file, for example http://something.com/../x.js. I need to extract a variable from x.js
Is it possible to do this using python?
At the moment I am using urllib2.urlopen() but when I use .read() I get this lovely mess:
U�(��%y�d�<�!���P��&Y��iX���O�������<Xy�CH{]^7e� �K�\�͌h��,U(9\ni�A ��2dp}�9���t�<M�M,u�N��h�bʄ�uV�\��0�A1��Q�.)�A��XNc��$"SkD�y����5�)�B�t9�):�^6��`(���d��hH=9D5wwK'�E�j%�]U~��0U�~ʻ��)�pj��aA�?;n�px`�r�/8<?;�t��z�{��n��W
�s�������h8����i�߸#}���}&�M�K�y��h�z�6,�Xc��!:'D|�s��,�g$�Y��H�T^#`r����f����tB��7��X�%�.X\��M9V[Z�Yl�LZ[ZM�F���`D�=ޘ5�A�0�){Ce�L*�k���������5����"�A��Y�}���t��X�(�O�̓�[�{���T�V��?:�s�i���ڶ�8m��6b��d$��j}��u�D&RL�[0>~x�jچ7�
When I look in the dev tools to see the DOM, the only thing in the body is a string wrapped in tags. In the regular view that string is a json element.
.read() should give you the same thing you see in the "view source" window of your browser, so something's wrong. It looks like the HTTP response might be gzipped, but urllib2 doesn't support gzip. urllib2 also doesn't request gzipped data, so if this is the problem, the server is probably misconfigured, but I'm assuming that's out of your control.
I suggest using requests instead. requests automatically decompresses gzip-encoded responses, so it should solve this problem for you.
import requests
r = requests.get('https://something.com/x.js')
r.text # unparsed json output, shouldn't be garbled
r.json() # parses json and returns a dictionary
In general, requests is much easier to use than urllib2 so I suggest using it everywhere, unless you absolutely must stick to the standard library.
import json
js = urllib2.urlopen("http://something.com/../x.js").read()
data = json.loads(js)
I have to encode string in c# and decode it with javascript unescape function.
the javascript unescape is the only option since I am sending the string with get request to some api that using unescape to decoed it.
i tried almost everything
server.urlencode
WebUtility.HtmlEncode
and a lot other encoding! I even tried Uri.EscapeDataString using jscript
Nothing isn't encode like the "escape" function
Any idea How to make it work?
EDIT:
this is my code
string apiGetRequest = String.Format("http://212.00.00.00/Klita?name={0}&city={1}&CREATEBY=test ", Uri.EscapeDataString(name), Uri.EscapeDataString(city));
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(apiGetRequest);
req.GetResponse();
Can you give an example of the string you want do encode and the encoded result?
URLencoding is the correct encoding-type you need. Make sure, you don't double encode your string somewhere in your code.
You might need to use decodeURIComponent instead of unescape, since unescape is not UTF-8 aware, thus might result in in broken string after decoding.
See http://xkr.us/articles/javascript/encode-compare/ for more information.
EDIT:
I don't know much about asp, but it looks like your trying to access the url not with a browser but with your ASP-server-side application. Well, your server does not run any JS code. You will just retrieve the HTML markup and maybe some JS code as a big string. This code would be parsed and executed within a browser but not within ASP.
I would like to retrieve the contents of a javascript script instead of executing it upon requesting it.
EDIT: I understand that Python is not executing the javascript code. The issue is that when I request this online JS script it gets executed. I'm unable to retrieve the contents of the script. Maybe what I want is to decode the script like so http://jsunpack.jeek.org/dec/go
That's what my code looks like to request the js file:
def request(self, uri):
data = None
req = urllib2.Request(uri, data, self.header)
response = urllib2.urlopen(req)
html_text = response.read()
return html_text.decode()
I know approximately what the insides of the script look like but all I get after the request is issued is a 'loaded' message. My guess is that the JS code gets executed. Is there any way to just request the code?
There is no HTML or JavaScript interpreter in urllib2. This module does nothing but fetch the resource and return it to you raw; it certainly will not attempt to execute any JavaScript code it receives. If you are not receiving the response you expect, check the URL with a tool like wget or monitor the network connection with Wireshark or Fiddler to see what the server is actually returning.
(decode() here only converts the bytes of the HTTP response body to Unicode characters—using the default character encoding, which probably isn't a good idea.)
ETA:
I guess what I want is to decode the Javascript like so jsunpack.jeek.org/dec/go
Ah, well that's a different game entirely. You can get the source for that here, though you'll also need to install SpiderMonkey, the JavaScript engine from Mozilla, to allow it to run the downloaded JavaScript.
There's no way to automatically ‘unpack’ obfuscated JavaScript without running it, since the packing code can do anything at all and JS is a Turing-complete language. All this tool does is run it with some wrapper code for functions like eval which packers/obfuscators typically use. Unfortunately, this sabotage is easily detectable, so if it's malware you're trying to unpack you'll find this fails as often as it succeeds.
I'm not sure I understand. If I do a simplified version of your code and run it on a URI that's sure to have some javascript:
>>> import urllib2
>>> res = urllib2.urlopen("http://stackoverflow.com/questions/6946867/how-to-unpack-javascript-in-python")
And you print res (or res.decode()), the javascript is intact.
Doing urlopen should retrieve whatever character stream the source provides. It's up to you to do something with it (render it as html, interpret it as javascript, etc).