Windows tool to decode HTML entities in a file - javascript

Is there a command line/batch script tool for Windows that can be used to decode HTML entitles like , ℘, and ‰ to readable UTF-8 text?
I found this web tool (https://mothereff.in/html-entities) that uses javascript that can do just this but I need this done from a Windows batch file. I know of the amazing JREPL.bat utility which incorporates javascript into windows command shell to make regex replacements in files. I just can't find a similar tool for HTML entities conversion.
Edit: To the bright coders out there, I hope you can write a batch tool that can perform HTML entities decoding/encoding to help me and the future readers looking for the same solution. Here are Github pages I think can be of use: https://github.com/mathiasbynens/he https://github.com/mathiasbynens/mothereff.in/tree/master/html-entities

You don't need extensive applications (like JREPL.bat or my own FindRepl.bat) or complicated programs in order to perform a replacement as simple as this one. The small Batch file below is an example that performs a replacement of 3 HTML entities:
#set #a=0 // & cscript //nologo //E:JScript "%~F0" < input.txt & goto :EOF
var rep = new Array();
rep["©"] = "\u00A9";
rep["팆"] = "\uD306";
rep["☃"] = "\u2603";
var f = new ActiveXObject("Scripting.FileSystemObject").CreateTextFile("output.txt", true, true);
f.Write(WScript.Stdin.ReadAll().replace(/©|팆|☃/g,function (A) {return rep[A]}));
f.Close();
input.txt:
Foo © bar 팆 baz ☃ qux
output.txt:
Foo © bar 팆 baz ☃ qux
You only need to add as many character equivalences as you want to convert...

It is trivial to incorporate JScript into a batch file, so you could easily write your own custom hybrid JScript/batch script that incorporates the he.js found at https://github.com/mathiasbynens/he.
But it is even simpler to use the JREPL.BAT tool that you already mentioned. You can use the /JLIB option to load the he.js code, thus making all of the he (html-entities) functionality accessible to JREPL.
Here is a trivial example that decodes test.txt, overwriting the original file.
jrepl "^.*" "he.decode($0)" /jlib "he.js" /f test.txt /o -
This isn't the most efficient way to do it, but it is probably plenty fast enough, and it sure is convenient.
Here is another example that encodes every character in test.txt (including newlines), writing the result to out.txt
jrepl "^[\s\S]*" "he.encode($0,{encodeEverything:true})" /m /j /jlib he\he.js /f test.txt /o out.txt
You should study all the documentation for both he and JREPL to discover all the possibilities.
The regex portion in the examples might seem to be more of a hindrance then help. But it is easy to envision how it might be useful to selectively encode only portions of your input text. Or you could use the JREPL /T option to use different encoding options for different sections of text.

Related

tensorflowjs - Is there an equivalent method for tokenizer in javascript?

I'm building an NLP classifier in python and would like to build a hosting HTML page for a demo. I want to test on a sample text to see the prediction and this is implemented in python through tokenizing the text and then padding it before predicting. Like this:
tf.tokenizer.texts_to_sequences(text)
token_list = tf.tokenizer.texts_to_sequences([text])[0]
token_list_padded = pad_sequences([token_list], maxlen=max_length, padding=padding_type)
The problem is that I'm new to javascript, so is there tokenization and padding methods in javascript like in python?
There is not yet a tf.tokenizer in js as there is in python.
A simple js.tokenizer has been described here. A more robust approach would be to use the tokenizer that comes with universal sentence encoder
There is no native mechanism for tokenization in Javascript.
You can use a Javascript library such as natural or wink-tokenizer or wink-nlp. The last library automatically extracts a number of token's features that may be useful in training.

Open and save xml file from jmeter command line

I am trying to manipulate jmeter test plan in a web based tool. The problem is, it converts many characters to implicitly. For example " converts to ", 
 converts to newline.
I observed that, if i open that modified file from jmeter ui and save it without doing anything, all the characters are converted back to original. For example " converts to ".
So is it possible to do this automatically using jquery/javascript. I am using angularjs with node.js for my application. I would prefer to do this open-save-close operation in background. Please suggest , how can i achieve this. is there any jmeter-plugin available which i can run from javascript/jquery ?
Many thanks in advance
You need to escape the following characters in XML, otherwise it will result into invalid markup.
"
'
<
>
&
Given you use NodeJS you can use xml-escape function to do the trick for you.
JMeter provides __escapeXML() function out of the box just in case you're looking for Java-based implementation, see Apache JMeter Functions - An Introduction article to get familiarized with JMeter Functions concept.
What jmeter is doing with XML is correct and is done by the library XStream.
JMeter manipulate Java Objects and serializes them using XStream.
I am not sure what you are trying to do, but I don't think it is correct in terms of maintainability, indeed JMeter doesn't provide any contract on XML (XSD or DTD). Test plans should be manipulated through Java.
As far as I know, you cannot manipulate it through javascript, but you can potentially use this DSL to manipulate/generate test plans:
https://github.com/flood-io/ruby-jmeter

PHP / Javascript minification

I am curious if there is a full minifier that can do javascript AND the php / html minification? As a for instance, if you look at this pen:
http://codepen.io/ajhalls/pen/qmEKVY
You can see in the HTML area you have the code
<div class="col-sm-1 svgPatternItem " data-id="svgPattern-42 " style="font-size:14px;text-shadow: 0px 0px 4px rgb(0, 0, 0);color:#fff;position:relative;width:100px;height:100px; ">
Then in the javascript, you have:
$(".svgPatternItem").each(function( index ) {
while the class svgPatternItem makes it descriptive and readable, it is unnecessarily long. In sublime I could do a global search and replace to make that class aa, and the next one class ab and so on, but that is exactly the type of work you would expect a macro to excel at, yet I haven't found one that will modify both.
To further complicate matters, I have mixed in some new ECMAScript 2017 that breaks most minifiers, but which made development so much more pleasant like using the backtick when defining multiline variables. I could revert things to previous JS if needed, but it makes it harder to develop on unless it is required.
I was looking at http://esprima.org and it seems that if you look at the online parser with the js code from the codepen earlier you would just need to look for the callee => arguments => value and if it was an alphanumeric value, do a global search and replace using a short variable name through the entire code base of php and javascript which "should" work.
All that being said, as evidenced from my last two questions, I haven't found a way to do that parsing myself using javascript or PHP. Maybe python could do it, but I wonder if I need a full windows application such as Sublime to do the work for me so am wondering if anyone else has solved this particular issue.
As far as JavaScript minification is concerned the best is Google Closure Compiler.
It has an online version as well as a downloadable runnable java jar file.
Know more about Google Closure compiler for JS here:
https://closure-compiler.appspot.com/home
https://developers.google.com/closure/compiler/
HTML Minification:
There is a Repository at github that lets you CSS and JS and HTML into a single line of code, you can find it at https://github.com/searchturbine/phpwee-php-minifier
It runs on the PHP engine.
If you want to obfuscate your JS code there is a concept called uglification It basically minifies as well as obfuscates your code against reverse engineering upto a certain level find more about it at :https://github.com/mishoo/UglifyJS2

Google Spreadsheet: Encrypt cell content with Google Apps Script

I have a Google Spreadsheet and would like to encrypt the content of a few cells (I do not care which encryption method is being used as long as there is an equivalent decryption method for iOS).
Unfortunately there are no built-in encryption functions in Google Apps Script.
For this reason I would like to use a open source Javascript library like Crypto-JS and sjcl.
How can I use one of these libraries with Google Apps Script?
In the Google Apps Script documentation, I have not found any clue on how to use external JavaScript libraries with my Google Apps Script.
Well I'll say this, because this is the method that I used with Date JS. You can do the following:
Download the source .js file(s).
Open the .js file(s) in a text editor
Copy/paste all code into a new Script Project
here you can "recreate" the original .js files (copy/paste source individually) with the same names
Include the project key of that Script Project as a library of the project in which you want to use those functions.
Even if the projects are open-source you will want to make sure you comply with the licenses of those projects if you are going to use them.
This is basically a small "hack" around not being able to upload .js files into GAS Projects. Assuming that the JS is standard, this method will work with Google's system.
The other option is to simple find a light-weight one- or two-function crypto package, or a single crypto algorithm like AES-128 (taht you are given permission to use, of course). It really depends on how much encryption you want, if you need to reverse the cipher text to get the plain values, etc.
If this is a for some kind of password system, I would recommend using a simple hash. For example:
function stringHash (someString) {
var hash = 0;
if (this.length == 0) return hash;
for (i = 0; i < this.length; i++) {
char = this.charCodeAt(i);
hash = ((hash << 5) - hash) + char;
hash = hash & hash;
}
return Math.abs(hash); // Personally I don't like negative values, so I abs'd it
}
in which you would ask for a user's password, and if the password hash matched the hash stored in the spreadsheet or wherever, then you would validate. You can use this to simulate logging into a UiApp GUI, for example: store usernames/password hashes in a database and validate a user before loading the "real" app.
However, as Serge mentioned, Spreadsheets will contain revision history of the original value before it was hashed, as well as the value after it was hashed. If you want to avoid this, use ScriptDB.
PS - in addition to this work-around, I'll say that it's not currently possible to "import" a non-GAS code library into your Script Project, unless you manually copy the source file-by-file into your Script Project. There may be a feature request on the Issue Tracker already, if not you can create one and I'll star it.
EDIT: As per request, I've included an open source AES encryption "package" (contains base64 as well, which is nice) in the answer, to act as a reference for others who want to encrypt in GAS. Make sure you follow the author's request, which is to retain his original copyright and link back to the source.
Other than the AES I linked and the simple hash (equivalent to Java's String.hashCode()), whose resource can be found here, there is Crypto-JS as you mentioned in your question and, if you took the time to fully copy/paste all the code (assuming that agrees with the terms of the license - I haven't read it), you could use that by the steps I described in the top half of my answer.
MD5 in Javascript is also an algorithm that you could use. If you use the code in md5.js which is located at the top of the page, you'll have what you need. Again, make sure you're following licensing rules if you use it.
Personally I would probably just use the hash and the base-64 patterns, as most of what you would use this encryption for is probably not incredibly important. AES might take a bit longer to compute - you can probably benchmark it yourself to see if it will cause major problems with triggers running for an extended period of time, but I doubt it would be a problem anyway.
Note: base-64 is 2-way, so is AES. MD5 is a type of hash, and the simple hash function I provided is also (of course) a hash. Hash functions are one-way. So if you need two-way functionality (encrypt/decrypt), then use base-64 or AES. Base-64 is essentially the kid version of AES. And the simple hash function is the kid version of MD5. Keep this in mind :)
Edit again: I'm not familiar with iOS development or its internals, but it seems to me that iOS can at least do some cryptographic operations. You may want to read more into those methods though, because I'm not really sure how you're putting GAS and iOS together; I can't give you any more help in that area unfortunately.
The functions above don't work for me. Here is something what you can copy and paste into google sheets (spreadsheet) script editor
function enc(str) {
var encoded = "";
for (i=0; i<str.length;i++) {
var a = str.charCodeAt(i);
var b = a ^ 123; // bitwise XOR with any number, e.g. 123
encoded = encoded+String.fromCharCode(b);
}
return encoded;
}
This is what you get when you use it =ENC in your spreadsheet
Based on this post here
External libraries can be used by using JavaScript's built in eval() function
(ex. eval(UrlFetchApp.fetch('path/to/library'))).
Of course, the library must have no dependencies for this to work.

Can you combine Javascript into single file without minifying it with yui-compressor?

Is there a way to compress input javascript in a sinlge file but not minify it using Yui-Compressor?
string compressedJavascript = JavaScriptCompressor.Compress(uncompressedJavascript)
This what I have now. I see the Compress() is overloaded to allow code to be left obfuscated, etc. But I would like the code to be left unminified for debugging.
It seems YUI's command line documentation says to use the -nomunge option which means:
Minify only. Do not obfuscate local symbols.
Look for that in the documentation for the C# port you are using.

Categories