How to use node.io for HTML parsing from node.js?

How to use node.io for HTML parsing from node.js? - javascript

I am trying to use node.io on node.js to parse a HTML page which i have as a string in a variable.
I am facing trouble with passing the HTML string to my node.io job as an argument.
This is an excerpt of my code at my node file nodeiotest.js:
var nodeIOJob = require('./nodeiojobfile.js');
var nodeio = require('node.io');
vat htmlString = 'HTML String Here';
nodeio.start(nodeIOJob.job, function(err, output) {
console.log(output);
}, true);
The next is an excerpt of my file nodeiojobfile.js:
var nodeio = require('node.io');
var methods = {
input: ['xxxxxxxxxxxxxxxx'], // htmlString is suppossed to come here
run: function (num) {
console.log(num);
this.emit('Hello World!');
}
}
exports.job = new nodeio.Job(methods);
How do I send my htmlString as argument to my job in the other file?
Also, after receiving the file i need to parse it as an HTML and perform some basic CSS selection (ex. getElementById() etc.) and need to calculate offsetHeight of certain HTML elements. The documentation says I can use get() and getHTML() methods to parse a URL's html but what about HTML in a string? How do I parse them?
For testing purposes I am using he following HTML:
<div>
<p id="p1">
Testing document
</p>
</div>
I am trying to select the <p> and then find out its height.
Can anyone help me?
Thnx in advance!!

I'm not familiar with node.io, but I think you want something like this:
// nodeiotest.js
...
var htmlString = 'HTML String Here';
nodeio.start(nodeIOJob.job(htmlString), function(err, output) {
console.log(output);
}, true);
// nodeiojobfile.js
var nodeio = require('node.io');
module.exports.job = function(htmlString) {
var methods = {
input: [ htmlString ],
run : function (num) {
console.log(num);
this.emit('Hello World!');
}
};
return new nodeio.Job(methods);
};

Related

Take selected text, send it over to Scryfall API, then take the link and put it in the selected text

I've been able to sort out the middle bit (the API seems to be called to just fine) along with the submenu displaying. Originally I thought that just the end part wasn't working but I'm now thinking that the selection part isn't either.
What am I doing wrong with the getSelection() and what do I need to do to insert a link into said selection? (to clarify, not to replace the text with a link, but to insert a link into the text)
//Open trigger to get menu
function onOpen(e) {
DocumentApp.getUi().createAddonMenu()
.addItem('Scry', 'serumVisions')
.addToUi();
}
//Installation trigger
function onInstall(e) {
onOpen(e);
}
//I'm not sure if I need to do this but in case; declare var elements first
var elements
// Get selected text (not working)
function getSelectedText() {
const selection = DocumentApp.getActiveDocument().getSelection();
if (selection) {
var elements = selection.getRangeElements();
Logger.log(elements);
} else {
var elements = "Lack of selection"
Logger.log("Lack of selection");
}
}
//Test run
// insert here
// Search Function
function searchFunction(nameTag) {
// API call + inserted Value
let URL = "https://api.scryfall.com/cards/named?exact=" + nameTag;
// Grabbing response
let response = UrlFetchApp.fetch(URL, {muteHttpExceptions: true});
let json = response.getContentText();
// Translation
let data = JSON.parse(json);
// Jackpot
let link = data.scryfall_uri;
// Output
Logger.log(link);
}
// Test run
searchFunction("Lightning Bolt");
//Let's hope this works how I think it works
function serumVisions() {
const hostText = getSelectedText();
const linkage = searchFunction(hostText);
// Unsure what class I'm supposed to use, this doesn't
const insertLink = DocumentApp.getActiveDocument().getSelection().newRichTextValue()
.setLinkUrl(linkage);
Logger.log(linkage);
}
For the first part, I tried the getSelection() and getCursor() examples from the Google documentation but they don't seem to work, they all just keep returning null.
For the inserting link bit, I read all those classes from the Spreadsheet section of the documentation, at the time I was unaware but now knowing, I haven't been able to find a version of the same task for Google Docs. Maybe it works but I'm writing it wrong as well, idk.

Modification points:
In your script, the functions of getSelectedText() and searchFunction(nameTag) return no values. I think that this might be the reason for your current issue of they all just keep returning null..
elements of var elements = selection.getRangeElements(); is not text data.
DocumentApp.getActiveDocument().getSelection() has no method of newRichTextValue().
In the case of searchFunction("Lightning Bolt");, when the script is run, this function is always run. Please be careful about this.
When these points are reflected in your script, how about the following modification?
Modified script:
Please remove searchFunction("Lightning Bolt");. And, in this case, var elements is not used. Please be careful about this.
From your script, I guessed that in your situation, you might have wanted to run serumVisions(). And also, I thought that you might have wanted to run the individual function. So, I modified your script as follows.
function getSelectedText() {
const selection = DocumentApp.getActiveDocument().getSelection();
var text = "";
if (selection) {
text = selection.getRangeElements()[0].getElement().asText().getText().trim();
Logger.log(text);
} else {
text = "Lack of selection"
Logger.log("Lack of selection");
}
return text;
}
function searchFunction(nameTag) {
let URL = "https://api.scryfall.com/cards/named?exact=" + encodeURIComponent(nameTag);
let response = UrlFetchApp.fetch(URL, { muteHttpExceptions: true });
let json = response.getContentText();
let data = JSON.parse(json);
let link = data.scryfall_uri;
Logger.log(link);
return link;
}
// Please run this function.
function serumVisions() {
const hostText = getSelectedText();
const linkage = searchFunction(hostText);
if (linkage) {
Logger.log(linkage);
DocumentApp.getActiveDocument().getSelection().getRangeElements()[0].getElement().asText().editAsText().setLinkUrl(linkage);
}
}
When you select the text of "Lightning Bolt" in the Google Document and run the function serumVisions(), the text of Lightning Bolt is retrieved, and the URL like https://scryfall.com/card/2x2/117/lightning-bolt?utm_source=api is retrieved. And, this link is set to the selected text of "Lightning Bolt".
Reference:
getSelection()

Need to JSON stringify an object in ExtendScript

I am working on processing meta data information of my Indesign document links, using ExtentdScript.
I want to convert the object to string using JSON.stringify but when I use it, I am getting error saying:
can't execute script in target engine.
If I remove linkObjStr = JSON.stringify(linksInfObj); from below code, then everything works fine.
What is the equivalent to JSON.stringify in ExtendScript, or is there any other possibilities to display linksInfObj with its proper contents instead [object object]?
for (var i = 0, len = doc.links.length; i < len; i++) {
var linkFilepath = File(doc.links[i].filePath).fsName;
var linkFileName = doc.links[i].name;
var xmpFile = new XMPFile(linkFilepath, XMPConst.FILE_INDESIGN, XMPConst.OPEN_FOR_READ);
var allXMP = xmpFile.getXMP();
// Retrieve values from external links XMP.
var documentID = allXMP.getProperty(XMPConst.NS_XMP_MM, 'DocumentID', XMPConst.STRING);
var instanceID = allXMP.getProperty(XMPConst.NS_XMP_MM, 'InstanceID', XMPConst.STRING);
linksInfObj[linkFileName] = {'docId': documentID, 'insId': instanceID};
linkObjStr = JSON.stringify(linksInfObj);
alert('Object' + linksInfObj, true); // I am getting [Object Object] here
alert('String' + linkObjStr, true);
}

ExtendScript does not include a JSON object with the associated methods for parsing, namely JSON.parse() and JSON.stringify(). Nor does it provide any other builtin feature for parsing JSON.
Solution:
Consider utilizing a polyfill to provide JSON functionality such as JSON-js created by Douglas Crockford.
What you'll need to do:
Download the JavaScript file named json2.js from the Github repo and save it in the same location/folder as your .jsx file.
Note You can just copy and paste the raw version of json2.js from the same Github repo to create the json2.js file manually if you prefer.
Then at the top of your current .jsx file you'll need to #include the json2.js file by adding the following line of code:
#include "json2.js";
This is analogous to how you might utilize the import statement to include a module in modern day JavaScript (ES6).
A pathname to the location of the json2.js can be provided if you decide to save the file in a different location/folder than your .jsx file.
By including json2.js in your .jsx file you'll now have working JSON methods; JSON.parse() and JSON.stringify().
Example:
The following ExtendScript (.jsx) is a working example that generates JSON to indicate all the links associated with the current InDesign document (.indd).
example.jsx
#include "json2.js";
$.level=0;
var doc = app.activeDocument;
/**
* Loads the AdobeXMPScript library.
* #returns {Boolean} True if the library loaded successfully, otherwise false.
*/
function loadXMPLibrary() {
if (!ExternalObject.AdobeXMPScript) {
try {
ExternalObject.AdobeXMPScript = new ExternalObject('lib:AdobeXMPScript');
} catch (e) {
alert('Failed loading AdobeXMPScript library\n' + e.message, 'Error', true);
return false;
}
}
return true;
}
/**
* Obtains the values f XMP properties for `DocumentID` and `instanceID` in
* each linked file associated with an InDesign document (.indd). A returns the
* information formatted as JSON,
* #param {Object} doc - A reference to the .indd to check.
* #returns {String} - The information formatted as JSON.
*/
function getLinksInfoAsJson(doc) {
var linksInfObj = {};
linksInfObj['indd-name'] = doc.name;
linksInfObj.location = doc.filePath.fsName;
linksInfObj.links = [];
for (var i = 0, len = doc.links.length; i < len; i++) {
var linkFilepath = File(doc.links[i].filePath).fsName;
var linkFileName = doc.links[i].name;
var xmpFile = new XMPFile(linkFilepath, XMPConst.FILE_INDESIGN, XMPConst.OPEN_FOR_READ);
var allXMP = xmpFile.getXMP();
// Retrieve values from external links XMP.
var documentID = allXMP.getProperty(XMPConst.NS_XMP_MM, 'DocumentID', XMPConst.STRING);
var instanceID = allXMP.getProperty(XMPConst.NS_XMP_MM, 'InstanceID', XMPConst.STRING);
// Ensure we produce valid JSON...
// - When `instanceID` or `documentID` values equal `undefined` change to `null`.
// - When `instanceID` or `documentID` exist ensure it's a String.
instanceID = instanceID ? String(instanceID) : null;
documentID = documentID ? String(documentID) : null;
linksInfObj.links.push({
'name': linkFileName,
'path': linkFilepath,
'docId': documentID,
'insId': instanceID
});
}
return JSON.stringify(linksInfObj, null, 2);
}
if (loadXMPLibrary()) {
var linksJson = getLinksInfoAsJson(doc);
$.writeln(linksJson);
}
Output:
Running the script above will log JSON formatted something like the following example to your console:
{
"indd-name": "foobar.indd",
"location": "/path/to/the/document",
"links":[
{
"name": "one.psd",
"path": "/path/to/the/document/links/one.psd",
"docId": "5E3AE91C0E2AD0A57A0318E078A125D6",
"insId": "xmp.iid:0480117407206811AFFD9EEDCD311C32"
},
{
"name": "two.jpg",
"path": "/path/to/the/document/links/two.jpg",
"docId": "EDC4CCF902ED087F654B6AB54C57A833",
"insId": "xmp.iid:FE7F117407206811A61394AAF02B0DD6"
},
{
"name": "three.png",
"path": "/path/to/the/document/links/three.png",
"docId": null,
"insId": null
}
]
}
Sidenote: Modelling your JSON:
You'll have noticed that the JSON output (above) is structured differently to how you were attempting to structure it in your given example. The main difference is that you were using link filenames as property/key names, such as the following example:
Example of a problematic JSON structure
{
"one.psd": {
"docId": "5E3AE91C0E2AD0A57A0318E078A125D6",
"insId": "xmp.iid:0480117407206811AFFD9EEDCD311C32"
},
"two.jpg": {
"docId": "EDC4CCF902ED087F654B6AB54C57A833",
"insId": "xmp.iid:FE7F117407206811A61394AAF02B0DD6"
}
...
}
Producing JSON like this example isn't ideal because if you were to have two links, both with the same name, you would only ever report one of them. You cannot have two properties/keys that have the same name within an Object.
Edit:
As a response to the OP's comment:
Hi RobC, other than using #include 'json2.js', is there any other way to include external js file in the JSX file?
There are a couple of alternative ways as follows:
You could utilize $.evalFile(). For instance replace #include "json2.js"; with the following two lines:
var json2 = File($.fileName).path + "/" + "json2.js";
$.evalFile(json2);
Note: This example assumes json2.js resides in the same folder as your .jsx
Alternatively, if you're wanting to avoid the existence of the additional json2.js file completely. You could add a IIFE (Immediately Invoked Function Expression) at the top of your .jsx file. Then copy and paste the content of the json2.js file into it. For instance:
(function () {
// <-- Paste the content of `json2.js` here.
})();
Note: If code size is a concern then consider minifying the content of json2.js before pasting it into the IIFE.

I apply JavaScript Minifier to JSON-js
then put the result to my script.

In Karate DSL, calling a javascript file returns java.lang.RuntimeException

I have a javascript file I want to call. contents are below. When I tried calling the file, I keep getting a "no variable found with name: response" even though there is clearly a variable defined. The file executes fine within command-line using node so the javascript function is valid. Any thoughts? I attached the error message in a screenshot.
Javascript content in snippet below.
Karate script:
Scenario: Call JavaScript:
* def sample = read('classpath:reusable/gen-data.js')
* print someValue
function createTestData(sampleJson, fieldsToChange, numRecords) {
var testData = [];
for (var i = 0; i < numRecords; i++) {
var copy = JSON.parse(JSON.stringify(sampleJson));
fieldsToChange.forEach(function(fieldToChange) {
copy[fieldToChange] = copy[fieldToChange] + i;
});
testData.push(copy);
}
return {content: testData};
}
var testData = {
"country": "US",
"taskStatusCode" : "Closed",
"facilityCode" : "US_203532",
};
function getTestData() {
String testData = JSON.stringify(createTestData(testData, ["taskStatusCode", "facilityCode"], 1), null, 1);
console.log("all done getTestData()");
console.log("test data: \n" + testData);
return testData;
};
console.log("calling getTestData()");
getTestData();

I think this error is thrown when the JavaScript is not correct. For example in my case this JS file:
/* Set the custom authentication header */
function fn() {
var authToken = karate.get('authToken');
var out = {};
out['Auth-Token'] = authToken
return out;
}
This file will produce the "no variable found with name: response".
The reason is because "the right-hand-side (or contents of the *.js file if applicable) should begin with the function keyword." according to the karate docs (link).
Now by moving the comment and making the function keyword the first bit of text it works as expected:
function fn() {
/* Set the custom authentication header */
var authToken = karate.get('authToken');
var out = {};
out['Auth-Token'] = authToken
return out;
}
In the OP, the function keyword is the first thing in the file, but there is javascript outside the original function -- which I don't think is legal for karate syntax. In other words, everything has to be in the outer function.

My workaround was to use java instead of JavaScript.

Get Filename from URL and Strip File Extension

I need to get just the filename without the extension from a url and can't quite get there.
Here's my url in question:
https://www.mealenders.com/shop/index.php/shop/solo-pack.html
Here's what I've tried:
function () {
var value={{Page Path}}.split("/");
return value.reverse()[0];
}
That almost gets me there as it returns "solo-pack.html". What else do I need to do to get rid of the ".html" for this?
Thanks in advance.

You can do the following using javascript. Pop returns the last element which is a string, and then you can use the replace function to get just the filename without .html on the end.
function getFilename () {
return {{ Page Path }}.split('/').pop().replace('.html', '');
}
I see that {{ Page Path }} is probably some templating language but you could modify the above script, to get the current URL and then get the filename as so.
function getFilename () {
return window.location.href.split('/').pop().replace('.html', '');
}
Furthermore you could make it more dynamic to handle any file extension with the following. You need to get the index of the period using indexOf and then sub string from the start of the filename up to the position of the period.
function getFilename () {
var filename = window.location.href.split('/').pop();
return filename.substr(0, filename.lastIndexOf('.');
}

function getFileName(url) {
return url.split("/").pop().split(".")[0];
}
var url = "https://www.mealenders.com/shop/index.php/shop/solo-pack.html";
console.log(getFileName(url));

function () {
var value={{Page Path}}.split("/");
var fileName= value.reverse()[0].split('.')[0];
return fileName;
}

If you need to get rid of any extension, you can use .replace() with regular expression:
var url = "https://www.mealenders.com/shop/index.php/shop/solo-pack.html";
function getFilename (path) {
return path.toString().split('/').pop().replace(/\.\w+$/, '');
}
console.log(getFilename(url));
This will for example change test/index.html into index but index.php.default into index.php and also test.name.with.dots.txt -> test.name.with.dots

Short and sweet:
"https://url/to/file/solo-pack.html".split(/[\\/]/).pop().replace(/\.[^/.]+$/, "")
Returns:
solo-pack

Highlight HTML using Remarkable and Highlightjs

I'm having trouble getting the highlight function to execute when using Remarkable to highlight HTML code. I'm taking from the example here:
var md = new Remarkable({
html:true,
langPrefix:'lang-',
highlight: function (str, lang) {
alert('highlighting'); // never executes!
if (lang && hljs.getLanguage(lang)) {
try {
return hljs.highlight(lang, str).value;
} catch (err) {}
}
try {
return hljs.highlightAuto(str).value;
} catch (err) {}
return ''; // use external default escaping
}
});
var test = md.render('<code class="lang-js">var x = 1;</code>');
See fiddle

Remarkable works when you give it text written in markdown, not HTML. It generates the HTML for you. If you wanted to write out the HTML yourself, you don't need Remarkable ;)
So, your test line should look like this:
var test = md.render('``` js\nvar x = 1;\n```\n');
(normally, text is pulled from a text area, so you don't need the "\n" in there, you would just hit enter)
Here is the working fiddle:
https://jsfiddle.net/fhz9oma1/7/

We Keep Coding

JavaScript is the programming language of the Web.

How to use node.io for HTML parsing from node.js? - javascript

Related

Take selected text, send it over to Scryfall API, then take the link and put it in the selected text

Need to JSON stringify an object in ExtendScript

In Karate DSL, calling a javascript file returns java.lang.RuntimeException

Get Filename from URL and Strip File Extension

Highlight HTML using Remarkable and Highlightjs

Categories

Resources