[XSS]Expressjs prevent run script form POST request - javascript

I'm new to ExpressJs. I have a question about posted javascript
app.get('/nothing/:code',function(req, res) {
var code = req.params.code;
res.send(code)
});
If I POST javascript tag, It would run. Is there way to prevent that?

There are many possible HTML sanitizers out there (as simple search on NPM will give you a listing that you can use in your nodejs code).
The most simple would be to simply use the built in "escape" function, but that won't stop many XSS attacks.
app.get('/nothing/:code',function(req, res) {
var code = escape(req.params.code);
res.send(code)
});
A better solution would be to use a library designed for this purpose. For example if you used the santizier library (Google Caja's HTML sanitizer packaged for node):
var sanitizer = require('sanitizer');
...
app.get('/nothing/:code',function(req, res) {
var code = sanitizer.sanitize(req.params.code);
res.send(code)
});

Related

Input Processing in JavaScipt

I'm new to Web Development (including JavaScript and HTML) and have a few issues within my personal project that seem to have no clear fixes.
Overview
My project is taking input from a user on the website, and feeding it to my back-end to output a list of word completion suggestions.
For example, input => "bass", then the program would suggest "bassist", "bassa", "bassalia", "bassalian", "bassalan", etc. as possible completions for the pattern "bass" (these are words extracted from an English dictionary text file).
The backend - running on Node JS libraries
trie.js file:
/* code for the trie not fully shown */
var Deque = require("collections/deque"); // to be used somewhere
function add_word_to_trie(word) { ... }
function get_words_matching_pattern(pattern, number_to_get = DEFAULT_FETCH) { ... }
// read in words from English dictionary
var file = require('fs');
const DICTIONARY = 'somefile.txt';
function preprocess() {
file.readFileSync(DICTIONARY, 'utf-8')
.split('\n')
.forEach( (item) => {
add_word_to_trie(item.replace(/\r?\n|\r/g, ""));
});
}
preprocess();
module.exports = get_words_matching_trie;
The frontend
An HTML script that renders the visuals for the website, as well as getting input from the user and passing it onto the backend script for getting possible suggestions. It looks something like this:
index.html script:
<!DOCTYPE HTML>
<html>
<!-- code for formatting website and headers not shown -->
<body>
<script src = "./trie.js">
function get_predicted_text() {
const autofill_options = get_words_matching_pattern(input.value);
/* add the first suggestion we get from the autofill options to the user's input
arbitrary, because I couldn't get this to actually work. Actual version of
autofill would be more sophisticated. */
document.querySelector("input").value += autofill_options[0];
}
</script>
<input placeholder="Enter text..." oninput="get_predicted_text()">
<!-- I get a runtime error here saying that get_predicted_text is not defined -->
</body>
</html>
Errors I get
Firstly, I get the obvious error of 'require()' being undefined on the client-side. This, I fix using browserify.
Secondly, there is the issue of 'fs' not existing on the client-side, for being a node.js module. I have tried running the trie.js file using node and treating it with some server-side code:
function respond_to_user_input() {
fs.readFile('./index.html', null, (err, html) => {
if (err) throw err;
http.createServer( (request, response) => {
response.write(html);
response.end();
}).listen(PORT);
});
respond_to_user_input();
}
With this, I'm not exactly sure how to edit document elements, such as changing input.value in index.html, or calling the oninput event listener within the input field. Also, my CSS formatting script is not called if I invoke the HTML file through node trie.js command in terminal.
This leaves me with the question: is it even possible to run index.html directly (through Google Chrome) and have it use node JS modules when it calls the trie.js script? Can the server-side code I described above with the HTTP module, how can I fix the issues of invoking my external CSS script (which my HTML file sends an href to) and accessing document.querySelector("input") to edit my input field?

Extracting table value from an URL with Node JS

I am quite new to Node JS and express but I am trying to build a website which serves static files. After some research I've found out that NodeJS with Express can be quite useful for this.
So far I managed to serve some static html files which are located on my server, but now I want to do something else:
I have an URL to an html page, and in that html page, there is a table with some information.
I want to extract specific a couple of values from it, and 1) save it as JSON in a file, 2) write those values in a html page. I've tried to play with jQuery, but so far I've been unsuccessful.
This is what I have so far:
1.node app running on port 8081, which I will further access it from anywhere with NGINX reverse proxy (I already have nginx setup and it works)
2.I can get the URL and serve it as HTML when I use the proper URI.
3.I see that the table doesn't have an ID, but only the "details" class associated with it. Also, I am only interested in getting these rows:
<div class='group'>
<table class='details'>
<tr>
<th>Status:</th>
<td>
With editors
</td>
</tr>
From what I've seen so far, jQuery would work fine if the table has an ID.
This is my code in app.js
var express = require('express');
var app = express();
var request = require('request');
const path = require('path');
var content;
app.use('/', function(req, res, next) {
var status = 'It works';
console.log('This is very %s', status);
//console.log(content);
next();
});
request(
{
uri:
'https://authors.aps.org/Submissions/status?utf8=%E2%9C%93&accode=CH10674&author=Poenaru&commit=Submit'
},
function(error, response, body) {
content = body;
}
);
app.get('/', function(req, res) {
console.log('Got a GET request for the homepage');
res.sendFile(path.join(__dirname, '/', 'index.html'));
});
app.get('/url', function(req, res) {
console.log('You requested table data!!!');
TO DO: SHOW ONLY THE THE VALUES OF THAT TABLE INSTEAD OF THE WHOLE HTML PAGE
res.send(content);
});
var server = app.listen(8081, function() {
var host = server.address().address;
var port = server.address().port;
console.log('Node-App listening at http://%s:%s', host, port);
});
Basically, the HTML content of that URL is saved into content variable, and now I would like to save only the table from it, and also output only the saved part to the new html page.
Any ideas?
Thank you in advance :)
Ok, So I've come across this package called cheerio which basically allows one to use jQuery on the server. Having the html code from that specific URL, I could search in that table the elements that I need. Cheerio is quite straight-forward and with this code I got the results I needed:
var cheerio = require('cheerio');
request(
'https://authors.aps.org/Submissions/status?utf8=%E2%9C%93&accode=CH10674&author=Poenaru&commit=Submit',
(error, res, html) => {
if (!error && res.statusCode === 200) {
const $ = cheerio.load(html);
const details = $('.details');
const articleInfo = details.find('th').eq(0);
const articleStatus = details
.find('th')
.next()
.eq(0);
//console.log(details.html());
console.log(articleInfo.html());
console.log(articleStatus.html());
}
}
);
Thank you #O.Jones and #avcS for guiding me to jsdon and html-node-parser. I will definitely play with those in the near future :)
Cheers!
Your task is called "scraping." You want to scrape a particular chunk of data from some web page you did not create and then return it as part of your own web page.
You have noticed a problem with scraping: often the page you're scraping does not cleanly identify the data you want with a distinctive id. So you must use some guesswork to find it. #AvcS pointed out a server-side npm library called jsdom you can use for this purpose.
Notice this: Even though browsers and nodejs both use Javascript, they are still very different environments. Browser Javascript has lots of built-in APIs to access web pages' Document Object Models (DOMs). But nodejs doesn't have those APIs. If you try to load jQuery into node.js, it won't work, because it depends on browser DOM APIs. The jsdom package gives you some of those DOM APIs.
Once you have fetched that web page to scrape, code like this may help you get what you need.
const jsdom = require("jsdom");
const { JSDOM } = jsdom;
...
const page = new JSDOM(page_in_text_string).window;
Then you can use a subset of the DOM APIs to find the elements you want in your page. In your example, you are looking for elements with the selector div.class table.group. You're looking for the div.class element.
You can do this sort of thing to find what you need:
const desiredTbl = page.document.querySelector("div.class table.group");
const desiredDiv = desiredTbl ? desiredTbl.parentNode : null;
const result = desiredDiv ? desiredDiv.textContent : null;
Finally do this:
page.close();
Your question says you want certain rows from your document. HTML document don't have rows, they have elements. If you want to extract just parts of elements (part of the table rather than the whole thing) you'll need to use some text-string code. Just sayin'
Also, I have not debugged any of this. That is left to you.
There's a smaller and faster library to do similar things called node-html-parser. If performance is important you may want that one instead.

Use external Javascript Libraries IN BigQuery functions?

New to user-defined UDFs, so excuse if this is a dumb question.
Can I use a standard http library to make a request FROM a BigQuery function?
Basically I want to be able to make a function that's available from SQL, that will trigger an external service over http.
I've tried both import and require for the http library in my custom function, but both fail when running the Javascript in BigQuery.
'use strict';
function https(){
let res = '';
http = require('http');
http.get('https://google.com'), (resp) => {
let data = "";
resp.on('end', () => {
res = "pinged";
});
};
return res;
};
Thanks in advance!
As Elliott Brossard said, this isn't possible.
The solution I wound up with is having a library of UDF Javascript functions, and then having another Javascript library which consumes this same code. The outer library handles all the web traffic stuff.

Use "readabilitySAX" on distant pages, in Node.js

I want to get the length of articles published on newspapers and magazines websites and on blogs.
In a server made in Node.js, I want to use the "readabilitySAX" module (https://github.com/fb55/readabilitySAX), but I must make a mistake with the way to use it because this code is not working:
var Readability = require("readabilitySAX/readabilitySAX.js"),
Parser = require("htmlparser2/lib/Parser.js");
var readable = new Readability({
pageURL: "http://www.nytimes.com/2014/04/18/business/treatment-cost-could-influence-doctors-advice.html?src=me&ref=general"
});
parser = new Parser(readable, {});
console.log(readable.getArticle().textLength);
The pageURL attribute is used when Readability resolve relative links, not to download a page.
To download a page, you can use the get method :
require("readabilitySAX").get("http://url", {type:"html"}, function(article) {
console.log(article.textLength);
})

JSONP callback in Dart

I have been trying to get basic JSONP working in Dart and I am getting stuck. Reading this blog post as well as this this blog show that I should use window.on.message.add(dataReceived); to get a MessageEvent and retrieve data from the event.
Dart complains that "There is no such getter 'message' in events". In addition, I looked up different ways of getting a MessageEvent but it seems to be something completely unrelated (WebSockets?) and is not what I actually need.
If anybody can explain what is going on and how to really use JSONP in Dart, that would be awesome!
You don't need to use what is described in the articles you point anymore. You can use dart:js :
import 'dart:html';
import 'dart:js';
void main() {
// Create a jsFunction to handle the response.
context['processData'] = (JsObject jsonDatas) {
// call with JSON datas
};
// make the call
ScriptElement script = new Element.tag("script");
script.src = "https://${url}?callback=processData";
document.body.children.add(script);
}
I recently wrote a blog post on this myself as I was running into similar problems.
I first cover a few prerequisite things like Verifying CORS Compliance and Verifying JSONP Support
I too ended up registering with the updated method:
window.onMessage.listen(dataReceived);
I then had a fairly simple method to dynamically create the script tag in Dart as well (my requirement was that I had to use Dart exclusively and couldn't touch the website source files):
void _createScriptTag()
{
String requestString = """function callbackForJsonpApi(s) {
s.target="dartJsonHandler";
window.postMessage(JSON.stringify(s), '*');
}""";
ScriptElement script = new ScriptElement();
script.innerHtml = requestString;
document.body.children.add(script);
}
I then invoked it from Dart with some simple logic that I wrapped in a method for convenience.
void getStockQuote(String tickerId)
{
String requestString = "http://finance.yahoo.com/webservice/v1/symbols/" + tickerId + "/quote?format=json&callback=callbackForJsonpApi";
ScriptElement script = new ScriptElement();
script.src = requestString;
document.body.children.add(script);
}
If you are using dart:js I find Alexandre's Answer useful and, after upvoting Alexandre, I have updated my post to include the simplified version as well:
context['callbackForJsonpApi'] = (JsObject jsonData)
{
//Process JSON data here...
};
This obviously eliminates the need for the onMessage and _createScriptTag above, and can be invoked the same as before.
I decided to keep both approaches, however, as I have noticed over time the Dart APIs changing and it seems to be a good idea to have a fallback if needed.
The syntax has changed
window.onMessage.listen(dataReceived);

Categories