Get title from HTML - javascript

Get title from HTML - javascript

I have a server side function which returns content of HTML page:
if (Meteor.isServer) {
Meteor.startup(function () {
// code to run on server at startup
Meteor.methods({
sayHello: function() {
var response = Meteor.http.call("GET", "http://google.com");
return response;
}
});
});
And I have client code where I am trying to get title from this HTML page:
'click .add_tag' : function(e,t) {
//Session.set('editing_tag_id', e.target.id);
Meteor.call("sayHello", function(err, response) {
var title = $(response.content).find("title").text();
var title2 = $(response).find("title").text();
var title3 = response.content.match(/<title[^>]*>([^<]+)<\/title>/)[1];
alert(title3);
});
I would like to get jQuery version ('title' or 'title2'), but it doesn't works. It returns empty string.
'Title3' - version works fine, but I don't like regexps. :)
Is there any way to make 'jQuery'-versions works ?

As requested, I will reiterate my comment as an answer...
I would stick with the regex, even though you don't like it. There is a huge overhead of constructing a DOM element that is essentially an entire page, purely for the purpose of parsing a small amount of text. The regex is more lightweight and will perform adequately in slower browsers or on slower machines.

Wrap response.content in a <div> and then do a selection off of that. This way you have a proper structure to start from rather than an array that you might actually be getting.
var $wrap = $("<div></div>").html(response.content);
$wrap.find("title").text();
An example of what is probably going on: http://jsfiddle.net/UFtJV/

Don't forget one thing : you should never return HTML to client. You should return Json (or even Xml) that your client will transform into Html using Template.
You are doing like a lot of dev doing Bad Ajax.
Don't forget : "only data on wire, not display".
So there should not be any problem coz, on response you just have to take data from Json formatted response and inject it inside your Template.

Related

How to run javascript based rules against a page?

I would like to run some rules against pages. These rules are essentially functions that check the page for information. They can be simple as in 'check if the page has a title tag' or more complex like 'check if all links on the pages are whitelisted based on example.com/allowed_links.json'.
The rules would be run on the page on-demand only and come from a trusted source.
My first approach has been to create a rule service that sends back an javascript array of rules. All the client then has to do is go over the array and run each function in it. The response is a standard object {rule: [name], pass: [true|false], message: [some message about success/failure]}
Since this is on demand only, we fetch the rules from the service and run 'eval' on it.
EDIT: The response from 'mysite/rules' looks like this
RULESYSTEM.rules.push(function1() {...});
RULESYSTEM.rules.push(function2() {...});
...
const RULESYSTEM = {
rules: [],
};
let rules = fetch('mysite/rules')
let rulesscript = await rules.text();
eval(rulescript)
...
//eval will populate the previously declared rules array.
let pass = true;
for(let i=0; i < RULESYSTEM.rules.length; i++) {
let rule = RULESYSTEM.rules[i];
//This obj has only one property. Get that one.
let result = rule();
pass = pass && result.pass;
}
...
This works perfectly fine. However it is receiving a lot of pushback as 'eval' is considered evil and to be avoided at any cost. The security is not an issue here since the source is within the organization itself and thus trusted.
Are there any other ways to implement a system like this.

It would appear that all you're attempting to do is to retrieve JSON data and transforming it into a javascript object.
fetch('mysite/rules')
.then(res=>res.json())
.then(data=>{
//TODO: handle data which is your object/array
console.log(data)
})
Thus no need for eval. Also you need to remember that fetch returns a promise and that rules.text() and rules.json() also returns a promise. The way you've currently written it won't work anyway.
According to MDN
The json() method of the Body mixin takes a Response stream and reads it to completion. It returns a promise that resolves with the result of parsing the body text as JSON.
To answer your question:
Is it possible to return javascript code as JSON
That's clearly a no, however there are alternatives ! You can simply fetch javacsript as text (as you've done) and programmatically create a script tag, load your javascript text in it and append it to your HTML file.
Or even better, you can simply dynamically create a script tag with the URL of your server endpoint sending javascript and append it to your HTML file.
Something like:
const script = document.createElement("script");
script.onload = function(){
console.log("script loaded");
}
script.src = '/some/url/here';
document.body.appendChild(script);

I am going to add this as an answer. I will use some dummy data you can query based on an endpoint
Route("get-functions")
Response getJSFunctions(List<string> js_to_load){
var options = getData(); //returns a list
var guid = new Guid()
var fp = File.open(guid.toString() + ".js", "w+")
var out = "var fns = [" + options.join("\n") + "];";
fp.write(out);
fp.write(" var runner = options => fns.forEach(fn => fn(options) );");
fp.close()
return new Response({url: guid.toString() + ".js"})
}
Js:
$.json("get-functions", data => {
let script = document.createElement("script");
script.src = data.url;
document.head.appendChild(script)
runner(options);
});
So what is happening is that you build a Temp JS FIle containing all JS Functions you want run, then add that file dynamically to the runtime. Then you have a function which will always be available called runner, which you can immediately call.
Runner will iteratively call each function with a global list of options across the functions, which you can define on the clientside.
Instead of using C#, you can use any serverside or even Javascript if you are using node as your backend. You need DB Access and file creation access on the host machine. You will also want to delete the GUID files ocassionally as they are just one and done use files, so you can delete them all every 5 minutes or something.
I dont have the means right now to create a running sample. I can create something a little later with Python if you like as the backend.

node.js does not recognise the url in the unfluff module

Any help will be appreciated.
I need to extract data from websites and found that node-unfluff does the job (see https://github.com/ageitgey/node-unfluff). There is two ways to call this module.
First, from command line which works!
Second, from node js which doesn't work.
extractor = require('unfluff');
data = extractor('test.html');
console.log(data);
Output : {"title":"","lang":null,"tags":[],"image":null,"videos":[],"text":""}
The data returns an empty json object. It appears like it cannot read the test.html.
It seems like it doesn't recognise test.html. The example says, "my html data", is there a way to get html data ? Thanks.

From the docs of unfluff:
extractor(html, language)
html: The html you want to parse
language (optional): The document's two-letter language code. This
will be auto-detected as best as possible, but there might be cases
where you want to override it.
You are passing a filename, and it expects the actual HTML of the file to be passed in.
If you are doing this in a scripting context, I'd recommend doing
data = extractor(fs.readFileSync('test.html'));
however if you are doing this in the context of a server or some time when blocking will be an issue, you should do:
fs.readFile('test.html', function(err, html){
var data = extractor(html);
console.log(data);
));

Odd occurrence in using app.use(parseExpressRawBody())

Maybe this is simple, maybe this is a bug on Parse - would like to know if anyone has had the same problem and a possible solution.
What I'm trying to do:
I'm sending a JSON request from an app called FormEntry to my Parse app
The body comes in like this: json={"someLabel" : "someValue"}
I would like to take the entire body and create a Parse.Cloud.httpRequest over to Zapier to perform some functions.
Now, the problem seems to be this:
On random occasions (i.e. I have no idea why), the body is sent (as shown by the logs) where there is a trailing comma at the end of the last pair in the JSON object. e.g. like this json={"lastLabel" : "lastValue",}
The number of elements in 'normal' and 'incorrect' objects seem to be the same, so it's simply just another comma added. And I have no idea why.
My setup:
Using app.use(parseExpressRawBody()); only and not the standard app.use(express.bodyParser()); which doesn't provide access to the raw body.
Because parseExpressRawBody converts the body to a buffer I need to turn it back into a string to send it in the HTTP request in a meaningful way. Therefore I use: var body = req.body.toString();
When logging this var to the Parse console it looks to be format back from the buffer fine.
And that's about it. Nothing complex going on here but a real annoying bug that I just haven't found a sensible way of understanding. Would SUPER appreciate anyone who has seen this before or who could point me in a direction to focus on.

Just an update on this. Not a solution that answers why there is malformed JSON but a hack to get the right result.
The purpose of the HTTP request was to point over to Zapier so I wrote a Zapier script that would deal with the malformed JSON. Added here for anyone else who needs it.
"use strict";
var Zap = { newSubmission_catch_hook: function(bundle) {
var body = bundle.request.content;
var cleanTop = body.substring(5,body.length);
var cleanChar = cleanTop.length;
var condition = cleanTop.substring(cleanChar-2,cleanChar);
function testCase(condition,cleanTop) {
if (condition != ",}"){
console.log("Everything is fine, returning JSON");
return cleanTop;
}
else {
console.log("Nope! We have an error, cleaning end");
var cleanEnd = cleanTop.substr(0,cleanChar-2) + '}';
console.log("The object now ends with: " + cleanEnd.substr(-10));
return cleanEnd;
}
}
var newBody = JSON.parse(testCase(condition,cleanTop));
return newBody;
}
};

Caching AJAX query results with prototype

I'm looking at putting together a good way of caching results of AJAX queries so that the same user doesn't have to repeat the same query on the same page twice. I've put something together using a Hash which works fine but i'm not sure if there's a better method i could be using. This is a rough snippet of what i've come up with which should give you a general idea:
var ajaxresults;
document.observe("dom:loaded", function() {
ajaxresults = new Hash();
doAjaxQuery();
});
function doAjaxQuery(){
var qs = '?mode=getSomething&id='+$('something').value;
if(ajaxresults.get(qs)){
var vals = (ajaxresults.get(qs)).evalJSON();
doSomething(vals);
}else{
new Ajax.Request('/ajaxfile.php'+qs,{
evalJSON: true,
onSuccess: function(transport){
var vals = transport.responseText.evalJSON();
ajaxresults.set(qs,transport.responseText);
},
onComplete: function(){
doSomething(vals);
}
});
}
}

Did you try caching the AJAX requests by defining the cache content headers. Its another way and your browser will take care of caching. You dont have to create any hash inside your libraray to maintaing cache data.
High performance websites discussed lot about this. I dont know much about the PHP, but there is a way in .Net world to setting cache headers before writing the response to stream. I am sure there should be a similar way in PHP too.

If you start building a results tree with JSON you can check of a particular branch (or entry) exists in the tree. If it doesn't you can go fetch it from the server.
You can then serialize your JSON data and store it in window.name. Then you can persist the data from page to page as well.
Edit:
Here's a simple way to use JSON for this type of task:
var clientData = {}
clientData.dataset1 = [
{name:'Dave', age:'41', userid:2345},
{name:'Vera', age:'32', userid:9856}
]
if(clientData.dataset2) {
alert("dataset 2 loaded")
}
else {
alert("dataset 2 must be loaded from server")
}
if(clientData.dataset1) {
alert(clientData.dataset1[0].name)
}
else {
alert("dataset 1 must be loaded from server")
}

Well, I guess you could abstract it some more (e.g. extend Ajax by a cachedRequest() method that hashes a combination of all parameters to make it universally usable in any Ajax request) but the general approach looks fine to me, and I can't think of a better/faster solution.

jsonp request not working in firefox

I am trying a straightforward remote json call with jquery. I am trying to use the reddit api. http://api.reddit.com. This returns a valid json object.
If I call a local file (which is what is returned from the website saved to my local disk) things work fine.
$(document).ready(function() {
$.getJSON("js/reddit.json", function (json) {
$.each(json.data.children, function () {
title = this.data.title;
url = this.data.url;
$("#redditbox").append("<div>" + title + "<div>");
});
});
});
If I then try to convert it to a remote call:
$(document).ready(function() {
$.getJSON("http://api.reddit.com", function (json) {
$.each(json.data.children, function () {
title = this.data.title;
url = this.data.url;
$("#redditbox").append("<div>" + title + "<div>");
});
});
});
it will work fine in Safari, but not Firefox. This is expect as Firefox doesnt do remote calls due to security or something. Fine.
In the jquery docs they say to do it like this (jsonp):
$(document).ready(function() {
$.getJSON("http://api.reddit.com?jsoncallback=?", function (json) {
$.each(json.data.children, function () {
title = this.data.title;
url = this.data.url;
$("#redditbox").append("<div>" + title + "<div>");
});
});
});
however it now stops working on both safari and firefox. The request is made but what is return from the server appears to be ignored.
Is this a problem with the code I am writing or with something the server is returning? How can I diagnose this problem?
EDIT Changed the address to the real one.

JSONP is something that needs to be supported on the server. I can't find the documentation, but it appears that, if Reddit supports JSONP, it's not with the jsoncallback query variable.
What JSONP does, is wrap the JSON text with a JavaScript Function call, this allows the JSON text to be processed by any function you've already defined in your code. This function does need to be available from the Global scope, however. It appears that the JQuery getJSON method generates a function name for you, and assigns it to the jsoncallback query string variable.

The URL you are pointing to (www.redit.com...) is not returning JSON! Not sure where the JSON syndication from reddit comes but you might want to start with the example from the docs:
$(document).ready(function() {
$.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?", function (data) {
$.each(data.items, function(i,item){
$("<img/>").attr("src", item.media.m).appendTo("#redditbox");
if ( i == 4 ) return false;
});
});
});
(apologies for formatting)
EDIT Now I re read your post, I see you intended to go to api.reddit.com unfortunately you haven't got the right parameter name for the json callback parameter. You might need to further consult the reddit documentation to see if they support JSONP and what the name of the callback param should be.

I'm not sure about reddit.com, but for sites that don't support the JSONP idiom you can still create a proxy technique (on the backend) that would return the reddit JSON, and then you would just make an ajax request to that that.
So if you called http://mydomain.com/proxy.php?url=http://api.reddit.com:
<?php
$url = $_GET["url"];
print_r(file_get_contents($url));
?>

http://api.reddit.com/ returns JSON, but doesn't appear to be JSONP-friendly. You can verify this, if you have GET, via
% GET http://api.reddit.com/?callback=foo
which dumps a stream of JSON without the JSONP wrapper.
http://code.reddit.com/browser/r2/r2/controllers/api.py (line 84) shows the code looking for 'callback' (not 'jsoncallback'). That may be a good starting point for digging through Reddit's code to see what the trick is.

We Keep Coding

JavaScript is the programming language of the Web.

Get title from HTML - javascript

Related

How to run javascript based rules against a page?

node.js does not recognise the url in the unfluff module

Odd occurrence in using app.use(parseExpressRawBody())

Caching AJAX query results with prototype

jsonp request not working in firefox

Categories

Resources