Using jQuery to manipulate a composed HTML page? - javascript

I am working on a data scraping application. The pages I want to scrape have already been collected by server-to-server requests, and are currently stored in my database. I use an axios request to retrieve them from my database, and I now want to traverse/manipulate the pages using my VueJS application.
Here's what I have so far...
import $ from 'jquery'
...
...
axios.get('/mydomain/fetch_page/1')
.then(function (response) {
console.log(response.data.body); // log point #1
let $page = $(response.data.body);
// let $heading = $page('h2');
console.log($page); // log point #2
})
At log point #1 the page looks fine. It includes the entire page: DOCTYPE, html, head, body, etc
At log point #2, it is an object where each property represents a DOM node, including text nodes.
If I un-comment the line that queries the $page for a $heading, I get an error telling me that $page is not a function. What am I doing wrong?

$page is a jQuery object, not a function. Just as when you're writing web applications using jQuery, you call methods on it, you don't call it as a function.
To get the h2 elements, use the .find() method:
let $heading = $page.find("h2");

Related

How to examine the contents of data returned from an ajax call

I have an ajax call to a PHP module which returns some HTML. I want to examine this HTML and extract the data from some custom attributes before considering whether to upload the HTML into the DOM.
I can see the data from the network activity as well as via console.log. I want to extract the values for the data-pk attribute and test them before deciding whether to upload the HTML or just bypass it.
$.ajax({
url: "./modules/get_recent.php",
method: "POST",
data: {chat_id:chat_id, chat_name:chat_name, host_id:host_id, host_name:host_name}, // received as a $_POST array
success: function(data)
{
console.log(data);
},
})
and some of the console log data are:
class="the_pks" data-pk="11"
class="the_pks" data-pk="10"
etc.
In the above data I want to extract and 'have a look at' the numbers 11 and 10.
I just do not know how to extract these data-pk values from the returned data from the ajax call. Doing an each on class 'the_pks' does not work as at the time I am looking at the data they have not been loaded into the DOM.
I have searched SO for this but have not come up with an answer.
Any advice will be most appreciated.
I hope I understand your question.
If you get a HTML as a response, you can always create an element and insert that html inside it, without adding it to the DOM. And after that you can search it as you do with a DOM element.
const node = document.createElement("div");
//then you can do
node.appendChild(data);
// or
node.innerHTML = data;
And after that you can search inside the node:
node.querySelectorAll("[data-pk]")
I will re-engineer this - it was probably a clumsy way to try and achieve what I wanted anyway. Thanks to those who tried to help.

How to run javascript based rules against a page?

I would like to run some rules against pages. These rules are essentially functions that check the page for information. They can be simple as in 'check if the page has a title tag' or more complex like 'check if all links on the pages are whitelisted based on example.com/allowed_links.json'.
The rules would be run on the page on-demand only and come from a trusted source.
My first approach has been to create a rule service that sends back an javascript array of rules. All the client then has to do is go over the array and run each function in it. The response is a standard object {rule: [name], pass: [true|false], message: [some message about success/failure]}
Since this is on demand only, we fetch the rules from the service and run 'eval' on it.
EDIT: The response from 'mysite/rules' looks like this
RULESYSTEM.rules.push(function1() {...});
RULESYSTEM.rules.push(function2() {...});
...
const RULESYSTEM = {
rules: [],
};
let rules = fetch('mysite/rules')
let rulesscript = await rules.text();
eval(rulescript)
...
//eval will populate the previously declared rules array.
let pass = true;
for(let i=0; i < RULESYSTEM.rules.length; i++) {
let rule = RULESYSTEM.rules[i];
//This obj has only one property. Get that one.
let result = rule();
pass = pass && result.pass;
}
...
This works perfectly fine. However it is receiving a lot of pushback as 'eval' is considered evil and to be avoided at any cost. The security is not an issue here since the source is within the organization itself and thus trusted.
Are there any other ways to implement a system like this.
It would appear that all you're attempting to do is to retrieve JSON data and transforming it into a javascript object.
fetch('mysite/rules')
.then(res=>res.json())
.then(data=>{
//TODO: handle data which is your object/array
console.log(data)
})
Thus no need for eval. Also you need to remember that fetch returns a promise and that rules.text() and rules.json() also returns a promise. The way you've currently written it won't work anyway.
According to MDN
The json() method of the Body mixin takes a Response stream and reads it to completion. It returns a promise that resolves with the result of parsing the body text as JSON.
To answer your question:
Is it possible to return javascript code as JSON
That's clearly a no, however there are alternatives ! You can simply fetch javacsript as text (as you've done) and programmatically create a script tag, load your javascript text in it and append it to your HTML file.
Or even better, you can simply dynamically create a script tag with the URL of your server endpoint sending javascript and append it to your HTML file.
Something like:
const script = document.createElement("script");
script.onload = function(){
console.log("script loaded");
}
script.src = '/some/url/here';
document.body.appendChild(script);
I am going to add this as an answer. I will use some dummy data you can query based on an endpoint
Route("get-functions")
Response getJSFunctions(List<string> js_to_load){
var options = getData(); //returns a list
var guid = new Guid()
var fp = File.open(guid.toString() + ".js", "w+")
var out = "var fns = [" + options.join("\n") + "];";
fp.write(out);
fp.write(" var runner = options => fns.forEach(fn => fn(options) );");
fp.close()
return new Response({url: guid.toString() + ".js"})
}
Js:
$.json("get-functions", data => {
let script = document.createElement("script");
script.src = data.url;
document.head.appendChild(script)
runner(options);
});
So what is happening is that you build a Temp JS FIle containing all JS Functions you want run, then add that file dynamically to the runtime. Then you have a function which will always be available called runner, which you can immediately call.
Runner will iteratively call each function with a global list of options across the functions, which you can define on the clientside.
Instead of using C#, you can use any serverside or even Javascript if you are using node as your backend. You need DB Access and file creation access on the host machine. You will also want to delete the GUID files ocassionally as they are just one and done use files, so you can delete them all every 5 minutes or something.
I dont have the means right now to create a running sample. I can create something a little later with Python if you like as the backend.

How to replace string in JSON format in Javascript with data from JSON file?

For a site that I'm making, I wanted to have an additional feature which uses this plugin on GitHub called cratedigger that uses WebGL and Three.js. It is a 3D virtual crate that contains vinyl records, and simulates "crate digging" of albums. The plugin gets the data for the vinyl titles, artists, and covers through a string in JSON format that is parsed through JSON and stored in a const variable. The original code (index.js):
const data = JSON.parse('[{"title":"So What","artist":"Miles
Davis","cover":"http://cdn-
images.deezer.com/images/cover/63bf5fe5f15f69bfeb097139fc34f3d7/400x400-
000000-80-00.jpg","year":"2001","id":"SOBYBNV14607703ACA","hasSleeve":false},
{"title":"Stolen Moments","artist":"Oliver Nelson","cover":"http://cdn-
images.deezer.com/images/cover/99235a5fbe557590ccd62a2a152e4dbe/400x400-
000000-80-00.jpg","year":"1961","id":"SOCNMPH12B0B8064AA","hasSleeve":false},
{"title":"Theme for Maxine","artist":"Woody Shaw","cover":"http://cdn-
images.deezer.com/images/cover/bb937f1e1d57f7542a64c74b13c47900/400x400-
000000-80-00.jpg","year":"1998","id":"SOMLSGW131343841A7","hasSleeve":false}]
');
You can view the source code here. The above code is in line 3.
For my site though, I want the titles, artists, covers, etc. to come from my MySQL database. So what I did is when I click a button found in my main site, it will run the sql query of my database and then convert it into a .json file:
//getdatabase.php
<?php
include 'includes/session.php';
include 'includes/header.php';
$conn = $pdo->open();
$data = $conn->query("
SELECT name as title,
artist_name as artist,
concat('http://website.com/images/', photo) as cover,
year(date_created) as year,
id,
hasSleeve
from products
where category_id = '5';
")->fetchAll(PDO::FETCH_ASSOC);
foreach($data as &$row){
$row['hasSleeve'] = filter_var($row['hasSleeve'],
FILTER_VALIDATE_BOOLEAN);
}
$json_string = json_encode($data);
$file = 'cratedigger.js/src/htdocs/data.json';
file_put_contents($file, $json_string);
$pdo->close();
header('location: cratedigger.js/lib/index.html');
?>
Afterwards, it will be redirected to the index of the cratedigger plugin. To retrieve the data in the .json file, I used fetch API in the index.js file under the src folder of this plugin. So I replaced the original code in the plugin with this:
//replaced the long line of const data=JSON.parse('[{"title":...]'); with
this:
let data = [];
fetch('data.json').then(function(resp) {
return resp.json();
})
.then(function(data) {
console.log(data); //outputs data from my database
return data;
});
console.log(data); //output is 0 or none
I use node.js to build and test this plugin, and when I test it with the code I used, the crate appears but with no records in it (a blank wooden crate). In the console, the log inside the fetch api does output the data from the json file, but the log outside the fetch outputs zero or none. I figured that it was that fetch is asynchronous, so the second console.log didn't output any data because it didn't wait for fetch to finish.
And that's my problem. I want to replace the original code that uses a string in JSON format, and replace it with data from my database. Some of the solutions that I came up with that didn't work are:
Use await and async - This is still asynchronous so it couldn't store my data in the variable.
XMLHttpRequest is mostly asynchronous too and its synchronous part is already deprecated.
Place fetch inside a function - My problem with this code is that the variable "data" used in the original code is used/called in other parts of the source files like these examples:
function fillInfoPanel(record) {
if (record.data.title) {
titleContainer.innerHTML = record.data.title;
}
}
//or this//
cratedigger.loadRecords(data, true, () => {
bindEvents();
});
So calling a function like myData(); to get the data from fetch wouldn't work as I need it to be inside the data variable. I'm not sure how I'm supposed to replace data variable used in other parts of the code with a function.
Any suggestions that might work? I don't necessarily need to use fetch to retrieve data. I'm more into HTML, PHP & CSS, and not that familiar with Javascript, so I'm stuck here. Node.JS was something that I've learned a week ago so the codes initially confused me. I've read articles and watched YouTube tutorials about javascript, json, etc. but they confused me more than help with my problem.

Using Jquery methods on Data Returned from Ajax, with out printing out the data

So I have a rather unique situation. I am using JQuery to gather some data based on two date ranges, what is returned as a response in the $data variable (I am using Ajax) I have set, is a html table.
Now I don't want the user to ever see this table, I want to use This jquery plugin to download the CSV file of that table. The question is, if the table sits inside of a $data and can be seen via the network tab in Chrom Dev Tools, under Response, is it possible to be manipulated with Jquery?
In our inhouse framework, we do the following to get Ajax Data:
// The following belongs to a JS class method.
data = {
startDate : $('.startDate').val(),
endDate : $('.endDate').val()
}
CT.postSynch('report/payRollReport/downloadPayRoleReport', {data : data}, function(data){
console.log(data);
});
We pass a data object to our Ajax wrapper, call a controller with an action (in this case downloadPayRoleReport translates to ajaxDownloadPayRoleReport()) which in turn returns an HTML table, which I can view via console.log(data)
I want to use the above linked plugin on data to then turn this html table into a csv and instant download.
Question is, can this be done?
You can create a jQuery object from the table. Then you can do anything to the jQuery object just like you could if it were actually on the DOM. You can always put the table on the DOM as well off screen, but I think any chance you have to not touch the DOM you should take it.
var myTable = $(data);
myTable.mySpecialTableMethodToExportToCSV();

jquery- get different page elements

I want to get element attribute value which belongs to other html page.
For example if I am in file a.html and want to get data like element attribute value from b.html in a.html
All I am trying to do in jquery.
Please suggest!
I read posts but I want like below-
something like->
[Code of a.html]
var result = get(b.html).getTag(img).getAttribute(src)//not getting exactly
$("#id").append(result)
any idea how can i achieve this?
first you will have to fetch the b.html and then you can find the attribute value e.g.
//if you dont want to display the data make the div hidden
^
$("#someDivID").load("b.html",function(data){
var value=$(data).find("#elementID").attr("attributeName");
});
With jQuery you can load only parts of remote pages. Basic syntax:
$('#result').load('ajax/test.html #container');
The second part of the string is a basic jQuery selector. See the jQuery documentation.
By default, selectors perform their searches within the DOM starting at the document root.
If you want to pass alternate context, you can pass to the optional second parameter to the $() function. For eg,
$('#name', window.parent.frames[0].document).attr();
Keep in mind that you can usually only make direct connections (like with $(..).load() to pages on the same domain you're currently on, or to domains that do not have CORS restrictions. (The vast majority of sites have CORS restrictions). If you want to load the content from a cross-domain page that has CORS restrictions, you'll have to make make the request through your server, and have your server make the request to the other site, then respond to your front-end script with the response.
As for this question, if you want to achieve this result without jQuery, you can use DOMParser on the response text instead, to transform it into a document, and then you can use DOM methods on that document to retrieve the element, parse it as desired, and insert it (or data retrieved from it) onto the current page. For example:
fetch('b.html') // replace with the URL of the external page
.then(res => res.text())
.then((responseText) => {
const doc = new DOMParser().parseFromString(responseText, 'text/html');
const targetElementOnOtherPage = doc.querySelector('img');
const src = targetElementOnOtherPage.src;
document.querySelector('#id').insertAdjacentHTML('beforeend', `<img src="${src}">`);
})
.catch((err) => {
// There was an error, handle it here
});

Categories