Google email extraction script - Exceeded maximum execution time (No matter what) - javascript

I am using the below script to extract email addresses from gmail based upon some search criteria and output them to a google spreadsheet. Functionally, the script works and does what I want it do.
However, I am constantly getting "Exceeded maximum execution time" when I run the script as the maximum execution time for gmail scripts appears to be five minutes. I have tested this with a smaller label in gmail with a handful of emails and the script runs successfully and outputs emails as expected. However when I attempt to extract anything in larger batches with more emails the script cannot finish.
This script is cobled from other stuff I found on the web. I have attempted to amend this time out issue by adding for loops in a try block with the exception being caught and sent to sleep so that the script could pause execution and not exceed the time limit, however this did not work. I have also tried other methods of sending the script to sleep to prevent time out from occuring but these where unsccessful.
Could someone help me in preventing the time out from occurring or else use some more efficient way of searching through email threads to grab the emails out?
Edit: I have amended the code with the suggestions added, however it still cannot complete without reaching execution time limit. Any ideas why the script is not pausing? I have also attempted to search just one message using GmailApp.search(search, 0, 1) however the script will not complete when I search my Inbox.
function extractEmailAddresses() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var userInputSheet = ss.getSheets()[0];
var labelName = userInputSheet.getRange("B2").getValue();
var keyword = userInputSheet.getRange("C2").getValue();
var sheetName = "Label: " + labelName;
var sheet = ss.getSheetByName (sheetName) || ss.insertSheet (sheetName, ss.getSheets().length);
sheet.clear();
var messageData = [];
var search = "label:" + label + "is:unread " + keyword;
// Process 50 Gmail threads in a batch to prevent script execution errors
var threads = GmailApp.search(search, 1, 1);
var messages, from, email, subject, mailDate;
try {
for (var x=0; x<threads.length; x++) {
var message = threads[x].getMessages()[0]; //Get message for thread
from = message.getFrom();
mailDate = message.getDate();
from = from.match(/\S+#\S+\.\S+/g);
if ( from.length ) {
email = from[0];
email = email.replace(">", "");
email = email.replace("<", "");
//push emails to array
messageData.push ([email, mailDate]);
}
}
}
catch (e) {
//Pause script to prevent exceeded timeout error
Logger.log(e.toString());
Utilities.sleep(5000);
}
//Adding our emails to the spreadsheet
sheet.getRange (1, 1, messageData.length, 2).setValues (messageData);
}

Two quick ideas, I've not looked at it in detail and don't know the API well:
Don't call getMessages twice:
Replace:
from = threads[x].getMessages()[0].getFrom();
mailDate = threads[x].getMessages()[0].getDate();
with:
var message = threads[x].getMessages()[0];
from = message.getFrom();
mailDate = message.getDate();
Avoid using setValue on each iteration: Single cell updates to the spreadsheet will be slow, as each time will involve communicating with Sheets infrastructure and committing a value. Instead, build up a bigger array of cell values to change and set it all at once using setValues(Object[][])
Those are just thoughts from a quick glance, sorry.

Related

Remove Importxml formula immediately after scrape data in every active cell in Google Sheets

I use this script to scrape data from any website in every 15 minute. I want to make this script auto remove Importxml formula and keep value only, but yet still can't achieve it.
function fetchData (){
var wrkBk= SpreadsheetApp.getActiveSpreadsheet();
var wrkSht= wrkBk.getSheetByName("Sheet1");
var url= "https://coinmarketcap.com/currencies";
for (var i= 2;i <=6;i++)
{
var coin= wrkSht.getRange('A' + i).getValue();
var formula = "=IMPORTXML(" + String.fromCharCode(34) + url + "/" + coin + String.fromCharCode(34) + "," + String.fromCharCode(34)+"//span[#class='cmc-details-panel-price__price']"+ String.fromCharCode(34)+")";
wrkSht.getRange('C' + i).activate();
wrkSht.getActiveRangeList().clear({contentsOnly: true, skipFilteredRows: true});
wrkSht.getRange('C'+i).setFormula(formula);
Utilities.sleep(1000);
}}
And I try put this script before Utilites.sleep(1000); and yet still not success
First try
var range = wrkSht.getRange('C'+i);
range.copyTo(range, {contentsOnly: true});
Second try
var range = wrkSht.getCurrentCell();
range.copyTo(range, {contentsOnly: true});
This is my Google Spreadsheet
https://docs.google.com/spreadsheets/d/1vykBSNJQ9xO23jA1ZT8fQAjfmtUQOQTqzQXFfCqz8oQ/edit?usp=sharing
Hope someone can help me, Thanks you
By default Google Apps Script applies the changes made by the code until the execution ends. Use SpreadsheetApp.flush() to force the changes be applied before doing the copy/paste as values only operation.
Instead of
var range = wrkSht.getCurrentCell();
Use
SpreadsheetApp.flush(); // This force to apply the previous changes (add the formula)
Utilities.sleep(30000); // This is required to wait for the spreadsheet to be recalculated (importxml import the data)
var range = wrkSht.getDataRange(); // This is in case that you want to paste the whole sheet as values
Instead of sleep you could use a loop to poll the spreadsheet until the spreadsheet is recalculated.
NOTE: Whenever it's possible we should avoid to use Google Apps Script classes and methods inside loops because they are (extremely?) slow and the execution time limit is small for free accounts (6 mins) and not so big for G Suite accounts (30 mins). The official docs explain this and we have several questions about this here.
Resources
Best Practices | Google Apps Script

Google Scripts Trigger not firing

I'm struggling to get my script to auto-run at 6AM (ish). I have the trigger set up to run this script, "Time-Driven", on a "day timer" between "6-7 am". I'm getting no failure notifications (set up to email to me immediately), but the script isn't running. It works exactly as I want it to when I manually run it, but the whole point was to automate it, so I'm not sure what I am doing wrong here. I looked up other instances, and they seem to have been fixed by deleting and re-adding the triggers, but that doesn't solve the issue for me. Is it something in my script preventing an auto-run?
function getMessagesWithLabel() {
var destArray = new Array();
var label= GmailApp.getUserLabelByName('Personal/Testing');
var threads = label.getThreads(0,2);
for(var n in threads){
var msg = threads[n].getMessages();
var body = msg[0].getPlainBody();
var destArrayRow = new Array();
destArrayRow.push('thread has '+threads[n].getMessageCount()+' messages');
for(var m in msg){
destArrayRow.push(msg[m].getPlainBody());
}
destArray.push(destArrayRow);
}
Logger.log(destArray);
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sh = ss.getActiveSheet();
if(ss.getLastRow()==0){sh.getRange(2,1).setValue('getMessagesWithLabel() RESULTS')};
sh.getRange(2,1,destArray.length,destArray[0].length).setValues(destArray);
}
I'm not 100% sure, but the reason for this could be that during a trigger there's no "ActiveSpreadsheet" if the script isnt directly linked to spreadsheet (As the spreadsheet is closed). So you should try using:
var ss = SpreadsheetApp.openById(id); // id is the id of the spreadsheet
// https://docs.google.com/spreadsheets/d/id_is_here/
var sh = ss.getSheetByName(name); // name of the actual sheet ("Sheet 1" for example)
Otherwise i see nothing wrong with your code (other than you using label.getThreads(0,2) which sets the maximum number of threads to be brought in to 2, but i assume that's intentional)
Also, you're setting 2,1 instead of what i assume needs to be 1,1 in
if(ss.getLastRow()==0){sh.getRange(2,1).setValue('getMessagesWithLabel() RESULTS')};
The problem is due to the use of getActiveSheet as it retrieves the sheet displayed on the UI but when your time-driven trigger runs there isn't a sheet displayed on the UI.
Replace getActiveSheet by getSheetByName or better get the sheet by it's ID (for details see Get Google Sheet by ID?)
Reference:
https://developers.google.com/apps-script/reference/spreadsheet/spreadsheet-app#getactivesheet

TypeError: Cannot find function includes in object

I have very little experience using Google Script but I was attempting to use it to search through one column of a spreadsheet and find all instances of the string "Film Dub" (knowing that there can be only one per cell).
Below is my code:
function filmDub() {
var sheet = SpreadsheetApp.getActiveSheet();
var data = sheet.getDataRange().getValues();
for (var i = 1; i < 100; i++) {
var s = data[i][2].toString();
if (s.includes('Film Dub')) {
data[5][13]++;
}
}
}
However I keep receiving the error
TypeError: Cannot find function includes in object Let's Make A Date, Film Dub, Three Headed Broadway Star, Film TV Theater Styles, Greatest Hits, World's Worst. (line 6, file "Code")
"Let's Make A Date, Film Dub, Three Headed Broadway Star, Film TV Theater Styles, Greatest Hits, World's Worst" is the correct content of data[i][2] so it is getting correct information from the spreadsheet. I have used the debugger in Google Script Editor to verify that s is a string (this was one of the solutions to similar questions here on Stack Overflow) but that didn't fix my problem. What else could be wrong with it?
You should use indexOf with a string to check for the existence of a text block.
function filmDub() {
var sheet = SpreadsheetApp.getActiveSheet();
var data = sheet.getDataRange().getValues();
for (var i = 1; i < 100; i++) {
var s = data[i][2].toString();
if (s.indexOf('Film Dub') !== -1) {
data[5][13]++;
}
}
}
you just need to enable v8 engine of apps script.
click on Run in menu bar.
click on Enable Run time apps script V8 engine.
and then run.
I got same error before after enabling v8 engine it works very well.
Upon further checking, double check how your sheet is formed. This is how I formed the sheet to make your code working.
[A] [B] [C]
[1]Let's Make A Date Film Dub Three Headed Broadway Star
[0] [1] [2]
Here is you code:
function filmDub() {
var sheet = SpreadsheetApp.getActiveSheet();
var data = sheet.getDataRange().getValues();
for (var i = 1; i < 10; i++) {
var s = data[i][2].toString();
//Logger.log(s);
if (s.indexOf("Film Dub")> -1) {
Logger.log("Horray");
}
}
}
Here is the result:
Hope this helps!
Make sure your Sheet is running the current version of Apps Script. When you open the script editor you should see a yellow notice at the top that says "This project is running on our new Apps Script runtime powered by Chrome V8." If not, there should be an option to enable it. The includes function is new for Sheets scripting. You'll need to enable it individually for every Sheet in which you intend to use it.
I am not an expert on Google Apps Script, but what I know it's javascript, so why not use the following JS method:
if(s.indexOf('Film Dub') > -1)

How to get google search output in the google application script environment?

If I use next function to get google output:
function myFunction() {
var post_url, result;
post_url = "http://www.google.com/search?q=stack+overflow";
result = UrlFetchApp.fetch(post_url);
Logger.log(result);
}
doesn't work.
P.S.
Sorry, I have to eŃ…plore some dependences.
I take an example
function scrapeGoogle() {
var response = UrlFetchApp.fetch("http://www.google.com/search?q=labnol");
var myRegexp = /<h3 class=\"r\">([\s\S]*?)<\/h3>/gi;
var elems = response.getContentText().match(myRegexp);
for(var i in elems) {
var title = elems[i].replace(/(^\s+)|(\s+$)/g, "")
.replace(/<\/?[^>]+>/gi, "");
Logger.log(title);
}
}
and it works, than I begin to do some modifications and noticed that when I have some error in code it gives me an error
Request failed for http://www.google.com/search?q=labnol returned code
503.
So I did some researches without error's and it solution works. But when I began to form it to the function in lib it begans to throw me an error of 503 each time!
I'm very amazing of such behavior...
Here is short video only for fact. https://youtu.be/Lem9eiIVY0I
P.P.S.
Oh! I've broke some violations, so the google engine send me to stop list
so I run this:
function scrapeGoogle() {
var options =
{
'muteHttpExceptions': true
}
var response = UrlFetchApp.fetch("http://www.google.com/search?q=labnol", options);
Logger.log(response);
}
and get
About this pageOur systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen?
As I see I have to use some special google services to get the search output and not to be prohibited?
You can use simple regex to extract Google search results.
var regex = /<h3 class=\"r\">([\s\S]*?)<\/h3>/gi;
var items = response.getContentText().match(regex);
Alternatively, you can use the ImportXML function in sheets.
=IMPORTXML(GOOGLE_URL, "//h3[#class='r']")
See: Scrape Google Search with Sheets

How to ping IP addresses using JavaScript

I want to run a JavaScript code to ping 4 different IP addresses and then retrieve the packet loss and latency of these ping requests and display them on the page.
How do I do this?
You can't do this from JS. What you could do is this:
client --AJAX-- yourserver --ICMP ping-- targetservers
Make an AJAX request to your server, which will then ping the target servers for you, and return the result in the AJAX result.
Possible caveats:
this tells you whether the target servers are pingable from your server, not from the user's client
so the client won't be able to test hosts its LAN
but you shouldn't let the host check hosts on the server's internal network, if any exist
some hosts may block traffic from certain hosts and not others
you need to limit the ping count per machine:
to avoid the AJAX request from timing out
some site operators can get very upset when you keep pinging their sites all the time
resources
long-running HTTP requests could run into maximum connection limit of your server, check how high it is
many users trying to ping at once might generate suspicious-looking traffic (all ICMP and nothing else)
concurrency - you may wish to pool/cache the up/down status for a few seconds at least, so that multiple clients wishing to ping the same target won't launch a flood of pings
The only method I can think of is loading e.g. an image file from the external server. When that load fails, you "know" the server isn't responding (you actually don't know, because the server could just be blocking you).
Take a look at this example code to see what I mean:
/*note that this is not an ICMP ping - but a simple HTTP request
giving you an idea what you could do . In this simple implementation it has flaws
as Piskvor correctly points out below */
function ping(extServer){
var ImageObject = new Image();
ImageObject.src = "http://"+extServer+"/URL/to-a-known-image.jpg"; //e.g. logo -- mind the caching, maybe use a dynamic querystring
if(ImageObject.height>0){
alert("Ping worked!");
} else {
alert("Ping failed :(");
}
}
I was inspired by the latest comment, so I wrote this quick piece of code.
This is a kind of "HTTP ping" which I think can be quite useful to use along with XMLHttpRequest calls(), for instance to figure out which is the fastest server to use in some case or to collect some rough statistics from the user's internet connexion speed.
This small function is just connecting to an HTTP server on an non-existing URL (that is expected to return a 404), then is measuring the time until the server is answering to the HTTP request, and is doing an average on the cumulated time and the number of iterations.
The requested URL is modified randomely at each call since I've noticed that (probably) some transparent proxies or caching mechanisms where faking results in some cases, giving extra fast answers (faster than ICMP actually which somewhat weird).
Beware to use FQDNs that fit a real HTTP server!
Results will display to a body element with id "result", for instance:
<div id="result"></div>
Function code:
function http_ping(fqdn) {
var NB_ITERATIONS = 4; // number of loop iterations
var MAX_ITERATIONS = 5; // beware: the number of simultaneous XMLHttpRequest is limited by the browser!
var TIME_PERIOD = 1000; // 1000 ms between each ping
var i = 0;
var over_flag = 0;
var time_cumul = 0;
var REQUEST_TIMEOUT = 9000;
var TIMEOUT_ERROR = 0;
document.getElementById('result').innerHTML = "HTTP ping for " + fqdn + "</br>";
var ping_loop = setInterval(function() {
// let's change non-existent URL each time to avoid possible side effect with web proxy-cache software on the line
url = "http://" + fqdn + "/a30Fkezt_77" + Math.random().toString(36).substring(7);
if (i < MAX_ITERATIONS) {
var ping = new XMLHttpRequest();
i++;
ping.seq = i;
over_flag++;
ping.date1 = Date.now();
ping.timeout = REQUEST_TIMEOUT; // it could happen that the request takes a very long time
ping.onreadystatechange = function() { // the request has returned something, let's log it (starting after the first one)
if (ping.readyState == 4 && TIMEOUT_ERROR == 0) {
over_flag--;
if (ping.seq > 1) {
delta_time = Date.now() - ping.date1;
time_cumul += delta_time;
document.getElementById('result').innerHTML += "</br>http_seq=" + (ping.seq-1) + " time=" + delta_time + " ms</br>";
}
}
}
ping.ontimeout = function() {
TIMEOUT_ERROR = 1;
}
ping.open("GET", url, true);
ping.send();
}
if ((i > NB_ITERATIONS) && (over_flag < 1)) { // all requests are passed and have returned
clearInterval(ping_loop);
var avg_time = Math.round(time_cumul / (i - 1));
document.getElementById('result').innerHTML += "</br> Average ping latency on " + (i-1) + " iterations: " + avg_time + "ms </br>";
}
if (TIMEOUT_ERROR == 1) { // timeout: data cannot be accurate
clearInterval(ping_loop);
document.getElementById('result').innerHTML += "<br/> THERE WAS A TIMEOUT ERROR <br/>";
return;
}
}, TIME_PERIOD);
}
For instance, launch with:
fp = new http_ping("www.linux.com.au");
Note that I couldn't find a simple corelation between result figures from this script and the ICMP ping on the corresponding same servers, though HTTP response time seems to be roughly-exponential from ICMP response time. This may be explained by the amount of data that is transfered through the HTTP request which can vary depending on the web server flavour and configuration, obviously the speed of the server itself and probably other reasons.
This is not very good code but I thought it could help and possibly inspire others.
The closest you're going to get to a ping in JS is using AJAX, and retrieving the readystates, status, and headers. Something like this:
url = "<whatever you want to ping>"
ping = new XMLHttpRequest();
ping.onreadystatechange = function(){
document.body.innerHTML += "</br>" + ping.readyState;
if(ping.readyState == 4){
if(ping.status == 200){
result = ping.getAllResponseHeaders();
document.body.innerHTML += "</br>" + result + "</br>";
}
}
}
ping.open("GET", url, true);
ping.send();
Of course you can also put conditions in for different http statuses, and make the output display however you want with descriptions etc, to make it look nicer. More of an http url status checker than a ping, but same idea really. You can always loop it a few times to make it feel more like a ping for you too :)
I've come up with something cause I was bored of searching hours after hours for something that everyone is saying "impossible", only thing I've found was using jQuery.
I've came up with a new simple way using Vanilla JS (nothing else than base JavaScript).
Here's my JSFiddle: https://jsfiddle.net/TheNolle/5qjpmrxg/74/
Basically, I create a variable called "start" which I give the timestamp, then I try to set an invisible image's source to my website (which isn't an image) [can be changed to any website], because it's not an image it creates an error, which I use to execute the second part of the code, at this time i create a new variable called "end" which i give the timestamp from here (which is different from "start"). Afterward, I simply make a substraction (i substract "start" from "end") which gives me the latency that it took to ping this website.
After you have the choice you can store that in a value, paste it on your webpage, paste it in the console, etc.
let pingSpan = document.getElementById('pingSpan');
// Remove all the way to ...
let run;
function start() {
run = true;
pingTest();
}
function stop() {
run = false;
setTimeout(() => {
pingSpan.innerHTML = "Stopped !";
}, 500);
}
// ... here
function pingTest() {
if (run == true) { //Remove line
let pinger = document.getElementById('pingTester');
let start = new Date().getTime();
pinger.setAttribute('src', 'https://www.google.com/');
pinger.onerror = () => {
let end = new Date().getTime();
// Change to whatever you want it to be, I've made it so it displays on the page directly, do whatever you want but keep the "end - start + 'ms'"
pingSpan.innerHTML = end - start + "ms";
}
setTimeout(() => {
pingTest();
}, 1000);
} // Remove this line too
}
body {
background: #1A1A1A;
color: white
}
img {
display: none
}
Ping:
<el id="pingSpan">Waiting</el>
<img id="pingTester">
<br> <br>
<button onclick="start()">
Start Ping Test
</button>
<button onclick="stop()">
Stop
</button>
function ping(url){
new Image().src=url
}
Above pings the given Url.
Generally used for counters / analytics.
It won't encounter failed responses to client(javascript)
I suggest using "head" to request the header only.
xhr.open('head', 'asstes/imgPlain/pixel.txt' + cacheBuster(), true);
and than ask for readystate 2 - HEADERS_RECEIVED send() has been called, and headers and status are available.
xhr.onreadystatechange = function() {
if (xhr.readyState === 2) { ...
Is it possible to ping a server from Javascript?
Should check out the above solution. Pretty slick.
Not mine, obviously, but wanted to make that clear.
You can't PING with Javascript. I created Java servlet that returns a 10x10 pixel green image if alive and a red image if dead. https://github.com/pla1/Misc/blob/master/README.md

Categories