How to get google search output in the google application script environment? - javascript

If I use next function to get google output:
function myFunction() {
var post_url, result;
post_url = "http://www.google.com/search?q=stack+overflow";
result = UrlFetchApp.fetch(post_url);
Logger.log(result);
}
doesn't work.
P.S.
Sorry, I have to eхplore some dependences.
I take an example
function scrapeGoogle() {
var response = UrlFetchApp.fetch("http://www.google.com/search?q=labnol");
var myRegexp = /<h3 class=\"r\">([\s\S]*?)<\/h3>/gi;
var elems = response.getContentText().match(myRegexp);
for(var i in elems) {
var title = elems[i].replace(/(^\s+)|(\s+$)/g, "")
.replace(/<\/?[^>]+>/gi, "");
Logger.log(title);
}
}
and it works, than I begin to do some modifications and noticed that when I have some error in code it gives me an error
Request failed for http://www.google.com/search?q=labnol returned code
503.
So I did some researches without error's and it solution works. But when I began to form it to the function in lib it begans to throw me an error of 503 each time!
I'm very amazing of such behavior...
Here is short video only for fact. https://youtu.be/Lem9eiIVY0I
P.P.S.
Oh! I've broke some violations, so the google engine send me to stop list
so I run this:
function scrapeGoogle() {
var options =
{
'muteHttpExceptions': true
}
var response = UrlFetchApp.fetch("http://www.google.com/search?q=labnol", options);
Logger.log(response);
}
and get
About this pageOur systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen?
As I see I have to use some special google services to get the search output and not to be prohibited?

You can use simple regex to extract Google search results.
var regex = /<h3 class=\"r\">([\s\S]*?)<\/h3>/gi;
var items = response.getContentText().match(regex);
Alternatively, you can use the ImportXML function in sheets.
=IMPORTXML(GOOGLE_URL, "//h3[#class='r']")
See: Scrape Google Search with Sheets

Related

Simple code to extract substring into a variable of GTM (Google Tag Manager)

I do not know how to code Java and am not an expert in GTM. However, the code I need is so simple, It worked on an online editor but I have been trying to get it to work on GTM and it does not validate the code.
I need to extract the email adresses from a long string (variable {{Click URL}} in GTM) that contains a complete "mailto:" url with many parameteres and only extract the short email from there (without the additional parameters after the ".com?")
Just an example of this kind of url:
'mailto:information#example.com?subject=Demande%20de%20renseign
ements&body=Votre%20nom:%20%0A%0ANom%20du%20produit:%20%0A%0AVotre%20tel
.%20si%20vous%20souhaitez%20recevoir%20un%20appel%20de%20notre%20part:%2
0%0A%0AVotre%20demande%20de%20renseignements:%20%0A'
Here is the code,
let shortmailto2 = {{Click URL}},
let fin = shortmailto2.indexOf('?'),
let debut = shortmailto2.indexOf(':'),
let shortmailto = shortmailto2.slice(debut+1,fin);
it pulls the right email address as I need when testing on an online editor but when I insert it into GTP (and use a pre-existinge variable, the "click url") I get an error (see monosnap link below for the screen shot): https://monosnap.com/file/eBFYfEwLv9LrPwGrGl6rzaHCbmoeYj
Thanks!
GTM Custom JavaScript Variables:
This field should be a JavaScript function that returns a value using the 'return' statement. If the function does not explicitly return a value, it will return undefined and your container may not behave as expected. Below is an example of this field:
function() {
var now = new Date();
return now.getTime();
}
The following worked for me when I tested it, returning just the email address.
function() {
var shortmailto2 = {{Click URL}};
var fin = shortmailto2.indexOf('?');
var debut = shortmailto2.indexOf(':');
return shortmailto2.slice(debut+1,fin);
}

Google Apps Script JSON.Parse causing 'could not connect to server' on debug

I'm by no means what I would call a "developer" but dabble quite a bit. I'm working on some Apps Script code to query an API and push the results into SQL. I have most of the bits working but I've noticed that while I'm debugging in the Apps Script editor, when I step into the following line of code, the editor throws the "could not connect to server message at the top.
var response = UrlFetchApp.fetch(clientApiURL,options);
var resultSet = JSON.parse(response.getContentText()); <-- this is the line that is crashing the IDE
Anyone know how to better debug this? When I'm not debugging it, the code seems to behave and function properly. But with this API, not all objects are formatted the same way, so I like to use the debugger to inspect them. I can do that when the editor crashes.
Any help/insight would be super appreciated. I've also pasted below the value of response.getContentText()
{"result":{"lead":[{"id":"332","accountID":null,"ownerID":null,"companyName":"","title":null,"firstName":"RYAN","lastName":"CAVANAUGH","street":null,"city":null,"country":null,"state":null,"zipcode":null,"emailAddress":"email#here.com","website":null,"phoneNumber":null,"officePhoneNumber":null,"phoneNumberExtension":null,"mobilePhoneNumber":null,"faxNumber":null,"description":null,"campaignID":"789934082","trackingID":"202003_5e6fa18a87853a69eb306910","industry":null,"active":"1","isQualified":"1","isContact":"1","isCustomer":"1","status":"4","updateTimestamp":"2020-05-08 20:24:48","createTimestamp":"2020-05-03 20:23:29","leadScoreWeighted":"23","leadScore":"26","isUnsubscribed":"0","leadStatus":"customer","persona":"","product_5e554b933fb5b":""}]},"error":null,"id":"5222020","callCount":"215","queryLimit":"50000"}
This will reproduce the error:
function test(){
var obj={"result":{"lead":[{"id":"332","accountID":null,"ownerID":null,"companyName":"","title":null,"firstName":"RYAN","lastName":"CAVANAUGH","street":null,"city":null,"country":null,"state":null,"zipcode":null,"emailAddress":"email#here.com","website":null,"phoneNumber":null,"officePhoneNumber":null,"phoneNumberExtension":null,"mobilePhoneNumber":null,"faxNumber":null,"description":null,"campaignID":"789934082","trackingID":"202003_5e6fa18a87853a69eb306910","industry":null,"active":"1","isQualified":"1","isContact":"1","isCustomer":"1","status":"4","updateTimestamp":"2020-05-08 20:24:48","createTimestamp":"2020-05-03 20:23:29","leadScoreWeighted":"23","leadScore":"26","isUnsubscribed":"0","leadStatus":"customer","persona":"","product_5e554b933fb5b":""}]},"error":null,"id":"5222020","callCount":"215","queryLimit":"50000"}
var resultSet = JSON.parse(obj);
}
Taking a look at the problem:
function test(){
var obj={"result":{"lead":[{"id":"332","accountID":null,"ownerID":null,"companyName":"","title":null,"firstName":"RYAN","lastName":"CAVANAUGH","street":null,"city":null,"country":null,"state":null,"zipcode":null,"emailAddress":"email#here.com","website":null,"phoneNumber":null,"officePhoneNumber":null,"phoneNumberExtension":null,"mobilePhoneNumber":null,"faxNumber":null,"description":null,"campaignID":"789934082","trackingID":"202003_5e6fa18a87853a69eb306910","industry":null,"active":"1","isQualified":"1","isContact":"1","isCustomer":"1","status":"4","updateTimestamp":"2020-05-08 20:24:48","createTimestamp":"2020-05-03 20:23:29","leadScoreWeighted":"23","leadScore":"26","isUnsubscribed":"0","leadStatus":"customer","persona":"","product_5e554b933fb5b":""}]},"error":null,"id":"5222020","callCount":"215","queryLimit":"50000"}
var resultSet = JSON.parse(JSON.stringify(obj));
}
but so what the point of the parse is to return an object from a string not from object.
But I do see the problem. 'Cannot connect to the server'
I found that this does seem to work know:
function test(){
var obj='{"result":{"lead":[{"id":"332","accountID":"","ownerID":"","companyName":"","title":"","firstName":"RYAN","lastName":"CAVANAUGH","street":"","city":"","country":"","state":"","zipcode":"","emailAddress":"email#here.com","website":"","phoneNumber":"","officePhoneNumber":"","phoneNumberExtension":"","mobilePhoneNumber":"","faxNumber":"","description":"","campaignID":"789934082","trackingID":"202003_5e6fa18a87853a69eb306910","industry":"","active":"1","isQualified":"1","isContact":"1","isCustomer":"1","status":"4","updateTimestamp":"2020-05-08 20:24:48","createTimestamp":"2020-05-03 20:23:29","leadScoreWeighted":"23","leadScore":"26","isUnsubscribed":"0","leadStatus":"customer","persona":"","product_5e554b933fb5b":""}]},"error":"","id":"5222020","callCount":"215","queryLimit":"50000"}';
var resultSet = JSON.parse(obj);
var end="is near";//I just put this here to have a place to stop with debugger running
}
I replaced all of the null s with "".

Wikipedia API working intermittently with getJSON calls

I am having some trouble using the wikipedia api. I use .getJSON to return the data however it only works intermittently for some reason. I am not sure if there's a limit on the search that we can perform or something else. The page would just crash after a series of hit and misses.
I have clear all my cache stored in Chrome but the issue still persists. Please help, thank you.
function wiki_api(val) {
$.getJSON("https://en.wikipedia.org/w/api.php?
action=opensearch&search="+val+"&limit=5&format=json&callback=?", function(data) {
//console.log(search+"2");
console.log("runs")
console.log(data);
}); }
function ClickEvent() {
var temp;
console.log("before");
var dummy = document.getElementById("test").value;
var search = dummy.split(" ").join("+");
console.log(search);
wiki_api(search);
console.log("ran");
}
link to codepen: https://codepen.io/xguardx/pen/BwJMXx?editors=0011

Search Increasing URL For Text

For stekhn, here's the proper link: var location = "http://www.roblox.com/Trade/inventoryhandler.ashx?filter=0&userid=" + i + "&page=1&itemsPerPage=14";
I'm trying to create a Javascript script where I can search through a users inventory, detect if they have what I'm looking for in their inventory and output the userID if they have it.
If I type in bluesteel, I need a Javascript script which will search through http://snackyrite.com/site.ashx?userid=1 and detect if it has the text 'bluesteel' is on it - if it is, I need it to display the user id, which is 1.
You may be thinking that's easy and I can easily find the script for that - well there's a catch, my objective isn't only to get it to search userid=1, I need it to search from userid=1 up to userid=45356
If the word 'bluesteel' is found on userid=5, userid=3054 and userid=12 (these are just examples), I need it to display 5, 3054 and 12 (the ID's) on the same page where the script was ran from.
This is the script I've tried, but the userid won't increase (I'm not sure how to do that).
var location = http://snackyrite.com/site.ashx?userid=1;
if(location.indexOf("bluesteel") > -1) {
output.userid
}
I do apologize, Javascript isn't my best.
Use a loop:
for (var i = 1; i <=45356; i++) {
var loc = "http://snackyrite.com/site.ashx?userid="+i;
// get contents of location
if (contents.indexOf("bluesteel") > -1) {
console.log(i);
}
}
Since getting the contents will presumably use AJAX, the if will probably be in the callback function. See Javascript infamous Loop issue? for how to write the loop so that i will be preserved in the callback function.
This kind of web scraping can't be done in the Browser (client-side JavaScript).
I would suggest building a scraper with Node.js.
Install Node.js
Install request npm i request
Install cheerio npm i cheerio
Create a file scraper.js
Run node scraper.js
Code for scraper.js
// Import the scraping libraries
var request = require("request");
var cheerio = require("cheerio");
// Array for the user IDs which match the query
var matches = [];
// Do this for all possible users
for (var i = 1; i <= 45356; i++) {
var location = "http://snackyrite.com/site.ashx?userid="+i;
request(location, function (error, response, body) {
if (!error) {
// Load the website content
var $ = cheerio.load(body);
var bodyText = $("body").text();
// Search the website content for bluesteel
if (bodyText.indexOf("bluesteel") > -1) {
console.log("Found bluesteel in inventory of user ", i);
// Save the user ID, if bluesteel was found
matches.push(i);
}
// Something goes wrong
} else {
console.log(error.message);
}
});
console.log("All users with bluesteel in inventory: ", matches);
}
The above code seems kind of complicated, but I think this is the way it should be done. Of corse you can use any other scraping tool, library.

JavaScript to save MAFF in Firefox

I am experimenting with iMacros to automate as task that Firefox will do. I simply want to save the current page with the MAFF extension. The JavaScript that the iMacros forum has lead me to, is this:
// I stuck these variable in just to try something.
var doc = "http://www.traderjoes.com";
var file = "C:\\Export\\Test.maff";
var format = "MAFF";
// I stuck these variable in just to try something.
var MafObjects = {};
Components.utils.import("resource://maf/modules/mafObjects.jsm",
MafObjects);
var jobListener = {
onJobComplete: function(aJob, aResult) {
if (!Components.isSuccessCode(aResult)) {
// An error occurred
} else {
// The save operation completed successfully
}
},
onJobProgressChange: function(aJob, aWebProgress, aRequest,
aCurSelfProgress,
aMaxSelfProgress,
aCurTotalProgress,
aMaxTotalProgress) { },
onStatusChange: function(aWebProgress, aRequest, aStatus,
aMessage) { }
};
var saveJob = new MafObjects.SaveJob(jobListener);
saveJob.addJobFromDocument(doc, file, format);
saveJob.start();
I was only getting an error on line 26 because this was sample code. With the little JavaScript I know I tried to add some variables on the lines before the code starts. The thing is that when I try to search for syntax example for the method .addJobFromDocument I don’t find much, just like two results. Is this a method of JavaScript? Usually with things from the DOM you will get a great deal of information on them.
Does anybody know a way of automating the save of MAFF of the current open tab in Firefox and then closing the browser? iMacros was something I came to and glad to see it features but really I just want to automate from a command line the saving of a URL as a MAFF archive The doc (that I got from iMacros forum) also had these code snippets but I don’t have much idea how to use them. Thanks
var fileUri = Components.
classes["#mozilla.org/network/io-service;1"].
getService(Components.interfaces.nsIIOService).
newFileURI(file);
var persistObject = new MafObjects.MafArchivePersist(null, format);
persistObject.saveDocument(doc, fileUri);
Also:
var doc = gBrowser.contentDocument;
var file = Components.classes["#mozilla.org/file/local;1"].
createInstance(Components.interfaces.nsILocalFile);
file.initWithPath("C:\\My Documents\\Test.maff");
var format = "TypeMAFF";

Categories