First, I want parse a html and fetch some line
with Google Apps Script, and it's showed
" The element type "link" must be terminated by the matching end-tag "/link " "
and code here
var response = UrlFetchApp.fetch(url)
var downloadContent = response.getContentText();
var doc = XmlService.parse(downloadContent);
I think because the html use html5, that GAS can't parsing,
so I try otherwise method to parsing string,
(read line by line and keep lines which I need)
var xml = UrlFetchApp.fetch(url).getContentText();
but GAS hasn't Scanner, and how can I do?
In fact, I want to go this url "https://www.ptt.cc/bbs/gossiping/index.html"
and fetch information in
<div class="r-ent">
...
</div>
Google Apps Script is JavaScript so you can use the split() method to split the text content into multiple lines by the newline character.
var text = UrlFetchApp.fetch(url).getContentText();
var lines = text.split(/\r?\n/);
Logger.log(lines);
Related
I'm writing a program to scrape HTML text contained within a string variable and pick up all instances of text such as: Example for both h2 and h3 headers. I figured the best way to do this would be using RegExp, but I'm not exactly sure what the syntax for this should be. I'm implementing this within Google Apps Script and have the following code thus far for this function (I've omitted the url).
function scraper(){
var mainSheet =
SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1");
var url = "";
var xml = UrlFetchApp.fetch(url).getContentText();
var re = new RegExp();
}
When sending input from a multiline textbox with an xhttp request through javascript, chrome blocks out new lines as part of some new exploit prevention. I have tried using encodeURI, which did nothing, and trying to send also causes this error. I am allowing users to submit html through the textbox.
Edit:
Javascript code:
var taskid = 'task=' + notes;
var cont = '&content=' + po.value;
var head = '&head=' + pp.value;
var comb = taskid+cont+head;
var nlink = 'create.note.php?'+comb;
var encoded = encodeURI(nlink);
xhttp.open('GET', encoded + comb, true);
Chromes response:
[Deprecation] Resource requests whose URLs contained both removed whitespace
(`\n`, `\r`, `\t`) characters and less-than characters (`<`) are blocked.
Please remove newlines and encode less-than characters from places like
element attribute values in order to load these resources. See
https://www.chromestatus.com/feature/5735596811091968 for more details.
I am using ColdFusion to connect to and execute methods from a Web Service. I store the contents of the returned xml string in to a ColdFusion array then I convert the ColdFusion array into a JavaScript array, so that I may populate the content of my HTML document.
My problem arises when trying to add a photo to a unordered list called "agent_photo_list". Specifically when I call the .setAttribute method. It seems to involve the 'src' parameter. The JavaScript code works as I expect when it is not inside the cfscript tag and WriteOutput method. I have researched the problem, I haven't been able to find a reference that is sufficiently similar. I am still having trouble understanding what my problem is. I have included my code below:
cfscript>
WriteOutput('
<script language = "JavaScript">
var #ToScript(array, "jsArray")#
var agent = jsArray[0];
document.getElementById("output").innerHTML = agent.firstname + " " + agent.lastname;
var imgurl = "_images/agentphoto.jpg";
var node = document.createElement("LI");
var imgnode = (document.createElement("IMG"));
imgnode.setAttribute('src', "imgurl");
node.appendChild(imgnode);
document.getElementById("agent_photo_list").appendChild(node);
</script>
')
</cfscript>
I am using a jpg file located in my _images folder for testing purposes, I will later change it to agent.photourl.
The error I get is provided below:
Invalid CFML construct found on line 117 at column 35.ColdFusion was
looking at the following text:<p>src</p><p>The CFML
compiler was processing:<ul><li>An expression beginning
with WriteOutput, on line 111, column 17.This message is usually
caused by a problem in the expressions structure.<li>A script
statement beginning with WriteOutput on line 111, column
17.<li>A cfscript tag beginning on line 102, column 10.</ul> The specific sequence of files included or processed is: C:\inetpub\wwwroot\webservice.cfm, line: 117
I am curious to why my JavaScript is functional inside the cfscript tag until calling the setAttribute method and why it is functional outside the cfscript tag.
I will appreciate your insight. Thank you.
You need to wrap the src in "". Also, add the ";" at the end of WriteOutput closure. The below code should work for you.
<cfscript>
WriteOutput('
<script language = "JavaScript">
var #ToScript(array, "jsArray")#
var agent = jsArray[0];
document.getElementById("output").innerHTML = agent.firstname + " " + agent.lastname;
var imgurl = "_images/agentphoto.jpg";
var node = document.createElement("LI");
var imgnode = (document.createElement("IMG"));
imgnode.setAttribute("src", "imgurl");
node.appendChild(imgnode);
document.getElementById("agent_photo_list").appendChild(node);
</script>
');
</cfscript>
I want to find and replace text in a HTML document between, say inside the <title> tags. For example,
var str = "<html><head><title>Just a title</title></head><body>Do nothing</body></html>";
var newTitle = "Updated title information";
I tried using parseXML() in jQuery (example below), but it is not working:
var doc= $($.parseXML(str));
doc.find('title').text(newTitle);
str=doc.text();
Is there a different way to find and replace text inside HTML tags? Regex or may be using replaceWith() or something similar?
I did something similar in a question earlier today using regexes:
str = str.replace(/<title>[\s\S]*?<\/title>/, '<title>' + newTitle + '<\/title>');
That should find and replace it. [\s\S]*? means [any character including space and line breaks]any number of times, and the ? makes the asterisk "not greedy," so it will stop (more quickly) when it finds </title>.
You can also do something like this:
var doc = $($.parseXML(str));
doc.find('title').text(newTitle);
// get your new data back to a string
str = (new XMLSerializer()).serializeToString(doc[0]);
Here is a fiddle: http://jsfiddle.net/Z89dL/1/
This would be a wonderful time to use Javascript's stristr(haystack, needle, bool) method. First, you need to get the head of the document using $('head'), then get the contents using .innerHTML.
For the sake of the answer, let's store $('head').innerHTML in a var called head. First, let's get everything before the title with stristr(head, '<title>', true), and what's after the title with stristr(head, '</title>') and store them in vars called before and after, respectively. Now, the final line is simple:
head.innerHTML = before + "<title>" + newTitle + after;
Inside of Hyperion Reporting Studio I have a document level script where I wish to call a batch file and pass arguments to the batch file.
Here is the code I have:
var Path = "W:\\directory\\Reference_Files\\scripts\\vbs\\SendEmail.bat"
var Email = "my.email#xxx.com"
var Subject = "My Subject"
var Body = "My Body"
var Attach = "W:\Maughan.xls"
Application.Shell(Path + " " + Email + " " + Subject + " " + Body + " " + Attach)
This code does not open the file, but gives the error message The filename, directory name, or volume label syntax is incorrect.
If I pass Path by itself my bat file runs (giving me a warning because no parameters are passed) and I when I run the same code from the Shell Command, it works flawlessly.
Can anyone provide any insight into the correct syntax to pass into the Application.Shell method so that it reads my parameters and passes them to the batch file? I have been searching high and low online to no avail.
Because var Attach = "W:\Maughan.xls" should be var Attach = "W:\\Maughan.xls".
Within a string the escape character \ just escapes the next character so Attach will contain just W:Maughan.xls. To add \ you need to use \ twice.
Update:
It may have no difference in this particular case, because W:Maughan.xls means to look for Maughan.xls in the current directory on the drive W which is most likely \.
But what is definitely important are quotes around the parameters Subject and Body. In you code the constructed command is
W:\directory\Reference_Files\scripts\vbs\SendEmail.bat my.email#xxx.com My Subject My Body W:Maughan.xls
I sure that the bat file cannot distinguish between the subject and body (unless it expect exactly two words in each of them) so the right command most likely is
W:\directory\Reference_Files\scripts\vbs\SendEmail.bat my.email#xxx.com "My Subject" "My Body" W:\Maughan.xls
and you can check it by running the command above in cmd.
To construct it the parameters should be modified as follows:
var Path = "W:\\directory\\Reference_Files\\scripts\\vbs\\SendEmail.bat"
var Email = "my.email#xxx.com"
var Subject = "\"My Subject\""
var Body = "\"My Body\""
var Attach = "W:\\Maughan.xls"
(this correction was inspired by impinball's answer)
Try putting an escaped quote on either side of the variable values. Depending on where the directory is, that may make a difference. The outside quotes in strings aren't included in the string values in JavaScript. Here's an example of what I'm talking about:
var Path = "\"W:\\directory\\Reference_Files\\scripts\\vbs\\SendEmail.bat\""
instead of
var Path = "W:\\directory\\Reference_Files\\scripts\\vbs\\SendEmail.bat"