I am developing web development tools and I'd like to copy of a part of current web page's HTML code to the clipboard using Javascript.
This probably would involve
Getting the piece of HTML in the question using DOM innerHTML
Copying this text to clipboard using Javascript
Is anyone aware of any gotchas here? E.g. related to clipboard handling - when one is not using documentEditable mode do I need to create a hidden where to put the HTML payload for copying?
Also if possible I'd like to make the interaction with WYSIWYG components, like TinyMCE, work so that when one pastes the HTML in the visual edit mode it comes through as formatted HTML instead of plain text.
It is enough if solution works in Chrome and Firefox. Internet Explorer does not need to be supported.
Javascript has no way of adding things to the clipboard. Well at least not any that works cross browser.
There is however a flash solution which works well. http://code.google.com/p/zeroclipboard/
We developed a small Firefox-AddOn to remove special characters (hyphens) when copy/pasting content from the editor. This has been necessary because there is no javascript way to fill anyting into the clipboard. I guess it should be possible to write an extension for Chrome too (googel is your friend here). This seems to be the only way to get what you want from my point of view.
Example:
Here is the necessary code snippet for a FireFox-Addon to remove special characters onCopy
// get Clipboard Object
var clip = Components.classes["#mozilla.org/widget/clipboard;1"].createInstance(Components.interfaces.nsIClipboard);
// get Transferable Object
var tr_unicode = new Transferable();
var tr_html = new Transferable();
// read "text/unicode flavors" (the clipboard has several "flavours" (html, plain text, ...))
tr_unicode.addDataFlavor("text/unicode");
tr_html.addDataFlavor("text/html");
clip.getData(tr_unicode, clip.kGlobalClipboard); // Systemclipboard
clip.getData(tr_html, clip.kGlobalClipboard); // Systemclipboard
// generate objects to write the contents into (used for the clipboard)
var unicode = { }, ulen = { }, html = { }, hlen = { };
tr_html.getTransferData("text/html", html, hlen);
tr_unicode.getTransferData("text/unicode", unicode, ulen);
var unicode_obj = unicode.value.QueryInterface(Components.interfaces.nsISupportsString);
var html_obj = html.value.QueryInterface(Components.interfaces.nsISupportsString);
// we remove Softhyphen and another control character here
var re = new RegExp('[\u200b' + String.fromCharCode(173)+ ']','g');
if (unicode_obj && html_obj)
{
var unicode_str = unicode_obj.data.replace(re, '');
var html_str = html_obj.data.replace(re, '');
// Neue Stringkomponenten für unicode und HTML-Content anlegen
var unicode_in = new StringComponent();
unicode_in.data = unicode_str;
var html_in = new StringComponent();
html_in.data = html_str;
// generate new transferable to write the data back to the clipboard
// fill html + unicode flavors
var tr_in = new Transferable();
tr_in.setTransferData("text/html", html_in, html_in.data.length * 2);
tr_in.setTransferData("text/unicode", unicode_in, unicode_in.data.length * 2);
// copy content from transferable back to clipboard
clip.setData(tr_in, null, clip.kGlobalClipboard);
}
Related
I need to make an update to a script that is using pdfform.js in order to take data from html inputs and pass them into a fillable pdf file.
Basically all I need to do is to update the PDF file for the year 2022.
The problem is that my new PDF doesn't have fields where I can write like the old one:
PDF that has fillable inputs:
old pdf
My new PDF:
new pdf
I tried to add fields to my new pdf using Adobe Acrobat but the script is not able to write to them. I don't know exactly how to add the fields in order to have the same reference as the old ones.
<script type = "text/javascript"
src = "pdfform.pdf_js.dist.js" > < /script>
< script > $(document).ready(function() {
$("#descarca-pdf").on("click", function() {
event.preventDefault();
var a = new XMLHttpRequest;
a.open("GET", "Formular-230_Habitat-for-Humanity-Romania-2.pdf", !0), a.responseType = "arraybuffer", a.onload = function() {
if (200 == this.status) {
var a = this.response,
e = {
cnp: [$("#form-cnp").val()],
initiala: [$("#form-initiala").val()],
prenume: [$("#form-prenume").val()],
numar: [$("#form-numar").val()],
nume: [$("#form-nume").val()],
scara: [$("#form-scara").val()],
etaj: [$("#form-etaj").val()],
apartament: [$("#form-apartament").val()],
bloc: [$("#form-bloc").val()],
judet: [$("#form-judet").val()],
localitate: [$("#form-localitate").val()],
codpostal: [$("#form-codpostal").val()],
email: [$("#form-email").val()],
telefon: [$("#form-telefon").val()],
strada: [$("#form-strada").val()],
fax: [$("#form-fax").val()]
},
o = pdfform().transform(a, e),
t = new Blob([o], {
type: "application/pdf"
}),
r = document.createElement("a");
r.href = window.URL.createObjectURL(t), r.download = "Formular_230_Habitat_for_Humanity_Romania.pdf", r.click()
} else on_error("failed to load URL (code: " + this.status + ")")
}, a.send()
})
}); < /script>
Does anyone know any tool that I can use for this?
As far as I can tell you are trying to update a Government Form 230 which used to look like this
And you say you need to update those field for this year HOWEVER the Current Form uses a totally different structure, like this
Here we can see the difference in the two sets of field tags
The XFDF data fields structure as published can be found at
https://static.anaf.ro/static/10/Anaf/Declaratii_R/AplicatiiDec/structura_B230_D230_2020_27032020.pdf
and https://static.anaf.ro/static/10/Anaf/Declaratii_R/AplicatiiDec/structura_B230_D230_2022_12012022.pdf
The forms structure may be used for customised versions with a java library and both years forms have their own personalized .jars for each structure, so it is best to use the official ANAF J Soft available online.
You provide a sample of a modified custom version which has been dumbed down to be part flattened and part not ! thus to my view would not be compatible with your declaration it must match the government XFA format, however it was initially generated or modified using Adobe Forms Library Code, so has working fields in one part area, but they are not XFA structure like the official ANAF version they are declared as "You cannot save data...Please print..." to be simply printed and scanned as if it was a basic paper copy without intelligent fields, which defeats the whole reason for it being a fillable form in the first place. You may just as well use a word doc or any other editor form.
OK on looking through many different modified D230's including yours they seem to be from the same source "Codruţ Popa" who I assume is an ANAF designer so best to get a newer copy from them.
function Copy() // this function will be latched to a button later on.
{
var text = writePreview(); // this pours in the formatted string by the writePreview() function to the variable 'text'
text = br2nl(text); //variable 'text' is purified from <br/> and is replaced by a carriage return
//I need some code here to pour in the contents of the variable 'text' to the clipboard. That way the user could paste the processed data to a 3rd party application
}
I'm building an offline client-side web application. The main purpose of this is to have user's input to fields, format the text such that it fits a certain criteria, then click copy so they can paste it to a 3rd party CRM.
The only available browser for this is Google Chrome. I've scoured the internet hoping to find a simple solution for this.
I'm not concerned about security as this application is not going to be published and is meant just for offline use.
I want to keep it as simple as possible and adding invisible textarea ruin the layout. Flash is not allowed in my current environment.
Look at clipboard.js
A modern approach to copy text to clipboard
No Flash. No dependencies. Just 2kb gzipped
https://clipboardjs.com/
this was solved by updating my browser (Google Chrome v49). I was using a lower version (v34).
found that later versions (v42+) of Google Chrome supports document.execCommand('copy')
I hope it helps people
here are the functions I used:
function SelectAll(id)
{
document.getElementById(id).focus();
document.getElementById(id).select();
}
function copy()
{
SelectAll('textAreaID');
document.execCommand("Copy", false, null);
}
According to this article "In javascript, copying a value from variable to clipboard is not straightforward as there is no direct command.".
Hence as suggested there I did the following:
defined the following in html file - I added at the bottom ( I never noticed element being added and being removed ):
<div id="container"/>
then in Javascript I added:
.
function copyQ() {
var container = document.getElementById("container");
var inp = document.createElement("input");
inp.type = "text";
container.appendChild(inp);
inp.value = "TEST_XYZ";
inp.select();
document.execCommand("Copy");
container.removeChild(container.lastChild);
alert("Copied the text: " + inp.value);
}
May be there is a better way, but it works for me.
UPDATE:
Further, I found that if your text is multiline and if you use input of type text all text is converted to one line text.
To keep paragraphs / separate lines I tried to use textarea and text is copied as is - multi line:
var inp = document.createElement("textarea");
//inp.type = "text";
Hope it helps someone.
I'm working with the Firefox Addon SDK writing a Spanish dictionary addon. What I want to do is have a dictionary pop up with the translation when the mouse hovers over text. No highlighting the text or right-clicking should be required (although some dictionaries use this). There are several programs that do this with the old addon XUL format (Rikaichan among others), but I was wondering if there was a way to do this with the new SDK.
My current workaround is to inject javascript tags around every word in the text nodes along with onmouseover="lookThisUp()". This works, but it runs into complications when I want to check words that change meaning when in pairs ("get up" rather than "get"), so a method without cutting up all the text with javascript tags would be preferential.
this is an example of how to do it with the most recent navigator:browser window:
var {Cu} = require('chrome');
Cu.import('resource://gre/modules/Services.jsm');
var aDOMWindow = Services.wm.getMostRecentWindow('navigator:browser');
aDOMWindow.gBrowser.addEventListener('mouseover', isTextNode, true);
function isTextNode(event) {
var node = event.explicitOriginalTarget;
if (node.nodeName == '#text') {
Services.appShell.hiddenDOMWindow.console.log('moused over a text node = ',node,'the event:',event);
}
}
as you mouse over things in the most recent browser if its over a textnode it will log it to Browser Console.
var page = UrlFetchApp.fetch(contestURL);
var doc = XmlService.parse(page);
The above code gives a parse error when used, however if I replace the XmlService class with the deprecated Xml class, with the lenient flag set, it parses the html properly.
var page = UrlFetchApp.fetch(contestURL);
var doc = Xml.parse(page, true);
The problem is mostly caused because of no CDATA in the javascript part of the html and the parser complains with the following error.
The entity name must immediately follow the '&' in the entity reference.
Even if I remove all the <script>(.*?)</script> using regex, it still complains because the <br> tags aren't closed.
Is there a clean way of parsing html into a DOM tree.
I ran into this exact same problem. I was able to circumvent it by first using the deprecated Xml.parse, since it still works, then selecting the body XmlElement, then passing in its Xml String into the new XmlService.parse method:
var page = UrlFetchApp.fetch(contestURL);
var doc = Xml.parse(page, true);
var bodyHtml = doc.html.body.toXmlString();
doc = XmlService.parse(bodyHtml);
var root = doc.getRootElement();
Note: This solution may not work if the old Xml.parse is completely removed from Google Scripts.
In 2021, the best way to parse HTML on the .gs side that I know of is...
Click + next to Library
Enter 1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0
Click "Look up"
Click Add
Sample usage:
const contentText = UrlFetchApp.fetch('https://www.somesite.com/').getContentText();
const $ = Cheerio.load(contentText);
$('.some-class').first().text();
That's it -- this is probably the closest we'll get to doing jQuery-like DOM selection in GAS. The .first() is important or else you may extract more content than you expected (think of it as using querySelector() instead of querySelectorAll()).
Credit where credit is due: https://github.com/tani/cheeriogs
As of May 2020, you can now use the Cheerio library for Google Apps Script to do this.
Returns the content of Wikipedia's Main Page
const content = getContent_('https://en.wikipedia.org');
const $ = Cheerio.load(content);
Logger.log($('#mp-right').text());
Returns the content of the first paragraph <p> of Wikipedia's Main Page
const content = getContent_('https://en.wikipedia.org');
const $ = Cheerio.load(content);
Logger.log($('p').first().text());
To add to your project:
Select Resources - Libraries... in the Google Apps Script editor. Enter the project key 1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0 in the Add a library field, and click "Add". Select the highest version number, and click "Save".
I found that the best way to parse html in google apps is to avoid using XmlService.parse or Xml.parse. XmlService.parse doesn't work well with bad html code from certain websites.
Here a basic example on how you can parse any website easily without using XmlService.parse or Xml.parse. In this example, i am retrieving a list of president from "wikipedia.org/wiki/President_of_the_United_States"
whit a regular javascript document.getElementsByTagName(), and pasting the values into my google spreadsheet.
1- Create a new Google Sheet;
2- Click the menu Tools > Script editor... to open a new tab with the code editor window and copy the following code into your Code.gs:
function onOpen() {
var ui = SpreadsheetApp.getUi();
ui.createMenu("Parse Menu")
.addItem("Parse", "parserMenuItem")
.addToUi();
}
function parserMenuItem() {
var sideBar = HtmlService.createHtmlOutputFromFile("test");
SpreadsheetApp.getUi().showSidebar(sideBar);
}
function getUrlData(url) {
var doc = UrlFetchApp.fetch(url).getContentText()
return doc
}
function writeToSpreadSheet(data) {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getSheets()[0];
var row=1
for (var i = 0; i < data.length; i++) {
var x = data[i];
var range = sheet.getRange(row, 1)
range.setValue(x);
var row = row+1
}
}
3- Add an HTML file to your Apps Script project. Open the Script Editor and choose File > New > Html File, and name it 'test'.Then copy the following code into your test.html
<!DOCTYPE html>
<html>
<head>
</head>
<body>
<input id= "mButon" type="button" value="Click here to get list"
onclick="parse()">
<div hidden id="mOutput"></div>
</body>
<script>
window.onload = onOpen;
function onOpen() {
var url = "https://en.wikipedia.org/wiki/President_of_the_United_States"
google.script.run.withSuccessHandler(writeHtmlOutput).getUrlData(url)
document.getElementById("mButon").style.visibility = "visible";
}
function writeHtmlOutput(x) {
document.getElementById('mOutput').innerHTML = x;
}
function parse() {
var list = document.getElementsByTagName("area");
var data = [];
for (var i = 0; i < list.length; i++) {
var x = list[i];
data.push(x.getAttribute("title"))
}
google.script.run.writeToSpreadSheet(data);
}
</script>
</html>
4- Save your gs and html files and Go back to your spreadsheet. Reload your Spreadsheet. Click on "Parse Menu" - "Parse". Then click on "Click here to get list" in the sidebar.
Xml.parse() has an option to turn on lenient parsing, which helps when parsing HTML. Note that the Xml service is deprecated however, and the newer XmlService doesn't have this functionality.
For simple tasks such as grabbing one value from a webpage, you could use a regular expression. Regex is notoriously bad for parsing HTML as there's all sorts of weird cases it can get tripped up, but if you're confident about the HTML you're accessing this can sometimes be the simplest way.
Here's an example that fetches the contents of the page's <title> tag:
var page = UrlFetchApp.fetch(contestURL);
var regExp = new RegExp("<title>(.*)</title>", "gi");
var result = regExp.exec(page.getContentText());
// [1] is the match group when using parenthesis in the pattern
var value = result ? result[1] : 'No title found';
I know it is not exactly what OP asked, but I found this question when I was looking for some html parsing options - so it might be useful for others as well.
There is an easy to use the library for TEXT parsing. It's useful if you want to get only one piece of information from the html(xml) code.
EDIT 2021: The script library id is:
1Mc8BthYthXx6CoIz90-JiSzSafVnT6U3t0z_W3hLTAX5ek4w0G_EIrNw
It works like in the picture above
function getData() {
var url = "https://chrome.google.com/webstore/detail/signaturesatori-central-s/fejomcfhljndadjlojamaklegghjnjfn?hl=en";
var fromText = '<span class="e-f-ih" title="';
var toText = '">';
var content = UrlFetchApp.fetch(url).getContentText();
var scraped = Parser
.data(content)
.from(fromText)
.to(toText)
.build();
Logger.log(scraped);
return scraped;
}
If you are using
Cheerio library for Google Apps Script
Source code
Library page (⭐ star it!)
Installation by library ID:
1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0
A function to get current emojis from unicode.org:
function getEmojis() {
var t = new Date();
var url = 'https://unicode.org/emoji/charts/full-emoji-list.html';
var fetch = UrlFetchApp.fetch(url);
var contentText = fetch.getContentText();
//console.log(new Date() - t);
// Cherio
var $ = Cheerio.load(contentText);
var data = [];
$("table > tbody > tr").each((index, element) => {
var row = [];
$(element).find("td").each((index, child) => {
row.push($(child).text());
});
if (row.length > 0) {
data.push(row);
}
});
//console.log(data);
//console.log(new Date() - t);
// Result
return data;
}
↑ Sample code shows how to parse table and put it into [[array]]
May be used as a custom function:
Bonus
Parsing the site may be a time-consuming operation + you may reach the limit.
Here's a test file with a full version of the script:
https://docs.google.com/spreadsheets/d/1iO7YjYWyfseQu_YCfRbGDPg7NskOgMu_iO1iGjr7KxY/edit#gid=93365395
↑ it uses CasheService to reduce the number of calls.
Natively there's no way unless you do what you already tried which wont work if the html doesnt conform with the xml format.
There are two options
a) One is to use JavaScript's string functions. First locate your tag using string.indexOf() and then extract the data you want using string.substring().
b) The other option is to make use of the Xml Service.
It's not possible to create an HTML DOM server-side in Apps Script. Using regular expressions is likely your best option, at least for simple parsing.
I'm trying to get a window's XUL text as a String in Javascript. I need it to be done at runtime because the window adds/removes UI elements dynamically.
I have tried the following:
document.toXML()
document.xml
document.documentElement.toXML()
Among other things. Nothing works! Can anyone help?
You use XMLSerializer:
new XMLSerializer().serializeToString(document);
I don't think there is a function or field to get xul text, but you can work around by reading the content from xul url
function getContentFromURL(url) {
var Cc = Components.classes;
var Ci = Components.interfaces;
var ioService = Cc['#mozilla.org/network/io-service;1'].getService(Ci.nsIIOService);
var scriptableStream = Cc['#mozilla.org/scriptableinputstream;1'].getService(Ci.nsIScriptableInputStream);
var channel = ioService.newChannel(url, null, null);
var input = channel.open();
scriptableStream.init(input);
return scriptableStream.read(input.available());
}
so you can call getContentFromURL(document.location) to get the XUL content