How do some WYSIWYG editors keep formatting of pasted text? - javascript

How do some WYSIWYG editors keep formatting of pasted text? As an example, I copied italic red text from a text-editor into a WYSIWYG and it kept the text's color and style, how is this happening? For the longest I thought JavaScript had access the clipboards text only. Is this not the case? If so then, what is it?

There's a content type negotiation between the source and target during the copy/paste operation. It happens sort of like this:
You copy something into the copy and paste buffer. The copied data is tagged with, more or less, a MIME type and who put it there.
When you paste, the paste target tells the copy-and-paste system that it understands a specific list of MIME types.
The copy-and-paste system matches the available formats to the desired formats and finds text/html in both lists.
Someone (probably the original source of the data) then converts the paste buffer to text/html and drops it in the editor.
That's pretty much how things worked back when I was doing X11/Motif development (hey! get off my lawn you rotten kids!) so I'd guess that everyone does it pretty much the same way.

JavaScript has no direct access to the clipboard in general. However, all major browsers released over the past few years have a built-in WYSIWYG editing facility, via the contenteditable attribute/property of any element (which makes just that element editable) and the designMode property of document objects (which makes the whole document editable).
While the user edits content in the page, if they trigger a paste (via keyboard shortcuts such as Ctrl + V or Shift + Insert or via the Edit or context menus), the browser automatically handles the whole pasting process without any intervention from JavaScript. Part of this process includes preserving formatting wherever possible.
However, the HTML this produces can be gruesome and varies heavily between browsers. Many WYSIWYG editors such as TinyMCE and CKEditor employ tricks to intercept the pasted content and clean it before it reaches the editor's editable area.

What you're seeing is a rich text editor. There's some information in this Wikipedia article: http://en.wikipedia.org/wiki/Online_rich-text_editor

I think it copied the selected DOM instead

Related

How to copy to clipboard a selected text with styling ang images by using javascript?

When a user selects a part of the page with styled texts and images, it is possible to copy all that content (with images and styles) and paste it to MS Word or to an e-mail client by clicking "Copy" in the context menu.
How is it possible to achieve the same result with javascript?
So far, I have found solutions to copy plain text only or by using the depreciated document.execCommand("copy") command.
Is there a solution that works for all modern browsers, including Firefox?
If such a function cannot be implemented for security reasons or whatever, could someone please explain why exactly? Because users copy content all the time, why it cannot be done with Javascript?
Edit: I'm trying to show a custom popup with a Copy button when user selects some content on the page. That button should be able to copy all the styling of the selected content, not just plain text. Just like the Copy button in the browser context menu or Ctrl+C
As far as I understand, you wanna magically transform HTML, CSS and JavaScript to a text format. This is technically not possible.
Yes, it is possible. I suggest looking at Navigator.clipboard API: https://developer.mozilla.org/en-US/docs/Web/API/Navigator/clipboard
You can get the selected HTML Elements, do all kinds of transformation on data and then paste the data in to the clipboard. Multiple formats also supported, such as images, HTML and simple text. Note that if you are going to copy HTML and/or images pasting will not work in simple text editors, only in editors like Word that supports advanced pasting formats.

dart function to copy text to clipboard with MIME type

I'm trying to copy text to the OS clipboard in my dart web app with the push of a button, and I'm not finding a clean way of doing so.
My current solution is to create a textArea element, add the text I'm intending to copy to the element, calling document.execCommand("copy") on said element, then deleting the textArea element. This works in the browsers I am intending to support; however, I also need to set the MIME type for this copied text, which does not appear to be possible with my current implementation.
So, my question for you all is: using my current solution, can I also set the MIME type for the text being copied? OR is there a better approach I could take using a different dart api?
Javascript (even dart converted to javascript) cannot get to the OS clipboard. Only Flash seems to have that ability.

Copy webpage as it is rendered, regardless of underlaying source

So I have this webpage that I want to copy to a word document. It's an installation guide and we want to use that but add comments relating to how we installed the program in our environment.
Simple problem. Just copy and paste, right? Wrong.
Problem is, this specific webpage is built up of <div ..> tags, where a couple of checkboxes enable/disable them relating to your choices. So I check the box that marks an installation in Linux, and all div tags relating to that installation option are shown.
Example from the source:
<div class="forWindows forAIX forLinux forZLinux forPLinux forSolaris">...</div>
<div class="forJTS"> ... </div>
<div class="forCCM"> ... </div>
This means, that whenever I copy and paste a part of the webpage I get all the content, regardless of what I actually see on the screen. What I want is to just copy the webpage as I see it on the screen.
I've tried to copy from Internet Explorer and Firefox both to MS Word and to a basic text editor with the same results.
I want the result to be text so I can edit it, so screenshots or exporting to PDF won't work.
I could save the source HTML, remove the tags that dont apply and open the local html file, except that it's quite alot of work. Also the page seems to rely heavily on scripts on the serverside, so I guess that may cause some issues.
Ideally I'd like to preserve the formatting as it is shown aswell.
To reproduce the issue:
Go to the IBM's interactive guide for installing Rational Team Concert.
Select any choices, but to verify step 5-6 below, choose Linux as OS.
Click "Get your interactions"
Copy/paste a part of the webpage and compare the pasted version with what is seen in the browser.
Go to Step 3, "set up the database" in the guide. Copy all the content between "What to do next" in the previous step to the end of the heading in step 3. All in all, about 6 lines.
Paste in a texteditor, you should now see text that only relates to zOS and IBMi operating systems.
It seems that the behaviour on copy and pasting is undefined. Some browsers will copy ignoring the styling that hides stuff and others will copy including styles (ie some will include hidden text and others will not).
A rough summary of browsers seems to be:
IE - copies hidden text on IE8 and presumably older, no idea about newer.
FF - newer versions will not copy hidden text, older versions it seems will. Unknown where the cutoff is but it seems to be somewhere between version 3 and version 14. :)
Chrome - my current version (19.0.1084.52) will copy just the visible text. Untested on any other version.
I would simply screenshot the page and use a simple graphics editing program to crop the image and add annotations.
To screenshot a page, press Print screen (probably shorted as PrtScn on your keyboard). That copies the screenshot into memory.
Now, in your graphics editing program or even word processor, click paste (or press ctrl-v). The screenshot will appear. Crop and add annotations as per your desire.
Write a bookmarklet that concatenates #text nodes together but only when the parent element has a computed-style where display != none and visiblity != hidden.

How to paste Text from Word to plain text by preserve defined styles?

I want to let the user paste text to an editor (currently CKEditor). By pasting the text all styles and elements which are not white-listed must be removed, including images, tables etc. So 90% should be converted to plain text or be removed while some simple styles like bold, italic or underlined should be preserved.
Didn't thought that's so complicated. But all I can find within the documentation and the samples of CKEditor is about pasting complete plain text or pasting cleaned up content from Word without the ability to configure a white-list (and even if I remove all table-related plugins it is still possible to paste a table from MS WorD).
I really, really appreciate any hint.
Thanks.
You can't without writing your own parser. Another issue is MS word uses Windows-1252 character encoding and most of the web uses UTF-8 encoding, so if you paste from WORD and transmit this data via AJAX, it will be garbled.
While Dreamweaver has a pretty good "paste from word" feature, it's unlikely you'll find an online equivalent. This is a huge and complex problem that would be an application in itself. Even WORD's "save as HTML" can't even do a decent job of it.
Sadly, what most have to do, is strip it all down to ASCII (paste into Notepad), put it in the editor and mark it back up.
You can add a listener for the 'paste' event in the editor instance: http://docs.cksource.com/ckeditor_api/symbols/CKEDITOR.editor.html#event:paste
That way you get the HTML that it's gonna get pasted and you can perform whatever clean up you need (for example based on inserting that html into a div and then work with the DOM, or using regexps on the string).
Found a solution:
Listening to the paste event as AlfonsoML wrote.
Sending the pasted content of Word to the server.
Parsing it with the HTML Agility Pack.
Sending it back to the client.
Inserting it within the editor.

Prevent web page copying method

I was completely facinated when I discovered BRW's (example) method to prevent web page copying. I had a quick look through the source view and couldn't see how they did it. Aside from inserting (c) symbols through out the text, they also scramble the text yet it is completely readable through a browser. Amazing!
Any ideas how they did it?
If you view the source, you will notice that its a boatload of <i> and <span> elements littered in the source (some of which are hidden by indenting them -10000 to the left). However, a simple scraper with a tiny bit of logic could easily undo that travesty.
Sure, it will prevent casual copy and paste, but is downright dumb, plus makes you pretty much ungoogleable.
They overlay/insert invisible text spans (through CSS: text-indent: -100000px) that browsers usually still copy and paste, resulting in too much copied text. You need to parse the CSS to determine what is readable (try lynx -- bad text)

Categories