InDesign CS6 Import multiple XML files in document via scripting

InDesign CS6 Import multiple XML files in document via scripting - javascript

I've created a script where I select the folder that holds the xml files I want to import, create the document and insert these XML files, but my script ends with the following message, which is not very helpful: "Execution finished. Result: undefined".
Any help will be appreciated.
var myDocument = app.documents.add();
var MyFolderWithFiles = Folder.selectDialog ("Choose a folder");
var myFiles = MyFolderWithFiles.getFiles("*.xml");
for(var i = 0; i < myFiles.length; i++) {
myDocument.importXML(myFiles[i]);
}

This question is a bit old but doesn't have an accepted answer so I'll see if I can help.
Your code for creating the new document looks to me like you're just creating a new blank document. How does the XML you're importing know where to go? One thing that might help is after you've completed the import check the "Structure" of the XML within the document.
var myDocument = app.documents.add();
When I import XML I don't create a new blank document, I create a new document from a template that has a predefined structure so InDesign knows where to place each XML node within your template. Here is a decent reference to help get started setting up your INDT file.
var myDoc = app.open( '//path/to/myTemplate/myTemplate.indt', OpenOption.OPEN_COPY );
// NOTE: I'm running against an InDesign server, if you're running against your ID GUI then you'll need an extra param on the app.open() call like in the following line
var myDoc = app.open( '//path/to/myTemplate/myTemplate.indt', true, OpenOption.OPEN_COPY ); // This is probably what you'll need to use
// The extra param for showingWindow should be true if running against the ID GUI, this feature is NOT available when executing against the ID server
Also I set my XML import preferences -- your's may differ from mine -- so InDesign knows what to do, say in case of an unmatched XML node in contrast to your XML import structure. For example here is one of the xmlImportPreference sets I use from time to time.
with ( myDoc.xmlImportPreferences )
{
allowTransform = false;
createLinkToXML = false;
ignoreUnmatchedIncoming = false;
ignoreWhitespace = true;
importCALSTables = false;
importStyle = XMLImportStyles.mergeImport;
importTextIntoTables = false;
importToSelected = false;
removeUnmatchedExisting = true;
repeatTextElements = true;
}
Something else to look into with the help of creating your XML structure for your INDT is the use of a DTD (Document Type Definition) file. Here is another good reference for help with InDesign and XML, it also goes into some detail about DTD files. An example of a simple DTD file might be something like this.
<!ELEMENT Root (Root*)>
<!ELEMENT Customer(name, Address*)>
<!ELEMENT name(#PCDATA)>
<!ELEMENT Address(street, city, state, zip)>
<!ELEMENT street(#PCDATA)>
<!ELEMENT city(#PCDATA)>
<!ELEMENT state(#PCDATA)>
<!ELEMENT zip(#PCDATA)>
In XML that would represent something like this:
<Root>
<Customer>
<name>Billy Bob</name>
<Address>
<street>123 Test Ave</street>
<city>Testville</city>
<state>IA</state>
<zip>12345</zip>
</Address>
</Customer>
</Root>
I hope this helps point someone in the right direction struggling with ID and XML. It can be tricky and temperamental at times.
I did ramble a little bit so if someone finds this helpful but still cannot quite get it to work I can elaborate on a specific issue all you have to do is ask! ;) HAPPY CODING!

"Execution finished. Result: undefined"
it means
the files are imported succesfully.
and indesign doesn't return any value for this statement...

I second Sulaiman_J. The message indicates everything went fine. The fact you don't have anything imported can be related to the xml import options. You should check with a manual xml import and display xml options. Check if the "Only import elements that match existing structure". Because if this option is checked and as you are working on new documents, there won't be existing structure and the import will not inject any nodes.
That's why you may not have injected contents even if InDesign says it did.

Related

export "saved for later" from (evil) feedly

i'm trying to migrate from feedly as it is unacceptable (at least to me) that a search query is (fully) enabled only by a pro version.
Anyhow, to export my lengthy list of "saved for later" i found some lovely scripts:
Simple script that exports a users "Saved For Later" list out of Feedly as a JSON string and feedly-to-pocket. where i am instructed to:
You must switch off SSL (http rather than https) or jQuery won't load!
so i though i did by adding (ubuntu 14.04/chrome 40 x64)
--ssl-version-min=tls1
to my /usr/share/applications/google-chrome.desktop file (all lines starting with Exec=). However when i try to run it in the browser console i get
This request has been blocked; the content must be served over HTTPS.
So, any suggestions? (also, excuse me for noobness)

Go to your Feedly "saved" list and scroll down until all articles have loaded.
Open console and paste the following Javascript into it:
function loadJQuery() {
script = document.createElement('script');
script.setAttribute('src', '//code.jquery.com/jquery-2.1.3.js');
script.setAttribute('type', 'text/javascript');
script.onload = loadSaveAs;
document.getElementsByTagName('head')[0].appendChild(script);
}
function loadSaveAs() {
saveAsScript = document.createElement('script');
saveAsScript.setAttribute('src', 'https://cdn.rawgit.com/eligrey/FileSaver.js/5733e40e5af936eb3f48554cf6a8a7075d71d18a/FileSaver.js');
saveAsScript.setAttribute('type', 'text/javascript');
saveAsScript.onload = saveToFile;
document.getElementsByTagName('head')[0].appendChild(saveAsScript);
}
function saveToFile() {
// Loop through the DOM, grabbing the information from each bookmark
map = jQuery(".entry.quicklisted").map(function(i, el) {
var $el = jQuery(el);
var regex = /Published:(.*)(.*)/i;
return {
title: $el.attr("data-title"),
url: $el.attr("data-alternate-link"),
summary: $el.find(".summary")[0].innerHTML,
time: regex.exec($el.find("span.ago").attr("title"))[1]
};
}).get(); // Convert jQuery object into an array
// Convert to a nicely indented JSON string
json = JSON.stringify(map, undefined, 2);
var blob = new Blob([json], {type: "text/plain;charset=utf-8"});
saveAs(blob, "FeedlySavedForLater" + Date.now().toString() + ".txt");
}
loadJQuery()
Source: Feedly-Export-Save4Later

Not javascript but here is how I saved a html page with all the links and excerpts...
Open the saved pages in feedly in chrome
scroll down so they are all there
inspect any element (the top article is a good choice) so it opens the generated html
find the div id="section0_column0" node
right-click & copy it
paste into Notepad++
this html is untidy so carry on...
Do a Regex find & replace
find: (?s)<div id=.+?_main.+?>.+?(<a href=")(.+?)(").+?sans-serif">(.+?)</span>.+?</div>.+?</div>.+?</div>
replace: <div>$1$2$3>$2</a></div> <div> $4<br /> <br /></div>
save the html page.
open it in Chrome

Posted the question in the jquery forum and the solution was rather simple (remove http from attribute string)
line 34 should be
script.setAttribute('src', '//code.jquery.com/jquery-latest.min.js');
So to close the loop - for a full searchable/archived list of links not only by title/url but context also(!) you can:
Follow the instructions in https://github.com/ShockwaveNN/feedly-to-pocket (with the correction suggested by kind stranger jakecigar and you also have to register a pocket app (obtain consumer key) for the ruby script to work)
Export html list from your pocket account
Import pocket list to a Kifi library
and at last feedly-free with my personal search engine

I know I'm a bit late to the party but Ive been hunting around for a few days to find a reasonably simple solution. None of which have been listed clearly or concisely on stack overflow or elsewhere on the web. I have in fact found a much easier way to do this.
Use this java script from this Gist just as it instructs https://gist.github.com/ShockwaveNN/a0baf2ca26d1711f10e2 (Note this is referenced above and found through the link #gep shared in step one)
Once the JS as completed running it will download a text file. (It does still run successfully and on large numbers, I just exported almost 2500 articles)
Create a blank test.json in SublimeText.
Copy all entries from your exported text file into this json file
Weirdly it does seem you need to copy and past as I tried just renaming the text file and when I did that I received errors on the next step
Make sure you are signed into pocket
Go here: https://getpocket.com/import/springpad
Select your newly created test.json
Upload
Note: On large uploads the import page fails to refresh (this did not seem to be an issue as all my articles did make it into my account)
This allows you to directly upload json into your pocket account. Thus no more messing around with random supposed other fixes. I hope this make it a lot easier for everyone in the future.

Firefox addon SDK and DOM manipulation problems

I'm trying to get to grips with the Firefox addon SDK (previously known as Jetpack from what I understand), but I'm having problems working with the DOM.
I need to iterate over all of the text nodes in the DOM when a web page loads and make changes to some of the strings that they contain. I've posted a simplified version of what I'm doing below (new to Javascript, so forgive me any oddities).
// test.js
function parseElement(Element)
{
if (Element == null)
return;
var i = 0;
var Result = false;
if (Element.hasChildNodes)
{
var children = Element.childNodes;
while (i <= children.length - 1)
{
var child = children.item(i);
parseElement(child);
i++;
}
}
if (Element.nodeType == 3)
{
// For testing - see what the text node contains
alert(Element.nodeValue);
Result = true;
}
return Result;
}
window.addEventListener("load", function load(event)
{
window.removeEventListener("load", load, false);
parseElement(document.body);
}
When I create a basic HTML document:
<!-- test.html -->
<html>
<head>
<script type="text/javascript" src="test.js"></script>
</head>
<body>
<b>hello world</b>
<p>foo</p>
<i>test</i>
</body>
</html>
...include this Javascript file in the HEAD section then open it in Firefox, the "alert" displays 6 dialog boxes containing:
1) "hello world"
2) blank -> no visible characters, just a newline
3) "foo"
4) blank -> no visible characters, just a newline
5) "test"
6) blank -> no visible characters, just a newline
Exactly what I would expect to see.
The problem arises when I create an addon and use test.js as a page-mod Content Script from my main.js file (modified to remove the "addEventListener" part). When I use "cfx run" to start Firefox with my addon installed, then open the same HTML document (with the "script" part for the test.js file commented), the alerts do not display at all.
So that's the first puzzle. But having also navigated to other web pages - for example, a YouTube video page - the alert DOES display several dialogs, but they include very strange strings, mostly the content of script tags:
EDIT I don't have enough reputation to embed an image, so here's a link instead showing the sort of thing I mean instead: http://img46.imageshack.us/img46/5994/mtpd.jpg
And again, the text I would expect to see is absent.
Apologies for some of the redundancy below, but just to be clear: this is my main.js:
main.js
var data = require("sdk/self").data;
var data = require("sdk/self").data;
exports.main = function()
{
pageMod.PageMod({
include: "*",
contentScriptFile: [data.url("test.js")]
});
}
And the modified version of the Javascript file is identical to the "test.js" listing above, but for the end part:
test.js
<snip>
...
return Result;
}
parseElement(document.body);
I've included my project files (if I can call them that) in a zip if it makes things easier to visualise: http://www.mediafire.com/?774iprbngtlgkcp
I've tried changing
parseElement(document.body);
to
parseElement(unsafeWindow.document.body);
in case it makes any difference, but the outcome is identical.
So I'm very puzzled about what's happening. I can't understand why the test.js file isn't picking out the text nodes (and only the text nodes) from the DOM when I use it as part of an addon, but does exactly what I would anticipate when included as a script in a HTML document. Can anyone shed any light on this?
Thank you in advance.

Errors in your lib code and contentScripts are usually logged to the Error Console. Check what is printed there. Also see the SDK console module.
Your page-mod won't run because by default page-mods will run only after the load event.
See the contentScriptWhen documentation.
script tags actually often have a text-node child containing the inline script source. So it is absolutely normal that those are enumerated as well.
For some discussion about walking tree nodes, see: getElementsByTagName() equivalent for textNodes
However, if you're after the text of specific ids/classes, consider using document.querySelector/.querySelectorAll, or if you're after nodes that have a specific XPath, use document.evaluate. This very likely will be a lot faster.
Other than that, I cannot really tell what exactly your remaining issues are and what you're trying to achieve in the first place exactly, so I cannot advice on that.

You wondered that
I've discovered that my add-on is NOT executed when a document is
accessed via File->Open File.
That is by design. At match-pattern, it says that
A single asterisk matches any URL with an http, https, or ftp scheme.
For other schemes like file, resource, or data, use a scheme followed
by an asterisk, as below.
You can use the regular expression /.*/ to match all sites and all schemas.

Using javascript to rename multiple HTML files using the <TITLE></TITLE> in each file

I have used HTTRACK to download Federal regulations from a government website, and the resulting HTML files are not intuitively named. Each file has a <TITLE></TITLE> tag set, that would serve nicely to name each file in a fashion that will lend itself to ebook creation. I want to turn these regulations into an ebook for my Kindle, so that I can have the regulations readily available for reference, rather than having to carry volumes of books with me everywhere.
My preferred text/hex editor, UltraEdit Professional 15.20.0.1026, has scripting commands enable through embedding of the JavaScript engine. In researching possible solutions to my problem, I found xmlTitleSave on the IDM UltraEdit website.
// ----------------------------------------------------------------------------
// Script Name: xmlTitleSave.js
// Creation Date: 2008-06-09
// Last Modified:
// Copyright: none
// Purpose: find the <title> value in an XML document, then saves the file as the
// title.xml in a user-specified directory
// ----------------------------------------------------------------------------
//Some variables we need
var regex = "<title>(.*)</title>" //Perl regular expression to find title string
var file_path = UltraEdit.getString("Path to save file at? !! MUST PRE EXIST !!",1);
// Start at the beginning of the file
UltraEdit.activeDocument.top();
UltraEdit.activeDocument.unicodeToASCII();
// Turn on regular expressions
UltraEdit.activeDocument.findReplace.regExp = true;
// Find it
UltraEdit.activeDocument.findReplace.find(regex);
// Load it into a selection
var titl = UltraEdit.activeDocument.selection;
// Javascript function 'match' will match the regex within the javascript engine
// so we can extract the actual title via array
t = titl.match(regex);
// 't' is an array of the match from 'titl' based on the var 'regex'
// the 2nd value of the array gives us what we need... then append '.xml'
saveTitle = t[1]+".xml";
UltraEdit.saveAs(file_path + saveTitle);
// Uncomment for debugging
// UltraEdit.outputWindow.write("titl = " + titl);
// UltraEdit.outputWindow.write("t = " + t);
My question is two-fold:
Can this JavaScript be modified to extract the <TITLE></TITLE> contents from an HTML file and rename the files?
If the JavaScript cannot be modified easily, is there a script/program/black magic/animal sacrifice that can accomplish the same thing?
EDIT:
I have been able to get the script to work as desired by removing the UltraEdit.activeDocument.unicodeToASCII(); line and changing the file extension to .html. My only issue now is that while this script works on single open files, it does not batch process the directory.

You can use just about any "scriptable" language to do something like this pretty quickly. Ruby is my favorite:
require 'fileutils'
dir = "/your/directory"
files = Dir["#{dir}/*.html"]
files.each do |file|
html = IO.read file
title = $1 if html.match /<title>([^<]+)<\/title>/i
FileUtils.mv file "#{dir}/#{title}.html"
puts "Renamed #{file} to #{title}.html."
end
Obviously if your UltraEdit script worked for you this might be obtuse, but for anybody running a different env, hopefully this is useful.

Does this not work out of the box?
I don't know anything about UltraEdit, but as far as a regex engine is concerned, if it can parse <title>(.*)</title> out of an XML document, it can do the exact same for HTML.
Just modify the final file title to .html instead of .xml
saveTitle = t[1]+".html";
Assuming you can get that script to work as it's intended (point being I don't know UltraEdit), I'm pretty confident that same process will work for HTML.

XML and HTML are both plain text, and that script is simply running a regular expression on the text to extract the title tags, which are the same in both; the only thing you need to do is change this line:
saveTitle = t[1]+".xml";
to this:
saveTitle = t[1]+".html";

After much searching and trial and error on the scripting side, I ran across a fantastic program for Windows that will do the renaming via TITLE tags: Flexible Renamer 8.3. The author's website is http://hp.vector.co.jp/authors/VA014830/english/FlexRena/, and it manages to handle every bit of what I needed. Many thanks to #coreyward and #Yuji for their fantastic advice on the scripting end of things.

How to get the uri of the .js file itself

is there a method in JavaScript by which I can find out the path/uri of the executing script.
For example:
index.html includes a JavaScript file stuff.js and since stuff.js file depends on ./commons.js, it wants to include it too in the page. Problem is that stuff.js only knows the relative path of ./commons.js from itself and has no clue of full url/path.
index.html includes stuff.js file as <script src="http://example.net/js/stuff.js?key=value" /> and stuff.js file wants to read the value of key. How to?
UPDATE: Is there any standard method to do this? Even in draft status? (Which I can figure out by answers, that answer is "no". Thanks to all for answering).

This should give you the full path to the current script (might not work if loaded on request etc.)
var scripts = document.getElementsByTagName("script");
var thisScript = scripts[scripts.length-1];
var thisScriptsSrc = thisScript.src;

If your script knows that it's called "stuff.js", then it can look at all the script tags in the DOM.
var scripts = document.getElementsByTagName('script');
and then it can look at the "src" attributes for its name. Kind-of a hack, however, and to me it seems like something you should really work out server-side.

script.aculo.us (source) solves a similar problem. here is the relevant code
var js = /scriptaculous\.js(\?.*)?$/;
$$('script[src]').findAll(function(s) {
return s.src.match(js);
}).each(function(s) {
var path = s.src.replace(js, ''),
includes = s.src.match(/\?.*load=([a-z,]*)/);
(includes ? includes[1] : 'builder,effects,dragdrop,controls,slider,sound').split(',').each(
function(include) { Scriptaculous.require(path+include+'.js') });
});
(some parts of this like .each require prototype)

Path to included Javascript page

How do I get the absolute or site-relative path for an included javascript file.
I know this can be done in PHP, (__file__, I think). Even for an included page, one can check the path (to the included file). Is there any way to have this self awareness in Javascript?
I know I can can get the page URL, but need to get the JS URL.
Eg. Javascript needs to modify the src of an image on the page. I know where the image is relative to the JavaScript file. I don't know where the Javascript is relative to the page.
<body>
<img id="img0" src="">
<script src="js/imgMaker/myscript.js"></script>
</body>
function fixPath(){
$$("#img0")[0].set('src','js/imgMaker/images/main.jpg');
}
Please do not tell me to restructure my function - the example is simplified to explain the need.
In the actual case, a Mootools class is being distributed and people can put it into whatever folder they want.
I would just read the src of the script element, but the class can be part of any number of javascript files, so I can't know what the element looks like.

JavaScript (not JScript) has no concept of file names. It was developed for Netscape back in the days. Therefore there is no __file__ feature or anything similar.
The closest you can come are these two possibilities:
What you already mentioned: Harvest all src attributes of all JS files and try to figure out which one is the right.
Make it a necessary option, that the path to the images must be set in the embedding HTML file. If not set, use a reasonable and well-documented default:
<script type="text/javascript">
var options = {
'path_to_images': '/static/images/' // defaults to '/js/img/'
};
</script>

Based on http://ejohn.org/blog/file-in-javascript/
(function(){
this.__defineGetter__("__FILE__", function() {
return (new Error).stack.split("\n")[2].split("#")[1].split(":").slice(0,-1).join(":");
});
})();
(function(){
this.__defineGetter__("__DIR__", function() {
return __FILE__.substring(0, __FILE__.lastIndexOf('/'));
});
})();
Then later
img.setAttribute('src', __DIR__ + '/' + file);

if you have folders:
/webroot
/images
/scripts
Then images would be an absolute path of /images/whatever.jpg and scripts would be an absolute path of /scripts/js.js

I'm using the following method to get the base URL and using it for loading the other prorotypes, maybe this is what you need. Lets say current script name is 'clone.js'.
/*
* get the base URL using current script
*/
var baseURL = '';
var myName = 'clone.js';
var myPattern = /(^|[\/\\])clone\.js(\?|$)/;
var scripts = document.getElementsByTagName("script");
for (var i = 0; i < scripts.length; i++) {
var src;
if (src = scripts[i].getAttribute("src")) {
if (src.match(myPattern)) {
baseURL = src.replace(myName, '');
break;
}
}
}
Var baseURL should contain what you need.

The path to the JS is irrelevant; links in the HTML file are always relative to the HTML file, even if you modify them from external JS.
[EDIT] If you need to build a path relative to the current web page, you can find its path in document.location.pathname. This path is relative to the web root but you should be able to find a known subpath and then work from there.
For example, for this page, it pathname would be /posts/1858724. You can look for posts and then build a relative path from there (for example posts/../images/smiley.png)

I know this question was asked awhile back but I have a similar situation to Sam's.
In my case, I have two reasons for the situation:
The user can access different sub-domains, each with its own index page.
The user can enter a password that causes index.php to adjust the paths.
Most of the references point to the same src locations for the scripts, but some do not. For instance, those at a different level of the tree would require a different path.
I addressed it by assigning an id to the index page's script tag. For example, the head might include...
<script id='scriptLocation' type='text/javascript' language='javascript' src='../scripts.test/script.js'></script>
My JavaScript is then able to read the path...
var myPath = document.getElementById("scriptLocation").src;

Found another approach, perhaps someone with more JS ninja can flush this out.
CSS stylesheet are able to find the node that called them using document.stylesheets.ownernode.
I could not find a similar call for javascript files.
But, in some cases, if one can include a CSS file together with the javascript, and give the first rule some unique identifier.
One can loop through all stylesheets till they find the one with the identifier [if(document.stylsheets[i].cssRules[0] == thisIs:myCSS)], than use ownerNode to get the path of that file, and assume the same for the JS.
Convoluted and not very useful, but its another approach - might trigger a better idea by someone.

We Keep Coding

JavaScript is the programming language of the Web.

InDesign CS6 Import multiple XML files in document via scripting - javascript

"Execution finished. Result: undefined" it means the files are imported succesfully. and indesign doesn't return any value for this statement...

Related

export "saved for later" from (evil) feedly

Firefox addon SDK and DOM manipulation problems

Using javascript to rename multiple HTML files using the <TITLE></TITLE> in each file

How to get the uri of the .js file itself

Path to included Javascript page

Categories

Resources