alfresco bulk upload of images with metadata in xml

alfresco bulk upload of images with metadata in xml - javascript

Am very new to alfresco and I am tasked to upload alot of images with their respective metadata. The metadata is in xml format. For example the first batch have 10,000 files with the following naming, for the image is a imagename.Tiff(employee1.tiff,employee2.tiff etc), the xml contains metadata of the employee e.g employee no,name, department, date employeed etc. I have created custom aspects that corresponds to the metadata but my problem is how to index the metadata into alfresco.
I have already created a listening folder where when i drop my files thy are been picked and placed in the right space. The remaining issue is to index the metadata for each uploaded image.
Kindly assist.

An easy way to ingest in bulk, with metadata, is to use Alfresco's Bulk Import Tool
If you are on Alfresco Enterprise, then I recommend that you use the In-place Bulk Import. However, if you do use the In-place Bulk Import, you'll want to distribute the files efficiently on the filesystem and also keep children limited to a few thousand as recommended by Alfresco.
Regarding the metadata in XML - you'll need to transform the XML into the format the tool is looking for. Details available here: http://docs.alfresco.com/4.2/topic/com.alfresco.enterprise.doc/concepts/bulk-import-prepare-filesystem.html

Related

Shell/Python/JS/whatever script to remove metadata from Excel files in Linux

I've tried mat2 and found Metadata Cleaner, but underneath it relies in mat2 and it doesn't work (mat2 removes the data in the sheets, Metadata Cleaner removes the file entirely).
I was trying to get an universal solution to remove metadata from any kind of files (remove the file creator name, etc), but it seems that even removing metadata kist from Excel files is already tricky.
I feel I'm missing something, because to anonymize personal data from public available information should be a common task.

File supported by BigchainDB

BigchainDB support .json file format. Is there any possible way to store digital asset like images, documents?

You can convert any file (seen as a sequence of bits) into JSON, for example, by converting it to base64, which can then be included as the value in a JSON document.
While that is possible, it's not a good idea. BigchainDB is more for storing metadata (which you might want to search or query), not for storing big files.

BigchainDB allows interoperability with IPFS. Use the unique hash returned by the IPFS, as when files are added to it, to store as an asset information or as a metadata of the asset in BigchainDB which then can be queried to view the file from IPFS.

Is there a best practice for storing and retrieving static list in node

I am building a application for the first time in node.
My website will include a static list of countries, music genres and so on...
Should I store the data in my database, or should I use a static json file with a list(countries, genre)?
My folder structure looks something like src\lib..scsss..server and so on.
My question ultimately is - Is there a best practice for storing static lists in node - if a josn file is preferred where should this exist in my folder structure?

If your data is not gonna change and static, then you should use file system which will have high R/W operation rate compared to communication with DB Server overhead.
Moreover you can use filecache to cache all your static files. Which will load the files even faster.

The answer is really that "it depends" upon some things you have not specified.
First off, if it is a list of data that does not change while your app is running (or changes very infrequently), then you don't want to load it from some remote source every time you need it. You will want to load it once and then keep that list in memory for subsequent use. This will be a lot more efficient for your server.
As to where to store the list in the first place, you have several choices that depend upon who is going to maintain that list and what level of programming skill they might have.
If the list of countries will not change often and will be maintained by a Javascript developer, then you can either put the list right into a Javascript literal in your code or in a JSON file in your file system. If choosing the latter option as a JSON file, it can be in the same directory as your Javascript source files and just loaded directly with require() upon startup.
If the list of countries will be maintained by someone who is not a Javascript developer, but can be trusted to follow JSON syntax rules, then you can put the list in a JSON file. Whether you put this file in the same directory as your JS files or in a separate data directory really depends more upon how your application is deployed, who has permission to do what, etc...
If the list of countries will be maintained by someone who has no idea about programming or syntax rules and should be modifiable completely independently from your code, then you may want to either put it in the database and build some sort of admin interface for modifying it or put it in a plain text file (one country per line) and then parse that file upon app startup.

Read and Write DOCX file

I have 2 docx files that I am working with. One docx file contains text information of a product (start serial number, length, width, and height). The other docx file contains a sticker label with an image and all of the text information from the first file.
This is what I do currently:
I open the first docx file and copy all of the text information (serial, length, width, and height)
Then I paste each info into the second docx file that contains the formatted label.
If I need to make more than one label, I copy the label and increment the serial number by 1.
This takes a lot of time to make several labels for different products. My goal is to come up with an easier way to take data from one docx and inject it into the other. Also, generating more labels when needed.
My first thought was to extract the docx file to get it's xml contents. Then read the data using javascript, c++, or any other language. Then Ask user to input number of labels to generate, manipulate the xml, and repack it as a docx file.
Then I thought about trying to use the windows office "mail merge" feature, but I have never done this before.
I would like to know if anyone has any suggestions for an easy solution to import data from one docx file and generating labels into another.
I am open for any suggestion.
Also, I am not a professional programmer. I am an undergraduate computer engineering student with some experience in c, c++, java, javascript, python, MIPS assembly, and php.

The only open-source (and probably easier to come by) solution I know know is:
http://poi.apache.org/
http://poi.apache.org/document/quick-guide-xwpf.html
This is a good bet when it comes to speed and it is free software.
But if you open a file, alter it and save it again - the result can be flaky: The formatting can be slightly off. At least in my tests with the pptx counterpart.
I reckon when you have user interaction (web page?) in order to create the document, you can build a small HTTP Api around the library.
There is also: http://www.docx4java.org/trac/docx4j - which I have not tested yet.
You can also go the C#/Redmond way: How do I create the .docx document with Microsoft.Office.Interop.Word?
The Interop (2nd Example in the first answer of the question above) way gives the best result when it comes to the accuracy of the formatting. Basically when you open a file with Interop - it will look the same when you alter and save it. But you cannot use this when interacting with a user - because it starts a separate MS Office process - and I would not count on this from my own user experience. But if you want to generate these files as a batch in a single user session - it will deliver a good result.
I cannot comment on the "OpenXML SDK" library described in the above SO question.

Wath about the Open XML https://www.youtube.com/watch?v=rMnEl6JZ7I8 and website developer http://openxmldeveloper.org/ .
On the site you found sdk for:
Open XML SDK for JavaScript: http://openxmldeveloper.org/wiki/w/wiki/open-xml-sdk-for-javascript.aspx. Demo: http://openxmldeveloper.org/blog/b/openxmldeveloper/p/openxmlsdkjs_demo.aspx
Open XML and Java http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2006/11/21/openxmlandjava.aspx
.Net Resources http://openxmldeveloper.org/resources/dotnet/m/cc/default.aspx

Importing XML in cq5.5

I want to import my data in cq5.5.But the option of content loader is not available in cq5.5.How to import xml in cq5.5.Do we have to create bundles or there is some another way to do so.

I haven't verified if that module is in CQ5.5 or not, but I think the Sling JCR ContentLoader should work in CQ if you add it yourself, if it's that module that you mean.
Apart from that, one useful pattern is to drop XML files in a folder, observe that folder via JCR or Sling events and use any suitable XML parser or digester to process it. This gives you full flexibility and using the right parser should allow you to process XML files of arbitrary sizes. The scenario is similar to how Sling's espblog sample detects and processes image files to create thumbnails.
You could also use CQ's workflow engine to detect XML files in specific folders and trigger workflow steps to process/import them, that might give you a better view on things via CQ's workflow console.

We Keep Coding

JavaScript is the programming language of the Web.