My goal is the use the data in a JSON file ("data.json") for both my Java application and my node.js application.
Java application reads the JSON file and rewrites the file, only with additional or edited JSON objects.
I would like also the node.js application to read the JSON file and rewrite the file with additional or edited JSON objects.
What is the best way to do this? What are my options for collaborating, syncing and queuing changes from each source?
Thanks
Obviously to protect the file from corruption on concurrent writes and reads you need to implement some lock system. Between the threads in a single process it's a trivial task, however between processes it's not that easy.
There's one trick that could work on UNIX systems and it's based on the fact that file rename procedure is atomic in unix. Based on the file renaming you could implement a simple mutex, or even a more complicated read-write lock semantics. For a simple mutex use the following strategy:
When an app want's to use the file, check the data* file name. If it's data_locked_<some_pid>, wait (or subscribe to FS notifications so that you don't need to poll FS and spend CPU time) and repeat step 1.
If the file name is data attempt to rename the file to data_locked_<pid>. If the attempt succeed - use the file.
After you finish working with the file rename it back to data
This simple mutex semantics will keep the file locked for use by a single process, however, reading operations be safely performed concurrently, so you could optimise the solution further by making a ReadWriteLock:
For reads:
When the app whats to read the file, check if the file name is data or data_readlocked_<some_pid_list>. Attempt to rename the file to data_readlocked_<original_pid_list + pid>. If attempt failed, repeat step 1.
Read the file.
Rename the file so that it's name doesn't contain this process pid any more. If this process pid is the only in the list - rename to data. If rename attempt failed repeat step 3 until attempt is succeed.
For write:
Check if the file name if data. If it is, rename to data_writelocked_<pid>, if failed repeat step 1. If the file name is anything other then data - wait and repeat step 1.
Write to the file.
Rename it back to data.
Related
I am upload the single zip file at a time to FTP Server if I increases the thread number to 5 it will runs five times but I can see only one in the FTP server
If you're uploading the same file to the same folder - the previous version will be replaced by the most recent version so your 5 threads are writing to the same file and you're seeing the result of the latest upload only.
The solution would be to use different file names for different threads using __threadNum() function like:
this way the files will appear as:
somefile-1.zip
somefile-2.zip
etc.
so the current thread number will be added to the filename
However on subsequent executions the files again will be replaced so you can use unique names using __UUID() function like:
somefile-${__UUID}.zip
this way you will get filenames looking like:
somefile-0f573e42-c93a-4952-8378-2bdd34b44019.zip
Check out Apache JMeter Functions - An Introduction article for more information on JMeter Functions concept.
I have a file containing a very small amount of data which is being updated every 10 ms by my java program.
Would it be safe to read that file simultaneously in my javascript program?
It depends on your operation system and the reading/writing software that accesses the file. If the file is locked because you try to access it in the very small time window while it is written, your read could fail. In that case you simply should build a loop, that tries again to open the file until it has success.
More about file locking: https://en.wikipedia.org/wiki/File_locking
Instead you could also use a socket or a database.
I am building a application for the first time in node.
My website will include a static list of countries, music genres and so on...
Should I store the data in my database, or should I use a static json file with a list(countries, genre)?
My folder structure looks something like src\lib..scsss..server and so on.
My question ultimately is - Is there a best practice for storing static lists in node - if a josn file is preferred where should this exist in my folder structure?
If your data is not gonna change and static, then you should use file system which will have high R/W operation rate compared to communication with DB Server overhead.
Moreover you can use filecache to cache all your static files. Which will load the files even faster.
The answer is really that "it depends" upon some things you have not specified.
First off, if it is a list of data that does not change while your app is running (or changes very infrequently), then you don't want to load it from some remote source every time you need it. You will want to load it once and then keep that list in memory for subsequent use. This will be a lot more efficient for your server.
As to where to store the list in the first place, you have several choices that depend upon who is going to maintain that list and what level of programming skill they might have.
If the list of countries will not change often and will be maintained by a Javascript developer, then you can either put the list right into a Javascript literal in your code or in a JSON file in your file system. If choosing the latter option as a JSON file, it can be in the same directory as your Javascript source files and just loaded directly with require() upon startup.
If the list of countries will be maintained by someone who is not a Javascript developer, but can be trusted to follow JSON syntax rules, then you can put the list in a JSON file. Whether you put this file in the same directory as your JS files or in a separate data directory really depends more upon how your application is deployed, who has permission to do what, etc...
If the list of countries will be maintained by someone who has no idea about programming or syntax rules and should be modifiable completely independently from your code, then you may want to either put it in the database and build some sort of admin interface for modifying it or put it in a plain text file (one country per line) and then parse that file upon app startup.
I have a directory where some other programs write XML files that I have to process when they're complete.
Until now I avoided the handling of incompletely written files by asking the writing programs to first use a temporary name and only at end rename the files in ".xml". My code looks like this :
var fs = require("fs");
var handleFiles = function(){
fs.readdirSync(args.in).forEach(function(filename) {
if (filename.slice(-4)!=='.xml') return;
// handle XML file here
});
}
fs.watch(args.in, handleFiles);
But some new programs I have to support are unable to write with a temporary name.
How can I ensure I handle the files when they're completely written in an efficient, reliable, cross-browser (windows & linux) and not timeout-based (i.e. not testing a rename every 10 ms until it works) way ?
Writing operations are one-shot, so I guess what I want (for linux and more importantly for windows) is to be notified when there are new files not being write-locked.
On linux you could use the inotify facilities. See inotify(7).
Maybe using incrontab could be worthwhile.
One technique for dealing with this is to use a zero-byte "signal file" which is written after the main file is done. For example
mybigfile.txt
mybigfile.txt.done
The ".done" file is 0 bytes long and is created by a "touch" or similar process when it has finished writing the data file. You scan for the .done file and don't do anything if it is missing. Unfortunately you imply that you can't control the writing processes, so this might not be a solution you can use.
I am developing an html page referencing a large Javascript file (1MB+) which is seldom modified. From here, I get that the js file will not be resent if not modified.
My question is: how does Apache checks whether an ftp uploaded javascript file has been modified? Is it from its file timestamp? If not, where does it get this information? I want to understand the process to control performance issues.
For static files, a call to stat() is typically used to check if the file size or modification time has changed.
The Caching Guide goes into detail and also contains the above reference in the section A Brief Guide to Conditional Requests.