I need a NodeJS script I wrote to run every 10 minutes and grab data from an API. I used to be a Unix admin and something like this would be accomplished with a cron job. I know I'm going to have to set up some kind of scheduled execution on the server where my script resides. What's the best way to approach this?
I think node-cron would solve your problem.
I know I'm going to have to set up some kind of scheduled execution on the server where my script resides
If you have just one script as of now and you are looking to NOT have many then you can simply place your scripts in your main repository itself with its own configurations.
Alternatively, you can setup a whole new repository just for your scripts which would give you a lot more power over how you want to run your scripts, what language you want to write them in, who can access you code etc.
Related
I have a nodejs server that runs some scripts and parses files for my client to access. The data sits on certain urls and I fetch that on the client side. The problem is that the data will be out of date, so the script will need to be ran again. Is there a way to run these scripts in my server every say 15 minutes and post that to the url?
Your need is something like job scheduling. There are few ways to achieve it:
You can handle in your Nodejs application
Use setInterval method to repeatedly calls a function. However, this method is not fully supported by Nodejs - look at this. Just remind that timer methods in Nodejs maybe delay your task because the event-loop mechanism
Use the job scheduling library that allows you to schedule the recurring task. There are 2 libs as I did a quick search: node-schedule and node-cron
Create a system cron-job to trigger your Nodejs app to handle the recurring task. Check out more at here.
I have some .js files that take information from web pages with Cheerio, but what I want to do is give them kind of setTimeout like 1 day period to restart themselves if there is new data comes or not. I guess I shouldn't do with setTimeout because I'll have 15-20 files bot that getting data, I should use thread but how am I going to use them like service.
I would recommend using cron for node, its an implementation of cron and is really simple to use. This will allow you to schedule tasks to be ran when you want them to. It will also allow you to schedule tasks with out overloading your server with setTimeout but from what you say you wont have many so it wont make too much of an impact.
Actually, 15-20 sounds fine for me to use setTimeout.
I thought you might want to check some cron tools like: https://www.npmjs.com/package/node-schedule and then schedule your crawlers to rescan targets as you need, as this would be more efficient.
I'm developing an app that should receive a .CSV file, save it, scan it, and insert data of every record into DB and at the end delete the file.
With a file with about 10000 records there aren't problems but with a larger file the PHP script is correctly runned and all data are saved into DB but is printed ERROR 504 The server didn't respond in time..
I'm scanning the .CSV file with the php function fgetcsv();.
I've already edit settings into php.ini file (max execution time (120), etc..) but nothing change, after 1 minute the error is shown.
I've also try to use a javascript function to show an alert every 10 seconds but also in this case the error is shown.
Is there a solution to avoid this problem? Is it possible pass some data from server to client every tot seconds to avoid the error?
Thank's
Its typically when scaling issues pop up when you need to start evolving your system architecture, and your application will need to work asynchronously. This problem you are having is very common (some of my team are dealing with one as I write) but everyone needs to deal with it eventually.
Solution 1: Cron Job
The most common solution is to create a cron job that periodically scans a queue for new work to do. I won't explain the nature of the queue since everyone has their own, some are alright and others are really bad, but typically it involves a DB table with relevant information and a job status (<-- one of the bad solutions), or a solution involving Memcached, also MongoDB is quite popular.
The "problem" with this solution is ultimately again "scaling". Cron jobs run periodically at fixed intervals, so if a task takes a particularly long time jobs are likely to overlap. This means you need to work in some kind of locking or utilize a scheduler that supports running the job sequentially.
In the end, you won't run into the timeout problem, and you can typically dedicate an entire machine to running these tasks so memory isn't as much of an issue either.
Solution 2: Worker Delegation
I'll use Gearman as an example for this solution, but other tools encompass standards like AMQP such as RabbitMQ. I prefer Gearman because its simpler to set up, and its designed more for work processing over messaging.
This kind of delegation has the advantage of running immediately after you call it. The server is basically waiting for stuff to do (not unlike an Apache server), when it get a request it shifts the workload from the client onto one of your "workers", these are scripts you've written which run indefinitely listening to the server for workload.
You can have as many of these workers as you like, each running the same or different types of tasks. This means scaling is determined by the number of workers you have, and this scales horizontally very cleanly.
Conclusion:
Crons are fine in my opinion of automated maintenance, but they run into problems when they need to work concurrently which makes running workers the ideal choice.
Either way, you are going to need to change the way users receive feedback on their requests. They will need to be informed that their request is processing and to check later to get the result, alternatively you can periodically track the status of the running task to provide real-time feedback to the user via ajax. Thats a little tricky with cron jobs, since you will need to persist the state of the task during its execution, but Gearman has a nice built-in solution for doing just that.
http://php.net/manual/en/book.gearman.php
I'm developing website where I need to execute one code at particular time.
Which is faster and better choice to write Cron Job or to use JavaScript Timing Event
or something similar or JavaScript
You are asking a question about two completely different things.
Cron job is based on the server, JavaScript (unless you are using NodeJS) is based on the client. Depending on whether this is a task that:
must be performed and cannot be relied on the client (eg. the data is sensitive), or
can depend on the client execution (which means the browser window should remain open and the JavaScript should be enabled),
choose Cron (1) or JavaScript (2) respectively.
It is really like a comparison between apples and oranges. Unless you will tell us whether you want orange juice or apple pie, we won't be able to help you more. Just remember that Cron is for more reliable server-side task execution, and JavaScript timeout is per-user (or rather per-client), less reliable execution.
It entirely depends on the nature of the code you need to execute at a particular time.
If it's something that has to happen every day at 2pm or whatever, regardless of whether or not anyone's looking at the website, then you should use a cron job for that.
On the other hand, if it's something that needs to execute at a certain time per user (i.e., to automatically log a user out of a page after some amount of idle time), then the appropriate call is Javascript timing functions.
Javascript timing functions will only work if someone's actually looking at the page, and then it'll be called multiple times for multiple users, which may or may not be desirable depending on your situation.
Of course, you may be running Node.js on the server, in which case you can use Javascript timing functions as if they were a cron job.
In short, use cron
In your comment to another answer, you said:
I want to execute php function every week one time
In this case, you have one main option (assuming you are using *nix) and that is cron (I don't know what the Windows alternative is). Cron is specifically designed for this function, and whether or not you choose to use it, it is most likely running on your server anyway (for other system functions) so speed is not an issue.
Don't use Node.js
Node.js is an alternative serverside technology to PHP. You would use it server side instead of php. If you're already using PHP, then forget it. The only reason Node.js has been mentioned is because you've asked about JavaScript.
Also, for a weekly timing event, A JavaScript timer wouldn't be a good idea. The setTimeout() function works in milliseconds, and is good for working in seconds and minutes (possibly hours), but not weeks.
If you were to use serverside JavaScript (like Node.js), you would probably need to do something similar to the PHP Alternative below.
PHP Alternative
Of course, depending on your hosting environment (especially cheaper ones), cron may not be available. In this case you would have to come up with a different strategy, and you would probably be best to use PHP. Something that I've seen done before goes along these lines:
Have some register of jobs that need to be performed. (In a database, or a file, or whatever)
Every time you run your main PHP script (usually index.php), check the register to see if there are any outstanding jobs.
2.a. Run the job.
2.b. Update the register, so you remember the last time the job was performed.
Pros:
It works if you don't have access to cron.
Cons:
If your script is not run very often (because this method relies on people visiting your page), your jobs may not be run as often as you like.
If your script is run very often, you will suffer unnecessary overhead in your script.
If your jobs take a long time to run, it will effect the page load times.
You're basically replicating cron, but using PHP which is far less efficient than using cron.
It's unlikely (unless you invest a lot of time) that you'll develop a solution that is as good as cron.
For a javascript timing event to run you would need to open the webpage. That means you have to expose that page publicly. You don't want to do that. Cron jobs are easy and effective. I like them. You should do that.
My server guy wants to use apache, and also he wants to program a daemon to control automated file deletion, along with other automated file tasks. Can node.js automate the deletion of files if they reach a certain length of time? And in addition, can node.js time-stamp, because my server guy swears it cannot, and that "daemons are superior for automated file tasks!"
Thank you.
Sorry if my question isn't very comprehensible, I'm in a hurry to get this answered.
Yes, here you can find how.
For such tasks, I think that cron job that executes some script (it can be a nodejs, php, perl, sh, whatever) can work just fine.
At the end, depends on your problem. Daemon sound like an overkill, but it might be the only approach.
One way to use node might be to use inotify/dnotify to be notified when files are created, then set callbacks with setTimeout to clear them, or keep a list and periodically iterate it and delete files.
The distinction between a daemon and node is not correct, a node application can be a daemon or a one-off script. Node just keeps running until no further interrupts are possible and then the script terminates.