cron job vs Javascript timing event

cron job vs Javascript timing event - javascript

I'm developing website where I need to execute one code at particular time.
Which is faster and better choice to write Cron Job or to use JavaScript Timing Event
or something similar or JavaScript

You are asking a question about two completely different things.
Cron job is based on the server, JavaScript (unless you are using NodeJS) is based on the client. Depending on whether this is a task that:
must be performed and cannot be relied on the client (eg. the data is sensitive), or
can depend on the client execution (which means the browser window should remain open and the JavaScript should be enabled),
choose Cron (1) or JavaScript (2) respectively.
It is really like a comparison between apples and oranges. Unless you will tell us whether you want orange juice or apple pie, we won't be able to help you more. Just remember that Cron is for more reliable server-side task execution, and JavaScript timeout is per-user (or rather per-client), less reliable execution.

It entirely depends on the nature of the code you need to execute at a particular time.
If it's something that has to happen every day at 2pm or whatever, regardless of whether or not anyone's looking at the website, then you should use a cron job for that.
On the other hand, if it's something that needs to execute at a certain time per user (i.e., to automatically log a user out of a page after some amount of idle time), then the appropriate call is Javascript timing functions.
Javascript timing functions will only work if someone's actually looking at the page, and then it'll be called multiple times for multiple users, which may or may not be desirable depending on your situation.
Of course, you may be running Node.js on the server, in which case you can use Javascript timing functions as if they were a cron job.

In short, use cron
In your comment to another answer, you said:
I want to execute php function every week one time
In this case, you have one main option (assuming you are using *nix) and that is cron (I don't know what the Windows alternative is). Cron is specifically designed for this function, and whether or not you choose to use it, it is most likely running on your server anyway (for other system functions) so speed is not an issue.
Don't use Node.js
Node.js is an alternative serverside technology to PHP. You would use it server side instead of php. If you're already using PHP, then forget it. The only reason Node.js has been mentioned is because you've asked about JavaScript.
Also, for a weekly timing event, A JavaScript timer wouldn't be a good idea. The setTimeout() function works in milliseconds, and is good for working in seconds and minutes (possibly hours), but not weeks.
If you were to use serverside JavaScript (like Node.js), you would probably need to do something similar to the PHP Alternative below.
PHP Alternative
Of course, depending on your hosting environment (especially cheaper ones), cron may not be available. In this case you would have to come up with a different strategy, and you would probably be best to use PHP. Something that I've seen done before goes along these lines:
Have some register of jobs that need to be performed. (In a database, or a file, or whatever)
Every time you run your main PHP script (usually index.php), check the register to see if there are any outstanding jobs.
2.a. Run the job.
2.b. Update the register, so you remember the last time the job was performed.
Pros:
It works if you don't have access to cron.
Cons:
If your script is not run very often (because this method relies on people visiting your page), your jobs may not be run as often as you like.
If your script is run very often, you will suffer unnecessary overhead in your script.
If your jobs take a long time to run, it will effect the page load times.
You're basically replicating cron, but using PHP which is far less efficient than using cron.
It's unlikely (unless you invest a lot of time) that you'll develop a solution that is as good as cron.

For a javascript timing event to run you would need to open the webpage. That means you have to expose that page publicly. You don't want to do that. Cron jobs are easy and effective. I like them. You should do that.

Related

Run jobs on FCFS basis in Nodejs from a database

I am developing a NodeJS application wherein a user can schedule a job (CPU intensive) to be run. I am keeping the event loop free and want to run the job in a separate process. When the user submits the job, I make an entry in the database (PostgreSQL), with the timestamp along with some other information. The processes should be run in the FCFS order. Upon some research on stackoverflow, I found people suggesting Bulljs (with Redis), Kue, RabbitMQ, etc. as a solution. My doubt is why do I need to use those when I can just poll the database and get the oldest job. I don't intend to poll the db at a regular interval but instead only when the current job is done executing.
My application does not receive too many simultaneous requests. And also users do not wait for the job to be completed. Instead they logout and are notified through mail when the job is done. What can be the potential drawbacks of using child_process (spawn/exec) module as a solution?

My doubt is why do I need to use those when I can just poll the database and get the oldest job.
How are you planning on handling failures? What if Node.js crashes with a job mid-progress, would that effect your users? Would you then retry a failed job? How do you support back-off? How many attempts before it should completely stop?
These questions are answered in the Bull implementation, RabbitMQ and almost every solution you'll find for your current challenge.
From what I noticed (child_process), it's a lower level implementation (low-level in Node.js), meaning that a lot of the functionality you'll typically require (failover/backoff) isn't included. You'll have to implement this.
That's where it usually becomes more trouble than it's worth, although admittedly managing, monitoring and deploying a Redis server may not be the most optimal solution either.
Have you considered a different approach, how would a periodic CRON job work? (For example).
The challenge with such a system is usually how you plan to handle failure and what impact failure has on your application and end-users.
I will say, in the defense of Bull, for a CPU intensive task I prefer to have a separated instance of the worker process, I can then re-deploy that single process as many times as I need. This keeps my back-end code separated and generally easier to manage, whilst also giving me the ability to easily scale up/down when required.
EDIT: I mention "more trouble than it's worth", if you're looking to really learn how technology like this is developed, go with child process and build your own abstractions on-top, if it's something you need today, use Bull, RabbitMQ or any purpose-built alternative.

Node.js bot with Cheerio

I have some .js files that take information from web pages with Cheerio, but what I want to do is give them kind of setTimeout like 1 day period to restart themselves if there is new data comes or not. I guess I shouldn't do with setTimeout because I'll have 15-20 files bot that getting data, I should use thread but how am I going to use them like service.

I would recommend using cron for node, its an implementation of cron and is really simple to use. This will allow you to schedule tasks to be ran when you want them to. It will also allow you to schedule tasks with out overloading your server with setTimeout but from what you say you wont have many so it wont make too much of an impact.

Actually, 15-20 sounds fine for me to use setTimeout.
I thought you might want to check some cron tools like: https://www.npmjs.com/package/node-schedule and then schedule your crawlers to rescan targets as you need, as this would be more efficient.

Error 504, avoid it with some data passing from server to client?

I'm developing an app that should receive a .CSV file, save it, scan it, and insert data of every record into DB and at the end delete the file.
With a file with about 10000 records there aren't problems but with a larger file the PHP script is correctly runned and all data are saved into DB but is printed ERROR 504 The server didn't respond in time..
I'm scanning the .CSV file with the php function fgetcsv();.
I've already edit settings into php.ini file (max execution time (120), etc..) but nothing change, after 1 minute the error is shown.
I've also try to use a javascript function to show an alert every 10 seconds but also in this case the error is shown.
Is there a solution to avoid this problem? Is it possible pass some data from server to client every tot seconds to avoid the error?
Thank's

Its typically when scaling issues pop up when you need to start evolving your system architecture, and your application will need to work asynchronously. This problem you are having is very common (some of my team are dealing with one as I write) but everyone needs to deal with it eventually.
Solution 1: Cron Job
The most common solution is to create a cron job that periodically scans a queue for new work to do. I won't explain the nature of the queue since everyone has their own, some are alright and others are really bad, but typically it involves a DB table with relevant information and a job status (<-- one of the bad solutions), or a solution involving Memcached, also MongoDB is quite popular.
The "problem" with this solution is ultimately again "scaling". Cron jobs run periodically at fixed intervals, so if a task takes a particularly long time jobs are likely to overlap. This means you need to work in some kind of locking or utilize a scheduler that supports running the job sequentially.
In the end, you won't run into the timeout problem, and you can typically dedicate an entire machine to running these tasks so memory isn't as much of an issue either.
Solution 2: Worker Delegation
I'll use Gearman as an example for this solution, but other tools encompass standards like AMQP such as RabbitMQ. I prefer Gearman because its simpler to set up, and its designed more for work processing over messaging.
This kind of delegation has the advantage of running immediately after you call it. The server is basically waiting for stuff to do (not unlike an Apache server), when it get a request it shifts the workload from the client onto one of your "workers", these are scripts you've written which run indefinitely listening to the server for workload.
You can have as many of these workers as you like, each running the same or different types of tasks. This means scaling is determined by the number of workers you have, and this scales horizontally very cleanly.
Conclusion:
Crons are fine in my opinion of automated maintenance, but they run into problems when they need to work concurrently which makes running workers the ideal choice.
Either way, you are going to need to change the way users receive feedback on their requests. They will need to be informed that their request is processing and to check later to get the result, alternatively you can periodically track the status of the running task to provide real-time feedback to the user via ajax. Thats a little tricky with cron jobs, since you will need to persist the state of the task during its execution, but Gearman has a nice built-in solution for doing just that.
http://php.net/manual/en/book.gearman.php

Background processes in Node.js

What is a good aproach to handle background processes in a NodeJS application?
Scenario: After a user posts something to an app I want to crunch the data, request additional data from external resources, etc. All of this is quite time consuming, so I want it out of the req/res loop. Ideal would be to just have a queue of jobs where you can quickly dump a job on and a daemon or task runner will always take the oldest one and process it.
In RoR I would have done it with something like Delayed Job. What is the Node equivalent of this API?

If you want something lightweight, that runs in the same process as the server, I highly recommend Bull. It has a simple API that allows for a fine grained control over your queues.
If you're familiar with Ruby's Resque, there is a node implementation called Node-resque
Bull and Node-resque are all backed by Redis, which is ubiquitous among Node.js worker queues. They would be able to do what RoR's DelayedJob does, it's matter of specific features that you want, and your API preferences.

Background jobs are not directly related to your web service work, so they should not be in the same process. As you scale up, the memory usage of the background jobs will impact the web service performance. But you can put them in the same code repository if you want, whatever makes more sense.
One good choice for messaging between the two processes would be redis, if dropping a message every now and then is OK. If you want "no message left behind" you'll need a more heavyweight broker like Rabbit. Your web service process can publish and your background job process can subscribe.
It is not necessary for the two processes to be co-hosted, they can be on separate VMs, Docker containers, whatever you use. This allows you to scale out without much trouble.

If you're using MongoDB, I recommend Agenda. That way, separate Redis instances aren't running and features such as scheduling, queuing, and Web UI are all present. Agenda UI is optional and can be run separately of course.
Would also recommend setting up a loosely coupled abstraction between your application logic and the queuing / scheduling system so the entire background processing system can be swapped out if needed. In other words, keep as much application / processing logic away from your Agenda job definitions in order to keep them lightweight.

I'd like to suggest using Redis for scheduling jobs. It has plenty of different data structures, you can always pick one that suits better to your use case.
You mentioned RoR and DJ, so I assume you're familiar with sidekiq. You can use node-sidekiq for job scheduling if you want to, but its suboptimal imo, since it's main purpose is to integrate nodejs with RoR.
For worker daemonising I'd recommend using PM2. It's widely used and actively-maintained. It solves a lot of problems (e.g. deployment, monitoring, clustering) so make sure it won't be an overkill for you.

I tried bee-queue & bull and chose bull in the end.
I first chose bee-queue b/c it is quite simple, their examples are easy to understand, while bull's examples are bit complicated. bee's wiki Bee Queue's Origin also resonates with me. But the problem with bee is <1> their issue resolution time is quite slow, their latest update was 10 months ago. <2> I can't find an easy way to pause/cancel job.
Bull, on the other hand, frequently updates their codes, response to issues. Node.js job queue evaluation said bull's weakness is "slow issues resolution time", but my experience is the opposite!
But anyway their api is similar so it is quite easy to switch from one to another.

I suggest to use a proper Node.js framework to build you app.
I think that the most powerful and easy to use is Sails.js.
It's a MVC framework so if you are used to develop in ROR, you will find it very very easy!
If you use it, It's already present a powerful (in javascript terms) job manager.
new sails.cronJobs('0 01 01 * * 0', function () {
sails.log.warn("START ListJob");
}, null, true, "Europe/Dublin");
If you need more info not hesitate to contact me!

Method to 'compile' javascript to hide the source during page execution?

I wanted to hide some business logic and make the variables inaccessible. Maybe I am missing something but if somebody can read the javascript they can also add their own and read my variables. Is there a way to hide this stuff?

Any code which executes on a client machine is available to the client. Some forms of code are harder to access, but if someone really wants to know what's going on, there's no way you have to stop them.
If you don't want someone to find out what code is being run, do it on a server. Period.

That's one of the downsides of using a scripting language - if you don't distribute the source, nobody can run your scripts!
You can run your JS through an obfuscator first, but if anyone really wants to figure out exactly what your code is doing, it won't be that much work to reverse-engineer, especially since the effects of the code are directly observable in the first place.

Javascript cannot be compiled, that is, it is still Javascript.
But, there's this: http://dean.edwards.name/packer/
Generally, this is used to reduce the code footprint of the Javascript, if say your script is being downloaded thousands of times per minute. There are other methods to accomplish this, but as for hiding the code this sort of works.
Granted, the code can be unpacked. This will keep out a novice but anyone who is determined to read your source code will find a way.
It is even this way with compiled languages, even when they have been obfuscated. It's impossible to hide your code 100% of the time -- if it executes on your machine, it can be read by a determined hacker.

You could encrypt it so no one can read it.
For example
http://daven.se/usefulstuff/javascript-obfuscator.html

You must always validate the data you send back. I've had a rather entertaining time playing pranks on a forum I'm a mod of by manipulating the pages with the Web Developer Toolbar. Whether or not you obfuscate it, always assume that data coming to the server has been intentionally manipulated. Only after you prove it hasn't (or verify the user has permission to act) do you handle the request.

We Keep Coding

JavaScript is the programming language of the Web.