I'm developing an email reminder system to ping users 24, 8, 3, and 1 hour before a task is due.
I have a server that runs on Node.js. My first idea was to set four separate setTimeout() each time a user is assigned a task. However, I assume that having hundreds of setTimeout() idling on a server wouldn't be best performance-wise.
As such, would I be better off polling for incomplete tasks every five minutes and sending reminders to users who have tasks with approaching deadlines? The downside here is that I would be reading an entire MongoDB collection every five minutes.
Nodejs is very efficient with lots and lots of timers. You can easily have tens of thousands of timers with no meaningful ramifications. It uses a sorted, linked list that takes only a tiny amount of time to insert a new timer and costs nothing once inserted.
Only the next timer to fire at the start of the list is regularly compared in the event loop. When it fires, it's removed from the front of the linked list and the next timer in the list is now at the head. Because it's a linked list, the time to fire a timer and remove it from the start of the linked list is independent of how long the list is (e.g. it's not an array that has to be copied down).
So, for your specific application, it is far, far more important to be efficient with your database (as few requests as possible) than it is to minimize the number of timers. So, whichever timer design/implementation optimizes your database load is what I would recommend.
FYI, if you want to remind a user 4 times about an approaching due task, you can still only have one timer live at a time per task. Set the first timer to fire, then when it fires to notify, you do a little time calculation on the due date/time that you previously saved and see when to set the next timer for. That would leave you with just one timer per task rather than four.
But, the main point here is still that you should first optimize the design for efficient use of the database.
And, timers are not persistent so if your server restarts, you need a mechanism for recreating the appropriate timers upon server startup (probably a database query that provides you any tasks pending within a certain time).
Related
I have a use case where I have to create a record in the database for users repeatedly on a scheduled basis. Let's say every Monday Weekly/BiWeekly. There are two ways with which i can achieve it.
Using Database Triggers to Create a record on the time. But I don't know how to repeat it. I have to create a trigger for the next schedule when this trigger runs, which i don't think is right approach.
Using Queues to handle the scheduling and executing the repeated jobs. But adding a job for each user is not a good idea I guess. I might be wrong but there is no other way to achieve my goal.
I am confused on what to choose between the two. Let's say i have to do this for 1 million users every week Monday at 9.00 a.m.
Which approach will scale?
I am using nodejs as my backend and using Bull-Queue for the queue and postgres as my Database.
Using Database Triggers to Create a record on the time. But I don't
know how to repeat it. I have to create a trigger for the next
schedule when this trigger runs, which i don't think is right
approach.
- Not a right approach based on the so many factors like - memory, number of requests and code quality.
So I went with the second approach:
Using Queues to handle the scheduling and executing the repeated jobs.
But adding a job for each user is not a good idea I guess. I might be
wrong but there is no other way to achieve my goal.
Lets say we have a simple example as below.
<input id="filter" type="text" />
<script>
function reload() {
// get data via ajax
}
$('#filter').change($.debounce(250,reload));
</script>
What we're doing is introducing a small delay so that we reduce the number of calls to reload whilst the user is typing text into the input.
Now, I realise that this will depend on a case by case basis but is there an accepted wisdom of how long the debounce delay should be, given an average (or maybe that should be lowest common denominator) typing/interaction speed. I generally just play around with the value until it "feels" right, but I may not represent a typical user. Has anyone done any studies on this?
As you hinted at, the answer depends on a number of factors - not all of them subjective.
In general the reason for making use of a debounce operation can be summed up as having one of two purposes:
Reducing the cost of providing dynamic interactive elements (where cost can be computational, IO, network or latency and may be dictated by the client or server).
Reducing visual "noise" to avoid distracting the user with page updates while they are busy.
Reaction Times
One important number to keep in mind is 250ms - this represents the (roughly) median reaction time of a human and is generally a good upper bound within which you should complete any user interface updates to keep your site feeling responsive. You can view some more information on human reaction times here.
In the former case, the exact debounce interval is going to depend on what the cost of an operation is to both parties (the client and server). If your AJAX call has an end to end response time of 100ms then it may make sense to set your debounce to 150ms to keep within that 250ms responsiveness threshold.
On the other hand, if your call generally takes 4000ms to run, you may be better off setting a longer debounce on the actual call and instead using a first-layer debounce to show a loading indicator (assuming that your loading indicator doesn't obscure your text input).
$('#filter').change($.debounce(250, show_loading));
$('#filter').change($.debounce(2000, reload));
Backend Capacity
It is also important to keep in mind the performance cost of these requests on your backend. In this case, a combination of average typing speed (about 44 words per minute, or roughly 200 characters per minute) and knowledge of your user base size and backend capacity can enable you to select a debounce value which keeps backend load manageable.
For example: if you have a single backend capable of handling 10 requests per second and peak active user base of 30 (using this service), you should select your debounce period such that you avoid exceeding 10 requests per second (ideally with a margin of error). In this case, we have 33.3% of the capacity required to handle one input per user per second, so we ideally would serve at most one request per user every 3 seconds, giving us our 3000ms debounce period.
Frontend Performance
The final aspect to keep in mind is the cost of processing on the client side. Depending on the amount of data you're moving and the complexity of your UI updates, this may be negligible or significant. One thing you want to try and ensure is that your user interface remains responsive to user input. That doesn't necessarily mean that it always needs to be able to react, however while a user is interacting with it, it should react rapidly to them (60FPS is generally the objective here).
In this case, your objective should be to debounce at a rate which prevents the user interface from becoming sluggish or unresponsive while the user is interacting with it. Again, statistics are a good way to derive this figure, but keep in mind that different types of input require different amounts of time to complete.
For example, transcribing a sentence of short words is generally a lot faster than entering a single long and complex word. Similarly, if a user has to think about what they are entering they will tend to type slower. The same applies for the use of special characters or punctuation.
Subjective Answer
In practice, I've used debounce periods which range from 100ms for data that is exceptionally quick to retrieve and presents very little impact on performance through to 5000ms for things that were more costly.
In the latter case, pairing a short, low-cost debounce period to present the user with feedback and the longer period for actual computational work tends to strike a good balance between user experience and the performance cost of wasted operations.
One notable thing I try to keep in mind when selecting these values is that, as someone who works with a keyboard every day, I probably type faster than most of my user base. This can mean that things which feel smooth and natural to me are jarring for someone who types slower, so it's a good idea to do some user testing or (better yet) gather metrics and use those to tune your interface.
I'd like to offer a succinct answer regarding search text input.
I usually do 300ms, which just feels right considering both saving hardware resources and providing a good enough user experience.
--
An interesting thought...
Let's take an example from one of the master's: Google.
If you notice (you have to be a quick typist), Google actually has very little or no debounce time for the first couple characters (2 I think), but after that it increases the debounce time. Their main goal, obviously, is to give an instantaneous feel and balance their UI with use cases. I don't know what data or studies they've done, but they've done them and are a good example when a search input is the primary function of a site.
With that, I'd say this is an excellent user experience, albeit extra complexity and programming time. Google needs it. A less frequently used search bar could go without extra complexity, perhaps.
I want to measure the time it takes for a user to complete a task (answer a quiz). I want to measure it accurately, without the network lag. Meaning, if I measure on the server side the time between 2 requests, it won't be the real time it took the user, because the network time is factored in.
But on the other hand, if I measure in javascript and post the timestamps to the server, the user will be able to see the code, and cheat by sending false timestamps, no?
How can I get the timestamps in javascript and make sure the user doesn't fake it?
Generally in client side code, any question that starts off with "How to securely..." is answered with "Not possible". Nothing, not even putting variables in a closure (because I, the evil cheating user could just change the code on my end and send it back to you).
This is the kind of validation that should be performed server side, even with the disadvantage of network latency.
The trick here would be to measure the time using JavaScript, but also keep track of it using server-side code. That way, you can rely on the timestamps received by the client as long as you enforce a maximum difference between calculated times. I'd say a few seconds should be good enough. However, by doing so, you are creating an additional vector for failure.
Edit: A user could potentially tweak his or her time in their favor by up to the maximum enforced difference if they are able to take advantage of the (lack of) network lag.
I faced same problem while designing an online examination portal for my project.
I went for a hybrid approach.
Get time from server as user loads the page, and starts timer based on javascript. Record the start time in your database.
Let the timer run on client side for some time, say 30 seconds.
Refresh timer by making a AJAX call to server for timer reset as per the time that has passed already.
NOTE: try to use external javascript and obfuscate the code of timer to make guessing difficult.
This way you may not prevent user completely from modifying timer, but you can limit max possible error to 30s.
I have just stepped into the world of Web Development, and I am developing a small browser game that simply allows connected users to take control of an object (a triangle currently!), and simply move around the screen area.
Currently, I store the clients co-ordinate position in a MySQL database, and update that position using AJAX, roughly 30 times per second.
Other clients positions are also polled roughly 30 times per second.
My problem however, is that this seems to be causing an hour long IP lockout for the client, which I assume is automatically occurring on my Host's end. Would this perhaps be a normal default precautionary action? I was under the impression that 30 AJAX polls in a second was not a particularly stressful amount, however as I mentioned this is a new field for me. I'm fearful I've created some miniscule DOS attack!
If so, I would be grateful if someone with experience in this matter could point me to a more efficient method of handling the kind of interactivity I have described. This is all leading up to a six-month project I will be working on alone for my final year University project, so I'm more than happy to put the extra hours in to learn a better solution.
What you should do is known as "hybrid-polling". Basically you have a long running method server side which is running an "infinite" loop which runs once every 33ms (30 times per second). This loop will shoot data out to a part of your front end if the data has changed. When the data gets to be too large in the buffer for the method to handle, the method exits. The whole time your client is polling to see if new data was written. If the method exits, the client must restart the method. This is a hybrid approach, where the client polling is only checking client side data, except when the method exits, in which case the client must poll again to restart the server method, which then runs once every 33ms and pushes data out to the client.
Look up Comet (compatible with older browsers but not as efficient as possible), BOSH, or Web Sockets (ideal but not compatible with older browsers) for other approaches.
If I make a live countdown clock like ebay, how do I do this with django and sql? I'm assuming running a function in django or in sql over and over every second to check the time would be horribly inefficient.
Is this even a plausible strategy?
Or is this the way they do it:
When a page loads, it takes the end datetime from the server and runs a javascript countdown clock against it on the user machine?
If so, how do you do the countdown clock with javascript? And how would I be able to delete/move data once the time limit is over without a user page load? Or is it absolutely necessary for the user to load the page to check the time limit to create an efficient countdown clock?
I don't think this question has anything to do with SQL, really--except that you might retrieve an expiration time from SQL. What you really care about is just how to display the timeout real-time in the browser, right?
Obviously the easiest way is just to send a "seconds remaining" counter to the page, either on the initial load, or as part of an AJAX request, then use Javascript to display the timer, and update it every second with the current value. I would opt for using a "seconds remaining" counter rather than an "end datetime", because you can't trust a browser's clock to be set correctly--but you probably can trust it to count down seconds correctly.
If you don't trust Javascript, or the client's clock, to be accurate, you could periodically re-send the current "seconds remaining" value to the browser via AJAX. I wouldn't do this every second, maybe every 15 or 60 seconds at most.
As for deleting/moving data when the clock expires, you'll need to do all of that in Javascript.
I'm not 100% sure I answered all of your questions, but your questions seem a bit scattered anyway. If you need more clarification on the theory of operation, please ask.
I have also encountered the same problem a while ago.
First of all your problem is not related neither django nor sql. It is a general concept and it is not very easy to implement because of overhead in server.
One solution come into my mind is keeping start time of the process in the database.
When someone request you to see remaingn time, read it from database, subtract the current time and server that time and in your browser initialize your javascript function with that value and countdown like 15 sec. After that do the same operation with AJAX without waiting user's request.
However, there would be other implementations depending your application. If you explain your application in detail there could be other solutions.
For example, if you implement a questionnaire with limited time, then for every answer submit, you should pass the calculated javascript value for that second.