Our team has a production-level Meteor app. In the app we have a particular Meteor method which sends an email. Today it sent 127 emails from one click of the Submit button (over the course of about 20 minutes).
I cannot post the exact code but the basic flow is pretty straightforward:
We catch the submit event and send everything to the server via Meteor.call
The Meteor method sends a request to a service to render a PDF
The Meteor method sends a request to SendGrid with the attachments and other email data
The Meteor method returns and triggers the callback
We don't really have much basis for determining the exact problem and are researching but are suspecting it is partly due to the end user's connection timing out and Meteor re-sending requests for which it did not get any response.
There are two threads we found related to the problem: https://groups.google.com/forum/#!topic/meteor-talk/vu5kk3t0Lr4 and https://github.com/meteor/meteor/issues/1285
Which both answer that methods should be idempotent. Obviously sending an email directly from a method is not idempotent so we proposed that the Meteor method should add these emails to a queue and have a different service process the queue on a schedule. However, we do not want to start implementing solutions that might help solve the problem.
So, this leaves me with two questions:
What exactly would cause a Meteor method to be called 127 times? How do we prevent this from happening? Is this a bug in Meteor or a bug in our app?
If we update the method so it uses EmailQueue.insert(...) (and let something else process the queue) does that just mean we, in this case, would put 127 records into the queue instead? Is the only solution here having some sort of lock to ensure duplicate records are not processed/inserted?
Thank you for any insight.
I would recommend you post the code, the process may be simple but there's usually small subleties that can help determine what's going on, i.e is it async/is there anything that may be blocking.
What may cause it
If Meteor doesn't respond to your browser it thinks it has not yet fired the call and it re-calls it. At least this is what I can gather may be happening using the information you've provided.
To get passed this ensure that
When sending the email you either use this.unblock (http://docs.meteor.com/#method_unblock) or use asynchronous JS with any of the proccesses you're doing
Ensure that nothing happening is blocking the main thread for your app. (It's hard to tell what it could be without any code)
Errors. If meteor restarts and reconnects it will prompt the browser to re-send the Meteor.call
The big clue here is that it took 20 mins, so its very likely to any of these.
How to check. Check the Network tab in your browser to see what's being sent to the server. Is the method being called multiple times? Are there re-connection attempts?
To answer the second question on whether using an Email Queue would help, it may not, it depends more on what's causing the initial problem.
Additionally: I think SendGrid automatically queues the emails anyway so you could send them all the emails at once and they queue them so they send out depending on your limits with them. (Not sure on this since I used them quite a while back)
Related
Background
I'm using aws-amplify to interact with Cognito. So when a user registers with my app, I call Auth.signUp(). I'm passing only username (email) and password to this function.
My user pool is configured to allow sign in by email only:
The Bug?
In my front end code, I accidentally registered an event listener twice, so Auth.signUp() was being called twice (concurrently, or at least in rapid succession) with the same parameters.
This resulted in two users being created in my User Pool, with the same email. My understanding of my user pool configuration suggests that this shouldn't be possible.
Race Condition?
My first thought was that since I'm sending two requests so close together, this may be some sort of unavoidable race condition. If I introduce an artificial pause between the calls (a breakpoint, or a setTimeout, say), everything works as expected.
However, even with the requests very tightly spaced, the second request does return the error response I'd expect:
{ code: 'InvalidParameterException',
name: 'InvalidParameterException',
message: 'Alias entry already exists for a different username'
}
Sadly, this response is misleading, because I do get a second (duplicate) user created in my pool with this request.
MCVE
This is easy to reproduce by exercising Auth.signUp twice concurrently, either in a node script or a browser. This repository contains examples of both.
The Question(s)
Is this a legitimate bug with Cognito?
Is a preSignUp Lambda trigger my only way to defend against this? If so, what would the broad strokes of that implementation look like?
I sent this to AWS support. They're aware of the issue but have no ETA.
Thanks for contacting AWS Premium Support. I understand that you
would like to know whether Cognito team is aware of the issue posted
here[1].
I checked with Cognito team on our end and YES, they are aware of this
issue/bug. Good news is, we already have trouble ticket open with
Cognito Team to fix the issue. However, I won't be able to provide an
ETA on when this fix will go live as I don't have any visibility into
their development/release plans. But, I would like to thank you for
your valued contribution in bringing this issue to our attention, I do
appreciate it.
I talked to AWS, still no fix and no time estimation.
Cognito limits usernames to one user only. However, yes multiple user can share an email.
I have an application that 99% of the time functions correctly. It's a relatively simple checkout system - the user submits a form, it runs through validation (all fields contain something), then fires against a payment processor. The payment processor uses an API to process the order and returns error or success responses, error returning a message, success passing the user to a 'Thank You' page with the order information.
The problem we're having is we're hearing about customers who say that when it starts processing a message appears (it's supposed to) in an overlay, then just hangs there. I've coded in a timeout which is supposed to wait 25 seconds, then send the user to the success page (minus any success information) which then tells them there was an error. However, in a small number of instances this is not happening.
I've tested this on the gauntlet of browsers and cannot replicate it, so I'm wondering...
If it's possible that a toolbar or plugin on the browser could be preventing the scripts from running correctly.
If there's some way I can programatically check for errors like this and push the user on regardless.
Here's the code for reference: http://jsfiddle.net/XaP7z/
I know this is a long-winded and somewhat vague question but I'm grasping at straws and the client is not happy (regardless of this being a <1% issue).
So, I'm creating this application that sometime it require pulling the feed and it's always timeout on heroku because of the xml parser takes time. So, I change to be asynchronous load via Ajax every time the page is loaded. I still get H12 error from my Ajax call. Now I'm thinking of using Resque to run the job in background. I can do that no problem but how would I know that the job is finished so I can pull the processed feed on to the html page via AJAX?
Not sure if my question is clear, so how would the web layer knows that the job is done and it should signal e.g (onComplete in javascript) to populate the content on the page?
There are a number of ways to do this
The JavaScript can use AJAX to poll the server asking for the results and the server can respond with 'not yet' or the results. You keep asking until you get the results.
You could take a look at Juggernaut (http://juggernaut.rubyforge.org/) which lets your server push to the client
Web Sockets are the HTML5 way to deal with the problem. There are a few gems around to get you started Best Ruby on Rails WebSocket tool
You have an architecture problem here. The reason for the H12 is so that the user is not sat there for more than 30 seconds.
By moving the long running task into a Resque queue, you are making it disconnected to the front end web process - there is no way that the two can communicate due to process isolation.
Therefore you need to look at what you are doing and how. For instance, if you are pulling a feed, are you able to do this at some point before the user needs to see the output and cache the results in some way - or are you able to take the request for the feed from the user and then email them when you have the data for them to look at etc etc.
The problem you have here is that your users are asking for something which takes longer than a reasonable amount of time to complete, so therefore you need to have a good look at what you are doing and how.
I have been working on having a instant messaging system on a website(kind of like Facebook and Gmail). I have javascript poll the server for new messages.
If the user has multiple instances of the site open is there any way to prevent each one from making requests?
You can assign each "new" load of the page with a UUID, and drop requests from all UUIDs that are not the most recent one for user. You need to send the UUID back in each request. If you want to get advanced, you can have the JavaScript on the page check the response to see if the server says it's an old UUID, and that it should stop making the requests.
Register each connection with a GUID generated on the fly in the browser. Check the GUID and the username pair to see which page was owner last. On page load, declare yourself a new window and that you're taking ownership. Sort of PageJustLoadedMakeMeOwner(myGuid, username)
Then have that GUID targeted frame update the server regularly for it's ownerness of the page.
If it stops updating the server, then have rules in the server that allow the next page to contact to take ownership of for that username.
Have pages that have lost ownership self-demote to only accessing once a minute or so.
The response to check if a given page is owner of that username is really fast. Takes almost no time to do, as far as the client is aware. So the AJAX there doesn't really restrict you.
Sort of a AmIOwner(username, myGuid) check (probably do this every five seconds or so). If true, then do the thing that you want to happen. If false, then poll to see if the owner of the page is vacant. If true, then take ownership. If false, then poll again in xx amount of seconds to see if the owner is vacant.
Does that make any sort of sense?
You could do something for multiple instances in the same browser, but there's nothing you can do if the user has multiple browsers. (Granted, not that common scenario)
If you still want to give it a try, probably the easiest way would be to keep a timestamp of the last request in a cookie and make new request only upon a certain threshold. You still might run a small race until the multiple instance s settle down, but if you use fuzzy time period for the polls, the instances should settle down pretty quickly to a stable state where one of the instances makes the call and the others reuse the result from the last call.
The main advantage of that approach is that the requests can be made by any of the instances, so you don't have to worry about negotiating a "primary" instance that makes the calls and figuring a fallback algorithm if the user closes the "primary" one. The main drawback is that since it's a fuzzy timing based algorithm, it does not fully eliminate the race conditions and occasionally you'll have two instances make the requests. You'll have to fine tune the timing a bit, to minimize that case, but you can't fully prevent it.
I'm currently fooling around with AJAX. Right now, I created a Markdown previewer that updates on change of a textarea. (I guess you know that from somewhere... ;-) ).
Now, I'm trying to figure out, how to update a page upon an event is fired from another client. So to say an asynchron message board. A user writes something, an event is called, the post is written.
But on the other clients' pages, the new post is of course not yet available until they reload and get the updated list of posts from the database.
Now, how can you get this to work asynchronously? So in that moment when one client does something, the other clients all get to know that he did something?
I don't think this can be done completely in AJAX, but I also have no idea whatsoever how to implement this on server-side, as it would require a page reload to inform the other clients of the event.
I'm thinking of creating a file or database entry that hashes the current state of data. Whenever a client loads the page, he saves this hash. Then, a timer (does this exist in JavaScript?) checks for the hash every few seconds.
As soon as anyone changes the databse, the hash is recalculated. If the script sees that the hash was changed and is different to the one saved, it reloads the contents form the database and saves the new hash.
Is that even going to work?
Polling that is light as possible is really the best solution here. Even if you did use a socket or something... That's still basically a live connection waiting around that will likely have to poll itself (albeit in a more effecient way).
20 queries in 10 minutes that have responses like {"updates":false} shouldn't even be putting a dent in your application. I mean imagine someone browsing your site requesting 20 pages and the related images/scripts/etc (even if some caching is involved), there could easily be hundreds of requests requiring all sorts of wasted database queries to information to be displayed on the page they don't actually care about.
You could use polling. For example each client might be sending continuous AJAX requests to the server say each 30 seconds to see if new posts are available and if yes, show them:
setInterval(function() {
// TODO: Send an AJAX request here to the server and fetch new posts.
// if new posts are available update the DOM
}, 30 * 1000);
On the other hand when someone decides to write a new post you send an AJAX (or not AJAX) request to the server to store this post in the database.
Another less commonly used approach is the concept of Comet and the HTML 5 WebSockets implementation which allow the clients to be notified by the server of changes using push.