Node.js / nowjs - A lot of users

Node.js / nowjs - A lot of users - javascript

I've made a server in nowjs, and with about 80 users online it get slow and sometimes people get disconnect. I've heared about that I have to change the workers count. But how to do it? and is it a solution? Or maybe there are another advices.

Since you mentioned writing log data to file and is larger, make sure you're using right Node asynch file i/o so is not blocking -- can use with optional callbacks. Better yet, creating a write stream is the way to go (Node is great for it's asynch file streaming capabilities).

You may have hit a scaling issue, 80 users seems low to me.
Are you sure you are not doing any kind of logic on your server side that could be blocking ?
Any math or something that require too much time ?
If you have a scaling issue, you may need to horizontally scale you app.
To do so you would have to use something like node cluster to have multiple workers handling the work, and a Redis or a Mongo used for handling shared the data, it might be possible to do using message in node cluster.
I've not push now.js that far yet. I don't know how it would handle in such a situation.

Related

Uploading Images Via API ( Javascript )

Well, I'm kinda stuck in a ( maybe simple ) problem. I would like to use a database / cloud sever with a Nest.Js API for a main purpuse to store images. It's my firt time experimenting with such concept and I would like to know first of all the 'best' solution. For example, is it ok to use something like MongoDB or Firebase Storage? If not, what would be the ideal solution to follow? Is it even good to use Nest for such a thing?
More specificly; ( How I think / plan for the project )
I would like to use a nonSQL database ( pref. MongoDB ) to store some data about a certain object. In this object I want a photo as well as part of it. So I am pretty sure I can't mix these types of data ( storing strings/numbers/etc AND images ) so maybe I could store a location to another DB ( Collection ) / Cloud Server and retrieve it from there.
Is this practical? How is this consept suposed to work?
So far I've managed to obdain to store images with the help of Multer in a local /uploads folder. But I don't really like this solution. After a certain amount of images posted my server will ( probably ) run slower and slower ( ???? ).
Sorry for the silly question. But I'm kinda stuck here for a few days... :(

The way you're doing it with filesystem is perfectly fine. Behind the hood, tools like mongodb are implemented using a file system anyways. Using server's file system can pose scalability issues when you need to spread application load over multiple servers, but for a single server solution it will have the highest performance due to minimal IO overhead.
I recommend using an object storage system (like amazon s3 or gcp bucket), if you want to take scalability seriously. All your files will be stored on a single managed system designed to scale, and you'll be able to treat each of your severs as stateless (and therefore provision multiple and guarantee you can access all files no matter which server handles each request).
These would be the two "industry standard" solutions. Not to say you cannot do this with mongo or firebase but they probably aren't as good solutions. The most important thing however, is to avoid storing files on a relational database, so as long as you aren't doing that it will be fine.
I'd say you're overthinking this. Stick with file system until you know you have actual scalability issues. Nestjs is highly modular you can implement your app in a way such that other modules don't "know" the underlying implementation of how you store files, so you can comfortably write your code with images stored on file system now then move later if needed.
Lastly, Nestjs is just a framework for writing HTTP servers. It isn't going to be any worse or better than any other HTTP server out there.

Error 504, avoid it with some data passing from server to client?

I'm developing an app that should receive a .CSV file, save it, scan it, and insert data of every record into DB and at the end delete the file.
With a file with about 10000 records there aren't problems but with a larger file the PHP script is correctly runned and all data are saved into DB but is printed ERROR 504 The server didn't respond in time..
I'm scanning the .CSV file with the php function fgetcsv();.
I've already edit settings into php.ini file (max execution time (120), etc..) but nothing change, after 1 minute the error is shown.
I've also try to use a javascript function to show an alert every 10 seconds but also in this case the error is shown.
Is there a solution to avoid this problem? Is it possible pass some data from server to client every tot seconds to avoid the error?
Thank's

Its typically when scaling issues pop up when you need to start evolving your system architecture, and your application will need to work asynchronously. This problem you are having is very common (some of my team are dealing with one as I write) but everyone needs to deal with it eventually.
Solution 1: Cron Job
The most common solution is to create a cron job that periodically scans a queue for new work to do. I won't explain the nature of the queue since everyone has their own, some are alright and others are really bad, but typically it involves a DB table with relevant information and a job status (<-- one of the bad solutions), or a solution involving Memcached, also MongoDB is quite popular.
The "problem" with this solution is ultimately again "scaling". Cron jobs run periodically at fixed intervals, so if a task takes a particularly long time jobs are likely to overlap. This means you need to work in some kind of locking or utilize a scheduler that supports running the job sequentially.
In the end, you won't run into the timeout problem, and you can typically dedicate an entire machine to running these tasks so memory isn't as much of an issue either.
Solution 2: Worker Delegation
I'll use Gearman as an example for this solution, but other tools encompass standards like AMQP such as RabbitMQ. I prefer Gearman because its simpler to set up, and its designed more for work processing over messaging.
This kind of delegation has the advantage of running immediately after you call it. The server is basically waiting for stuff to do (not unlike an Apache server), when it get a request it shifts the workload from the client onto one of your "workers", these are scripts you've written which run indefinitely listening to the server for workload.
You can have as many of these workers as you like, each running the same or different types of tasks. This means scaling is determined by the number of workers you have, and this scales horizontally very cleanly.
Conclusion:
Crons are fine in my opinion of automated maintenance, but they run into problems when they need to work concurrently which makes running workers the ideal choice.
Either way, you are going to need to change the way users receive feedback on their requests. They will need to be informed that their request is processing and to check later to get the result, alternatively you can periodically track the status of the running task to provide real-time feedback to the user via ajax. Thats a little tricky with cron jobs, since you will need to persist the state of the task during its execution, but Gearman has a nice built-in solution for doing just that.
http://php.net/manual/en/book.gearman.php

Background processes in Node.js

What is a good aproach to handle background processes in a NodeJS application?
Scenario: After a user posts something to an app I want to crunch the data, request additional data from external resources, etc. All of this is quite time consuming, so I want it out of the req/res loop. Ideal would be to just have a queue of jobs where you can quickly dump a job on and a daemon or task runner will always take the oldest one and process it.
In RoR I would have done it with something like Delayed Job. What is the Node equivalent of this API?

If you want something lightweight, that runs in the same process as the server, I highly recommend Bull. It has a simple API that allows for a fine grained control over your queues.
If you're familiar with Ruby's Resque, there is a node implementation called Node-resque
Bull and Node-resque are all backed by Redis, which is ubiquitous among Node.js worker queues. They would be able to do what RoR's DelayedJob does, it's matter of specific features that you want, and your API preferences.

Background jobs are not directly related to your web service work, so they should not be in the same process. As you scale up, the memory usage of the background jobs will impact the web service performance. But you can put them in the same code repository if you want, whatever makes more sense.
One good choice for messaging between the two processes would be redis, if dropping a message every now and then is OK. If you want "no message left behind" you'll need a more heavyweight broker like Rabbit. Your web service process can publish and your background job process can subscribe.
It is not necessary for the two processes to be co-hosted, they can be on separate VMs, Docker containers, whatever you use. This allows you to scale out without much trouble.

If you're using MongoDB, I recommend Agenda. That way, separate Redis instances aren't running and features such as scheduling, queuing, and Web UI are all present. Agenda UI is optional and can be run separately of course.
Would also recommend setting up a loosely coupled abstraction between your application logic and the queuing / scheduling system so the entire background processing system can be swapped out if needed. In other words, keep as much application / processing logic away from your Agenda job definitions in order to keep them lightweight.

I'd like to suggest using Redis for scheduling jobs. It has plenty of different data structures, you can always pick one that suits better to your use case.
You mentioned RoR and DJ, so I assume you're familiar with sidekiq. You can use node-sidekiq for job scheduling if you want to, but its suboptimal imo, since it's main purpose is to integrate nodejs with RoR.
For worker daemonising I'd recommend using PM2. It's widely used and actively-maintained. It solves a lot of problems (e.g. deployment, monitoring, clustering) so make sure it won't be an overkill for you.

I tried bee-queue & bull and chose bull in the end.
I first chose bee-queue b/c it is quite simple, their examples are easy to understand, while bull's examples are bit complicated. bee's wiki Bee Queue's Origin also resonates with me. But the problem with bee is <1> their issue resolution time is quite slow, their latest update was 10 months ago. <2> I can't find an easy way to pause/cancel job.
Bull, on the other hand, frequently updates their codes, response to issues. Node.js job queue evaluation said bull's weakness is "slow issues resolution time", but my experience is the opposite!
But anyway their api is similar so it is quite easy to switch from one to another.

I suggest to use a proper Node.js framework to build you app.
I think that the most powerful and easy to use is Sails.js.
It's a MVC framework so if you are used to develop in ROR, you will find it very very easy!
If you use it, It's already present a powerful (in javascript terms) job manager.
new sails.cronJobs('0 01 01 * * 0', function () {
sails.log.warn("START ListJob");
}, null, true, "Europe/Dublin");
If you need more info not hesitate to contact me!

AJAX or Socket.IO make more sense in my situation?

I have been working on something using AJAX that sometimes requires a couple POST's per second. Each post returns a JSON object (generated by the PHP file being posted to, ~11,000 bytes) and on average the latency is between 30ms and 250ms depending on if i'm on wifi or wired, but every roughly 1/15 calls it spikes up to about 4000ms. I am trying to find a way around this, as of right now I see two options:
Throw a timeout on the AJAX call, and have it call a GET on fail (the POST should still go through, it's the return trip that always times out) or...
Cut the entire thing down, learn node.js so I can use websockets to potentially rectify this issue.
Either solution as far as I can see is based on WHY the original call is failing. If it is something wrong with the AJAX call, then a new GET should be likely to go through, and solve the issue. but if it is something with the server itself then logically the GET would just time out as well, it's an issue with the server, and I'm dead in the water.
Since I have no experience yet at all with websockets I was hoping for some feedback on the best action to take next. Thanks for the input.
If it would help, I can very likely reduce the returning payload to 1/15 the size with some sneaky coding. would that make an impact?

WebSockets are a great option! Actually working with SocketIO is pretty simple and has a shallow learning curve.
Because the connection stays open, your requests skips the DNS lookup and routing for lower latency. This mean much lower overhead for each POST request you make.
If you ever forsee pushing data to your users, WebSockets are the de facto way to do it. Ajax polling is going out of style.
That said, you would have to port your back end logic to JavaScript. You would have to change your deployment strategy to a server that supports Node apps. You would have to deal with learning a new environment -- can also have some overhead.
Before you explore Node, consider the drawbacks above. I think it is a great bit of technology, but I would also look into the following approaches, especially if you are pressed for time.
1/15 size reduction is totally worth. Actually, it is worth it in both cases.
Can you do any sort of batching with your POST requests from the client side? If subsequent requests rely on the results of the previous POST request, you cannot do this. In this case, I strongly suggest using WebSockets.
All in all, there are always tradeoffs. If you aren't pressed for time, given Node and SocketIO and whirl, they are becoming very prevalent web technologies and are worth learning.

Cut the entire thing down, learn node.js so I can use websockets to potentially rectify this issue.
There is no point doing things with wrong tools. If you need real-time communication, use servers which support it out-of-the box, like node.js (probably the simplest to get into from PHP).
Since I have no experience yet at all with websockets
Get some framework atop of the raw websockets, like primus or socket.io and good luck ;)

What is the best way I can scale my nodejs app?

The basics
Right now a few of my friends and I are trying to develope a browser game made in nodejs. It's a multiplayer top-down shooter, and most of both the client-side and server-side code is in javascript. We have a good general direction that we'd like to go in, and we're having a lot of fun developing the game. One of our goals when making this game was to make it as hard as possible to cheat. Do do that, we have all of the game logic handled server-side. The client only sends their input the the server via web socket, and the server updates the client (also web socket) with what is happening in the game. Here's the start of our problem.
All of the server side math is getting pretty hefty, and we're finding that we need to scale in some way to handle anything more than 10 players (we want to be able to host many more). At first we had figured that we could just scale vertically as we needed to, but since nodejs is single threaded, is can only take advantage of one core. This means that getting a beefier server won't help that problem. Our only solution is to scale horizontally.
Why we're asking here
We haven't been able to find any good examples of how to scale out a nodejs game. Our use case is pretty particular, and while we've done our best to do this by ourselves, we could really benefit from outside opinions and advice
Details
We've already put a LOT of thought into how to solve this problem. We've been working on it for over a week. Here's what we have put together so far:
Four types of servers
We're splitting tasks into 4 different 'types' of servers. Each one will have a specific task it completes.
The proxy server
The proxy server would sit at the front of the entire stack, and be the only server directly accessible from the internet (there could potentially be more of these). It would have haproxy on it, and it would route all connections to the web servers. We chose haproxy because of its rich feature set, reliability, and nearly unbeatable speed.
The web server
The web server would receive the web-requests, and serve all web-pages. They would also handle lobby creation/management and game creation/management. To do this, they would tell the game servers what lobbies it has, what users are in that lobby, and info about the game they're going to play. The web servers would then update the game servers about user input, and the game server would update the web servers (who would then update the clients) of what's happening in the game. The web servers would use TCP sockets to communicate with the game servers about any type of management, and they would use UDP sockets when communicating about game updates. This would all be done with nodejs.
The game server
The game server would handle all the game math and variable updates about the game. The game servers also communicate with the db servers to record cool stats about players in game. This would be done with nodejs.
The db server
The db server would host the database. This part actually turned out to be the easiest since we found rethinkdb, the coolest db ever. This scales easily, and oddly enough, turned out to be the easiest part of scaling our application.
Some other details
If you're having trouble getting your head around our whole getup, look at this, it's a semi-accurate chart of how we think we'll scale.
If you're just curious, or think it might be helpful to look at our game, it's currently hosted in it's un-scaled state here.
Some things we don't want
We don't want to use the cluster module of nodejs. It isn't stable (said here), and it doesn't scale to other servers, only other processors. We'd like to just take the leap to horizontal scaling.
Our question, summed up
We hope we're going in the right direction, and we've done our homework, but we're not certain. We could certainly take a few tips on how to do this the right way.
Thanks
I realize that this is a pretty long question, and making a well thought out answer will not be easy, but I would really appreciate it.
Thanks!!

Following my spontaneous thoughts on your case:
Multicore usage
node.js can scale with multiple cores as well. How, you can read for example here (or just think about it: You have one thread/process running on one core, what do you need to use multiple cores? Multiple threads or multiple processes. Push work from main thread to other threads or processes and you are done).
I personally would say it is childish to develop an application, which does not make use of multiple cores. If you make use of some background processes, ok, but if you until now only do work in the node.js main event loop, you should definitely invest some time to make the app scalable over cores.
Implementing something like IPC is not that easy by the way. You can do, but if your case is complicated maybe you are good to go with the cluster module. This is obviously not your favorite, but just because something is called "experimental" it does not mean it's trashy. Just give it a try, maybe you can even fix some bugs of the module on the way. It's most likely better to use some broadly used software for complex problems, than invent a new wheel.
You should also (if you do not already) think about (wise) usage of nextTick functionality. This allows the main event loop to pause some cpu intensive task and perform other work in the meanwhile. You can read about it for example here.
General thoughts on computations
You should definitely take a very close look at your algorithms of the game engine. You already noticed that this is your bottleneck right now and actually computations are the most critical part of mostly every game. Scaling does solve this problem in one way, but scaling introduces other problems. Also you cannot throw "scaling" as problem solver on everything and expect every problem to disappear.
Your best bet is to make your game code elegant and fast. Think about how to solve problems efficiently. If you cannot solve something in Javascript efficiently, but the problem can easily be extracted, why not write a little C component instead? This counts as a separate process as well, which reduces load on your main node.js event loop.
Proxy?
Personally I do not see the advantage of the proxy level right now. You do not seem to expect large amount of users, you therefore won't need to solve problems like CDN solves or whatever... it's okay to think about it, but I would not invest much time there right now.
Technically there is a high chance your webserver software provides proxy functionality anyway. So it is ok to have it on the paper, but I would not plan with dedicated hardware right now.
Epilogue
The rest seems more or less fine to me.

Little late to the game, but take a look here: http://goldfirestudios.com/blog/136/Horizontally-Scaling-Node.js-and-WebSockets-with-Redis
You did not mention anything to do with memory management. As you know, nodejs doesn't share its memory with other processes, so an in-memory database is a must if you want to scale. (Redis, Memcache, etc). You need to setup a publisher & subscriber event on each node to accept incoming requests from redis. This way, you can scale up x nilo amount of servers (infront of your HAProxy) and utilize the data piped from redis.
There is also this node addon: http://blog.varunajayasiri.com/shared-memory-with-nodejs That lets you share memory between processes, but only works under Linux. This will help if you don't want to send data across local processes all the time or have to deal with nodes ipc api.
You can also fork child processes within node for a new v8 isolate to help with expensive cpu bound tasks. For example, players can kill monsters and obtain quite a bit of loot within my action rpg game. I have a child process called LootGenerater, and basically whenever a player kills a monster it sends the game id, mob_id, and user_id to the process via the default IPC api .send. Once the child process receives it, it iterates over the large loot table and manages the items (stores to redis, or whatever) and pipes it back.
This helps free up the event loop greatly, and just one idea I can think of to help you scale. But most importantly you will want to use an in-memory database system and make sure your game code architecture is designed around whatever database system you use. Don't make the mistake I did by now having to re-write everything :)
Hope this helps!
Note: If you do decide to go with Memcache, you will need to utilize another pub/sub system.

We Keep Coding

JavaScript is the programming language of the Web.