Nodejs genetic algorithm memory management? - javascript

I'm trying to implement a genetic algorithm and have the problem of heap memory getting too large. To the extend that it throws an error.
I'm afraid it has something to do with my population array. I displayed the memory usage while running the program and recognized it is growing and growing and never gets freed up.
Consider the following setup:
let used = {}
class Trainer {
static breed = function(elite) {
//code for making a new generation
return brood
}
constructor() {
this.population = []
}
async evolve(data, options, callback) {
// potential point where the error may be
// because I see the memory growing and never getting smaller
// but I think it should get smaller because I don't
// reference the old generation any more or do I ?
while (error >= threshold) {
// train each member of the population async
// and than make a new population out of the elite
const elite = this.population.slice(0, 10)
this.population = Trainer.breed(elite) <--- ¯\_(ツ)_/¯
callback(info)
used = process.memoryUsage()
}
}
}
const trainer = new Trainer
const data = getData()
const options = makeOptions()
function log(info) {
console.log(info)
console.log(used)
}
async function test() {
await trainer.evolve(data, options, log)
}
test()
Now I have some questions.
Is it true that on each generation the old array is not referenced anymore and get's garbage collected and makes some free space in memory?
How can I analyze the memory usage by each function to get better inside of where the problem may is in my code?
Because the code is very large I tried to simplified my problem. But if you are interested in the complete code please have a look in my repo:
https://github.com/kiro7shiro/image-generator
You can find the described setup from above in the "/cli/image-generator-evolve.js" file.

Related

MemoryView utilization with latest JSImport API syntax to interop with Javascript - Blazor

As a corollary to Need C# JSImport async signature interop example to match async Javascript method - Blazor
I'd like to find a Span/ArraySegment solution to passing and receiving parameters via javascript interop in Blazor.
Here's my best attempt so far, with syntax I would expect to work:
The import
[JSImport("getMessage", "SampleJS")]
[return: JSMarshalAs<JSType.Promise<JSType.Any>>()]
internal static partial Task<object>
GetWelcomeMessage([JSMarshalAs<JSType.MemoryView>] ArraySegment<byte> bytes);
JS module
export async function getMessage(dataPointer) {
var maskedData = new Uint8Array(dataPointer) // this just creates a zero-filled array
console.log(maskedData)
return await new Promise((resolve) => {
setTimeout(() => {
resolve(maskedData);
}, 2000);
});
}
This just logs the data (bytes) received, waits 2s and returns the same data.
Usage
byte[] sampleData = Encoding.UTF8.GetBytes("Hello from C#");
ArraySegment<byte> sampleDataPointer = new ArraySegment<byte>(sampleData);
object? result = await GetWelcomeMessage(sampleDataPointer);
if (result is byte[] bytes)
{
message = Encoding.UTF8.GetString(bytes);
Console.WriteLine($"Got {message} from {result.GetType()}");
}
So the problem is in the javascript method. I can't turn the javascript's incoming MemoryView parameter into a Uint8Array so I can work with it. Any suggestions?
Many thanks to Mister Magoo for getting this off the ground
Here is the API of IMemoryView https://github.com/dotnet/runtime/blob/1631f312c6776c9e1d6aff0f13b3806f32bf250c/src/mono/wasm/runtime/dotnet.d.ts#L246-L264
The marshaler would GC pin it for you in case of ArraySegment but not for Span.
EDIT BELOW is from original poster so this answer gets the credit:
This answer eventually lead me to the solution. My sample code works if you modify the javascript line from
var maskedData = new Uint8Array(dataPointer)
to
var maskedData = new Uint8Array(dataPointer.slice())
Many thanks!

NodeJS is using a lot of heap memory when nothing is happening

I have the following code saved to test.js:
setInterval(logMemoryUsage, 5000);
function logMemoryUsage() {
const formatMemoryUsage = (data) => `${Math.round((data / 1024 / 1024) * 100) / 100} MB`;
const memoryData = process.memoryUsage();
console.log({
rss: `${formatMemoryUsage(memoryData.rss)}`,
heapTotal: `${formatMemoryUsage(memoryData.heapTotal)}`,
heapUsed: `${formatMemoryUsage(memoryData.heapUsed)}`,
external: `${formatMemoryUsage(memoryData.external)}`,
});
}
I've run this code on two seperate machines (one of which was a fresh Ubuntu install), and on both the heapUsed is averaging around 5MB, despite my actual program not using that much:
{
rss: '24.37 MB',
heapTotal: '5.25 MB',
heapUsed: '4.63 MB',
external: '0.32 MB'
}
I understand that JavaScript is a JIT & garbage-collected language, and that node has other stuff it's doing behind the scenes, but 5 MB? It seems like an awful lot.
For reference, I encountered this on v16.18.1 and v18.12.1.
What is causing this, and is there anything I can do to reduce the amount of memory used by my node app?
Thanks.
(V8 developer here.)
5 MB? It seems like an awful lot.
Indeed, it is. Welcome to JavaScript. This memory is used for language-level built-in objects (like Math, String, Date), as well as environment-specific globals (e.g. document in the browser, fs, net, vm in Node).
Run this script to get an idea of just how much stuff there is by default:
let g_discovered = new Set();
function Discover(obj, path) {
g_discovered.add(obj);
let properties = Object.getOwnPropertyNames(obj);
for (let p of properties) {
try {
let v = obj[p];
if (v === null) continue;
if (g_discovered.has(v)) continue;
let nested_path = path.length > 0 ? path + "." + p : p;
console.log(`Some memory is used by ${nested_path}`);
if (typeof v === "object" || typeof v === "function") {
Discover(v, nested_path);
}
} catch (e) {
// Ignore failures. Examples that will throw:
// Function.prototype.arguments
// Symbol.prototype.description
}
}
}
Discover(this, "");
You could create and inspect a heap snapshot if you wanted to dig deeper.
JavaScript is a JIT & garbage-collected language
JavaScript as a language doesn't have an opinion on whether it is interpreted, AOT-compiled, or JIT-compiled. Most modern engines have decided that a combination of interpreting and JIT-compiling is the way to go, but that's an implementation detail. If anything, this reduces the amount of memory that's used after initialization.
The fact that JS is garbage-collected doesn't influence the minimum memory consumption of an empty heap. It can, however, mean that e.g. the strings created by the previous run of logMemoryUsage haven't been freed yet (because why spend time on GC when less than 5MB are in use?).
just so I'm set moving forward, are there any specific things I can do to catch memory leaks and keep general memory usage low?
That's a whole separate question. You can start at https://nodejs.org/en/docs/guides/diagnostics/memory/, and come back to StackOverflow when you have a specific problem that you can describe in detail, and where you can list steps you've already tried.

Object scoping rules seem to change due to seemingly irrelevant library?

So, I'm familiar with the general gist of JavaScript's features regarding objects. They're refcounted and if they go to zero, they die. Additionally, apple = banana where both are objects doesn't copy banana to apple but makes apple a reference to banana.
That being said, some of my code has something like this:
// imagine ws require() and setup here...
var RateLimit = require("ws-rate-limit")('10s', 80);
SickWebsocketServer.on("connection", function(mysocket, req){
// blahblahblah...
RateLimit(mysocket); // See below...
mysocket.on("limited", function(){console.log("someone was limited!"});
mysocket.on("message", function(data){
if(JSON.parse(msg).MyFlagToMessageASpecificWebsocketClient){ // obvs dont do this lol
findme = MyArr.find(guy=>guy.Socket==mysocket);
if(findme) console.log("TRIGGER PLS :)"); // GOAL
else console.log("DON'T TRIGGER"); // SOMETHING WENT WRONG
}
});
MyArr.push({MyName:"my SICK object", MyNumber:MyArr.length, Socket:mysocket})
}
The library used for rate limiting is called ws-rate-limit and I have pasted a shortened (non-code removed) version down below (since it's tiny). Imagine it to be in a package called ws-rate-limit (because it is :D).
const duration = require('css-duration')
module.exports = rateLimit
function rateLimit (rate, max) {
const clients = []
// Create an interval that resets message counts
setInterval(() => {
let i = clients.length
while (i--) clients[i].messageCount = 0
}, duration(rate))
// Apply limiting to client:
return function limit (client) {
client.messageCount = 0
client.on('newListener', function (name, listener) {
if (name !== 'message' || listener._rated) return
// Rate limiting wrapper over listener:
function ratedListener (data, flags) {
if (client.messageCount++ < max) listener(data, flags)
else client.emit('limited', data, flags)
}
ratedListener._rated = true
client.on('message', ratedListener)
// Unset user's listener:
process.nextTick(() => client.removeListener('message', listener))
})
// Push on clients array, and add handler to remove from array:
clients.push(client)
client.on('close', () => clients.splice(clients.indexOf(client), 1))
}
}
My issue is that, when I do use the RateLimit function, the "DON'T TRIGGER" code triggers. If I literally remove that one single line (RateLimit(mysocket)) it goes into "TRIGGER PLS :)".
The above is obviously logically simplified from my actual application but I think you get the gist. Apologies for any misspellings that may lead to undefineds or stuff like that; I promise you my code works if not for the RateLimit(mysocket) line.
When I add console.logs into the find function to log both the guy.Socket object and the mysocket object, with the RateLimit(mysocket) line, the mysocket object's .toString() returns [object global] rather than [object Object]. I know that this is some complicated JavaScript object scoping problem, but I have no clue where to start in terms of investigating it.
Thank you! :)
I'll take a random shot in the dark based on intuition. My best guess is that your issue is with the guy.Socket==mysocket line. Comparing objects that way will only check if they both point to the same heap memory location, even if it's two different stack variables. In your example I can only assume that the RateLimit(mysocket) line is somehow creating a new heap location for that stack variable (creating a new object from it) and because of that your == comparison is then no longer equal (even if they have the exact same values) because they're pointing to different locations.
Try using: JSON.stringify(guy.socket) === JSON.stringify(mysocket).

nodejs: run module in sandbox

I have this turn-based NodeJs gaming app in which developers (anyone) can submit a player-robot. My NodeJS app will load all players and let them play against each other. Because I don't know anything about the code submitted I need to run it inside a sandbox.
For example, the following untrusted code might look like this:
let history = [];
export default class Player {
constructor () {
this.history = [];
}
move (info) {
this.history.push(info);
}
done(result) {
history.push({result: result, history: this.history});
}
}
Now, in my main app I would like to do something like
import Player1 from 'sandbox/player1';
import Player2 from 'sandbox/player2';
....
for (let outer = 0; outer < 10; outer ++) {
let player1 = creeateSandboxedInstance(Player1);
let player2 = creeateSandboxedInstance(Player2);
for(let inner = 0; inner < 1000000; inner ++) {
...
let move1 = player1.move();
let move2 = player2.doMove();
...
}
}
What I would like the sandbox/creeateSandboxedInstance environment to take care of is:
Player class should not give access to the filesystem / internet
Player class should not have access to app global variables
Any state should be reseted (like class variables)
probably more things :)
I think that I should use the vm module. Something like this probably:
var vm = require('vm');
var script = new vm.Script('function move(info){ ... } ...', {conext});
var sandbox = script.runInNewContext();
script.move(..); // or
sandbox.move(..);
However, I cannot get it to work such that I can call the move method. Is something like even possible ?
Don't do this yourself. Use an existing library. There are quite a few issues you have to deal with if you were to write it yourself. For example: How do you handle a user writing a never ending for-loop?
How to run untrusted code serverside?
If you are planning on writing it yourself then yes, you will need the vm module.
By passing in an empty "sandbox" you have removed all global variables.
script.runInNewContext({});
Next you'll need to figure out how you want to handle the never ending for-loop. You'll have to create a new process to handle this scenario. Do you create 1 process to manage ALL untrusted code? If you do then you'll have to kill ALL untrusted code if a single script hangs. Do you create a new process for each untrusted code? If you do then you won't be happy with performance. Creating a new process can take a second or two. You could require the child process to "notify" the main process it's still alive. If it fails to notify within 5 seconds (or whatever your threshold is, kill the process). Note: script.runInNewContext does contain an option that lets you specify a "timeout" (if the code takes longer than X seconds - throw an exception), but the problem with that is it allows async code (according to another Stackoverflow post), although you could defend against that by not introducing setTimeout, setInterval, or setImmediate into the scope. However, even if you set it to 1 second, NO other code can run during that second in that process. So if you have 1000 scripts to run, it could take up to 1000 seconds (16 minutes) to run them all. At least running each in their own process will let them run in parallel.
Here's an example of why the timeout option won't work for you:
var script = new vm.Script('move = function move(info) { for(var i = 0; i < 100000; i++) { console.log(i); } }');
var sandbox = { move: null, console: console };
var result = script.runInNewContext(sandbox, { timeout: 1 });
sandbox.move('woah');
Next you'll need to figure out how to communicate from your main process, into a child process and then into the vm. I'm not going to get into communicating between processes as you can find that pretty easily. So, by calling script.runInNewContext you are executing the code right then and there. Which lets you set global variables:
var script = new vm.Script('move = function move(info) { console.log("test: " + info); }');
var sandbox = { move: null, console: console };
var result = script.runInNewContext(sandbox);
sandbox.move('success');

Adding object to PFRelation through Cloud Code

I am trying to add an object to a PFRelation in Cloud Code. I'm not too comfortable with JS but after a few hours, I've thrown in the towel.
var relation = user.relation("habits");
relation.add(newHabit);
user.save().then(function(success) {
response.success("success!");
});
I made sure that user and habit are valid objects so that isn't the issue. Also, since I am editing a PFUser, I am using the masterkey:
Parse.Cloud.useMasterKey();
Don't throw in the towel yet. The likely cause is hinted at by the variable name newHabit. If it's really new, that's the problem. Objects being saved to relations have to have once been saved themselves. They cannot be new.
So...
var user = // got the user somehow
var newHabit = // create the new habit
// save it, and use promises to keep the code organized
newHabit.save().then(function() {
// newHabit is no longer new, so maybe that wasn't a great variable name
var relation = user.relation("habits");
relation.add(newHabit);
return user.save();
}).then(function(success) {
response.success(success);
}, function(error) {
// you would have had a good hint if this line was here
response.error(error);
});

Categories