Dealing with arbitrarily large inbound data in JavaScript - javascript

Here's a function to remove specified characters from a string:
function remove(str, chars) {
var set = new Set(chars);
return [...str].filter(i => !set.has(i)).join('');
}
console.log(remove('hello world', 'el') === 'ho word'); // true
But... what if the inbound string is arbitrarily large and possibly continually extended?
Presumably we need a completely different strategy to deal with it in a piecemeal fashion?
Would such an implementation look like constructing a buffer object that is periodically updated as the data is inbound, and then having sampling logic to deal with the "delta", process it and pass it on?
And that this would have to be done asynchronously to avoid blocking everything else on the event loop?
Is this essentially what Node.js streams are?

[...str] will convert string into array of 1-character strings, which will occupy additional memory. Then .filter() will produces another array of strings, which could be as big as previous one, depending on input data. End then, the resulting string.
If you concerned about possible memory and/or performance, you can implement this with regular cycle "for" and "charAt" function.

Related

Manually "compressing" a very large number of boolean values in JSON

We have a data model where each entity has 600 boolean values. All of this data needs to travel over the wire from a node.js backend to an Angular frontend, via JSON.
I was thinking about various ways to optimize it (this is an internal API and is not public, so adherence to best practices is less important than performance and saving bandwidth).
I am not a native Javascript speaker, so was hoping to get some feedback on some of the options I was considering, which are:
Turning it into a bitfield and using a huge (600-bit) BigInt.
Is this a feasible approach? I can imagine it would probably be pretty horrific in terms of performance
Splitting the 600 bits into 10 integers (since JS integers are 64 bit), and putting those into an array in the JSON
Base64 encoding a binary blob (will be decoded to a UInt8Array I'm assuming?)
Using something like Protobuf? It might be overkill because I don't want more than 1-2 hours spent on this optimization; definitely don't want to make major changes to the architecture either
Side note: We don't have compression on the server end due to infrastructure reasons, which makes this more complicated and is the reason for us implementing this on the data level.
Thanks!
as Evan points out, transforming your boolean for example into a single character for true="t" and false="f", the 600 boolean will become a joined string of 600 chars which can very well be split into 3 strings of 200 of the sizes, then once received on the front just concatenate the transit and if you want to recover your Bollean values ​​from the string, with a simple reg it becomes possible.
I don't know how the data is set and then obtained, just changing this parameter to which I think needs to be automated.
Once the final string is obtained on the front here is an example of reg ex which can convert your string to an array with your 600 boolean. It is also possible to define indexes by defining an object instead of the array.
function convert_myBool(str)
{
/*var reg = new RegExp('.{1}', 'g');
var tmpTab = str.replace(reg, function(matched){
return matched == "t"?true:false;
});*/
//map is best
tmpTab = str.split('').map((value) =>{
return value == "t"?true:false;
});
return tmpTab;
};
I wrote this dynamically so of course it can be pondered, improved replaced etc. Hoping to have helped :)
Can it be sorted in any way? If there are boolean values that always occur in conjunction with a related value you may be able to group them and simplify.
Depending on what your use for that data is, you may be able to cache some of the it or memoize based on usage frequency. There would be a space tradeoff with caching, however.

accessing and removing objects by ID

I have certain requirements , I wanted to do the following in quickest way as possible.
I have 1000's of objects like below
{id:1,value:"value1"} . . {id:1000,value:"value1000"}
I want to access above objects by id
I want to clean the objects Lesser than certain id every few minutes (Because it generates 1000's of objects every second for my high frequency algorithm)
I can clean easily by using this.
myArray = myArray.filter(function( obj ) {
return obj.id > cleanSize;
});
I can find the object by id using
myArray.find(x => x.id === '45');
Problem is here , I feel that find is little slower when there is larger sets of data.So I created some objects of object like below
const id = 22;
myArray["x" + id] = {};
myArray["x" + id] = { id: id, value:"test" };
so I can access my item by id easily by myArray[x22]; , but problem is i am not able find the way to remove older items by id.
someone guide me better way to achieve the three points I mentioned above using arrays or objects.
The trouble with your question is, you're asking for a way to finish an algorithm that is supposed to solve a problem of yours, but I think there's something fundamentally wrong with the problem to begin with :)
If you store a sizeable amount of data records, each associated with an ID, and allow your code to access them freely, then you cannot have another part of your code dump some of them to the bin out of the blue (say, from within some timer callback) just because they are becoming "too old". You must be sure nobody is still working on them (and will ever need to) before deleting any of them.
If you don't explicitly synchronize the creation and deletion of your records, you might end up with a code that happens to work (because your objects happen to be processed quickly enough never to be deleted too early), but will be likely to break anytime (if your processing time increases and your data becomes "too old" before being fully processed).
This is especially true in the context of a browser. Your code is supposed to run on any computer connected to the Internet, which could have dozens of reasons to be running 10 or 100 times slower than the machine you test your code on. So making assumptions about the processing time of thousands of records is asking for serious trouble.
Without further specification, it seems to me answering your question would be like helping you finish a gun that would only allow you to shoot yourself in the foot :)
All this being said, any JavaScript object inherently does exactly what you ask for, provided you're okay with using strings for IDs, since an object property name can also be used as an index in an associative array.
var associative_array = {}
var bob = { id:1456, name:"Bob" }
var ted = { id:2375, name:"Ted" }
// store some data with arbitrary ids
associative_array[bob.id] = bob
associative_array[ted.id] = ted
console.log(JSON.stringify(associative_array)) // Bob and Ted
// access data by id
var some_guy = associative_array[2375] // index will be converted to string anyway
console.log(JSON.stringify(some_guy)) // Ted
var some_other_guy = associative_array["1456"]
console.log(JSON.stringify(some_other_guy)) // Bob
var some_AWOL_guy = associative_array[9999]
console.log(JSON.stringify(some_AWOL_guy)) // undefined
// delete data by id
delete associative_array[bob.id] // so long, Bob
console.log(JSON.stringify(associative_array)) // only Ted left
Though I doubt speed will really be an issue, this mechanism is about as fast as you will ever get JavaScript to run, since the underlying data structure is a hash table, theoretically O(1).
Anything involving array methods like find() or filter() will run in at least O(n).
Besides, each invocation of filter() would waste memory and CPU recreating the array to no avail.

Get difference between last and previous number

I have this function which parse a value received on mqtt. The value is actually a timestamp send by an arduino and is number like 1234 , 1345 etc...
var parts = msg.payload.trim().split(/[ |]+/);
var update = parts[10];
msg.payload = update;
return msg;
What i want is actually instead last value (which is update variable in my case) is to get difference between last received value and previous one.
Basically if I receive 1234 and then 1345, I want to remember 1234 and the value returned by function to be 1345 - 1234 = 111.
Thank you
If you want to store a value to compare to later you need to look at how to use context to store it.
The context is normally an in memory store for named variables, but it is backed by an API that can be used to persist the context between restarts.
I wanted to suggest an alternative approach. Node-RED has a few core nodes that are designed to work across sequences and for this purpose, they keep an internal buffer. One of these nodes is the batch node. Some use cases, like yours, can take advantage of this functionality to store values thus not requiring using context memory. The flow I share below uses a batch node configured to group two messages in a sequence, meaning it will always send downstream the current payload and the previous one. Then a join node will work on such sequence to reduce the payload to a single value, that is the difference between the timestamps. You need to open the configuration dialog for each node to fully understand how to set up those nodes to achieve the desired goal. I configured the join node to apply a fix-up expression to divide the payload by one thousand, so you get the value in seconds (instead of milliseconds).
Flow:
[{"id":"3121012f.c8a3ce","type":"tab","label":"Flow 1","disabled":false,"info":""},{"id":"2ab0e0ba.9bd5f","type":"batch","z":"3121012f.c8a3ce","name":"","mode":"count","count":"2","overlap":"1","interval":10,"allowEmptySequence":false,"topics":[],"x":310,"y":280,"wires":[["342f97dd.23be08"]]},{"id":"17170419.f6b98c","type":"inject","z":"3121012f.c8a3ce","name":"","topic":"timedif","payload":"","payloadType":"date","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":160,"y":280,"wires":[["2ab0e0ba.9bd5f"]]},{"id":"342f97dd.23be08","type":"join","z":"3121012f.c8a3ce","name":"","mode":"reduce","build":"string","property":"payload","propertyType":"msg","key":"topic","joiner":"\\n","joinerType":"str","accumulate":false,"timeout":"","count":"","reduceRight":false,"reduceExp":"payload-$A","reduceInit":"0","reduceInitType":"num","reduceFixup":"$A/1000","x":450,"y":280,"wires":[["e83170ce.56c08"]]},{"id":"e83170ce.56c08","type":"debug","z":"3121012f.c8a3ce","name":"Debug 1","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","x":600,"y":280,"wires":[]}]

Optimal way to search for certain key value in Array of objects in JavaScript

I have an Array of Objects which has a key say PhoneNumber (along with other key-value pairs). I have a phone number value that I'm looking for in this Array. I'm performing a linear search on this array for the phone number and breaking the loop as soon as I find the object (hence If I'm lucky, I may not need traverse through entire Array).
Best case here is only if I find the phone number in Array (and don't search further), but chances are more that I'll not find it, and will traverse whole array in vain.
Update
I thought to provide this, the search space (the Array of Objects) will have around ~500 elements, so looking specifically at this linear search may not be a performance concern, but there are many other tasks which are performed alongside this search, so I'm looking for as many micro-optimizations as possible.
Update 2 (In response to Elias Van Ootegem's comment)
I think my Array has something inappropriate in its structure such that neither of JSON.stringify() (Uncaught TypeError: Converting circular structure to JSON) or Ext.JSON.encode() (Maximum call stack exceeded) works to convert array into JSON string.
But, anyway to do this even faster?
Create a lookup object, which maps the phone number (as key) to the index inside the array.
var dataset = [{phone:'0123 456 689'},{phone:'0987 654 321'},...];
var lookup = {'0123 456 689':0,'0987 654 321':1,...};
// search like this...
var number = '0987 654 321';
var obj = dataset[lookup[number]];
But this is probably overkill for most use-cases, as a linear search should be fast enough for users even with thousands of entries.
It deppends on the usage.
If you use this array many times, you should use a map instead of using an array. Create a hash with {key: data}, and get the data given its key. (convert the array to a hash in the javascript)
If you only use the array once, the process of converting it to a map will take longer than the search itself. The best way, linear search.
The cost of boths solutions are (considering n: array lenght, and m, number of searches):
first solution:
n*log(n) + m * log(n)
Secons solution:
n*m

Convert JSON to array of objects with custom keys and values

I receive from the server a JSON string:
{0:["aNumber","aText","anID"],1:["aNumber","aText","anID"]...
I must elaborate this string so that:
aNumber is concatenated with client side strings (say, it becomes "http://www.myurl.com/aNumber.jpg");
aNumber becomes the value of url in array of objects;
aText becomes the value of caption in the same array;
anID becomes the value of id in the same array;
[{url:"http://www.myurl.com/aNumber.jpg",caption:"aText",id:"anID}.{url:"http://www.myurl.com/aNumber.jpg",caption:"aText",id:"anID"}...
I perfectly know how to do this, but I wanted to know if anyone knows if is possible to do the same thing avoiding a loop: the JSON is really huge (more than 10000 items) in a mobile context, so I was hoping in something magic to improve performances.
Try looping through 10,000 items in a mobile context. Then try 100,000 and then 1,000,000. You'll probably see that looping is not the greatest performance bottleneck.
You can't really do that, here the best solution is to convert one specific child array in the object only when you need it.
Anyway, the loop is not so long to execute, the longest is the parsing JSON String > Object.
For your loop, I would have made something like:
obj=JSON.parse({0:["aNumber","aText","anID"],1:["aNumber","aText","anID"]});
arr=[];
for(i in obj){
o=obj[i]; // improve performances on big objects
arr.push({url: "http://www.myurl.com/"+ o[0] + ".jpg", caption:o[1], id:o[2]});
}

Categories