I want to have an array in Redis (using Node) where I can add values to it and specify how long I want it to stay in there. After that time limit, they should be deleted, and ideally be able to call something so I know what just left. ex. I may get a request with 120s, so I want to add that value to a map for that long and then have it deleted.
Is there a better way to do this? I thought of using the EXPIRE but that seems to be just for keys, not elements in an array?
Any thoughts would be great.
This is what I am doing:
app.get('/session/:length', function(req, res) {
var length = parseInt(req.param('length'), 10);
addToArray(length, ip)
var ip = req.connection.remoteAddress;
res.json({ip: ip, length: length});
});
Basically, I when I add it to the array I want it to only keep it in the array for the time that is passed in. So if you say 30 seconds, it's in that array for 30s, and then is gone, and calls a callback. Maybe there is a better way to solve this problem?
What I do now is keep the times added and ip, time in an array and periodically loop through the array checking and deleting, but thought maybe it would be possible in redis to automatically do this.
While there isn't an automatic way to do that in Redis, the common approach to these kind of problems is to use a Redis sorted set. In your case, set the IP as the member's value and the expiry time (now + time to live) as the score using epoch representation.
Instead of looping periodically, you can just call ZREMRANGEBYSCORE every once in a while.
Since set members are unique, however, that means that you'll only be able to save each IP once. If that's OK, just update the score for an IP with every hit from it, otherwise make the member value unique by concatenating the IP with the timestamp.
Lastly, to get the IPs that haven't "expired", use ZRANGEBYSCORE to get members that have scores (expiry times) higher than now. Similarly and before deleting with ZREMRANGEBYSCORE, get the keys that expired for the callback logic that you mentioned.
Related
I have certain requirements , I wanted to do the following in quickest way as possible.
I have 1000's of objects like below
{id:1,value:"value1"} . . {id:1000,value:"value1000"}
I want to access above objects by id
I want to clean the objects Lesser than certain id every few minutes (Because it generates 1000's of objects every second for my high frequency algorithm)
I can clean easily by using this.
myArray = myArray.filter(function( obj ) {
return obj.id > cleanSize;
});
I can find the object by id using
myArray.find(x => x.id === '45');
Problem is here , I feel that find is little slower when there is larger sets of data.So I created some objects of object like below
const id = 22;
myArray["x" + id] = {};
myArray["x" + id] = { id: id, value:"test" };
so I can access my item by id easily by myArray[x22]; , but problem is i am not able find the way to remove older items by id.
someone guide me better way to achieve the three points I mentioned above using arrays or objects.
The trouble with your question is, you're asking for a way to finish an algorithm that is supposed to solve a problem of yours, but I think there's something fundamentally wrong with the problem to begin with :)
If you store a sizeable amount of data records, each associated with an ID, and allow your code to access them freely, then you cannot have another part of your code dump some of them to the bin out of the blue (say, from within some timer callback) just because they are becoming "too old". You must be sure nobody is still working on them (and will ever need to) before deleting any of them.
If you don't explicitly synchronize the creation and deletion of your records, you might end up with a code that happens to work (because your objects happen to be processed quickly enough never to be deleted too early), but will be likely to break anytime (if your processing time increases and your data becomes "too old" before being fully processed).
This is especially true in the context of a browser. Your code is supposed to run on any computer connected to the Internet, which could have dozens of reasons to be running 10 or 100 times slower than the machine you test your code on. So making assumptions about the processing time of thousands of records is asking for serious trouble.
Without further specification, it seems to me answering your question would be like helping you finish a gun that would only allow you to shoot yourself in the foot :)
All this being said, any JavaScript object inherently does exactly what you ask for, provided you're okay with using strings for IDs, since an object property name can also be used as an index in an associative array.
var associative_array = {}
var bob = { id:1456, name:"Bob" }
var ted = { id:2375, name:"Ted" }
// store some data with arbitrary ids
associative_array[bob.id] = bob
associative_array[ted.id] = ted
console.log(JSON.stringify(associative_array)) // Bob and Ted
// access data by id
var some_guy = associative_array[2375] // index will be converted to string anyway
console.log(JSON.stringify(some_guy)) // Ted
var some_other_guy = associative_array["1456"]
console.log(JSON.stringify(some_other_guy)) // Bob
var some_AWOL_guy = associative_array[9999]
console.log(JSON.stringify(some_AWOL_guy)) // undefined
// delete data by id
delete associative_array[bob.id] // so long, Bob
console.log(JSON.stringify(associative_array)) // only Ted left
Though I doubt speed will really be an issue, this mechanism is about as fast as you will ever get JavaScript to run, since the underlying data structure is a hash table, theoretically O(1).
Anything involving array methods like find() or filter() will run in at least O(n).
Besides, each invocation of filter() would waste memory and CPU recreating the array to no avail.
I have this function which parse a value received on mqtt. The value is actually a timestamp send by an arduino and is number like 1234 , 1345 etc...
var parts = msg.payload.trim().split(/[ |]+/);
var update = parts[10];
msg.payload = update;
return msg;
What i want is actually instead last value (which is update variable in my case) is to get difference between last received value and previous one.
Basically if I receive 1234 and then 1345, I want to remember 1234 and the value returned by function to be 1345 - 1234 = 111.
Thank you
If you want to store a value to compare to later you need to look at how to use context to store it.
The context is normally an in memory store for named variables, but it is backed by an API that can be used to persist the context between restarts.
I wanted to suggest an alternative approach. Node-RED has a few core nodes that are designed to work across sequences and for this purpose, they keep an internal buffer. One of these nodes is the batch node. Some use cases, like yours, can take advantage of this functionality to store values thus not requiring using context memory. The flow I share below uses a batch node configured to group two messages in a sequence, meaning it will always send downstream the current payload and the previous one. Then a join node will work on such sequence to reduce the payload to a single value, that is the difference between the timestamps. You need to open the configuration dialog for each node to fully understand how to set up those nodes to achieve the desired goal. I configured the join node to apply a fix-up expression to divide the payload by one thousand, so you get the value in seconds (instead of milliseconds).
Flow:
[{"id":"3121012f.c8a3ce","type":"tab","label":"Flow 1","disabled":false,"info":""},{"id":"2ab0e0ba.9bd5f","type":"batch","z":"3121012f.c8a3ce","name":"","mode":"count","count":"2","overlap":"1","interval":10,"allowEmptySequence":false,"topics":[],"x":310,"y":280,"wires":[["342f97dd.23be08"]]},{"id":"17170419.f6b98c","type":"inject","z":"3121012f.c8a3ce","name":"","topic":"timedif","payload":"","payloadType":"date","repeat":"","crontab":"","once":false,"onceDelay":0.1,"x":160,"y":280,"wires":[["2ab0e0ba.9bd5f"]]},{"id":"342f97dd.23be08","type":"join","z":"3121012f.c8a3ce","name":"","mode":"reduce","build":"string","property":"payload","propertyType":"msg","key":"topic","joiner":"\\n","joinerType":"str","accumulate":false,"timeout":"","count":"","reduceRight":false,"reduceExp":"payload-$A","reduceInit":"0","reduceInitType":"num","reduceFixup":"$A/1000","x":450,"y":280,"wires":[["e83170ce.56c08"]]},{"id":"e83170ce.56c08","type":"debug","z":"3121012f.c8a3ce","name":"Debug 1","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"payload","x":600,"y":280,"wires":[]}]
I currently have a chat bot setup to store records onto MongoDB. The object that's stored in Mongo looks like...
{ ..., expiration_time: 12451525, ... }
Where expiration_time is a number represented in minutes.
My initial approach was use setInterval on the web application to query the database to delete all records that matched the criteria of current time being greater or equal to expiration time. However, I feel like there would be a lot of additional queries to the database from the web application, when there are already other operations such as reading and writing data.
I read about storing functions onto Mongo, but I'm sure how to automate the process of invoking the function to self delete the records.
I would definitely love any feedback, approach, or guidelines for best practices.
Thanks in advance!
You can use a TTL index:
db.messages.createIndex( { "expiration_time": 1 }, { expireAfterSeconds: 0 } )
The only requirement is that expirtion_time has to be a date instead of an integer.
To expire documents at a specific clock time, begin by creating a TTL index on a field that holds values of BSON date type
IMHO there are 2 options:
One can use a cronjob to remove the out-of-date entries
One can use the Capped Collections. It's like a ring buffer, so that the oldest entry will be overwritten. Here one must choose the right fix-size of the capped Collections. I.e, size = 24 * 60 = 1440 if the chat bot writes every minute to the collection.
I want to set data with timestamp priority in the past or future, but not at the current date. And then be able to make queries with endAt and StartAt for specific dates (365 days)
The push method is great to set unique IDs for data and manage the order. Is there any method to generate a "unique PushId" like push() method for timestamp in past or future?
You can attempt to create unique ids similar to what push does, but this seems like a lot of work for little gain when there are built in tools in Firebase to order data. The simplest answer is to set a priority on each record using the server timestamp.
ref.push({ ...data..., ".priority": Firebase.ServerValue.TIMESTAMP });
To set one in the future or past, specify the timestamp manually.
ref.push({ ...data..., ".priority": timeInTheFuture });
.info/serverTimeOffset may also be helpful here for handling latency.
To create push ids, you would do something similar to the following:
Get the current timestamp and pad it to a fixed length (i.e. 16 characters)
Append a random series of digits, such as a random number or hash, also padded to a fixed length
Your entry will now look something like this: 000128198239:KHFDBWYBEFIWFE
You now have a lexicographically sortable id based on a timestamp, which is unique
Here's a helpful discussion on sorting numbers lexicographically
I am writing a node.js application that relies on redis as its main database, and user info is stored in this database.
I currently have the user data (email, password, date created, etc.) in a hash with the name as user:(incremental uid). And a key email:(email) with value (same incremental uid).
When someone logs in, the app looks up a key matching the email with email:(email) to return the (incremental uid) to access the user data with user:(incremental uid).
This works great, however, if the number of users reaches into the millions (possible, but somewhat a distant issue), my database size will increase dramatically and I'll start running into some problems.
I'm wondering how to hash an email down to an integer that I can use to sort into hash buckets like this (pseudocode):
hash(thisguy#somedomain.com) returns 1234
1234 % 3 or something returns 1
store { thisguy#somedomain.com : (his incremental uid) } in hash emailbucket:1
Then when I need to lookup this uid for email thisguy#somedomain.com, I use a similar procedure:
hash(thisguy#somedomain.com) returns 1234
1234 % 3 or something returns 1
lookup thisguy#somedomain.com in hash emailbucket:1 returns his (incremental uid)
So, my questions in list form:
Is this practical / is there a better way?
How can I hash the email to a few digits?
What is the best way to organize these hashes into buckets?
It probably won't end up mattering that much. Redis doesn't have an integer type, so you're only saving yourself a few bytes (and less each time your counter rolls over to the next digit). Doing some napkin math, at a million users, the difference in actual storage would be ~50 mbs. With hard drives in the < $1 / gb range, it's not worth the time it would take to implement.
As a thought experiment, you could maintain a key that is your current user counter, and just GET and INCR each time you add a new user.
Yes it the better way for saving millions of key value pair in hashes.
You need to create the algorithm for yourself. For example - you can use timestamp for creating a bucket value which changes after every 1000 value. . There can be many other ways.
Read this article for more reference http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value