Firebase "update" operation downloads data? - javascript

I was profiling a "download leak" in my firebase database (I'm using JavaScript SDK/firebase functions Node.js) and finally narrowed down to the "update" function which surprisingly caused data download (which impacts billing in my case quite significantly - ~50% of the bill comes from this leak):
Firebase functions index.js:
exports.myTrigger = functions.database.ref("some/data/path").onWrite((data, context) => {
var dbRootRef = data.after.ref.root;
return dbRootRef.child("/user/gCapeUausrUSDRqZH8tPzcrqnF42/wr").update({field1:"val1", field2:"val2"})
}
This function generates downloads at "/user/gCapeUausrUSDRqZH8tPzcrqnF42/wr" node
If I change the paths to something like this:
exports.myTrigger = functions.database.ref("some/data/path").onWrite((data, context) => {
var dbRootRef = data.after.ref.root;
return dbRootRef.child("/user/gCapeUausrUSDRqZH8tPzcrqnF42").update({"wr/field1":"val1", "wr/field2":"val2"})
}
It generates download at "/user/gCapeUausrUSDRqZH8tPzcrqnF42" node.
Here is the results of firebase database:profile
How can I get rid of the download while updating data or reduce the usage since I only need to upload it?

I dont think it is possible in firebase cloudfunction trigger.
The .onWrite((data, context) has a data field, which is the complete DataSnapshot.
And there is no way to configure not fetching its val.
Still, there are two things that you might do to help reduce the data cost:
Watch a smaller set for trigger. e.g. functions.database.ref("some/data/path") vs ("some").
Use more specific hook. i.e. onCreate() and onUpdate() vs onWrite().

You should expect that all operations will round trip with your client code. Otherwise, how would the client know when the work is complete? It's going to take some space to express that. The screenshot you're showing (which is very tiny and hard to read - consider copying the text directly into your question) indicates a very small amount of download data.
To get a better sense of what the real cost is, run multiple tests and see if that tiny cost is itself actually just part of the one-time handshake between the client and server when the connection is established. That cost might not be an issue as your function code maintains a persistent connection over time as the Cloud Functions instance is reused.

Related

Can Firebase transform data server-side before writing it?

According to this documentation, and this accompanying example, Firebase tends to follow the following flow when transforming newly written data:
Client writes data to Firebase, which is immediately accepted
The supplied Cloud Function is triggered, which transforms the data (in the example above, it removes swear words)
The transformed data is written again, overwriting the original data written in step 1
Maybe I'm missing something here, but this flow seems to present some problems. For example, if there is an error in step 2 above, and step 3 is never fired, the un-transformed data will just linger in the database. It seems like it would be better to transform the data as soon as it hits the server, but before writing. This would be followed by a single write operation, which will leave no loose artifacts behind if it fails. Is there any way in the current Firebase + Google Cloud Functions stack to add these types of pre-write data transforms?
My (tentative and weird) solution so far is to have a "shadow" /_temp/{endpoint} area in my Firebase db, so that when I want to write to /{endpoint}, I write there instead, which then triggers the relevant cloud function to do the transformation before writing to /{endpoint}. This at least prevents potentially incomplete data from leaking into my database, but it seems very inelegant and "hacky."
I'd also be interested to know if there are any server-side methods for transforming data before responding to read requests.
There is no hook in the Firebase Database (neither through Cloud Functions nor elsewhere) that allows you to modify values before they're written to the database. The temporary queue is the idiomatic way to address this use-case. It functions pretty similar to a moderator queue in most forum software.
You could use a HTTP Function to create an endpoint that your code calls and then perform the transformation there. You could use a similar pattern for reading data, although you'd have to rebuild the realtime synchronization capabilities of Firebase yourself.

Meteor js publish and subscribe is very slow to react

I have used meteor publish and subscribe method to interact with client and server. Now according to my scenario I am using D3 js to generate a bar chart and as soon as the data is entered in mongo db collection I am using a client side function to generate a bar chart. My issue is that publish and subscribe is too slow to react. And even if I limit the number of documents returned by mongodb, the issue still persists. It is also inconsistent i.e. it will react under 1 second sometimes and other times it will take 4-5 second. Please guide me on what to do and what is wrong with my implementation.
Here is the server side code,
Test = new Mongo.Collection("test")
Meteor.publish('allowedData', function() {
return Test.find({});
})
and here is the client side code,
Test = new Mongo.Collection("test")
Meteor.subscribe('allowedData');
Meteor.setTimeout(function() {
Test.find().observe({
added: function(document){
//something
},
changed:function(){
//something
},
removed:function(){
//something
},
})
From your comments I see that you need a report chart which is reactive. Even though it is your requirement, it is too expensive to have a chart like this. In fact when you system grows bigger, say you have around 10000 documents for one single chart, this kind of chart will crash your server frequently.
To work around this problem, I have two suggestions:
Define a method that returns data for the chart. Set up a job/interval timer in client to call that method periodically. The interval value depends on your need, 10 seconds should be fine for charts. It is not completely reactive this way, you only get the newest data after an interval but it is still better than a slow and crash-frequent system. You could find good modules to manage job/timer here.
Use this Meteor package meteor-publish-join (disclaimer: I am the author), it is made to solve the kind of problem you have: the need to do reactive aggregations/joins on a big data set and still have a good overall performance

How to tell whether my firebase node is used?

Firebase offers some overall Analytics in their App Dashboard, however, I need to know whether my stored data are ever used or they are just lying idly on a per node basis.
Why? It's simple: we are learning while developing, which makes the app a very fast evolving one. Not only the logic changes, but also the data stored need to be refactored from time to time. I would like to get rid of abandoned and forgotten data. Any ideas?
In best case, I would like to know this:
When was a node used last time? (was it used at all?)
How many times was it used in 1h/24h/1w/1M?
Differentiate between read/write operations
2017 update
Cloud Functions trigger automatically and run on server.
https://firebase.google.com/docs/functions/
https://howtofirebase.com/firebase-cloud-functions-753935e80323
2016 answer
So apparently the Firebase itself doesn't provide any of this.
The only way I can think of right now is to create wrappers for firebase query and write functions and either do the statistics in a client app or create a devoted node for storing the statistical data.
In case of storing the data in firebase, the wrapper for writing functions (set, update, push, remove, setWithPriority) is relatively easy. The query functions (on, once) will have to write in a successCallback.

How do I create a variable that persists across reloads and code pushes?

If I write a plugin which requires a very large initialization (14 mb JavaScript which takes 1 minute to set itself up), how can I make this object persistent (for lack of a better word) across the JavaScript files used in a Meteor projects?
After the initialization of the plugin, I have an object LargeObject and when I add a file simple_todo.js, I want to use LargeObject without it taking a minute to load after EVERY change.
I cannot find any solution.
I tried making a separate package to store this in Package object, but that is cleared after every change and reinitialized.
What would be the proper way of doing that? I imagine there should be something internal in Meteor which survives hot code push.
Here are two possible solutions I:
Cache some of its properties inside Session
Cache some of its properties inside a simple collection
Use a stub in your local environment.
Session can only be used client side. You can use a collection anywhere.
Session
client
example = function () {
if(!(this.aLotOfData = Session.get('aLotOfData'))) {
this.aLotOfData = computeALotOfData()
Session.set('aLotOfData', this.aLotOfData)
}
}
Here, no data has to be transferred between client and server. For every new client that connects, the code is rerun.
Collection
lib
MuchDataCollection = new Mongo.Collection('MuchDataCollection')
server
Meteor.publish('MuchData', function (query) {
return MuchDataCollection.find(query)
})
server
example = function () {
if(
!this.aLotOfData = MuchDataCollection.findOne({
name: 'aLotOfData'
}).data
) {
this.aLotOfData = computeALotOfData()
MuchDataCollection.insert({
name: 'aLotOfData',
data: this.aLotOfData
})
}
}
Even dough you can access the collection anywhere, you don't want anyone to be able to make changes to it. Because all clients share the same collection. Collections are cached client side. Read this for more info.
Stub
A stub is probably the easiest to implement. But it's the worst solution. You'll probably have to use a settings variable and still end up having the code for the stub inside the production environment.
What to choose
It depends on your exact use-case. If the contents of the object depend on the client or user, it's probably best to use a session-var. If it doesn't go for a collection. You'll probably need to build some cache-invalidation mechanisms, but I'd say, it's worth it.

Save to 3 firebase locations with a slow internet connection

Sometimes I'm having issues with firebase when the user is on a slow mobile connection. When the user saves an entry to firebase I actually have to write to 3 different locations. Sometimes, the first one works, but if the connection is slow the 2nd and 3rd may fail.
This leaves me with entries in the first location that I constantly need to clean up.
Is there a way to help prevent this from happening?
var newTikiID = ref.child("tikis").push(tiki, function(error){
if(!error){
console.log("new tiki created")
var tikiID = newTikiID.key()
saveToUser(tikiID)
saveToGeoFire(tikiID, tiki.tikiAddress)
} else {
console.log("an error occurred during tiki save")
}
});
There is no Firebase method to write to multiple paths at once. Some future tools planned by the team (e.g. Triggers) may resolve this in the future.
This topic has been explored before and the firebase-multi-write README contains a lot of discussion on the topic. The repo also has a partial solution to client-only atomic writes. However, there is no perfect solution without a server process.
It's important to evaluate your use case and see if this really matters. If the second and third writes failed to write to a geo query, chances are, there's really no consequence. Most likely, it's essentially the same as if the first write had failed, or if all writes had failed; it won't appear in searches by geo location. Thus, the complexity of resolving this issue is probably a time sink.
Of course, it does cost a few bytes of storage. If we're working with millions of records, that may matter. A simple solution for this scenario would be to run and audit report that detects broken links between the data and geofire tables and cleans up old data.
If an atomic operation is really necessary, such as gaming mechanics where fairness or cheating could be an issue, or where integrity is lost by having partial results, there are a couple options:
1) Master Record approach
Pick a master path (the one that must exist) and use security rules to ensure other records cannot be written, unless the master path exists.
".write": "root.child('maste_path').child(newData.child('master_record_id')).exists()"
2) Server-side script approach
Instead of writing the paths separately, use a queue strategy.
Create an single event by writing a single event to a queue
Have a server-side process monitor the queue and process events
The server-side process does the multiple writes and ensures they
all succeed
If any fail, the server-side process handles
rollbacks or retries
By using the server-side queue, you remove the risk of a client going offline between writes. The server can safely survive restarts and retry events or failures when using the queue model.
I have had the same problem and I ended up choosing to use condition Conditional Request with the Firebase REST API in order to write data transactionally. See my question and answer. Firebase: How to update multiple nodes transactionally? Swift 3 .
If you need to write concurrently (but not transactionally) to several paths, you can do that now as Firebase supports multi-path updates. https://firebase.google.com/docs/database/rest/save-data
https://firebase.googleblog.com/2015/09/introducing-multi-location-updates-and_86.html

Categories