Child_added subscription seems to download entire dataset - javascript

I have a large data set (~100k entries), that is being subscribed to using the 'child_added' event. Using node 7 and firebase 3.6.1, doing this seems to download the entire 100k entries before a single child_added event is fired.
Memory consumption grows significantly for a few dozen seconds, and then all child_added events are fired swiftly after each other.
This is slow:
require('firebase').
initializeApp({databaseURL: 'https://someproject.firebaseio.com'}).
database().ref('data').
on('child_added', (snap) => console.log(snap.key));
Limiting is still fast (few seconds delay):
require('firebase').
initializeApp({databaseURL: 'https://someproject.firebaseio.com'}).
database().ref('data').limitToFirst(10).
on('child_added', (snap) => console.log(snap.key));
Given the streaming nature of Firebase, I assume it is not intended behaviour for child_added subscriptions to download the entire data set to the client before anything is done.
Am I doing something wrong, or is this a bug?

Although that in the child_added section extracted from firebase documentation it says:
The child_added event is typically used when retrieving a list of items from the database. Unlike value which returns the entire contents of the location, child_added is triggered once for each existing child and then again every time a new child is added to the specified path. The event callback is passed a snapshot containing the new child's data. For ordering purposes, it is also passed a second argument containing the key of the previous child.
At the first lines in that page, we can found this:
Data stored in a Firebase Realtime Database is retrieved by attaching an asynchronous listener to a database reference. The listener will be triggered once for the initial state of the data and again anytime the data changes.
Seems to be its normal behaviour. It first retrieves all the data.

I am in the same situation, waiting nearly 40 seconds for the first child to fire. The only solution I could come up with was to get the keys using Firebase rest API and shallow query parameter, then loop over each key and call Firebase. Here is basically what I did.
`
console.log('start', Date.now());
fetch('https://[firebase_app].firebaseio.com/[your_path].json?shallow=true')
.then((response) => {
return response.json();
}).then(function(j) {
Object.keys(j).forEach(function (key) {
console.log(key, 'start', Date.now());
firebase_reference.child(key).on("child_added", function (snapshot) {
console.log(key, Date.now());
//now you have the first response without waiting for everything.
});
});
});`
I know this doesn't answer your question about child_added functionality, but it does what you would expect to happen with child_added. I am going to submit a feature request to Firebase and link this SO question.

Related

Firestore document listener doesn't return current value from server

I'm trying to use a document snapshot listener for Firebase Firestore. I want to perform some action based on the current document value from the server, but also listen for changes to the document and enable offline cache when possible.
The listener works to update a state when the document changes, but for some reason always operates form a previous cache of what changed when it was listening:
let unsub = firebase.firestore().collection('myCol').doc('myDoc').onSnapshot((doc) => {
if(doc.data().myVal) myFunction(); //myVal is always what the last listener thought it was, not updated from current server value
}
So if I then call unsub() and make a change to the document in the console, next time the listener is started up, it returns the last cache value from when it was previously listening instead of the first load being from the server.
How can I force the listener to get the first value from the server instead of it's old local cache?
The only way I can currently force the listener logic to load from the server first is by manually triggering a get() on the document first. This simply updates the local cache of any changes that happened when the listener wasn't listening last.
If you're having similar issues with your code, add this before setting the listener logic:
await firebase.firestore().collection('myCol').doc('myDoc').get({source: 'server'}).catch(e => {});

Is there a way to get the actual timestamp when using firebase.database.ServerValue.TIMESTAMP?

This question is about the javascript client.
I have code that goes something like this:
const localEvents = [];
const fbEvents = firebase.database().ref("myevents");
fbEvents.on("child_added", function(snapshot) {
const e = snapshot.val();
localEvents.push(e);
});
function createEvent(e) {
e.time = firebase.database.ServerValue.TIMESTAMP;
fbEvents.push(e);
}
After calling createEvent({}), it appears that entries in my localEvents list have time values which are not equal to the actual entries in the database (the client guesses the timestamp and calls the child_added handler before it's actually done a roundtrip to the server). Is there any way to avoid this, and/or is there any way to get a callback when the actual value of the time is known?
It's not possible, using only the snapshot in the listener, to determine if the timestamp comes from the server or is guessed locally.
What you can do instead is use the promise returned from fbEvents.push(e) to determine when the write actually succeeds. A resolved promise which means it was definitely written to Firebase. The listener callback you get after that will contain the server's updated value.
(Note that with Firestore it is possible to determine if a document was fully written to the server or not. Just not with Realtime Database.)

Firebase Realtime database: Does 'value' event fires automatically

As I understand, when this following line of code is interpreted/executed by Javascript
ref.on('value',callback)
(similar to document.addEventListener('click', callback)). The callback gets attached to the el/object for that event such that when that event executes then the attached callback (event handler) gets fired.
But I observe that firebase 'value' event will automatically fire when there is some data at this ref, when the above line of code is interpreted/executed by Javascript even though there is NO trigger such as add/delete/modify operation that happens to that ref.
Is this interpretation/assumption correct or the value event works just like any other event that trigger from add/delete/modify operations. In that case what would be that trigger?
Also, if the value event fires automatically does it actually do an async/network call to the firebase database on that ref and then fetches that data (snapshot) or is the ref data cached at the client side i.e. no async/network request.
Can anybuddy, clarify both this confusion? Your help is appreciated.
According to the documentation:
You can use the value event to read a static snapshot of the contents
at a given path, as they existed at the time of the event. This method
is triggered once when the listener is attached and again every time
the data, including children, changes. The event callback is passed a
snapshot containing all data at that location, including child data.
If there is no data, the snapshot will return false when you call
exists() and null when you call val() on it.
When you attach a listener, the SDK will use its persistent connection to the database to check if there is new data. If there is not any new data, then the locally cached data is provided.

How to stop reading child_added with Firebase cloud functions?

I try to get all 10 records using this:
exports.checkchanges = functions.database.ref('school/{class}').onCreate(snap => {
const class=snap.params.class;
var ref = admin.database().ref('/students')
return ref.orderByChild(class).startAt('-').on("child_added", function(snapshot) {
const age=snapshot.child("age");
// do the thing
})
)}
The problem is that after I get the 10 records I need correctly, even after few days when a new record is added meeting those terms, this function is still invoked.
When I change on("child_added to once("child_added I get only 1 record instead of 10. And when I change on("child_added to on("value I get null on this:
const age=snapshot.child("age");
So how can I prevent the function from being invoked for future changes?
When you implement database interactions in Cloud Functions, it is important to have a deterministic end condition. Otherwise the Cloud Functions environment doesn't know when your code is done, and it may either kill it too soon, or keep it running (and thus billing you) longer than is necessary.
The problem with your code is that you attach a listener with on and then never remove it. In addition (since on() doesn't return a promise), Cloud Functions doesn't know that you're done. The result is that your on() listener may live indefinitely.
That's why in most Cloud Functions that use the Realtime Database, you'll see them using once(). To get all children with a once(), we'll listen for the value event:
exports.checkchanges = functions.database.ref('school/{class}').onCreate(snap => {
const class=snap.params.class;
var ref = admin.database().ref('/students')
return ref.orderByChild(class).startAt('-').limitToFirst(10).once("value", function(snapshot) {
snapshot.forEach(function(child) {
const age=child.child("age");
// do the thing
});
})
)}
I added a limitToFirst(10), since you indicated that you only need 10 children.

Cloud Function stuck in an infinite loop

exports.addNewValue = functions.database.ref('/path')
.onWrite(event => {
event.data.adminRef.update({"timeSnapshot":Date.now()})})
It appears that Date.now() causes an infinite loop in the function because the following does not:
exports.addNewValue = functions.database.ref('/path')
.onWrite(event => {
event.data.adminRef.update({"timeSnapshot":"newString"})})
How do I fix this?
If you write back to the same location in the database that was previously changed, you can expect this sequence of events:
Function is triggered with the first change from the client
Function writes back to the database
Function is triggered a second time because of the write during step #2
All writes to the database that match the filter path, even those from within the same function, will trigger the function.
In step 3, you need a strategy to figure out if the second function invocation should result in yet another write back to the database. If it does not require another write, the function should return early so that it won't trigger another write. Typically you look at the data in the event passed to the function and figure out if it was already modified the first time. This could involve looking to see if some flag is set in the database, or if the data you modified does not need any more changes.
Many of the code samples provided by the Firebase team do this. In particular, look at text moderation. Also, there is a video that describes the problem and a possible solution. In the end, you're responsible for coming up with the strategy that meets your needs.
I think the following should work fine :-
exports.addNewValue = functions.database.ref('/path/timeSnapshot')
.onWrite(event => { event.data.adminRef.set(Date.now()) })
The logic behind the above is that when you put a trigger function on a higher node (such as /path in your case), then the function would be fired each time a change is made to any of its child nodes (/timestamp in your case) - hence, the infinite loop.
Therefore, as a general practice, for efficiency as well as cost effectiveness, make sure that your trigger function has the lowest possible path node. Flatting out your data really helps in this as well.
If you arrived here having problems with querying try using .once('value') ... it will mean that you only look at the reference point once ... i.e.
ref.orderByChild("isLive").equalTo(true).once("value" , function(snapshot) {
instead of
ref.orderByChild("isLive").equalTo(true).on("value", function(snapshot) {
as the second will have you listening all the time, and when data changes at the ref, the listener will receive the changes and run the code inside your block again

Categories