I've been using dgrid 0.3.2 along with JsonRest to display tables of data. Recently, I've been looking at upgrading to dgrid 0.3.6 or 0.3.7. Things work mostly the same, but it seems with the newer versions of dgrid that, if the user clicks a column header to sort fast enough, the grid will start displaying duplicate rows. I’ve verified that the response JSON and range are correct, and this didn’t seem to happen when we used dgrid 0.3.2.
Here’s a simple test case that reproduces the problem, and mimics how we set up our grid and store. Am I doing something wrong? If I don’t wrap the JsonRest in a Cache, I don’t get this issue, so I’m sure the problem is there, but I’m unsure about the performance ramifications of not caching the JSON response.
<!doctype html>
<html>
<head>
<%
String dgridVer = request.getParameter("dgridVer");
if (dgridVer==null) { dgridVer = "0.3.6"; }
%>
<script type="text/javascript">
var dojoConfig = {
isDebug: true,
baseUrl: 'dojo',
packages: [
{ name: 'dojo', location: 'dojo' },
{ name: 'dijit', location: 'dijit' },
{ name: 'dojox', location: 'dojox' },
{ name: 'dgrid', location: 'dgrid-<%=dgridVer%>' },
{ name: 'put-selector', location: 'put-selector' },
{ name: 'xstyle', location: 'xstyle' },
{ name: 'datagrid', location: '../datagrid' }
]
};
</script>
<script src="dojo/dojo/dojo.js"></script>
</head>
<body>
Try sorting a column as fast as you can. Look for duplicated rows.<br>
Using dgrid version: <%=dgridVer %><p>
<div id='gridDiv'></div>
<script>
require(['dgrid/Grid', 'dgrid/extensions/Pagination', 'dojo/store/JsonRest',
'dojo/store/Cache', 'dojo/store/Memory', 'dojo/_base/declare', 'dojo/domReady!'],
function(Grid, Pagination, JsonRest,
Cache, Memory, declare) {
var columns = [
{ field: "first", label: "First Name" },
{ field: "last", label: "Last Name" },
{ field: "age", label: "Age" }
];
var store = new JsonRest({
target: 'testData.jsp',
sortParam: "sortBy"
});
store = Cache(store, Memory());
var grid = new (declare([Grid, Pagination]))({
columns: columns,
store: store,
loadingMessage: 'Loading...',
rowsPerPage: 4,
firstLastArrows: true
}, 'gridDiv');
});
</script>
</body>
</html>
Check the default implementation of Cache.js, especially the query and queryEngine functions. By default they always get first to the master store, which in your case is the JsonRest store. Only after the data has been loaded, the caching store is updated (in your case the Memory store).
Now, if you check function _setSort in DGrid List.js file, and function refresh in DGrid OnDemandList.js you'll sill see that by default DGrid calls the query method of the current store to obtain the new list of items sorted differently. In your case that store is the dojo/store/Cache.
So, summing up, when the user clicks the column to sort, DGrid queries the Cache, which in turn queries JsonRest, which in turns queries the server, which then returns new data, which then the Cache stores in the Memory store.
You can actually confirm this for example with Firebug (a Firefox extension). In my case whenever I tried to sort, Firebug was showing a new request to the server to obtain new data.
This makes sense when there are lots of rows because DGrid is designed to load only the first batch of rows and then update the grid when user scrolls down. When the sort is changing, the first visible batch of rows may be different and may not be loaded yet, so DGrid must load them first.
But in my case the Json request was returning all data during a single request. I didn't like the default implementation and implemented my own caching store which doesn't require a trip to the server when changing the sorting. I can't share the implementation now but will try to do when I have some time to tidy up the code.
As for now you shouldn't notice any performance problems if you switch to JsonRest store only (considering that when changing the sorting there is the trip to the server anyway).
I am not sure what causes the specific problem of duplicated rows, but I remember seeing it too when my caching store wasn't implemented properly (it had something to do with deferred requests when loading data If I recall correctly). You can try to debug it by adding (again with Firebug) breakpoints in the get and query functions of the Cache store. My bet is that DGrid tries to load particular rows with the get method (which hits the cache) while the query request is still loading data from the server after user changed the sorting. But I may be wrong so please try to confirm it first if you can.
Related
I'm recently working with Firebase Cloud functions to delegate lot of work from my client side to the server, reducing the data cost for the user. But recently I wondered if it's worth it or not, or maybe a better database structure could fix it.
I have a social app where the user can workout and post their results, you can follow users and all kind of "typical" social media stuffs. Well, my problem appear when I want to implement pagination retrieving the last X workouts that I should show to each user on their feed.
My question is : How expensive could be update from 1-1000(worst case) fields on the database on a common event trigger on Firebase Cloud functions. It's enough expensive at client side to look for avoid it and look for better ways talking about performance even if it's more expensive at client side?
I will explain it looking at my example:
Database Structure
"privateUserData" : {
"user1" : {
"messagingTokens": {
"someToken": true,
"someToken2": true,
},
"accountCreationDate" : 1495819217216,
"email" : "abcd#gmail.com",
"followedBy" : {
"user2": true,
"user3": true,
},
"following" : {
"user2": true,
"user3": true,
},
"lastLogin" : 1498654134543,
"photoUrl" : "photo.png",
"username" : "Francisco Durdin Garcia"
},
},
"publicUserData": {
"user1": {
"username": "someUserName",
"followersCount": 5,
"followingCount": 1,
"photoUrl" : "someUrl"
}
...
},
"workouts" : {
"workout1" : {
"likes": {
"user1": true,
"user2": true,
...
},
"followers": {
"user1": true,
"user2": true,
...
},
"comments": {
"comment1": {
"owner": "user1",
"content": "somecomment",
"time": 1493153530311,
"replies": {
"reply1": {
"owner": "user1",
"content": "somecomment",
"time": 1493153530311,
}
}
}
}
"authorUid" : "user1",
"description" : "desc",
"points" : 63,
"time" : "00:03",
"createdAt" : 1493153530311,
"title" : "someTitle",
"workoutJson" : "workoutJsonDataHere"
}
}
To be able to do that query I should do individual queries for each user I follow:
The problem is that I can do a "global" query and limit it to just X dataSnapshots. I can just filter few workouts for each individual query:
mDatabase.child("workouts").orderByChild("authorId).equalTo("userIFollow").limitToLast(10)
This query will return me a filter applied just for one userIFollow it's not possible to do it over all of them, so I have three options:
1. Create a table which stores relation between usersId and workoutsId visible by them with a timeStamp value. But I should
keep track of this values thought a Firebase cloud function, and
obviously maybe I follow an user with Thousands of workouts I my
cloud functions would need to copy ALL OF THEM to the right
reference.
This was the way I wanted to go, but I don't know if it's
the proper way talking about client side cost.
2. I can add a lastActivityTimeStamp on publicUserData and filtering by that retrieve just a few workouts of the last users with activity, growing this query with a pagination too.
3. Finally I can always retrieve all the workouts from this user and filter on client side, this will be expensive just one, because later the cache will do everything easier.
This are the ways I found to resolve my problem, and my question is still how expensive and useful are Firebase Cloud functions to copy large amounts of data with common triggers.
From the way you worded your question, you seem familiar with the Database Cloud Functions for Firebase and it also seems that 'workouts' is your payload (the biggest chunk of data that you don't want to download repeatedly).
I would recommend the following approach based roughly off how GitHub's API works.
Prerequisites
In your /privateUserData/{user} data, you seem to have the list of followed user IDs (at /privateUserData/{user}/following). To make your queries simpler, I'd recommend implementing a list of workout IDs authored by that user (under something like /publicUserData/{user}/authorOf).
Implementation
I'd recommend building a HTTP Cloud Function, at say https://FUNCTION_URL/followedWorkouts. When called you would generate a list of workout IDs for a given user by checking who they follow and then getting the list of workouts authored by each followed user and return them as one array. To identify the user, you could pass in their ID using a GET parameter such as ?user=<someUserId> or through some form of authentication. How you go about it is up to you.
The function should return data in the following (or similar) format (in this case I'm using JSON):
[{"id": "workoutId1", "lastMod": "1493153530311"}, {"id": "workoutId2", "lastMod": "1493153530521"}, ...]
id is the the workout ID.
lastMod (short form of last modified) is the last time that workout's data was updated (from {workoutId}/lastModificationDate). See the 'caching' section below.
Filtering
I'd also implement the following 'filters' on the Cloud Function:
Since (?since=<someTimeStamp>): will return workout IDs that have been modified since that timestamp. (Say you downloaded some information at some time, T, you would then set since=T to then only receive workouts changed after that time.
Max (?max=X): will return the X most recent entries.
Start At (?startAt=X): will return the most recent entries starting at the index X (I'd make it a 1-based index).
So if you wanted to grab the 10 most recent entries, you could call https://FUNCTION_URL/followedWorkouts?max=10 which would give you the IDs for the 1st-10th most recently updated workouts. For the next 'page' of entries, you would call https://FUNCTION_URL/followedWorkouts?startAt=10&max=10 which would give you the 11th-20th most recently updated workout IDs.
Caching
As each workout is a payload, it doesn't make sense to download the multiple times. I would recommend caching this data to prevent this. In the response I suggested above, the field lastMod (last modified) allows you to check if a locally cached version needs updating. How you go about this, is yet again up to you.
Extending
If you need more of these paginated feeds, you could name the function more generally such as https://FUNCTION_URL/feeds and pass in the feed type as a parameter https://FUNCTION_URL/feeds?type=workouts. You could use this for things like followers, following, comments, etc.
Feel free to reach out if you need some more information.
I'm facing a problem implementing data-synchronization between a server and multiple clients.
I read about Event Sourcing and I would like to use it to accomplish the syncing-part.
I know that this is not a technical question, more of a conceptional one.
I would just send all events live to the server, but the clients are designed to be used offline from time to time.
This is the basic concept:
The Server stores all events that every client should know about, it does not replay those events to serve the data because the main purpose is to sync the events between the clients, enabling them to replay all events locally.
The Clients have its one JSON store, also keeping all events and rebuilding all the different collections from the stored/synced events.
As clients can modify data offline, it is not that important to have consistent syncing cycles. With this in mind, the server should handle conflicts when merging the different events and ask the specific user in the case of a conflict.
So, the main problem for me is to dertermine the diffs between the client and the server to avoid sending all events to the server. I'm also having trouble with the order of the synchronization process: push changes first, pull changes first?
What I've currently built is a default MongoDB implementation on the serverside, which is isolating all documents of a specific user group in all my queries (Currently only handling authentication and server-side database work).
On the client, I've built a wrapper around a NeDB store, enabling me to intercept all query operations to create and manage events per-query, while keeping the default query behaviour intact. I've also compensated for the different ID systems of neDB and MongoDB by implementing custom ids that are generated by the clients and are part of the document data, so that recreating a database won't mess up the IDs (When syncing, these IDs should be consistent across all clients).
The event format will look something like this:
{
type: 'create/update/remove',
collection: 'CollectionIdentifier',
target: ?ID, //The global custom ID of the document updated
data: {}, //The inserted/updated data
timestamp: '',
creator: //Some way to identify the author of the change
}
To save some memory on the clients, I will create snapshots at certain amounts of events, so that fully replaying all events will be more efficient.
So, to narrow down the problem: I'm able to replay events on the client side, I'm also able to create and maintain the events on the client and serverside, Merging the events on serverside should also not be a problem, Also replicating a whole database with existing tools is not an option as I'm only syncing certain parts of the database (Not even entire collections as the documents are assigned different groups in which they should sync).
But what I am having trouble with is:
The process of determining what events to send from the client when syncing (Avoid sending duplicate events, or even all events)
Determining what events to send back to the client (Avoid sending duplicate events, or even all events)
The right order of syncing the events (Push/Pull changes)
Another Question I would like to ask, is whether storing the updates directly on the documents in a revision-like style is more efficient?
If my question is unclear, duplicate (I found some questions, but they didnt help me in my scenario) or something is missing, please leave a comment, I will maintain it as best as I can to keep it simple, as I've just written everything down that could help you understand the concept.
Thanks in advance!
This is a very complex subject, but I'll attempt some form of answer.
My first reflex upon seeing your diagram is to think of how distributed databases replicate data between themselves and recover in the event that one node goes down. This is most often accomplished via gossiping.
Gossip rounds make sure that data stays in sync. Time-stamped revisions are kept on both ends merged on demand, say when a node reconnects, or simply at a given interval (publishing bulk updates via socket or the like).
Database engines like Cassandra or Scylla use 3 messages per merge round.
Demonstration:
Data in Node A
{ id: 1, timestamp: 10, data: { foo: '84' } }
{ id: 2, timestamp: 12, data: { foo: '23' } }
{ id: 3, timestamp: 12, data: { foo: '22' } }
Data in Node B
{ id: 1, timestamp: 11, data: { foo: '50' } }
{ id: 2, timestamp: 11, data: { foo: '31' } }
{ id: 3, timestamp: 8, data: { foo: '32' } }
Step 1: SYN
It lists the ids and last upsert timestamps of all it's documents (feel free to change the structure of these data packets, here I'm using verbose JSON to better illustrate the process)
Node A -> Node B
[ { id: 1, timestamp: 10 }, { id: 2, timestamp: 12 }, { id: 3, timestamp: 12 } ]
Step 2: ACK
Upon receiving this packet, Node B compares the received timestamps with it's own. For each documents, if it's timestamp is older, just place it in the ACK payload, if it's newer place it along with it's data. And if timestamps are the same, do nothing- obviously.
Node B -> Node A
[ { id: 1, timestamp: 11, data: { foo: '50' } }, { id: 2, timestamp: 11 }, { id: 3, timestamp: 8 } ]
Step 3: ACK2
Node A updates it's document if ACK data is provided, then sends back the latest data to Node B for those where no ACK data was provided.
Node A -> Node B
[ { id: 2, timestamp: 12, data: { foo: '23' } }, { id: 3, timestamp: 12, data: { foo: '22' } } ]
That way, both node now have the latest data merged both ways (in case the client did offline work) - without having to send all your documents.
In your case, your source of truth is your server, but you could easily implement peer-to-peer gossiping between your clients with WebRTC, for example.
Hope this helps in some way.
Cassandra training video
Scylla explanation
I think that the best solution to avoid all the event order and duplication issues are to use the pull method. In this way every client maintains its last imported event state (with a tracker for example) and ask the server for the events generated after that last one.
An interesting problem will be to detect the breaking of business invariants. For that you could store on the client the log of applied commands also and in case of a conflict (events were generated by other clients) you could retry the execution of commands from the command log. You need to do that because some commands will not succeed after re-execution; for example, a client saves a document after other user deleted that document in the same time.
I'm creating a mock application with JSON server as the backend and I'm wondering if it is possible to get the total number of records contained at an end point without loading all the records themselves? Assuming the db.json file looks like the JSON snippet below, how would I find out that the end point only has one record without fetching the record itself, provided it's possible?
{
"books": [{
"title": "The Da Vinci Code",
"rating": "0"}]
}
You can simply retrieve the X-Total-Count header
This is a screen-shot of a response headers returned by JSON Server when enabling pagination i.e using the _page parameter (e.g. localhost:3000/contacts?_page=1)
Whenever you fetch the data, json-server actually returns the total count by default (it has an x-total-count property:
Example:
axios
.get("http://localhost:3001/users", {
params: {
_page: 1,
_limit: 10
}
})
.then(res => {
console.log(res.data); // access your data which is limited to "10" per page
console.log(res.headers["x-total-count"]); // length of your data without page limit
});
You've three options. I'd recommend the 3rd one to you:
Return all the records and count them. This could be slow and send a lot of data over the wire but probably is the smallest code change for you. It also opens you up to attacks where people can hammer your server by requesting many records repeatedly.
Add a new endpoint. You could add a new endpoint that simply returns the count. It's simple but slightly annoying having a 2nd endpointime to document and maintain.
Modify the existing endpoint. Return something like
{
count: 157,
rows: [...data]
}
The benefit of 3 is its all in one endpoint. It also nears you toward a point where you can add a skip and take parameter in future to allow pagination of the resultant data.
You will write another end point that returns number of records. Usually also you may want end point for limit and offset to be used with pagination.
let response = await fetch("http://localhost:3001/books?_page=1");
let total = response.headers.get('X-Total-Count');
I am tying to integrate webix tables with a backbone collection as shown in the webix docs (http://docs.webix.com/desktop__backbone_collections.html) however it does not seem to work. The object sync call happens, but no data is loaded.
budgets = new Backbone.Budget.Collection(window.budget)
list =
width : 320
view : "datatable"
id : "budget_list"
backbone_collection : budgets
select : true
scroll : false
columns :[
{header : "Month", id: "budget_month"}
{header : "Year", id: "budget_year"}
{header : "Currency", id: "base_currency"}
]
on: {
onAfterRender : () ->
console.log("Sync ", #_settings)
#sync(#_settings.backbone_collection)
}
Calling .sync from the onAfterRender causes the problem, as sync causes re-rendering of datatable, which triggers new sync and it causes new re-rendering and etc.
To break this loop you can use webix.once which will guarantee that handler will be executed only once.
Check the updated snippet http://webix.com/snippet/5dd61a47
Its very possible that the server you are hitting 1) is not specifying 'Content-type: application/json' and this is rejected by the client on the response; and, or 2) doesn't response to the OPTIONS pre-flight thus throws up a CORS security block. Both are difficult to solve without access to the server.
Curl will not be subject to the CORS issue but a browser-based REST client will -- and thus best represent your issue.
Try using the Chrome advanced rest client with the URL and headers given in the UI.
And if you dont know the URL and header then sniff your requests when you run that UI.
I have a ExtJs(v3.1) `Ext.grid.GridPanel that loads some records from its store and allows editing.
If i select multiple records, and I click the delete it sends multiple DELETE requests, overwhelms the server, which eventually deletes some of them , returns 404 for the rest.
I don't understand why it sends a second or third request before the first has failed, it just has not returned.
this is the handler for the delete button
function onDelete() {
var recs = userGrid.getSelectionModel().getSelections();
userGrid.store.remove(recs); //delete multiple selections one at a time
}
and the store its based on
// Typical Store collecting the Proxy, Reader and Writer together
var store = new Ext.data.GroupingStore({
proxy: proxy,
reader: reader,
writer: writer,
sortInfo: { // Default sort by day decsending grouped by week
field: 'day',
direction: "DSC"
}, groupField: 'week',
batch: false, // update each record with an individual XHR request, the server doesnt process batch requests
});
this is a screenshot of firebug after i highlighted 5 records and clicked delete
Gee that line:
batch: false, // update each record with an individual XHR request, the server doesnt process batch requests
sure looks suspicious ... I bet that that's just what Ext does, given that it'd be pretty slow to actually wait for each response before sending the next.
(I agree however that just blasting out a whole bunch of overlapping HTTP transactions like that is not terribly smart.)
I have the similar issue, but I was able to fix it by calling onCommitChanges() on the data store once the store is modified.