Need advice on a database structure

Need advice on a database structure - javascript

A friend and I put together a MySQL php website where users can make predictions on sports events.
My question is regarding our "rankings" page which displays the list of members who have made the best predictions. The way we had it initially was each time a user loaded that page the server would calculate all the predictions from all the users, then displayed the top ones. Unfortunately, even with a small group alpha testing it, the loading of this page quickly became slower as the number of predictions grew.
Ideally speaking, we would like to have filters on this page, for each different sport, league and time frames, and create a dynamic rankings page, but we're stuck on finding an efficient way to structure our database that won't use up too much server resources.
Any ideas on what a better way could be is greatly appreciate. Thanks!

You could use "materialized views" to calculate the score. A materialized view is a view (thus some query), but where the results are stored (in memory) and updated accordingly:
http://www.fromdual.com/mysql-materialized-views
One can expect that the difference (in the number of (correct) predictions) will not change dramatically between two page renderings. Thus that the number of calculations will be quite low.

Related

Website takes very long to load, order, and display statistics

I have a database where a few statistics are stored (PlayerID, Time, Kills, Deaths), 7 in total but these are enough to explain.
What I did was load the table in php, create an array with all the statistics (including statistics that are a product of multiple statistics, kills per death for example), sort them, and cut the top 10.
Then I passed this top10-array as JSON to javascript, where I just created tr- and td-elements and made the respective tables append them.
However, with just 10 different sets of statistics, it took an eternity (like 30 seconds I'd say) to load the page. My guess was that the sorting takes too long for the server, so I tried just passing the initial array to javascript and put the sorting part in /* */, to test if it would go any faster. It did not.
Afterwards I turned the sorting back on, but disabled the javascript part and just displayed it with var_dump(). It still didn't go much faster.
The actual code is kinda long and I think this isn't a question that you need the code for, but I can still post the code if you really need it. What exactly is causing the load time, and what would be the best way to sort and display the statistics?
Edit: I forgot to say that using ORDER BY in the SQL query doesn't work, because I need the to calculate some statistics using others.

Sounds like you spend all you time on retrieving and processing the whole data set, and then you throw away everything but top 10 records.
Without seeing the model and understanding how the top 10 are selected, the only advice would be to rethink you model, decide on an indexed field by which you would be able to fetch just the top 10 records, and do the calculations for that field before you save statistics into DB.
It also depends on which operation is more time-sensitive for you - SELECT or UPDATE (i.e when you fetch or save statistics). But I bet couple of math operations before you save data will not affect much the time spent on saving the data. But will greatly improve time spent on generating some reports, including top 10 report.

Graphs in Django-Big Data analytics

Well my question is I want to build a running graph with random inputs,lets say in the range of 1-1000. The graph is kind of like a graph drawn by an ECG, an instrument which records the the heart beats and outputs the result on the screen in the form of a graph. The random values should be generated after a particular time-interval,say 5 secs. Upon arrival of new randomly generated value, the graph should shift left and the new value should be represented on the graph, just like this ECG. Visit the link.
And to give a more broad and practical view, I am developing a system for Data Analytics where the Data will be analyzed and the results(in the form of graphs) will be shown in the frontend.
A friend of mine, who is good with javascript suggested me that this could be done with js. But, I want it to be done using MVC architecture. So , how would I approach this problem. Preferably how can I model this in Django.
Different suggestions are always welcomed.

May be its a little late.
Try http://www.highcharts.com/demo/dynamic-update. This will surely help you.

AngularJS Infinite Scrolling with lots of data

So I'm trying to create an infinite scrolling table using AngularJS, similar to this: http://jsfiddle.net/vojtajina/U7Bz9/
The problem I'm having is that in the jsfiddle example, if I keep scrolling till I have a million results, the browser is going to slow to a crawl, wouldn't it? Because there would now be 1,000,000 results in $scope.items. It would be better if I only ever had, for example, 1000 results at a time inside $scope.items, and the results I was viewing happen to be within those 1000.
Example use case: page loads and I see the first 10 results (out of 1,000,000). Even though I only see 10, the first 1000 results are actually loaded. I then scroll to the very bottom of the list to see the last 10 items. If I scroll back up to the top again, I would expect that the top 10 results will have to be loaded again from the server.
I have a project I did with ExtJS that a similar situation: an infinite scrolling list with several thousand results in it. The ExtJS way to handle this was to load the current page of results, then pre-load a couple of extra pages of results as well. At any one time though, there was only ever 10 pages of results stored locally.
So I guess my question is how would I go about implementing this in AngularJS? I kow I haven't provided much code, so I'm not expecting people to just give the coded answer, but at least some advice in which direction to go.

A couple of things:
"Infinite scrolling" to "1,000,000" rows is likely to have issues regardless of the framework, just because you've created millions and millions of DOM nodes (presuming you have more than one element in each record)
The implementation you're looking at doing with <li ng-repeat="item in items">{{item.foo}}</li> or anything like that will have issues very quickly for one big reason: {{item.foo}}} or any ngBind like that will set up a $watch on that field, which creates a lot of overhead in the form of function references, etc. So while 10,000 small objects in an "array" isn't going to be that bad... 10,000-20,000 additional function references for each of those 10,000 items will be.
What you'd want to do in this case would be create a directive that handles the adding and removing of DOM elements that are "too far" out of view as well as keeping the data up to date. That should mitigate most performance issues you might have.
... good infinite scrolling isn't simple, I'm sorry to say.

I like the angular-ui implementation ui-scroll...
https://github.com/angular-ui/ui-scroll
... over ngInfiniteScroll. The main difference with ui-scroll from a standard infinite scroll is that previous elements are removed when leaving the viewport.
From their readme...
The common way to present to the user a list of data elements of undefined length is to start with a small portion at the top of the list - just enough to fill the space on the page. Additional rows are appended to the bottom of the list as the user scrolls down the list.
The problem with this approach is that even though rows at the top of the list become invisible as they scroll out of the view, they are still a part of the page and still consume resources. As the user scrolls down the list grows and the web app slows down.
This becomes a real problem if the html representing a row has event handlers and/or angular watchers attached. A web app of an average complexity can easily introduce 20 watchers per row. Which for a list of 100 rows gives you total of 2000 watchers and a sluggish app.
Additionally, ui-scroll is actively maintained.

It seems that http://kamilkp.github.io/angular-vs-repeat would be what you are looking for. It is a virtual scrolling directive.

So turns out that the ng-grid module for AngularJS has pretty much exactly what I needed. If you look at the examples page, the Server-Side Processing Example is also an infinite scrolling list that only loads the data that is needed.
Thanks to those who commented and answered anyway.
Latest URL : ng-grid

You may try using ng-infinite-scroll :
http://binarymuse.github.io/ngInfiniteScroll/

Check out virtualRepeat from Angular Material
It implements dynamic reuse of rows visible in the viewport area, just like ui-scroll

Can anyone recommend a well performing interface to allow the user to organize a large number of items in HTML?

Currently for "group" management you can click the name of the group in a list of available groups and it will take you to a page with two side by side multi-select list boxes. Between the two list boxes are Add and Remove buttons. You select all the "users" from the left list and click 'Add' and they will appear in the right list, and vice versa. This works fairly well for a small amount of data.
The problem lies when you start having thousands of users. Not only is it difficult and time consuming to search through (despite having a 'filter' at the top that will narrow results based on a string), but you will eventually reach a point where your computer's power and the number of list items apex and the whole browser starts to lag horrendously.
Is there a better interface idea for managing this? Or are there any well known tricks to make it perform better and/or be easier to use when there are many 'items' in the lists?

Implement an Ajax function that hooks on keydown and checks the characters the user has typed into the search/filter box so far (server-side). When the search results drop below 50, push those to the browser for display.
Alternatively, you can use a jQuery UI Autocomplete plugin, and set the minimum number of characters to 3 to trigger the search. This will limit the number of list items that are pushed to the browser.

I would get away from using the native list box in your browser and implement a solution in HTML/CSS using lists or tables (depending on your needs). Then you can use JavaScript and AJAX to pull only the subset of data you need. Watch the user's actions and pull the next 50 records before they actually get to them. That will give the illusion of all of the records being loaded at runtime.
The iPhone does this kind of thing to preserve memory for it's TableViews. I would take that idea and apply it to your case.

I'd say you hit the nail on the head with the word 'filter'. I'm not the hugest fan of parallel multi-selects like what you are describing, but that is almost beside the point, whatever UX element you use, you are going to run into a problem given thousands of items. Thus, filtering. Filtering with a search string is a fine solution, but I suspect searching by name is not the fastest way to get to the users that the admin here wants. What else do you know about the users? How are they grouped.
For example, if these users were students at a highschool, we would know some meta-data about them: What grade are they in? How old are they? What stream of studies are they in? What is their GPA? ... providing filtering on these pieces of metadata is one way of limiting the number of students available at a time. If you have too many to start with, and it is causing performance problems, consider just limiting them, have a button to load more, and only show 100 at a time.
Update: the last point here is essentially what Jesse is proposing below.

how to handle large dataset like sproutcore

I really don't have any substantial code to show here, actually, that's kinda why I am writing: I looked at the SproutCore demo, especially the Collection demo, on http://demo.sproutcore.com/sample_controls/, and am amazed by its loading 200,000 records to the page so easily. I tried using Rails to provide 200,000 records and in a completely blank HTML page with
<% #projects.each do |p| %>
<%= p.title %>
<% end %>
that freezes the browser for seconds on my m1530 laptop with 4gb ram and t7700 256gb ssd.
Yet the sproutcore demo does not freeze and takes less than 3 seconds to load.
What do you think the one technique they are using to enable this is?
Thanks!

The technology that SproutCore uses to display and scroll smoothly through "infinite" lists of data has very little to do with where the data comes from and almost all to do with the integration of special SproutCore view classes, SC.CollectionView (the parent class of SC.ListView and SC.GridView) and SC.ScrollView; the collection of powerful client side datastore classes: SC.Store and SC.SparseArray; and the SproutCore runtime and controller architecture.
The fact is that you simply cannot render a list of several hundred thousand items in it and expect the browser not to grind to a halt. That is too many elements to insert into the DOM tree and that is why SC.CollectionView is optimized to only generate elements for the currently showing items in the list (ex. if only 20 items are visible out of 20 million, only 20 elements are in the DOM). It gets even better than that though, because by default as items scroll in and out of view, the few existing elements are updated in place with the new item information so that the DOM tree is not even touched. This would not be possible though without the integration of SC.ScrollView that allows the collection to be aware of its visible rect and when a scroll is about to happen.
On top of that, there is the entire SproutCore runtime architecture which is used to ensure that all DOM manipulations are queued up so that you only touch the DOM once per run loop if needed when a display property changes (ex. toggling a display property 50 times in one run loop only touches the DOM once with the final value). This is an important factor in extreme performance that affects all SproutCore views including SC.CollectionView.
Finally, to make the list really scream, you cannot load several million items into the client in one request, nor can you even store them all in client memory. This leads me to another optimization of SC.CollectionView and the SproutCore data store, which is to work with sparse data. SC.CollectionView will never try to iterate over every item in its content, so it doesn't need all the data present, only what is being shown. When we load data into the client, we would use an SC.SparseArray to page in a bit of data at a time as needed. The whole system is very elegantly designed so that when the collection view requests an item that the sparse array doesn't yet have, the sparse array fetches it (or the next page of items) in the background. Because of bindings and observers, when the new data comes in we can update the list in place, which means that the scrolling doesn't block while data is being brought in.
That demo above is very outdated, here is a new one that uses the technologies I mentioned above: http://showcase.sproutcore.com/#demos/Big%20Data (source is here: https://github.com/sproutcore/demos/tree/master/apps/big_data). In this demo, I scroll through 50,000 names, which is all I could generate and split into 500 JSON files of a 100 names each that are loaded remotely from the server. Once you scroll past 100 names, you will see that the next 100 names are paged in and there is a brief flash of placeholder text "…" (how long you see the placeholder text depends on your Internet connection).
I used 50,000 names, but I don't see any problem showing a list of 500,000 or 5 million names though. However, at that scale you would want to also 'un-page' data as you bring in new data using SC.Store#unloadRecords to keep the memory use down.
There's a few other technologies in play to make this whole thing possible that I've missed, but those are the main ones at least.

I imagine the demo provided isn't being generated dynamically - it's static data.
Very few systems would be able to iterate a collection of live data that size. There are a number of techniques including streaming the dataset (using batch iteration through the records) through to caching and ajax partial loading strategies.

More on sproutcore here.. http://ostatic.com/blog/sproutcore-raises-the-bar-for-client-side-programming
If on the other hand you are looking for concurrency then node.js is the way to go.

We Keep Coding

JavaScript is the programming language of the Web.