I have a React JS app that uses an API to pull a JSON object that contains an array of 10,000+ objects. This array is then used to populate a table, with certain filters and options from checkboxes, dropdowns, that can manipulate the data. When tapping a checkbox, filter, sort, reduce functions are used on the array to return a specific subset of the array that can populate the table again.
There are 10-15 options to choose from, so 10-15 filter/map/reduce functions running on the data each time a box is checked.
These filtering options now cause a noticeable lag between clicking on the checkbox and changing the table. The app freezes while it calculates the new array. Is there a more efficient flow to filter my data?
Some example functions below below:
//gameData is an array of 10k+ objects
let inData = gameData
const options = {
dateToTime: new Date('2020-03-01'),
servers:[1,2,3],
maps:['A','B','C']
}
function groupByArray(array, key) {
return array.reduce(function (rv, x) {
let v = key instanceof Function ? key(x) : x[key];
let el = rv.find((r) => r && r.key === v);
if (el) {
el.values.push(x);
} else {
rv.push({ key: v, values: [x] });
} return rv;
}, []);
}
const gamesGrouped = groupByArray(inData, 'gameid')
inData = gamesGrouped.filter(a => a.playername != "new")
inData = inData.filter(game => {
const thisTime = new Date(game.creationtime)
return (thisTime < options.dateToTime)
})
inData = inData.filter(game => options.servers.includes(game.serverip))
inData.filter(game => options.maps.includes(game.map))
Thanks in advance!
I would say it is impossible to give general answer how to process array data, but I can give some pointers.
be careful when nesting loops (to avoid repeating same iterations)
avoid overheads, eg. find() can be replaced with for loop which is quite faster (I know it is easier to write find(), but you are looking at roughly 30% performance increase by switching it to for loop)
paginate - you can process array in chunks using generators (eg. if you need to show only first 10 results, that would be faster then processing all of them)
Also code that you provided is bit cryptic, you might want to use better naming.
Here is performance comparison for groupByArray() function: https://jsbench.me/t7kltjm1sy/1
Worth noting, whenever I deal with performance sensitive situations I keep code to the VanillaJS as close as possible, because with large data sets even slight function overhead can be noticeable.
Related
In my case minimum JSON data which I am using 90k [...] and currently I am using .filter method. Nothing is wrong working everything perfectly without any issue, but just for the performance point of view I am wondering and, need suggestion as well, which way we can use for improving the permeance do you agree or we have better way to improve the performance.
All the request coming from backend can not modify and split.
For the reference adding a 5k data which takes around 1sec.
I value all the developer times, I added a code snippet as well.
Appreciate any help and suggestion.
const load5kData = async () => {
let url = 'https://jsonplaceholder.typicode.com/photos';
let obj = await (await fetch(url)).json();
const filteredValue = obj.filter(item => item.albumId == 36);
console.log(filteredValue)
}
load5kData();
<h1>5k data</h1>
It looks like the response is returned with the albumId ordered in ascending order. You could make use of that by using a traditional for loop and short circuiting once you reach id 37.
In my opinion, if you aren't having performance issues just using the filter method, I would say just leave it and dont over-optimize!
Another option is that there are only 50 items with albumId == 36. You could just make your own array of all those objects in a json file. However you obviously lose out on fetching the latest images if the results of the api ever change
filter is really your only solution which will involve iterating over each element.
If you need to do multiple searches against the same data you can index data by the key you will be using so finding data with a specific albumId requires no additional filtering but would still require iterating over each element when initially indexing.
const indexByAlbumId = data =>
data.reduce((a, c) => {
if (a[c.albumId] === undefined) {
a[c.albumId] = [c];
} else {
a[c.albumId].push(c);
}
return a;
}, {});
const load5kData = async () => {
const url = 'https://jsonplaceholder.typicode.com/photos';
const data = await (await fetch(url)).json();
const indexedData = indexByAlbumId(data);
console.log('36', indexedData[36]);
console.log('28', indexedData[28]);
};
load5kData();
Another optimisation is that if the data is sorted by the index you are searching you can take advantage of this by doing a divide and conquer search where you first try to find an entry where the value is what you need, then from there you find where the chunk begins/ends by doing the same divide and conquer to the left/right of that element.
Currently using: obj.filter(item => item.albumId == 36);
My task is:
Implement the function duplicateStudents(), which gets the variable
"students" and filters for students with the same matriculation
number. Firstly, project all elements in students by matriculation
number. After that you can filter for duplicates relatively easily. At
the end project using the following format: { matrikelnummer:
(matrikelnummer), students: [ (students[i], students[j], ... ) ] }.
Implement the invalidGrades() function, which gets the variable "grades"
and filters for possibly incorrect notes. For example, in order to
keep a manual check as low as possible, the function should determine
for which matriculation numbers several grades were transmitted for
the same course. Example: For matriculation number X, a 2. 7 and a 2.
3 were transmitted for course Y. However, the function would also take
into account the valid case, i. e. for matriculation number X, once a
5,0 and once a 2,3 were transmitted for course Y.
In this task you should only use map(), reduce(), and filter(). Do not
implement for-loops.
function duplicateStudents(students) {
return students
// TODO: implement me
}
function invalidGrades(grades) {
return grades
.map((s) => {
// TODO: implement me
return {
matrikelnummer: -1/* put something here */,
grades: []/* put something here */,
};
})
.filter((e) => e.grades.length > 0)
}
The variables students and grades I have in a separate file. I know it might be helpful to upload the files too, but one is 1000 lines long, the other 500. That’s why I’m not uploading them. But I hope it is possible to do the task without the values. It is important to say that the values are represented as an array
I'll give you an example of using reduce on duplicateStudents, that's not returning the expected format but you could go from there.
const duplicateStudents = (students) => {
const grouping = students.reduce((previous, current) => {
if (previous[current.matrikelnummer]) previous[current.matrikelnummer].push(current); // add student if matrikelnummer already exist
else previous[current.matrikelnummer] = [current];
return previous;
}, {});
console.log(grouping);
return //you could process `grouping` to the expected format in here
};
here's preferences for you:
map
filter
reduce
I am searching through large set of an array with length more than 10000.
sample array should look like this.
let data = ['Hello world mr tom', ........]; // array length of 10000 with different strings
let input = '' // string gets from input field
and this is my code.
this.results = [];
for (let i = 0; i < data.length; i++) {
if (input.toLowerCase().split(' ').every(val => data[i].toLowerCase().includes(val))) {
this.results.push(data[i])
}
}
it is working, but it is taking too much time to load. let's say in my array list i have a common string called Hello world when entering this string in the input field, it is taking too much time to load. Is there any optimized way to acheive this search with lesser amount of time.
Add a debounce function. Don't filter the data immediately - instead, wait until around 500ms after the user has stopped typing, then perform the filtering action.
let timeoutId;
input.addEventListener('input', () => {
clearTimeout(timeoutId);
timeoutId = setTimeout(filterResults, 500);
});
(where filterResults, of course, filters the results with the input's value)
Call toLowerCase on the data elements just once, ahead of time, not on every iteration in the loop or after each input change:
const data = [/* your data */];
const lowerData = data.map(str => str.toLowerCase());
// then use `lowerData` instead
Call toLowerCase and .split on the input string outside of the loop beforehand, instead of on every iteration
Client-side is not meant for handling huge amounts of data, especially on lower-end phones. Performance suffers. For huge amounts of data, consider doing the filtering server-side instead.
If you have no choice but to do this client-side, perform the calculations in a separate Web Worker so that the page's main UI remains responsive while it's going on
I think you can take a look at these algorithms. You can try to adapt them to this problem or even to other string problems that you may have.
https://www.geeksforgeeks.org/trie-insert-and-search/
https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/
I have a javascript array of nested data that holds data which will be displayed to the user.
The user would like to be able to apply 0 to n filter conditions to the data they are looking at.
In order to meet this goal, I need to first find elements that match the 0 to n filter conditions, then perform some data manipulation on those entries. An obvious way of solving this is to have several filter statements back to back (with a conditional check inside them to see if the filter needs to be applied) and then a map function at the end like this:
var firstFilterList = _.filter(myData, firstFilterFunction);
var secondFilterList = _.filter(firstFilterList, secondFilterFunction);
var thirdFilterList = _.filter(secondFilterList, thirdFilterFunction);
var finalList = _.map(thirdFilterList, postFilterFunction);
In this case however, the javascript array would be traversed 4 times. A way to get around this would be to have a single filter that checks all 3 (or 0 to n) conditions before determining if there is a match, and then, inside the filter at the end of the function, doing the data manipulation, however this seems a bit hacky and makes the "filter" responsible for more than one thing, which is not ideal. The upside would be that the javascript Array is traversed only once.
Is there a "best practices" way of doing what I am trying to accomplish?
EDIT: I am also interested in hearing if it is considered bad practice to perform data manipulation (adding fields to javascript objects etc...) within a filter function.
You could collect all filter functions in an array and check every filter with the actual data set and filter by the result. Then take your mapping function to get the wanted result.
var data = [ /* ... */ ],
filterFn1 = () => Math.round(Math.random()),
filterFn2 = (age) => age > 35,
filterFn3 = year => year === 1955,
fns = [filterFn1, filterFn2, filterFn2],
whatever = ... // final function for mapping
result = data
.filter(x => fns.every(f => f(x)))
.map(whatever);
One thing you can do is to combine all those filter functions into one single function, with reduce, then call filter with the combined function.
var combined = [firstFilterFunction, seconfFilterFunction, ...]
.reduce((x, y) => (z => x(z) && y(z)));
var filtered = myData.filter(combined);
I trying to build a product list based on multiple filters. I thought this should be very straight forward but it's not for me at least.
Here is the plunkr http://plnkr.co/edit/vufFfWyef3TwL6ofvniP?p=preview
Checkboxes are dynamically generated from respective model e.g. sizes, colours, categories. Subcategory checkbozes should perform 'OR' query but cross section it should perform 'AND' query.
basically something like
filter:{categories:selectedcategories1} || {categories:selectedcategories2} | filter:{categories:selectedsizes1} || {categories:selectedsizes2}
problem is generating these filters dynamically. I also tried with filter in controller as-
var tempArr = [{'categories':'selectedvalue1'}, {'categories':'selectedvalue2'}];
var OrFilterObjects = tempArr.join('||');
$scope.products = $filter('filter')($scope.products, OrFilterObjects, true);
But couldn't find a way to assign correct value for OrFilterObjects.
Now as latest attempt (which is in plunkr) I am trying to use a custom filter. It's also not returning OR result.
Right now I am using it as productFilter:search.categories:'categories' if it would have returned OR result then I'd planned to use it as-
`productFilter:search.categories:'categories' | productFilter:search.colours:'colours' | productFilter:search.sizes:'sizes'`
Since I am here seeking help, it would be nice to have like productFilter:search.
I've spent considerable amount of time to find solution of this supposedly simple problem but most of examples use 'non-dynamic' checkboxes or flat objects.
May be I am thinking in wrong direction and there is a more elegant and simple Angular way for such scenarios. I would love to be directed towards any solution to similar solution where nested objects can be filtered with automated dynamically generated filters. Seems to me very generic use case for any shopping application but till now no luck getting there.
First thing you need to understand: this problem is not, by any definition, simple. You want to find a match based on a property of an object in an array which is a property of an object inside an input array you're supplying, not to mention [OR intra group] + [AND inter group] relations, search properties defined by either .title or .name, as well as criteria selection being completely dynamic. It's a complex problem.
Though it's a common scenario for shopping cart websites, I doubt that any web framework will have this kind of functionality built into its API. It's unfortunate but I don't think we can avoid writing the logic ourselves.
At any rate, since ultimately you want to just declare productFilter:search, here it is:
app.filter('productFilter', function($filter) {
var helper = function(checklist, item, listName, search) {
var count = 0;
var result = false;
angular.forEach(checklist, function(checked, checkboxName) {
if (checked) {
count++;
var obj = {};
obj[search] = checkboxName;
result = result || ($filter('filter')(item[listName], obj, true).length > 0);
}
});
return (count === 0) || result;
};
return function(input, definition) {
var result = [];
if (angular.isArray(input) && angular.isObject(definition)) {
angular.forEach(input, function(item) {
var matched = null;
angular.forEach(definition, function(checklist, listName) {
var tmp;
if (listName !== 'colours') {
tmp = helper(checklist, item, listName, 'title');
} else{
tmp = helper(checklist, item, listName, 'name');
}
matched = (matched === null) ? tmp : matched && tmp;
});
if (matched) result.push(item);
});
}
return result;
};
});
A couple of notes:
How to use: ng-repeat="product in products | productFilter:search".
The filter only does some basic checks: input must be an array, and definition must be an object. If you need more, you may do so there.
I would say that *.name is an exception to the rule (I assume that most of the criteria is defined by *.title). So, we handle that in if/else.
The count variable in a helper function is used to track how many checked checkbox(es) we went through for a particular criteria group. If we went through none, it means that whole criteria group is inactive, and we just return true.
It's a good design to create a filter that doesn't mutate the states of other objects outside it. That's why using count is better than calling cleanObj(). This is especially crucial when designing common components for other devs to use in a team: you want to minimize the element of surprise as much as possible.