GroupBy Array elements from CSV file and help reduce code - javascript

Each csv file that is imported has the same data structure.
I need to sum the ['Net Charge Amount'] by each '[Service Type'].
I am currently doing this by assigning each unique ['Service Type'] to their own array. My current script is probably overkill but it is very easy to follow, however I am looking for a more compact way of doing this otherwise this script could get very long.
const fs = require('fs')
const { parse } = require('csv-parse')
// Arrays for each service type
const GroundShipments = []
const HomeDeliveryShipments = []
const SmartPostShipments = []
const Shipments = []
The [Shipments] array will hold all data and I would assume this is the array
we want to work with
//functions for each service type
function isGround(shipment) {
return shipment['Service Type'] === 'Ground'
}
function isHomeDelivery(data) {
return data['Service Type'] === 'Home Delivery'
}
function isSmartpost(shipment) {
return shipment['Service Type'] === 'SmartPost'
}
function isShipment(shipment) {
return shipment['Service Type'] === 'Ground' || shipment['Service Type'] === 'Home Delivery' ||
shipment['Service Type'] === 'SmartPost'
}
// Import csv file / perform business rules by service type
// output sum total by each service type
fs.createReadStream('repco.csv')
.pipe(parse({
columns: true
}))
.on('data', (data) => {
//push data to proper service type array
// Ground
if (isGround(data)) {
GroundShipments.push(data)
}
// Home Delivery
if (isHomeDelivery(data)) {
HomeDeliveryShipments.push(data)
}
// Smartpost
if (isSmartpost(data)) {
SmartPostShipments.push(data)
}
// All shipment types, including Ground, Home Delivery, and Smartpost
if (isShipment(data)) {
Shipments.push(data)
}
})
.on('error', (err) => {
console.log(err)
})
.on('end', (data) => {
// sum data by service type
// Ground Only
const sumGround = GroundShipments.reduce((acc, data) =>
acc + parseFloat(data['Net Charge Amount']), 0)
// Home Delivery Only
const sumHomeDelivery = HomeDeliveryShipments.reduce((acc, data) =>
acc + parseFloat(data['Net Charge Amount']), 0)
// SmartPost Only
const sumSmartPost = SmartPostShipments.reduce((acc, data) =>
acc + parseFloat(data['Net Charge Amount']), 0)
// All services
const sumAllShipments = Shipments.reduce((acc, data) =>
acc + parseFloat(data['Net Charge Amount']), 0)
//output sum by service type to console
console.log(`${GroundShipments.length} Ground shipments: ${sumGround}`)
console.log(`${HomeDeliveryShipments.length} Home Delivery shipments: ${sumHomeDelivery}`)
console.log(`${SmartPostShipments.length} Smartpost shipments: ${sumSmartPost}`)
console.log(`${Shipments.length} All shipments: ${sumAllShipments}`)
})
Here is the console output:
[1]: https://i.stack.imgur.com/FltTU.png
Instead of separating each ['Service Type'] by its own Array and Function, I would like one Array [Shipments] to output each unique ['Service Type'] and sum total of ['Net Charge Amount']

The two keys to simplifying this are:
separating the CSV parsing from the data processing
using a groupBy function
First, you should parse the CSV into a simple JS array. Then you can use regular JS utility functions to operate on the data, such as the groupBy function. It is a utility that can be found in the lodash and ramda libraries. It's probably going to be added to vanilla JS as the .group method but that's a while from now.
I was looking for a sample problem to play with my own JS evaluation framework, so I answered your question there:
You can explore the underlying val yourself: https://www.val.town/stevekrouse.exampleGroupByShppingCSV
There are a couple things about my answer that wouldn't make sense in a normal NodeJS codebase, but that I had to do to make it work in val.town (async/await, using a custom groupBy method instead of importing one). If you'd like help getting it to work in your application, just let me know.

A solution would be to use a Map instance to keep track of the stats of different service types.
For each shipment find the associated stats (based on service type), or create a new stats object { count: 0, sum: 0 }. Then increment the count, and add the amount to the sum.
When all data is iterated (on end), you can loop through the serviceTypeStats which and log the values. You can also use this loop to calculate the total by adding all count and sum of each service type group.
const serviceTypeStats = new Map();
// ...
.on('data', (shipment) => {
const serviceType = shipment['Service Type'];
const amount = parseFloat(shipment['Net Charge Amount']);
if (!serviceTypeStats.has(serviceType)) {
serviceTypeStats.set(serviceType, { count: 0, sum: 0 });
}
const stats = serviceTypeStats.get(serviceType);
stats.count += 1;
stats.sum += amount;
})
// ...
.on('end', () => {
const total = { count: 0, sum: 0 };
for (const [serviceType, stats] of serviceTypeStats) {
total.count += stats.count;
total.sum += stats.sum;
console.log(`${stats.count} ${shipmentType}: ${stats.sum}`);
}
console.log(`${total.count} All shipments: ${total.sum}`);
})
If you want to loop keys in a specific order you can define the order in an array, or sort the keys of the Map instance.
// pre-defined order
const serviceTypeOrder = ["Ground", "Home Delivery", "SmartPost"];
// or
// alphabetic order (case insensitive)
const serviceTypeOrder = Array.from(serviceTypeStats.keys());
serviceTypeOrder.sort((a, b) => a.localeCompare(b, undefined, { sensitivity: "base" }));
// ...
for (const serviceType of sericeTypeOrder) {
const stats = serviceTypeStats.get(serviceType);
// ...
}

Related

How to prevent duplicate data from for loop

I am wondering what the best way is to prevent duplicate data from getting into a new array. I have a service call that returns the same array 3 times. I'm trying to take a number from inside the objects in the array and add them up to create a "total" number (fullRentAmt), but since the array gets returned 3 times I'm getting the total*3. I am thinking maybe .some() or .filter() could be of use here but I've never used those/am not sure how that would be implemented here. Thanks for any help!
What I tried, but it's not working/the new array isn't getting populated:
Component
properties = [];
fullRentAmt: number = 0;
const propertyDataSub = this.mainService.requestPropertyData()
.subscribe((pData: PropertyData[]) => {
if (pData) {
const propertyData = pData;
for (let i = 0; i < propertyData.length; i++) {
if (this.properties[i].propertyId !== propertyData[i].propertyId) {
this.properties.push(propertyData[i]);
}
}
for (let i = 0; i < this.properties.length; i++) {
this.fullRentAmt += this.properties[i].tenancyInformation[0].rentAmount;
}
});
Data returned from backend (array of 2 objects):
[
{
"tenantsData":[
{
"email":null,
"tenantNames":null,
"propertyId":2481,
}
],
"tenancyInformation":[
{
"id":2487,
"rentAmount":1000,
}
],
},
{
"tenantsData":[
{
"email":null,
"tenantNames":null,
"propertyId":3271,
}
],
"tenancyInformation":[
{
"id":3277,
"rentAmount":1200,
}
],
},
I'm not an angular developer, but I hope my answer will help you.
let the for loop duplicate the data as much as it wants. you just have to change the idea of storing the stuff from an array to a JavaScript Set
basically, it's very similar to arrays they're lists and iteratables that are very similar to arrays, the only difference is that they don't allow duplication,
usage:
const properties = new Set()
properties.add("yellow")
properties.add("blue")
properties.add("orange")
console.log(properties) // yellow, blue, orange
properties.add("blue")
properties.add("blue")
properties.add("blue")
console.log(properties) // yellow, blue, orange
after your for loop finishes, you may want to convert this set into a normal array, all you have to do is to use destructuring:
const propertiesArray = [...properties]
#YaYa is correct. I added this to show the correct code in Angular
properties = [];
fullRentAmt: number = 0;
const propertyDataSub = this.mainService.requestPropertyData()
.subscribe((pData: PropertyData[]) => {
if (pData && pData.length) {
let arrSet = new Set()
const propertyData = pData;
for (let i = 0; i < propertyData.length; i++) {
if (this.properties[i].propertyId !== propertyData[i].propertyId) {
arrSet.add(propertyData[i])
}
}
this.properties = Array.from(arrSet);
for (let i = 0; i < this.properties.length; i++) {
this.fullRentAmt += this.properties[i].tenancyInformation[0].rentAmount;
}
});
First thing you need to do is to fix your server and return the list once.
If server is out of your reach, you can leverage distinctUntilChanged pipe in combination with isEqual method in the frontend. You can either implement it yourself, or use a library such as lodash.
Also you do not have to subscribe, use async pipe in the template.
this.properties$ = this.mainService.requestPropertyData()
.pipe(
distinctUntilChanged(isEqual) // provide isEqual function somehow
);
this.totalRentAmount$ = properties$.pipe(
map(getTotalRentAmount)
);
// maybe in some other utility file:
export const getTotalRentAmount = (properties: Property[]): number => {
return properties
.map(property => property.tenancyInformation.rentAmount)
.reduce((total, amount) => total + amount, 0);
}
Then in the template:
<div>Total Rent Amount: {{ totalRentAmount | async }}</div>
Also if you really need to subscribe in the component and are only interested in the first emitted value of an observable, you can use first() or take(1) pipe to automatically unsubscribe after first value.
this.mainService.requestPropertyData()
.pipe(
first() // or take(1)
)
.subscribe(properties => this.properties = properties);
See the difference between first() and take(1)

How to concatenate indefinite number of arrays of Javascript objects

I have an array of Groups s.t. each Group has many Users
I want to return all (unique) Users for a given array of Groups.
So far, I have
let actor = await User.query().findById(req.user.id).eager('groups') // find the actor requesting
let actor_groups = actor.groups // find all groups of actor
if (actor_groups.length > 1)
var actor_groups_users = actor_groups[0].user
for (let i = 0; i < actor_groups.length; i++) {
const actor_groups_users = actor_groups_users.concat(actor_groups[i]);
}
console.log('actor groups users is', actor_groups_users);
else
// return users from the first (only) group
which returns the error: actor_groups_users is not defined
Feels like a roundabout way to do this. Is there a way to just combine actor_groups into a single combined group?
Here we can cycle through, adding users if not already in the array, using .forEach() and .includes().
This is assuming that group.user is an Array of users.
let users = [];
// check that actor_groups has items in it
if (actor_groups && actor_groups.length > 1) {
// cycle through each actor_group
actor_groups.forEach( group => {
// check if we have a 'user' array with items in it
if (group.user && group.user.length > 1) {
// cycle through each user in the group
group.user.forEach( user => {
// check if we already have this user
// if not, add it to users
if (!users.includes(user)) {
users.push(user);
}
}
}
}
}
You can simply do this:
const allGroupsArrs = actor_groups.map(({ user }) => user);
const actor_groups_users = [].concat(...allGroupArrs);
Or, you could simply use the .flat() method, which is not yet officially part of the ES standard, but is on its way there and has browser support outside of IE:
const allGroupsArrs = actor_groups.map(({ user }) => user);
const actor_groups_users = allGroupArrs.flat();
Also, the above would result in duplicate values in actor_groups_users if there are people who are in multiple groups. You can remedy this (assuming the array elements are primitive values) using a Set:
const unique_users = [...new Set(actor_groups_users)];
The most efficient way I can think of is
const users = [...new Set([...actor_groups].flatMap(el => el.user))]
I used this example:
const actor_groups = [{user: ['ok','boom']}, {user: 'er'}]
console.log([...new Set([...actor_groups].flatMap(el => el.user))])
//output: ["ok", "boom", "er"]

Filter/Reject Array of strings against multiple values using underscore

I'd like to _.filter or _.reject the cities array using the filters array using underscore.
var cities = ['USA/Aberdeen', 'USA/Abilene', 'USA/Akron', 'USA/Albany', 'USA/Albuquerque', 'China/Guangzhou', 'China/Fuzhou', 'China/Beijing', 'China/Baotou', 'China/Hohhot' ... ]
var filters = ['Akron', 'Albuquerque', 'Fuzhou', 'Baotou'];
My progress so far:
var filterList;
if (reject) {
filterList = angular.copy(cities);
_.each(filters, (filter) => {
filterList = _.reject(filterList, (city) => city.indexOf(filter) !== -1);
});
} else {
filterList = [];
_.each(filters, (filter) => {
filterList.push(_.filter(cities, (city) => city.indexOf(filter) !== -1));
});
}
filterList = _.flatten(filterList);
return filterList;
I'd like to DRY this up and use a more functional approach to achieve this if possible?
A somewhat more functional version using Underscore might look like this:
const cities = ['USA/Aberdeen', 'USA/Abilene', 'USA/Akron', 'USA/Albany',
'USA/Albuquerque', 'China/Guangzhou', 'China/Fuzhou',
'China/Beijing', 'China/Baotou', 'China/Hohhot']
const filters = ['Akron', 'Albuquerque', 'Fuzhou', 'Baotou'];
var inList = names => value => _.any(names, name => value.indexOf(name) > -1);
_.filter(cities, inList(filters));
//=> ["USA/Akron", "USA/Albuquerque", "China/Fuzhou", "China/Baotou"]
_.reject(cities, inList(filters));
//=> ["USA/Aberdeen", "USA/Abilene", "USA/Albany",
// "China/Guangzhou", "China/Beijing", "China/Hohhot"]
I'm using vanilla JavaScript here (some() and filter()) but I hope you get the idea:
const isValidCity = city => filters.some(filter => city.indexOf(filter) > -1)
const filteredCities = cities.filter(isValidCity)
Please note that this is a loop over a loop. So the time complexity is O(n * m) here.
In your example all city keys share the same pattern: country + / + city. Your filters are all an exact match to the city part of these names.
If this is a certainty in your data (which it probably isn't...), you could reduce the number of loops your code makes by creating a Map or object that stores each city per filter entry:
Create an object with an entry for each city name
Make the key the part that you want the filter to match
Make the value the original name
Loop through the filters and return the name at each key.
This approach always requires one loop through the data and one loop through the filters. For small array sizes, you won't notice a performance difference. When one of the arrays has length 1, you'll also not notice any differences.
Again, note that this only works if there's a constant relation between your filters and cities.
var cities = ['USA/Aberdeen', 'USA/Abilene', 'USA/Akron', 'USA/Albany', 'USA/Albuquerque', 'China/Guangzhou', 'China/Fuzhou', 'China/Beijing', 'China/Baotou', 'China/Hohhot' ]
var filters = ['Akron', 'Albuquerque', 'Fuzhou', 'Baotou'];
const makeMap = (arr, getKey) => arr.reduce(
(map, x) => Object.assign(map, {
[getKey(x)]: x
}), {}
);
const getProp = obj => k => obj[k];
const getKeys = (obj, keys) => keys.map(getProp(obj));
// Takes the part after the "/"
const cityKey = c => c.match(/\/(.*)/)[1];
const cityMap = makeMap(cities, cityKey);
const results = getKeys(cityMap, filters);
console.log(results);
Since you seem to be using AngularJS, you could utilize the built-in filter functionality. Assuming both the cities and filters array exist on your controller and you're displaying the cities array using ng-repeat, you could have something like this on your controller:
function cityFilter(city) {
var cityName = city.split('/')[1];
if (reject) {
return filters.indexOf(cityName) === -1;
} else {
return filters.indexOf(cityName) > -1;
}
}
And then in your template, you'd do something like this:
<div ng-repeat="city in cities | filter : cityFilter"></div>
Of course you'd have to modify your syntax a bit depending on your code style (for example, whether you use $scope or controllerAs).

RxJs zip operator equivalent in xstream?

Hello I'm trying to figure out if there is an equivalent to the RxJs operator zip in xstream, or at least a way to get the same behaviour. In case anyone needs clarification on the difference the marble diagrams below will show.
zip in rxjs
|---1---2---3-----------5->
|-a------b------c---d----->
"zip"
|-1a----2b------3c-----5d->
whereas 'combineLatest' aka 'combine' in xstream does
|---1---2----------4---5->
|----a---b---c---d------->
"combine"
|-1a----2a-2b-2c-2d-4d-5d>
Any help is appreciated as I'm very new to programming with streams. Thank you in advance!
I also needed a zip operator for xstream. So I created my own from existing operators. It takes an arbitrary number of streams for zipping.
function zip(...streams) {
// Wrap the events on each stream with a label
// so that we can seperate them into buckets later.
const streamsLabeled = streams
.map((stream$, idx) => stream$.map(event => ({label: idx + 1, event: event})));
return (event$) => {
// Wrap the events on each stream with a label
// so that we can seperate them into buckets later.
const eventLabeled$ = event$.map(event => ({label: 0, event: event}));
const labeledStreams = [eventLabeled$, ...streamsLabeled];
// Create the buckets used to store stream events
const buckets = labeledStreams.map((stream, idx) => idx)
.reduce((buckets, label) => ({...buckets, [label]: []}), {});
// Initial value for the fold operation
const accumulator = {buckets, tuple: []};
// Merge all the streams together and accumulate them
return xs.merge(...labeledStreams).fold((acc, event) => {
// Buffer the events into seperate buckets
acc.buckets[event.label].push(event);
// Does the first value of all the buckets have something in it?
// If so, then there is a complete tuple.
const tupleComplete = Object.keys(acc.buckets)
.map(key => acc.buckets[key][0])
.reduce((hadValue, value) => value !== undefined
? true && hadValue
: false && hadValue,
true);
// Save completed tuple and remove it from the buckets
if (tupleComplete) {
acc.tuple = [...Object.keys(acc.buckets).map(key => acc.buckets[key][0].event)];
Object.keys(acc.buckets).map(key => acc.buckets[key].shift());
} else {
// Clear tuple since all columns weren't filled
acc.tuple = [];
}
return {...acc};
}, accumulator)
// Only emit when we have a complete tuple
.filter(buffer => buffer.tuple.length !== 0)
// Just return the complete tuple
.map(buffer => buffer.tuple);
};
}
This can be used with compose.
foo$.compose(zip(bar$)).map(([foo, bar]) => doSomething(foo, bar))

Observable function to returned a chunked array

I have a function that returns something like Observable<[number, Array<DataItem>]>. Is it possible to write some function that returns Observable<[number, Array<PageWithDataItems>] using some Observable functions, given a function chunk (chunks the DataItem array according to page size) and a simple constructor that creates a PageWithDataItems with a chunked DataItem array.
What I have is some code that subscribes to Observable<[number, Array<DataItem>]> and then creates a new Observable, but I am hoping it would be possible to do the same with map, mapTo, switchMap or similar. I am a bit lost in all the Observable functions, so any help?
I am not entirely sure what you are going for here, but I gave it a shot:
// stream would be your data... just random chunks of numbers as an example here.
const stream = Rx.Observable.range(0, 480).bufferWithCount(100).select(d => [Math.random() * 100, d]);
class DataChunk<T> {
constructor(public data: Array<T>) { }
}
const pageSize = 10;
stream
// I do not understand what the 'number' in your [number, Array<DataItem>]
// represents. But it is the 'someNumber' item here..
.map(d => ({someNumber: <number>d[0], data: <number[]>d[1]}))
.map(d => ({
someNumber: d.someNumber,
pages: Ix.Enumerable
.fromArray(d.data)
.select((item, idx) => ({ pageNr : idx % pageSize, item: item }))
.groupBy(i => i.pageNr)
.select(pageItems => new DataChunk(pageItems.select(i => i.item).toArray()))
.toArray()
}))
.subscribe(dataInfo => {
// here each dataInfo sent down the stream will have been split up in to chunks
// of pageSize
log('Data recieved: ');
log(' someNumber: ' + dataInfo.someNumber);
log(' page count: ' + dataInfo.pages.length);
});
Working example on jsfiddle.
I used IxJS to do the chunking. It works similarly to RxJS but operates on collections (e.g. arrays) and not streams of evens like RxJS. I hope this was close to what you wanted, your question is not entirely clear.

Categories