How to Approach Parsing CSV in Javascript

How to Approach Parsing CSV in Javascript - javascript

(My first post, I apologise for any mistakes)
I'm working with a small set of data in CSV files, which I need to read, process, and then export as a text file.
The format of the CSV data is:
REGO,STATUS,SHIFT,LOCATION,LOADED
CCA4110,StatusON,5:43,Brisbane,1
CCA4112,StatusON,5:44,Syndey,0
CCA4118,StatusON,6:11,Melbourne,1
I want to be able to take each line after the header row, and check
a) if the 'LOADED' value equals 0 or 1 (skip to next row if 1).
b) If 'LOADED' is 0, then check if the 'REGO' value matches a pre-defined list of 'REGO' values.
c) If a match, change the 'SHIFT' time.
d) If no match, move on to next row.
After that, I want to export all of the rows, with only the 'REGO' and 'SHIFT' values, to look like:
CCA4110,5:43
CCA4112,5:33
...
Because this feels a little complex to me, I'm having trouble visualising the best way to approach this problem. I was wondering if someone could help me think about this in a way that isn't just hacking together a dozen nested for loops.
Thanks very much,
Liam
Edit: a question about checking multiple conditions:
Say I have two CSV files:
List_to_Change.csv
REGO,STATUS,SHIFT,LOCATION,LOADED
CCA2420,StatusOn,11:24,BRISBANE,1
CCA2744,StatusOn,4:00,SYDNEY,1
CCA2009,StatusOn,4:00,MELBOURNE,0
List_to_Compare.csv
REGO,CORRECT_SHIFT
CCA2420,6:00
CCA2660,6:00
CCA2009,5:30
An algorithm:
1. Check value in 'List_to_Check.csv' 'LOADED' column
A. If value equals '0' go to step 2.
B. If value equals '1' skip this row and go to next.
2. Check if 'REGO' value in 'List_to_Check.csv' shows up in 'List_to_Compare.csv'
A. If true go to step 3.
B. If false skip this row and go to next.
3. Change 'SHIFT' value in 'List_to_Change.csv' with value shown in 'List_to_Compare.csv'
4. Stringify each row that was changed and export to text file.

My advice would be to split the work flow in to three steps:
Parse all rows to javascript objects
Perform the logic on the array of objects
Stringify the objects back to CSV
// This creates an object based on an order of columns:
const Entry = ([rego, status, shift, location, loaded]) =>
({ rego, status, shift, location, loaded });
// Which entries are we interested in?
const shouldHandleEntry = ({ loaded }) => loaded === "1";
// How do we update those entries?
const updateEntry = entry => ({
...entry,
shift: ["CCA4118"].includes(entry.rego)
? "5:33"
: entry.shift
});
// What do we export?
const exportEntry = ({ rego, shift }) => `${rego},${shift}`;
// Chain the steps to create a new table:
console.log(
csvBody(getCSV())
.map(Entry)
.filter(shouldHandleEntry)
.map(updateEntry)
.map(exportEntry)
.join("\n")
)
// (This needs work if you're using it for production code)
function csvBody(csvString) {
return csvString
.split("\n")
.map(line => line.trim().split(","))
.slice(1);
};
function getCSV() { return `REGO,STATUS,SHIFT,LOCATION,LOADED
CCA4110,StatusON,5:43,Brisbane,1
CCA4112,StatusON,5:44,Sydney,0
CCA4118,StatusON,6:11,Melbourne,1`; }

Related

How to use filter(), map() to work with values fo an array? JavaScript

My task is:
Implement the function duplicateStudents(), which gets the variable
"students" and filters for students with the same matriculation
number. Firstly, project all elements in students by matriculation
number. After that you can filter for duplicates relatively easily. At
the end project using the following format: { matrikelnummer:
(matrikelnummer), students: [ (students[i], students[j], ... ) ] }.
Implement the invalidGrades() function, which gets the variable "grades"
and filters for possibly incorrect notes. For example, in order to
keep a manual check as low as possible, the function should determine
for which matriculation numbers several grades were transmitted for
the same course. Example: For matriculation number X, a 2. 7 and a 2.
3 were transmitted for course Y. However, the function would also take
into account the valid case, i. e. for matriculation number X, once a
5,0 and once a 2,3 were transmitted for course Y.
In this task you should only use map(), reduce(), and filter(). Do not
implement for-loops.
function duplicateStudents(students) {
return students
// TODO: implement me
}
function invalidGrades(grades) {
return grades
.map((s) => {
// TODO: implement me
return {
matrikelnummer: -1/* put something here */,
grades: []/* put something here */,
};
})
.filter((e) => e.grades.length > 0)
}
The variables students and grades I have in a separate file. I know it might be helpful to upload the files too, but one is 1000 lines long, the other 500. That’s why I’m not uploading them. But I hope it is possible to do the task without the values. It is important to say that the values are represented as an array

I'll give you an example of using reduce on duplicateStudents, that's not returning the expected format but you could go from there.
const duplicateStudents = (students) => {
const grouping = students.reduce((previous, current) => {
if (previous[current.matrikelnummer]) previous[current.matrikelnummer].push(current); // add student if matrikelnummer already exist
else previous[current.matrikelnummer] = [current];
return previous;
}, {});
console.log(grouping);
return //you could process `grouping` to the expected format in here
};
here's preferences for you:
map
filter
reduce

Performant way of finding unique values of keys from a list of objects in typescript

I am working on an angular app that has a
a set of filters
and records present in a table
Columns in the table correspond to filters. At any time filters contain unique values from corresponding columns as options.
For a record, a column can contain more than 1 values (i.e. more than 1 options from the corresponding filter)
When a user selects an option from a filter, the records in the table are filtered and the filtered results (as per user the selection) are shown to the user.
Once the set of filtered records is derived, unique values for each filter are derived from the set of filtered records by finding unique values for each column.
Key of Filter objects correspond to columns of Record objects I have a list of records and a list of filters. I want to iterate over both these lists and find the unique value of columns for each key.
I am using the below logic to find unique options for filters from my messages.
export function filterOptionsService(records: Record[], filters: RecordFilter[]): RecordFilter[] {
const newFilters: RecordFilter[] = filters.map(filter => {
//logic to find all values for a column from the set of records
const filterOptions = records.reduce((uniqueList, record) => uniqueList.concat(record[filter.key]), []);
//logic to find unique values for a column from the set of records
//which act as options of corresponding filters.
const uniqueOptions = uniqBy(filterOptions, (opt) => filter.valueFunction ? filter.valueFunction(opt) : opt);
const filterOptions: FilterOption[] = uniqueOptions.map(value => {
return {
label: filter.labelFunction ? filter.labelFunction(value) : value,
value: filter.valueFunction ? filter.valueFunction(value) : value,
};
});
filter.options = orderBy(dropListOptions, 'label');
//here is my logic to find the count of each option, present in the filtered records
filter.options = filter.options.map(option => ({
...option,
count: filter.valueFunction
? filterOptions.filter(value => filter.valueFunction(value) === option.value).length
: filterOptions.filter(value => value === option.value).length
}));
return filter;
});
return newFilters;
}
interface Filter {
key: string;
labelFunction?: Function;
valueFunction?: Function;
order: number;
multiple: boolean;
options: DropListOption[];
values: DropListOption[];
}
interface Record {
column1: string;
column2: string;
column3: string | string[];
.
.
columnN: string;
}
The below logic takes the most time in my code. It takes around 7 seconds for 8k records.
const filterOptions = records.reduce((uniqueList, record) =>
uniqueList.concat(record[filter.key]), []);
I am unable to make my code perform better. Please can you suggest where am I going wrong.
Here is sample code on typescrip playground
https://www.typescriptlang.org/play?#code/FASwdgLgpgTgZgQwMZQAQDEQBtowPIAOEIA9mKgN7Co201YIBGUWAXKgM4QzgDmA3NTq0AbgiwBXKOwRgAnoOG1kxEVAD87RiRJYosxUtQEEcrCQQATTalkKhSpCQmQbYCQFtmMQzQC+-LTA1AD0IRwIHgR6qGJ6HKhwJDCoANZQcrYwaAQwJASwEHIAjAA0xnkFMEUATKigkLCIKKgASlBOMJaYOLCUDsLpcuxcPGAC9UaoDMxY6C5IxGQ282CLpGC+wnFSq+vL7HtLmwN0yZaw7O5esFt0HhI4INHSqNq6+idT+ccch9i4QjHADaAF07qJxFI-hgAbAgRswVsAkFgGEKvlqiAoAkSHA2h1zqgkAwOAkENlOJE0AgEkNUHiCZ1unCUiTaRw0SFUBAABYgBICzggKJYEBwbGWVDZKo4yAIY6oDyZFQScTE0kJXm0pV2JXJHKVQrYzngXDNNCtEgAd36dFy+UKJRG3D4SIGDqqtRdYwEwACwWATjAXFQ4BAxHEVutMOjYNQAF5UMDTlNhBRPU7iqxgQAiMhQXPlXN87KF0GlTPVOQ1Vi5phIXN+UqptO0DNG6vZvMStRF1C5sDgcuVzve3MXOBNlttqYdx1dnP5sCF4ulqCrge9zeD4e5itV8cNputtPg4JBMIASVsHgZYCwKrJIF4Q-GsShOJ5JDSGQxXsyJIUg4EVonFEAkAVDYgzIUNw0jOZWVjQkuh6XB4yTFNZyUCghjrQ8Sn7c5LjKJVHmIF5WEQLAOCgcofg2P4wXKHYcXYMFm1PNNcIyfCxxrIiukuGpygeJ5KOo2j6KIRicwrD9JDY5NQT8GguKMc9gglXp8Bk2CAGVYBECCoAACngkAoxtDhSgs8Q0NgDgAEpBGCdEb21NRUGKAAWRJWQZRgACsOggTl0WtXkoHIDyEC8gAOVJpRtQKQsWBI+TQWYUuySwJBaTEICgsgeQQdIEgANgAWgAdk4QkwEsDgADpgDgBZFW0wE9JDQyYGMlBTLyGN2DjeSusc0aUJZHSwSc-pTgmmAWo8BACFMpbEwAPjtIwwnMXgIO-fzGtsLAsAU6FEmSWxiV0TxyDgPI70y+qIAZfFsmZBJT2DUMloRWDE2SmNmty-KzNMlwQAARykAAZAUIHKYb5oTHbUGhuGoERrhmuDSCICGm1gSW5qhlBJzyjmwIuN+2CPmag7TIAcgBnqOBZ0p2d+FzsNbPwXP9QMgA

I think the performance problem you're having is that Array.prototype.concat() does not modify an existing array, but instead returns a new array. Immutability is nice, but it doesn't seem relevant to your use case: every uniqueList array you create except for the very last one will be discarded. Object creation is fairly fast in JavaScript, but creating thousands of array objects only to immediately throw them away is slowing things down.
My suggestion would be to replace concat() with something that modifies the existing array, such as Array.prototype.push():
That is, you could change
const filterOptions = rows.reduce(
(uniqueList, row) => uniqueList.concat(row[filter.key]), []
);
to
const filterOptions: string[] = [];
for (let row of rows) filterOptions.push(...row[filter.key]);
When I run a simulation where I create 6000 rows and 14 filters (see playground link below), the concat() version takes about 7.5 seconds, whereas the push() version takes about 38 milliseconds. Hopefully that factor of ~200 improvement holds in your environment and makes enough of an impact to be sufficient for your needs.
Note that I'm not dealing with any "uniquifying" or "function" aspects that your original problem seems to have, since your reproducible example code doesn't touch that either. Unique strings might be more easily tallied via an object key than an array, or maybe even via a Set. But again, hopefully changing concat() to push() will be enough for you.
Playground link to code

RegEx search returns -1 even though phrase is definitely present in the array

I am importing an excel file using xlsx in angular, where the excel file is imported as an object of arrays, with each array being a row of the excel file and each item in each array is a cell within it's respective row. See below for how this is done
onFileChange(event: any)
{
const inputFile: DataTransfer = <DataTransfer>(event.target);
const fileReader: FileReader = new FileReader();
fileReader.onload = (event: any) =>
{
const binaryString: string = event.target.result;
const workBook: XLSX.WorkBook = XLSX.read(binaryString, { type: 'binary', sheetStubs: true});
/* sheetstubs true supposedly shows empty cells but isn't */
const workSheetName: string = workBook.SheetNames[0];
const workSheet: XLSX.WorkSheet = workBook.Sheets[workSheetName];
this.data = <Array>(XLSX.utils.sheet_to_json(workSheet,
{header: 1, blankrows: true }));
};
I am now trying to find the column which contains a manufacturer description by using a Regex search by looping through each of the arrays and searching for the term cap like so:
getManufacturerDescriptionColumn()
{
console.log(this.data)
for (const row in this.data)
{
var manDescriptIndex = row.search(/cap/i)
console.log(manDescriptIndex)
if (manDescriptIndex > -1)
{
return manDescriptIndex
}
}
}
however when ever I try this, even though the phrase cap is clearly present in some of the arrays when I run this, I am presented with all -1 values indicating that the phrase is not found in any of the arrays. Below is an attached example where CAP is clearly present in a couple of the 15 rows of the excel file but I am still met with 15 -1 values.
For example the array indexed at
Any ideas why a regex search isn't identifying the phrase in this scenario when I do console.log(this.data) I can see the phrase cap like so,
I have also tried adding another layer of iteration to isolate the strings of the individual cells in the row also to no avail.

As you can clearly read here MDN you are using the loop for..in which is a mistake, and you should be using for..of instead. Explanation below
for..in is made to iterate around objects, which actually works for arrays, but is just giving you the indexes, which in this case will be 0,1,2 and so on...
If you want to get the values you will have to use the for..of loop MDN
You will see this more clearly if you print console.log(row) before the check

Ok so #AlejandroVales was definitely right in his answer so I'm not gonna take away that well earned check mark but... I have an update to fully get the solution because while changing in to of was instrumental to me getting to the point I needed to get, there was more to be done and in the spirit of sharing information and frankly good karma, I have for others who run into a similar issue... a solution. Rather than trying to use array methods in order to see if I could find what I am looking for I tested each element in each array AGAINST a regular expression, AKA I flipped the problem on it's head a little bit. So to clarify take a look at the code block below
var cap = new RegExp(/cap/i)
for (const row of this.data)
{
for (let cell = 0; cell<row.length; cell++)
{
if (cap.test(row[cell]))
{
return cell
}
}
}
First set a variable equal to a regex, then iterate to a level where I am iterating over each cell in each array (each array is a row of the excel document in this case), then find the cell that returns a TRUE value and poof problem solved. If I didn't do a good job explaining this solution feel free to leave a comment below I am no expert but figured it's only right try my best to give value to a community that has given me so much value previously.

Javascript sum dynamic django rest framework entries

I'm working on an app that allows a user to calculate information regarding a trade based on their entries. I've been stuck getting proper calculations on trade sums based on entries being added after the page has already retrieved any existing entries.
<td scope="col">${sumTotal('amount')}</td>
<script>
...
mounted() {
this.trade_id = window.location.pathname.split("/")[3];
// Get entries of trade
axios
.get('/trade/' + window.location.pathname.split("/")[3] + '/entries/')
.then(response => (this.entries = response.data));
},
...
methods: {
sumTotal(base) {
return Math.round(this.entries.reduce((sum, cur) => sum + cur[base], 0));
}
</script>
When the page first loads and pulls the existing entries the first calculation (first column) is correct with 16. However, when I add a new entry the auto-calculated value is no longer true.
The first column should be 26 but now is 1610
I can add much more info as needed. I wasn't sure how much would be required and I didn't want to clutter the question.

This happened on JS side,
it is because you are adding a numeric value with a string value, the result will be the concatenation of both values in a new string.
if you noticed
previous sum was 16, and the new value is 10
their concatenation will be 1610
the solution is as simple as converting the new value from string to a numeric value
You should change the line inside sumTotal to:
return Math.round(this.entries.reduce((sum, cur) => sum + parseInt(cur[base]), 0));

resolve field function before sort

I am creating a graphql server using express, and I have a resolver that can transform my fields as per input from the user query.
The transformer that I am using is returning a function, which is the cause of my issues.
I want to sort my result by some user determined field, but since the field is a function, it won't work.
So the resolver looks like this:
const resolver = (req, param) => {
return {
history: async input => {
let size = input.pageSize || 3;
let start = (input.page || 0) * size;
let end = start + size;
let sortField = (input.sort || {}).field || 'timestamp';
return fs.promises.readFile("./history/blitz.json", "utf8").then(data =>
JSON.parse(data)
.slice(start, end)
.map(job => historyTransformer(job))
.sort((a,b) => a[sortField] > b[sortField] ? 1 : a[sortField] < b[sortField] ? -1 : 0)
);
}
};
};
and the transformer:
const historyTransformer = job => {
return {
...job,
timestamp: input =>
dateFormat(job.timestamp, input.format || "mm:hh dd-mm-yyyy")
};
};
I am not sure if I am missing something but is there an easy way of resolving the function call before starting the sorting?

GraphQL fields are resolved in a hierarchal manner, such that the history field has to resolve before any of its child fields (like timestamp) can be resolved. If the child field's resolver transforms the underlying property and your intent is to somehow use that value in the parent resolver (in this case, to do some sorting), that's tricky because you're working against the execution flow.
Because you're working with dates, you should consider whether the format of the field even matters. As a user, if I sort by timestamp, I expect the results to be sorted chronologically. Even if the response is formatted to put the time first, I probably don't want dates with the same times but different years grouped together. Of course, I don't know your business requirements and it still doesn't solve the problem if we're working with something else, like translations, which would cause the same problem.
There's two solutions I can think of:
Update your schema and lift the format argument into the parent field. This is easier to implement, but obviously not as nice as putting the argument on the field it applies to.
Keep the argument where it is, but parse the info parameter passed to the resolver to determine the value of the argument inside the parent resolver. This way, you can keep the argument on the child field, but move the actual formatting logic into the parent resolver.

We Keep Coding

JavaScript is the programming language of the Web.

How to Approach Parsing CSV in Javascript - javascript

Related

How to use filter(), map() to work with values fo an array? JavaScript

Performant way of finding unique values of keys from a list of objects in typescript

RegEx search returns -1 even though phrase is definitely present in the array

Javascript sum dynamic django rest framework entries

resolve field function before sort

Categories

Resources