Find and replace based on criteria with Google Apps Scripts Code - javascript

I have two Google Sheets housed in the same Spreadsheet object, for reference I'll refer to them as Data and Change-Log.
Data Sheet
ID Country Attribute Value
1 USA X 100
2 RUS X 77
3 MEX Y 32
4 GER Z 111
...
Change-Log Sheet
Country Attribute Value
USA X 84
GER Z 97
Updated Data Sheet
ID Country Attribute Value
1 USA X 84
2 RUS X 77
3 MEX Y 32
4 GER Z 97
...
Currently I am pulling into the Data sheet via an API, which is cleared and updated monthly.
Ideally I would like to write some sort of helper function that can query the Data sheet for entries shared between the two sheets and overwrite values in the Data sheet with the corresponding value from the Change-Log.
In the example above, I would want to query where the Country and Indicator variables are the same, then compare the Value variable and give preference to the Change-Log.Value entry by overwriting this value into the Data sheet where appropriate.

We have a very little information on how your sheets look and what are you trying to do.
You can do something like this:
function test() {
const ss = SpreadsheetApp.getActiveSpreadsheet();
const dataSheet = ss.getSheetByName("Data");
const dataRange = dataSheet.getRange(2, 1, dataSheet.getLastRow() - 1, 4);
const dataValues = dataRange.getValues(); // Get at once, faster execution
const changeLogSheet = ss.getSheetByName("Change-Log");
const replacements = changeLogSheet.getRange(2, 1, changeLogSheet.getLastRow() - 1, 3).getValues(); // Faster
const newValues = [];
dataValues.forEach(row => {
const replacementValue = getReplacementValue_(row[1], row[2], replacements);
newValues.push([row[0], row[1], row[2], (replacementValue ? replacementValue : row[3])]);
});
dataRange.setValues(newValues); // Faster
}
function getReplacementValue_(country, attribute, replacements) {
const matchingRows = replacements.filter(row => row[0] == country && row[1] == attribute);
return matchingRows.length > 0 ? matchingRows[0][2] : null;
}
I'm not sure if you want to overwrite all the data each time, or start at some row.
Note 1: Stack Overflow community will be able to help you more if your question is in the form of "here is my goal and code so far, these are the errors and problems". If you ask "how to do something" we won't be able to help you that much.
Note 2: If some answer helps you, upvote/accept it. If not, edit your question to provide more information. In this case it would be good if you can make a copy of your sheet (if data is sensitive then replace with mock data) and share it with anyone on the internet and post a link in your question. Something like "Data" sheet, "Change Log" sheet and "Desired Data" sheet.

Related

Using Apps Script for Google Sheets, Insert a Header Row after change in cell value

I have not done any scripting for about 7 years. I used to write a lot of VBA scripts with SQL for Access and Excel. I am now trying to automate a Google sheet row insertion with formatting and a cell value edit in the new row.
I am hoping someone who can script with their eyes closed will help this old lady with a script to automate a volunteer task for an all volunteer food coop so that I can pass this task on to someone with less spreadsheet skill.
I have written the task in my own fudgy language and am hoping someone can translate it into the proper language and syntax. Here it is:
function (createReceivingSheet)
for each cell in range A2: A500
if right(this.cell, 6) != right(this.cell.-1, 6)
insert.row.above(this.cell)
format(new row) bold, underline, font:arial, 12pt
merge (newrow.column1:column5)
format (newrow.cell.column1) border:bottom
case edit(newrow,cell.column1)
when original.cell = "02 GM *" then "GO MACRO",
when original.cell = "000 *" then "PRODUCE"
end
End function
In other words I want to insert a formatted title row above each change in vendor where the vendor code is the first 6 characters of the cells in column A.
I need the script to iterate through Col A:
compare each cell with the cell above
if the first 6 characters of the current cell are not equal to the first 6 characters of the cell above, then insert a row above
format the newly inserted row in bold, underline, 12 pt
merge the first 5 columns in the newly inserted row
format the merged cells in the newly inserted row with a bottom border
populate the first cell (column A) of the newly inserted row with a value based on a set of case statements when the original cell = "X" or "Y" or "Z" etc.
I do not know if this is an appropriate question to ask on this forum. Please let me know if you can help or if this is too much to ask on this forum.
I think this covers most of your pseudocode. At the bottom, you'll have to add the values that you want to put into the new cells.
This onOpen function adds a new menu after Help. Select a range then use the menu item to run partitionVendors.
There's good documentation here for when you need to add more features: https://developers.google.com/apps-script/reference/spreadsheet/spreadsheet-app?hl=en
const VENDOR_PREFIX_LENGTH = 6;
const MENU_TITLE = "Stack Overflow";
/**
* Adds a menu to the spreadsheet when the file is opened
*/
function onOpen(e) {
var ui = SpreadsheetApp.getUi();
const menu = ui.createMenu(MENU_TITLE)
menu.addItem("partition vendors", "partitionVendors")
.addToUi();
}
function partitionVendors() {
const sheet = SpreadsheetApp.getActiveSpreadsheet();
const range = SpreadsheetApp.getActiveRange();
for (let r = range.getNumRows(); r > 1; r--) {
const currentCellValue = range.getCell(r, 1).getValues()[0][0];
const previousCellValue = range.getCell(r - 1, 1).getValues()[0][0];
if (currentCellValue.slice(0, VENDOR_PREFIX_LENGTH) !== previousCellValue.slice(0, VENDOR_PREFIX_LENGTH)) {
const newRowNum = range.getLastRow() - range.getNumRows() + r;
sheet.insertRowBefore(newRowNum);
// assuming the selection is in column A, merge columns A to E
const newCellRange = sheet.getRange(`A${newRowNum}:E${newRowNum}`);
newCellRange.merge()
.setFontWeight("bold")
.setFontLine("underline")
.setFontFamily("Arial")
.setFontSize(12)
.setBorder(null, null, true, null, null, null);
// populate values here
if (currentCellValue === "value1") {
newCellRange.setValue("header for value1");
} // else if ...
}
}
}

Advice on how to optimise Google AppScript code [duplicate]

I've just written my first google apps scripts, ported from VBA, which formats a column of customer order information (thanks to you all of your direction).
Description:
The code identifies state codes by their - prefix, then combines the following first name with a last name (if it exists). It then writes "Order complete" where the last name would have been. Finally, it inserts a necessary blank cell if there is no gap between the orders (see image below).
Problem:
The issue is processing time. It cannot handle longer columns of data. I am warned that
Method Range.getValue is heavily used by the script.
Existing Optimizations:
Per the responses to this question, I've tried to keep as many variables outside the loop as possible, and also improved my if statements. #MuhammadGelbana suggests calling the Range.getValue method just once and moving around with its value...but I don't understand how this would/could work.
Code:
function format() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var s = ss.getActiveSheet();
var lastRow = s.getRange("A:A").getLastRow();
var row, range1, cellValue, dash, offset1, offset2, offset3;
//loop through all cells in column A
for (row = 0; row < lastRow; row++) {
range1 = s.getRange(row + 1, 1);
//if cell substring is number, skip it
//because substring cannot process numbers
cellValue = range1.getValue();
if (typeof cellValue === 'number') {continue;};
dash = cellValue.substring(0, 1);
offset1 = range1.offset(1, 0).getValue();
offset2 = range1.offset(2, 0).getValue();
offset3 = range1.offset(3, 0).getValue();
//if -, then merge offset cells 1 and 2
//and enter "Order complete" in offset cell 2.
if (dash === "-") {
range1.offset(1, 0).setValue(offset1 + " " + offset2);
//Translate
range1.offset(2, 0).setValue("Order complete");
};
//The real slow part...
//if - and offset 3 is not blank, then INSERT CELL
if (dash === "-" && offset3) {
//select from three rows down to last
//move selection one more row down (down 4 rows total)
s.getRange(row + 1, 1, lastRow).offset(3, 0).moveTo(range1.offset(4, 0));
};
};
}
Formatting Update:
For guidance on formatting the output with font or background colors, check this follow-up question here. Hopefully you can benefit from the advice these pros gave me :)
Issue:
Usage of .getValue() and .setValue() in a loop resulting in increased processing time.
Documentation excerpts:
Minimize calls to services:
Anything you can accomplish within Google Apps Script itself will be much faster than making calls that need to fetch data from Google's servers or an external server, such as requests to Spreadsheets, Docs, Sites, Translate, UrlFetch, and so on.
Look ahead caching:
Google Apps Script already has some built-in optimization, such as using look-ahead caching to retrieve what a script is likely to get and write caching to save what is likely to be set.
Minimize "number" of read/writes:
You can write scripts to take maximum advantage of the built-in caching, by minimizing the number of reads and writes.
Avoid alternating read/write:
Alternating read and write commands is slow
Use arrays:
To speed up a script, read all data into an array with one command, perform any operations on the data in the array, and write the data out with one command.
Slow script example:
/**
* Really Slow script example
* Get values from A1:D2
* Set values to A3:D4
*/
function slowScriptLikeVBA(){
const ss = SpreadsheetApp.getActive();
const sh = ss.getActiveSheet();
//get A1:D2 and set it 2 rows down
for(var row = 1; row <= 2; row++){
for(var col = 1; col <= 4; col++){
var sourceCellRange = sh.getRange(row, col, 1, 1);
var targetCellRange = sh.getRange(row + 2, col, 1, 1);
var sourceCellValue = sourceCellRange.getValue();//1 read call per loop
targetCellRange.setValue(sourceCellValue);//1 write call per loop
}
}
}
Notice that two calls are made per loop(Spreadsheet ss, Sheet sh and range calls are excluded. Only including the expensive get/set value calls). There are two loops; 8 read calls and 8 write calls are made in this example for a simple copy paste of 2x4 array.
In addition, Notice that read and write calls alternated making "look-ahead" caching ineffective.
Total calls to services: 16
Time taken: ~5+ seconds
Fast script example:
/**
* Fast script example
* Get values from A1:D2
* Set values to A3:D4
*/
function fastScript(){
const ss = SpreadsheetApp.getActive();
const sh = ss.getActiveSheet();
//get A1:D2 and set it 2 rows down
var sourceRange = sh.getRange("A1:D2");
var targetRange = sh.getRange("A3:D4");
var sourceValues = sourceRange.getValues();//1 read call in total
//modify `sourceValues` if needed
//sourceValues looks like this two dimensional array:
//[//outer array containing rows array
// ["A1","B1","C1",D1], //row1(inner) array containing column element values
// ["A2","B2","C2",D2],
//]
//#see https://stackoverflow.com/questions/63720612
targetRange.setValues(sourceValues);//1 write call in total
}
Total calls to services: 2
Time taken: ~0.2 seconds
References:
Best practices
What does the range method getValues() return and setValues() accept?
Using methods like .getValue() and .moveTo() can be very expensive on execution time. An alternative approach is to use a batch operation where you get all the column values and iterate across the data reshaping as required before writing to the sheet in one call. When you run your script you may have noticed the following warning:
The script uses a method which is considered expensive. Each
invocation generates a time consuming call to a remote server. That
may have critical impact on the execution time of the script,
especially on large data. If performance is an issue for the script,
you should consider using another method, e.g. Range.getValues().
Using .getValues() and .setValues() your script can be rewritten as:
function format() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var s = ss.getActiveSheet();
var lastRow = s.getLastRow(); // more efficient way to get last row
var row;
var data = s.getRange("A:A").getValues(); // gets a [][] of all values in the column
var output = []; // we are going to build a [][] to output result
//loop through all cells in column A
for (row = 0; row < lastRow; row++) {
var cellValue = data[row][0];
var dash = false;
if (typeof cellValue === 'string') {
dash = cellValue.substring(0, 1);
} else { // if a number copy to our output array
output.push([cellValue]);
}
// if a dash
if (dash === "-") {
var name = (data[(row+1)][0]+" "+data[(row+2)][0]).trim(); // build name
output.push([cellValue]); // add row -state
output.push([name]); // add row name
output.push(["Order complete"]); // row order complete
output.push([""]); // add blank row
row++; // jump an extra row to speed things up
}
}
s.clear(); // clear all existing data on sheet
// if you need other data in sheet then could
// s.deleteColumn(1);
// s.insertColumns(1);
// set the values we've made in our output [][] array
s.getRange(1, 1, output.length).setValues(output);
}
Testing your script with 20 rows of data revealed it took 4.415 seconds to execute, the above code completes in 0.019 seconds

Check if a column header title matches a string and if so then return the column index

I am trying to write a script for google sheets which returns the date in the next cell when the user enters 'y' in the current cell. I have a script which does this already, but the problem with my script is that the columns which it is evaluating is based on the column index, which means if our data set ever grows then these columns always have to stay in the same index which is creating a lot of organizational issues.
My question is..
Is it possible to look for the column header title rather than the column index in my code, and if so, what changes would I need to make?
function onEdit(e) {
if ([19].indexOf(e.range.columnStart) == -1 || ['y', 'Y'].indexOf(e.value) == -1) return;
e.range.offset(0, 1)
.setValue(Utilities.formatDate(new Date(), "GMT-5", "MM-dd-yyyy"))
}
This code currently looks at column index 19 and when either 'y' or 'Y' is entered into a cell in column index 19 it then outputs the date in the next cell in column 20.
How can I change the code to look for where the column header = 'Replied?' rather than index?
Goal:
If the following criteria is met:
Value is written into column 19 (S).
Header of column 19 (S) is 'Replied?'.
Value written is either 'Y' or 'y'.
Then write a date into the adjacent cell.
Code:
function onEdit(e) {
var sh = e.source.getActiveSheet();
var row = e.range.getRow();
var col = e.range.getColumn();
var value = e.value.toUpperCase();
var header = sh.getRange(1, col).getValue();
if (col === 19 && value === 'Y' && header === 'Replied?') {
sh.getRange(row, 20).setValue(Utilities.formatDate(new Date(), "GMT-5", "MM-dd-yyyy"))
}
}
Explanation:
I've based everything on the event objects passed to your onEdit trigger. For var value I have used toUpperCase() so that we don't have to check for either 'Y' OR 'y', only 'Y' alone. Also, instead of using range.offset I have just specified column 20 specifically in the getRange().setValue().
References:
Event Objects
String.toUpperCase()
One possible way to do this is to name the column/ cell in google sheets. See this website on how to.
Basically:
Open a spreadsheet in Google Sheets.
Select the cells you want to name.
Click Data and then Named ranges. A menu will open on the right.
Type the range name you want.
To change the range, click Spreadsheet Grid.
Select a range in the spreadsheet or type the new range into the text box, then - click Ok.
Click Done.
You can then refer to that named cell in google scripts by creating a custom function
function myGetRangeByName(n) { // just a wrapper
return SpreadsheetApp.getActiveSpreadsheet().getRangeByName(n).getA1Notation();
}
Then, in a cell on the spreadsheet:
myGetRangeByName("Names")
I'd do this.
function onEdit(e) {
var editedColumn = e.range.columnStart;
var sh = SpreadsheetApp.getActiveSpreadsheet();
var ss = sh.getSheetByName("This");//you only want onedits to the specific page
var data = ss.getDataRange().getValues();
var header = data[0][editedColumn];
if (header != "Replied") return;
if(e.value.toLowerCase() == "y"){
e.range.offset(0, 1)
.setValue(Utilities.formatDate(new Date(), "GMT-5", "MM-dd-yyyy"));}
}
You could also consider using a checkbox, that might be faster for your users.

Removing 'complicated' duplicates

Test File
Sometimes, my lists of emails include duplicate addresses for the same person. For example, Jane's addresses are both "jane.doe#email.com" and "doe.jane#email". Her variants include replacing the "." with "-" or "_". At the moment, my duplicates script—upgraded ever so kindly by #Jordan Running and Ed Nelson—takes care of 'strict' duplicates, yet cannot detect that "doe.jane#email.com" is a 'complicated' duplicate of "jane.doe#email.com". Is there a way to delete even these duplicates such that I do not email more than one of Jane's addresses? All of them point to the same inbox, so I need only include one of her addresses.
Here is my current code:
function removeDuplicates() {
const startTime = new Date();
const newData = [];
const sheet = SpreadsheetApp.getActiveSheet();
const data = sheet.getDataRange().getValues();
const numRows = data.length;
const seen = {};
for (var i = 0, row, key; i < numRows && (row = data[i]); i++) {
key = JSON.stringify(row);
if (key in seen) {
continue;
}
seen[key] = true;
newData.push(row);
};
sheet.clearContents();
sheet.getRange(1, 1, newData.length, newData[0].length).setValues(newData);
// Show summary
const secs = (new Date() - startTime) / 1000;
SpreadsheetApp.getActiveSpreadsheet().toast(
Utilities.formatString('Processed %d rows in %.2f seconds (%.1f rows/sec); %d deleted',
numRows, secs, numRows / secs, numRows - newData.length),
'Remove duplicates', -1);
}
Sample File
Fuzzy match test
Notes:
used without #email.com part, it distorts the result
use a the custom function: =removeDuplicatesFuzzy(B2:B12,0.66)
0.66 is a percentage of fuzzy match.
the right column of a result (Column D) shows found matches with > 0.66 accuracies. Dash - is when matches are not found ("unique" values)
Background
You may try this library:
https://github.com/Glench/fuzzyset.js
To install it, copy the code from here.
The usage is simple:
function similar_test(string1, string2)
{
string1 = string1 || 'jane.doe#email.com';
string2 = string2 || 'doe.jane#email.com'
a = FuzzySet();
a.add(string1);
var result = a.get(string2);
Logger.log(result); // [[0.6666666666666667, jane.doe#email.com]]
return result[0][0]; // 0.6666666666666667
}
There's also more info here: https://glench.github.io/fuzzyset.js/
Notes:
please google more info, look for javascript fuzzy string match. Here's related Q: Javascript fuzzy search that makes sense. Note: the solution should work in Google Sheets (no ECMA-6)
this algorithm is not smart like a human, it tests a string by char. If you have two similar strings like don.jeans#email.com it will be 84% similar to doe.jane#email.com but human detects it is completely another person.
Search for my Google Sheets add-on called Flookup. It should do what you want.
For your case, you can use this function:
ULIST(colArray, [threshold])
The parameter details are:
colArray: the column from which unique values are to be returned.
threshold: the minimum percentage similarity between the colArray values that are not unique.
Or you can simply use the Highlight duplicates or Remove duplicates from the add-on menu.
The key feature is that you can adjust the level of strictness by changing the percentage similarity.
Bonus: It will easily catch swaps like jane.doe#email.com / doe.jane#email.com
You can find out more at the official website.

google sheets triggers for email notification

I am trying to set up an instantaneous email notification when a certain value (in this case hydrogen sulphide) exceeds a threshold on google sheets.
An example of the data:
h2s VFA
F1 F2
01/10/17 555 893 786
02/10/17 456 980 654
03/10/17 205 1021 875
04/10/17
05/10/17
06/10/17 345 987
I've got the following working code:
function readCell() {
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var h2s_value = sheet.getRange("B2").getValues();
if(h2s_value>500) MailApp.sendEmail('emailaddress#gmail.com', 'High
Hydrogen Sulphide Levels', 'Hydrogen Sulphide levels are greater than 500ppm ' + h2s_value );
};
It works when I run the code and an email is sent if the value of B2 exceeds 500. I would like to automate the code to run everytime the value is updated, and so an email is sent instantaneously if the threshold is reached. I tried using onChange triggers, but it's not working.
The problem is that if I put triggers on the main spreadsheet (which records a long list of lots of different parameters), I will get an email notification for every single change made on the spreadsheet- whether it's relevant to the value of interest or not. So I have created another sheet which summarises single key parameters just for that day. The daily key parameters are linked to the main spreadsheet, however when I make a change on the main spreadsheet, the script doesn't recognise the change as the value changes indirectly through the link in the formula.
Does anyone know if there is a way to create a trigger to respond to indirect changes on a spreadsheet? i.e. where the formula remains the same but the value changes.
If it's possible to have instantaneous triggers, this would be much preferred than time driven triggers.
Any help would be much appreciated.
Thanks,
Lisa
--------------------------LATEST VERSION ---------------------------------------
Now I'm trying to work the code with multiple columns including for h2s and VFA (code below only contains code for VFA) , but I haven't been able to define more than one e.value and it only seems to run for the last column. Is it possible to define more than one e.value?
function onEdit (e) {
var ss = e.source;
var range = e.range.columnStart;
var watching_f1 = ss.getRange("Y6:Y");
var watching_f2 = ss.getRange ("Z6:Z");
// Only check the f1 values and send emails if cells in col "Y" changes
if ((watching_f1.getColumn() <= range == range <= watching_f1.getLastColumn())) {
var vfa_f1 = e.values[0];
if(vfa_f1>2500) {
MailApp.sendEmail('email#gmail.com', 'High VFAs in Fermenter 1', 'Hi, High VFAs in Fermenter 1. VFA recorded at ' + vfa_f1);
}};
// Only check the f2 values and send emails if cells in col "Z" changes
if ((watching_f2.getColumn() <= range == range <= watching_f2.getLastColumn())) {
var vfa_f2 = e.values[1];
if(vfa_f2>2500) {
MailApp.sendEmail('email#gmail.com', 'High VFAs in Fermenter 2', 'Hi, High VFAs in Fermenter 2. VFA recorded at ' + vfa_f2);
}};
}
Unless I am misunderstanding the issue, narrowing down the range on which the onEdit() trigger runs the MailApp.sendEmail() would prevent other cell edits on the sheet from sending the extra emails.
function onEdit (e) {
var ss = e.source;
var range = e.range.columnStart;
var watching = ss.getRange("B1:B");
// Only check the values and send emails if cells in col "B" changes
if ((watching.getColumn() <= range == range <= watching.getLastColumn())) {
var h2s_value = e.value; // Use the event object value to save a call to the sheet
if (h2s_value > 500) {
MailApp.sendEmail('emailaddress#gmail.com', 'High Hydrogen Sulphide Levels', 'Hydrogen Sulphide levels are greater than 500ppm ' + h2s_value );
}
}
}

Categories