I've just written my first google apps scripts, ported from VBA, which formats a column of customer order information (thanks to you all of your direction).
Description:
The code identifies state codes by their - prefix, then combines the following first name with a last name (if it exists). It then writes "Order complete" where the last name would have been. Finally, it inserts a necessary blank cell if there is no gap between the orders (see image below).
Problem:
The issue is processing time. It cannot handle longer columns of data. I am warned that
Method Range.getValue is heavily used by the script.
Existing Optimizations:
Per the responses to this question, I've tried to keep as many variables outside the loop as possible, and also improved my if statements. #MuhammadGelbana suggests calling the Range.getValue method just once and moving around with its value...but I don't understand how this would/could work.
Code:
function format() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var s = ss.getActiveSheet();
var lastRow = s.getRange("A:A").getLastRow();
var row, range1, cellValue, dash, offset1, offset2, offset3;
//loop through all cells in column A
for (row = 0; row < lastRow; row++) {
range1 = s.getRange(row + 1, 1);
//if cell substring is number, skip it
//because substring cannot process numbers
cellValue = range1.getValue();
if (typeof cellValue === 'number') {continue;};
dash = cellValue.substring(0, 1);
offset1 = range1.offset(1, 0).getValue();
offset2 = range1.offset(2, 0).getValue();
offset3 = range1.offset(3, 0).getValue();
//if -, then merge offset cells 1 and 2
//and enter "Order complete" in offset cell 2.
if (dash === "-") {
range1.offset(1, 0).setValue(offset1 + " " + offset2);
//Translate
range1.offset(2, 0).setValue("Order complete");
};
//The real slow part...
//if - and offset 3 is not blank, then INSERT CELL
if (dash === "-" && offset3) {
//select from three rows down to last
//move selection one more row down (down 4 rows total)
s.getRange(row + 1, 1, lastRow).offset(3, 0).moveTo(range1.offset(4, 0));
};
};
}
Formatting Update:
For guidance on formatting the output with font or background colors, check this follow-up question here. Hopefully you can benefit from the advice these pros gave me :)
Issue:
Usage of .getValue() and .setValue() in a loop resulting in increased processing time.
Documentation excerpts:
Minimize calls to services:
Anything you can accomplish within Google Apps Script itself will be much faster than making calls that need to fetch data from Google's servers or an external server, such as requests to Spreadsheets, Docs, Sites, Translate, UrlFetch, and so on.
Look ahead caching:
Google Apps Script already has some built-in optimization, such as using look-ahead caching to retrieve what a script is likely to get and write caching to save what is likely to be set.
Minimize "number" of read/writes:
You can write scripts to take maximum advantage of the built-in caching, by minimizing the number of reads and writes.
Avoid alternating read/write:
Alternating read and write commands is slow
Use arrays:
To speed up a script, read all data into an array with one command, perform any operations on the data in the array, and write the data out with one command.
Slow script example:
/**
* Really Slow script example
* Get values from A1:D2
* Set values to A3:D4
*/
function slowScriptLikeVBA(){
const ss = SpreadsheetApp.getActive();
const sh = ss.getActiveSheet();
//get A1:D2 and set it 2 rows down
for(var row = 1; row <= 2; row++){
for(var col = 1; col <= 4; col++){
var sourceCellRange = sh.getRange(row, col, 1, 1);
var targetCellRange = sh.getRange(row + 2, col, 1, 1);
var sourceCellValue = sourceCellRange.getValue();//1 read call per loop
targetCellRange.setValue(sourceCellValue);//1 write call per loop
}
}
}
Notice that two calls are made per loop(Spreadsheet ss, Sheet sh and range calls are excluded. Only including the expensive get/set value calls). There are two loops; 8 read calls and 8 write calls are made in this example for a simple copy paste of 2x4 array.
In addition, Notice that read and write calls alternated making "look-ahead" caching ineffective.
Total calls to services: 16
Time taken: ~5+ seconds
Fast script example:
/**
* Fast script example
* Get values from A1:D2
* Set values to A3:D4
*/
function fastScript(){
const ss = SpreadsheetApp.getActive();
const sh = ss.getActiveSheet();
//get A1:D2 and set it 2 rows down
var sourceRange = sh.getRange("A1:D2");
var targetRange = sh.getRange("A3:D4");
var sourceValues = sourceRange.getValues();//1 read call in total
//modify `sourceValues` if needed
//sourceValues looks like this two dimensional array:
//[//outer array containing rows array
// ["A1","B1","C1",D1], //row1(inner) array containing column element values
// ["A2","B2","C2",D2],
//]
//#see https://stackoverflow.com/questions/63720612
targetRange.setValues(sourceValues);//1 write call in total
}
Total calls to services: 2
Time taken: ~0.2 seconds
References:
Best practices
What does the range method getValues() return and setValues() accept?
Using methods like .getValue() and .moveTo() can be very expensive on execution time. An alternative approach is to use a batch operation where you get all the column values and iterate across the data reshaping as required before writing to the sheet in one call. When you run your script you may have noticed the following warning:
The script uses a method which is considered expensive. Each
invocation generates a time consuming call to a remote server. That
may have critical impact on the execution time of the script,
especially on large data. If performance is an issue for the script,
you should consider using another method, e.g. Range.getValues().
Using .getValues() and .setValues() your script can be rewritten as:
function format() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var s = ss.getActiveSheet();
var lastRow = s.getLastRow(); // more efficient way to get last row
var row;
var data = s.getRange("A:A").getValues(); // gets a [][] of all values in the column
var output = []; // we are going to build a [][] to output result
//loop through all cells in column A
for (row = 0; row < lastRow; row++) {
var cellValue = data[row][0];
var dash = false;
if (typeof cellValue === 'string') {
dash = cellValue.substring(0, 1);
} else { // if a number copy to our output array
output.push([cellValue]);
}
// if a dash
if (dash === "-") {
var name = (data[(row+1)][0]+" "+data[(row+2)][0]).trim(); // build name
output.push([cellValue]); // add row -state
output.push([name]); // add row name
output.push(["Order complete"]); // row order complete
output.push([""]); // add blank row
row++; // jump an extra row to speed things up
}
}
s.clear(); // clear all existing data on sheet
// if you need other data in sheet then could
// s.deleteColumn(1);
// s.insertColumns(1);
// set the values we've made in our output [][] array
s.getRange(1, 1, output.length).setValues(output);
}
Testing your script with 20 rows of data revealed it took 4.415 seconds to execute, the above code completes in 0.019 seconds
Related
I have a problem to use correctly setFormula in app script, i tried to use setFormula in a indeterminated range cells but I do not know how to specify that the range of rows be increased and it is not just a specific range. The script that I try to make is a condition in which if in a range of cells there is information, then put the formula in a cell.
function formulas() {
var activeSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet 1");
var rows = activeSheet.getMaxRows();
for(var i=7; i <= rows; i++){
var workingCell = activeSheet.getRange(i, 3).getValue();
if(workingCell != ""){
activeSheet.getRange(i, 4).setFormula("=$B$5"); //this is fine
activeSheet.getRange(i, 5).setFormula("=((100/H7)*I7)/100"); //but this not
}
}
}
how can I do it so if it's row 8 is (" = ((100 / H8) * I8) / 100 ") and so on.
EDIT
The problem is that I try to apply it to many cells and that the rows that I add will increase according to the row in which I am placing the formula ... If a formula is added in row D9 and D10 , then The range is H9, I9 and H10, I10
The simplest "fix" , as pointed out in a comment, is to concatenate your i loop variable into the formula, like this:
function formulas() {
var activeSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet 1");
var rows = activeSheet.getLastRow(); //maxRows consider blank rows, you don't need those
for(var i=7; i <= rows; i++){
var workingCell = activeSheet.getRange(i, 3).getValue();
if(workingCell != ""){
activeSheet.getRange(i, 4).setFormula("=$B$5");
activeSheet.getRange(i, 5).setFormula("=((100/H" +i+ ")*I" +i+ ")/100");
}
}
}
Anyway, this function executes too many gets and sets against the spreadsheet, and this will perform poorly as your sheet grows. You should try to minimize all your sets and gets by issuing them in bulk, that is, against a bigger range rather than cell-by-cell.
Your use-case has a problem with this approach because you have some blank spots in your range (when workingCell is blank). If setting a "blank" formula for those values is not an issue for you, then you can speed your script greatly by using this:
function formulas() {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet 1"); //not necessarily active
var workingCells = sheet.getSheetValues(7, 3, -1, 1); //-1 == lastRow
var r1c1formulas = [];
for (var i=0; i < workingCells.length; i++){
if (workingCells[i][0] != "") {
r1c1formulas.push(['=R5C2', '=((100/R[0]C[3])*R[0]C[4])/100']);
} else
r1c1formulas.push(['=""','=""']);
}
sheet.getRange(7, 4, workingCells.length, 2).setFormulasR1C1(r1c1formulas);
}
The 2nd "trick" is to use the formulas in R1C1 notation instead of regular A1 style. Check the setFormulaR1C1 documentation here.
The R1C1 notation may seem daunting at first but is rather simple, I'd say it is simpler then the 'A1' one. I'll try to summarize it here. R is the row number, and C column, and in front of the letter you have the row and column numbers (instead of letter). So =$B$5 is written as =R5C2.
The last thing different is the relative reference. In the A1 notation you just don't place the '$' signs. Which is not really intuitive and not all that flexible when you're trying to set a bunch of formulas at once (exactly your use-case). Because on the A1 the relative formula is a different formula, the references are different =B1 is not the same as =C1 (which could be "the same" if set on two cells in the same row and consecutive columns).
Anyway, on the R1C1 notation the relative reference is counted as number of rows and columns from the cell that is the reference.
So, when you set the formula =H7*I7 to cell E7, you count that H is 3 columns ahead of E and I 4. And it is all on the same row, so zero row difference. Lastly, to write a relative reference you wrap the number in []. Therefore =H7 * I7 on E7 becomes =R[0]C[3] * R[0]C[4].
Test File
Sometimes, my lists of emails include duplicate addresses for the same person. For example, Jane's addresses are both "jane.doe#email.com" and "doe.jane#email". Her variants include replacing the "." with "-" or "_". At the moment, my duplicates script—upgraded ever so kindly by #Jordan Running and Ed Nelson—takes care of 'strict' duplicates, yet cannot detect that "doe.jane#email.com" is a 'complicated' duplicate of "jane.doe#email.com". Is there a way to delete even these duplicates such that I do not email more than one of Jane's addresses? All of them point to the same inbox, so I need only include one of her addresses.
Here is my current code:
function removeDuplicates() {
const startTime = new Date();
const newData = [];
const sheet = SpreadsheetApp.getActiveSheet();
const data = sheet.getDataRange().getValues();
const numRows = data.length;
const seen = {};
for (var i = 0, row, key; i < numRows && (row = data[i]); i++) {
key = JSON.stringify(row);
if (key in seen) {
continue;
}
seen[key] = true;
newData.push(row);
};
sheet.clearContents();
sheet.getRange(1, 1, newData.length, newData[0].length).setValues(newData);
// Show summary
const secs = (new Date() - startTime) / 1000;
SpreadsheetApp.getActiveSpreadsheet().toast(
Utilities.formatString('Processed %d rows in %.2f seconds (%.1f rows/sec); %d deleted',
numRows, secs, numRows / secs, numRows - newData.length),
'Remove duplicates', -1);
}
Sample File
Fuzzy match test
Notes:
used without #email.com part, it distorts the result
use a the custom function: =removeDuplicatesFuzzy(B2:B12,0.66)
0.66 is a percentage of fuzzy match.
the right column of a result (Column D) shows found matches with > 0.66 accuracies. Dash - is when matches are not found ("unique" values)
Background
You may try this library:
https://github.com/Glench/fuzzyset.js
To install it, copy the code from here.
The usage is simple:
function similar_test(string1, string2)
{
string1 = string1 || 'jane.doe#email.com';
string2 = string2 || 'doe.jane#email.com'
a = FuzzySet();
a.add(string1);
var result = a.get(string2);
Logger.log(result); // [[0.6666666666666667, jane.doe#email.com]]
return result[0][0]; // 0.6666666666666667
}
There's also more info here: https://glench.github.io/fuzzyset.js/
Notes:
please google more info, look for javascript fuzzy string match. Here's related Q: Javascript fuzzy search that makes sense. Note: the solution should work in Google Sheets (no ECMA-6)
this algorithm is not smart like a human, it tests a string by char. If you have two similar strings like don.jeans#email.com it will be 84% similar to doe.jane#email.com but human detects it is completely another person.
Search for my Google Sheets add-on called Flookup. It should do what you want.
For your case, you can use this function:
ULIST(colArray, [threshold])
The parameter details are:
colArray: the column from which unique values are to be returned.
threshold: the minimum percentage similarity between the colArray values that are not unique.
Or you can simply use the Highlight duplicates or Remove duplicates from the add-on menu.
The key feature is that you can adjust the level of strictness by changing the percentage similarity.
Bonus: It will easily catch swaps like jane.doe#email.com / doe.jane#email.com
You can find out more at the official website.
Try as I might I CANNOT decipher the problem that I'm having writing new rows to a sheet. I've done this several times and I've debugged this thoroughly using Logger.log, but I just can't solve it. Here's a summary of what I'm doing, a code snippet, and a log:
What I'm doing:
Adding rows to a sheet (below existing rows)
73 new rows are stored stored in array: Grade Rows
When attempt to write the new rows to the sheet, get this error:
Incorrect range width, was 1 should be 26
Here’s the code including some Logger.logs:
var BeginningRow = LastSGRowSheet + 1;
var EndingRow = BeginningRow + SGPushKtr -1;
Logger.log("BeginningRow =>" + BeginningRow + "<=, SGPushKtr =>" + SGPushKtr + "<=, Ending Row =>" + EndingRow + "<=");
var GradesRangeString = 'A' + BeginningRow + ':' + LastStudentGradesColumnLetter + EndingRow;
Logger.log("GradesRangeString =>" + GradesRangeString + "<=");
StudentGradeSheet.getRange(GradesRangeString).setValues(GradeRows);
The error occurs in that last line of code.
Here’s the log:
17-12-31 11:51:15:763 EST] BeginningRow =>364<=, SGPushKtr =>73<=, Ending Row =>436<=
[17-12-31 11:51:15:764 EST] GradesRangeString =>A364:Z436<=
Let's say that your data array is dA then the number of rows in that array is dA.length and assuming its a rectangular array then the number of columns is vA[0].length. So your output command has to be some thing like this.
sheet.getRange(firstRow,firstColumn,dA.length,dA[0].length).setValues(dA);
If you'd like to learn a little more about this problem check this out.
You could also append each row to the current sheet one row at a time in loop.
It's hard to know why GradeRows doesn't match your range without seeing all of your code.
Using Cooper's getRange arguments will likely reveal your problem, and will prevent you from having to update your row and column variables when you make changes to your code. Another issue that gets me sometimes is the fact that the setValues array needs to be exactly the same dimensions as the range. If one row has a different length, it will fail. If the logic I use to create row arrays can result in different lengths, I use the function below to make sure my arrays are symmetric before writing them to a sheet. It is also helpful for debugging.
/**
* Takes a 2D array with element arrays with differing lengths
* and adds empty string elements as necessary to return
* a 2D array with all element arrays of equal length.
* #param {array} ar
* #return {array}
*/
function symmetric2DArray(ar){
var maxLength;
var symetric = true;
if (!Array.isArray(ar)) return [['not an array']];
ar.forEach( function(row){
if (!Array.isArray(row)) return [['not a 2D array']];
if (maxLength && maxLength !== row.length) {
symetric = false;
maxLength = (maxLength > row.length) ? maxLength : row.length;
} else { maxLength = row.length }
});
if (!symetric) {
ar.map(function(row){
while (row.length < maxLength){
row.push('');
}
return row;
});
}
return ar
}
How about using appendRow()? That way you don't need to do lots of calculations about the range. You can loop through your data and add it row by row. Something like this:
myDataArr = [[1,2],[3,4],[5,6]]
myDataArr.forEach(function(arrayItem){
sheet.appendRow([arrayItem[0],arrayItem[1]])
})
// This will output to the sheet in three rows.
// [1][2]
// [3][4]
// [5][6]
Goal: I'm trying to create a behavior tracker for four classes in Google Spreadsheets. The tracker has nine sheets: Class7A, Class7B, Class8A, Class8B, and Mon-Fri summary sheets. The goal was for each ClassXX sheet to have behavior tracking information for an entire week, but for the default view to show only the current day's information.
Attempts: During initial workup (with only the Class7A sheet created), I got this to work using a modification of the script found here (Thank you Jacob Jan Tuinstra!): Optimize Google Script for Hiding Columns
I modified it to check the value in the third row of each column (which held a 1 for Monday, 2 for Tuesday, etc), and if it did not match the numerical equivalent for the day of the week (var d = new Date(); var n = d.getDay();), then it would hide that column. This process was somewhat slow - I'm assuming because of the iterating through each column - but it worked.
Quite excited, I went ahead and added the rest of the sheets, and tried again - but the code as written, seems to affect only the current sheet. I tried modifying it by replacing var sheet = ss.getSheets()[0]; with for script that iterated through the columns, until i>4 (I've since lost that piece of code), with no luck.
Deciding to go back and try adapting the original version of the script to instead explicitly run multiple times for each named sheet, I found the that script no longer seems to work at all. I get various version of "cannot find XX function in sheet" or "cannot find XX function in Range."
Source: A shared version (with student info scrubbed) can be found here: https://docs.google.com/spreadsheets/d/1OMq4a4_Gh_xyNk_IRy-mwJn5Hq36RXmdAzTzx7dGii0/edit?usp=sharing (editing is on).
Stretch Goal: Ultimately, I need to get this to reliably show only the current day's columns (either through preset ranges (same for each sheet), or the 1-5 values), and I need it to do so for all four ClassXX sheets, but not the summary pages (and preferably more quickly than the iterations). If necessary, I can remove the summary pages and set them up externally, but that's not my first preference. I would deeply appreciate any help with this; so far my attempts have seemed to only take me backwards.
Thanks!
Current code:
function onOpen() {
// get active spreadsheet
var ss = SpreadsheetApp.getActiveSpreadsheet();
// create menu
var menu = [
{name: "Show Today Only", functionName: "hideColumn"},
{name: "Show All Days", functionName: "showColumn"},
{name: "Clear Week - WARNING will delete all data", functionName: "clearWeek"}
];
// add to menu
ss.addMenu("Show Days", menu);
}
var d = new Date();
var n = d.getDay();
function hideColumn() {
// get active spreadsheet
var ss = SpreadsheetApp.getActiveSpreadsheet();
// get first sheet
var sheet = ss.getSheets()[0];
// get data
var data = sheet.getDataRange();
// get number of columns
var lastCol = data.getLastColumn()+1;
Logger.log(lastCol);
// itterate through columns
for(var i=1; i<lastCol; i++) {
if(data.getCell(2, i).getValue() != n) {
sheet.hideColumns(i);
}
}
}
function showColumn() {
// get active spreadsheet
var ss = SpreadsheetApp.getActiveSpreadsheet();
// get first sheet
var sheet = ss.getSheets()[0];
// get data
var data = sheet.getDataRange();
// get number of columns
var lastCol = data.getLastColumn();
// show all columns
sheet.showColumns(1, lastCol);
}
I cannot recreate the problem of the script not working at all, it's working fine for Class7A so that part is working fine.
So let's look at the two other problems:
Applying this to all Sheets
Speeding up the script
First let's create some globals we use in both functions
var d = new Date();
var n = d.getDay();
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheetNames = ss.getSheets().map(function(sheet) {return sheet.getName();});
var classSheets = sheetNames.filter(function(sheetName) {return sheetName.match("Class")});
Now we can iterate over classSheets and get the sheet by name and hide columns in each.
However hiding each individual column is very slow.
The sheet is built very structured, every week has 12 columns (except for friday which doesn't have the grey bar), so we can just calculate the ranges we want to hide.
function hideColumn() {
classSheets.map(function(sheetName){
var sheet = ss.getSheetByName(sheetName);
if (n == 1) {
// Hide everything after the first three columns + Monday
sheet.hideColumns(3 + 11, 12 * 4);
} else if (n == 5) {
// Hide everything to the left except the leftmost three columns
sheet.hideColumns(3, 4 * 12);
} else {
// Hide everything left of the current day
sheet.hideColumns(3, (n - 1) * 12);
// Hide everything after the current day
sheet.hideColumns(3 + n * 12, (5 - n) * 12 - 1);
}
});
}
Lastly we can shorten showColumn
function showColumn() {
classSheets.map(function(sheetName){
var sheet = ss.getSheetByName(sheetName);
var lastCol = sheet.getLastColumn();
sheet.showColumns(1, lastCol);
});
}
I have a small script and what I'm trying to do is to write one value from 'Sheet 1' to 'Sheet 2'. Wait for the results to load and compare the cells to see if it is above 10% or not. I have some =importhtml functions in the spreadsheet and it takes along time to load. I've tried sleep, utilities sleep, and flush. None have been working, maybe because I might be putting it in the wrong area..
function compareCells() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var listSheet = ss.getSheetByName('Stocks');
var dataSheet = ss.getSheetByName('Summary');
var listSheetLastRow = listSheet.getLastRow();
var currRow = 1;
for (i = 1; i <= listSheetLastRow; i++) {
if (listSheet.getRange(1, 3).getValue() == 'asd') {
var ticker = listSheet.getRange(currRow, 1).getValue();
dataSheet.getRange(5, 4).setValue(ticker);
var value1 = dataSheet.getRange(15, 4).getValue();
var value2 = dataSheet.getRange(22, 4).getValue();
SpreadsheetApp.flush();
if (value1 > 0.10 && value2 > 0.10) {
listSheet.getRange(currRow, 8).setValue('True');
listSheet.getRange(currRow, 9).setValue(value1);
listSheet.getRange(currRow, 10).setValue(value2);
} else {
listSheet.getRange(currRow, 8).setValue('False');
}
} else {
Browser.msgBox('Script aborted');
return null;
}
currRow++;
}
}
If it is not important that you use the =IMPORTHTML() function in your sheet, the easiest way to do this will be to use UrlFetchApp within Apps Script. Getting the data this way will cause your script to block until the HTML response is returned. You can also create a time-based trigger so your data is always fresh, and the user will not have to wait for the URL fetch when looking at your sheet.
Once you get the HTML response, you can do all of the same processing you'd do in Sheet1 within your script. If that won't work because you have complex processing in Sheet1, you can:
use UrlFetchpApp.fetch('http://sample.com/data.html') to retrieve your data
write the data to Sheet1
call SpreadsheetApp.flush() to force the write and whatever subsequent processing
proceed as per your example above
By handling these steps sequentially in your script you guarantee that your later steps don't happen before the data is present.
I had a similar problem but came up with a solution which uses a while loop which forces the script to wait until at least 1 extra column or 1 extra row has been added. So for this to work the formula needs to add data to at least one extra cell other than the one containing the formula, and it needs to extend the sheet's data range (number of rows or columns), for example by adding the formula to the end of the sheet, which looks like what you are doing. Every 0.5 seconds for 10 seconds it checks if extra cells have been added.
dataSheet.getRange(5, 4).setValue(ticker);
var wait = 0;
var timebetween = 500;
var timeout = 10000;
var lastRow = dataSheet.getLastRow();
var lastColumn = dataSheet.getLastColumn();
while (dataSheet.getLastColumn() <= lastColumn && dataSheet.getLastRow() <= lastRow){
Utilities.sleep(timebetween);
wait += timebetween;
if (wait >= timeout){
Logger.log('ERROR: Source data for ' + ticker + ' still empty after ' + timeout.toString() + ' seconds.');
throw new Error('Source data for ' + ticker + ' still empty after ' + timeout.toString() + ' seconds.');
}
}
In case if you are getting these two values (
var value1 = dataSheet.getRange(15, 4).getValue();
var value2 = dataSheet.getRange(22, 4).getValue();
) after the =importhtml call, you have to add sleep function before these two lines of code.
You also can have a loop until you get some values into the range from =importhtml call and add some sleep in the loop. Also note that as of April 2014 the limitation of script runtime is 6 minutes.
I also found this link which might be helpful.
Hope that helps!