Creating a udf in bigquery to match array inputs - javascript

I'm trying to create a udf to match array's in biqquery, essentially if there are values of x array in y array then I want the result to be true.
e.g. match_rows([4,5,6], [5,6,7] should return true.
I've written this out but I keep getting syntax error's and i'm not familiar enough with what i'm doing to be able to debug so was hoping someone might be able to shed some light on what is going on.
Specific error = No matching signature for function match_rows for argument types: ARRAY, ARRAY. Supported signature: match_rows(ARRAY, ARRAY) at [44:1]
CREATE TEMP FUNCTION match_rows(arr1 ARRAY<FLOAT64>, arr2 ARRAY<FLOAT64>)
RETURNS BOOL
LANGUAGE js AS
"""
function findCommonElements2(arr1, arr2) {
// Create an empty object
let obj = {};
// Loop through the first array
for (let i = 0; i < arr1.length; i++) {
// Check if element from first array
// already exist in object or not
if(!obj[arr1[i]]) {
// If it doesn't exist assign the
// properties equals to the
// elements in the array
const element = arr1[i];
obj[element] = true;
}
}
// Loop through the second array
for (let j = 0; j < arr2.length ; j++) {
// Check elements from second array exist
// in the created object or not
if(obj[arr2[j]]) {
return true;
}
}
return false;
}
""";
WITH input AS (
SELECT STRUCT([5,6,7] as row, 'column2' as value) AS test
)
SELECT
match_rows([4,4,6],[4,7,8]),
match_rows(test.row, test.row)
FROM input ```

Instead of using ARRAY<FLOAT64> you are passing ARRAY<INT64> to your function. Therefore, the No matching signature error. In order to solve this error, you can just assign one of the values in your array as float, the syntax is below:
WITH input AS (
#notice that the first element is a float and so the whole array is ARRAY<FLOAT64>
SELECT STRUCT([5.0,6,7] as row, 'column2' as value) AS test
)
SELECT
#the same as it was done above, first element explicitly as float
match_rows([4.0,4,6],[4,7,8]),
match_rows(test.row, test.row)
FROM input
However, I have tested your function and I have realised that the syntax of your JavaScript UDF is not according to the documentation. The syntax should be as follows:
CREATE TEMP FUNCTION multiplyInputs(x FLOAT64, y FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS """
//write directly your transformations here
return x*y;
""";
Notice that, it is not needed to especify function findCommonElements2(arr1, arr2), you can go directly to the body of your function because the function's name is defined after the statement CREATE TEMP FUNCTION.
In addition, I have also figured out that your function was not returning the desired output. For this reason, I have written a simpler JavaScript UDF which returns what you expect. Below is the syntax and test:
CREATE TEMP FUNCTION match_rows(arr1 ARRAY<FLOAT64>, arr2 ARRAY<FLOAT64>)
RETURNS BOOL
LANGUAGE js AS
"""
//array of common elements betweent the two arrays
var common_el= [];
for(i=0;i < arr1.length;i++){
for(j=0; j< arr2.length;j++){
if(arr1[i] = arr2[j]){
//add to the common_el array when the element is present in both arrays
common_el.push(arr1[i]);
}
}
}
//if the array of common elements has at least one element return true othersie false
if(common_el.length > 0){return true;}else{return false;}
""";
WITH input AS (
SELECT STRUCT([5.0,6,7] as row, 'column2' as value) AS test
)
SELECT
match_rows([4.0,4,6],[4.0,5,7]) as check_1,
match_rows(test.row, test.row) as check_2
FROM input#, unnest(test) as test
And the output,
Row check_1 check_2
1 true true

Below is for BigQuery Standard SQL
#standardSQL
CREATE TEMP FUNCTION match_rows(arr1 ANY TYPE, arr2 ANY TYPE) AS (
(SELECT COUNT(1) FROM UNNEST(arr1) el JOIN UNNEST(arr2) el USING(el)) > 0
);
SELECT match_rows([4,5,6], [5,6,7])

Related

Custom array function for MIN() in Google Sheets

Since the MIN() function in Sheets only returns a single value and there is no way to make it work with ARRAYFORMULA, I wanted to make a custom function that would take two arrays and compare the values at each entry, and return an array of the minimums. (I know there's a workaround that uses QUERY, but it wasn't going to work for my purposes)
What I have right now will take two arrays with one row and work perfectly. Unfortunately, it breaks when more than one row is introduced. I'm not sure why, so I'm lost on how to move forward. How can I make it work for any size arrays?
When I feed it any two dimensional range, it throws an error:
"TypeError: Cannot set property '0' of undefined"
on this line finalarray[x][y] = Math.min(arr1[x][y], arr2[x][y]);
The current ""working"" code:
function MINARRAY(arr1, arr2) {
if (arr1.length == arr2.length && arr1[0].length == arr2[0].length)
{
var finalarray = [[]];
for (x = 0; x < arr1.length; x++)
{
for(y = 0; y < arr1[x].length; y++)
{
finalarray[x][y] = Math.min(arr1[x][y], arr2[x][y]);
}
}
return finalarray;
}
else
{
throw new Error("These arrays are different sizes");
}
}
finalarray is a 2D array. [[]] sets only the first element of finalarray to a array. It is needed to set all elements of finalarray to a array. Inside the loop, add
finalarray[x]=[]//set `x`th element as a array or finalarray[x]= finalarray[x] || []
finalarray[x][y] = Math.min(arr1[x][y], arr2[x][y]);
Alternatively,
finalarray[x] = [Math.min(arr1[x][y], arr2[x][y])]

Detecting datatype of each column in javascript prepending

I am working on an algorithm that detect the datatype of each column and prepend it to the top of the list. So I have say 2D matrix and
1) I need to detect the datatype of each element
2) Count the sum of each type of matrix column wise
3) Get the maximum of each type column wise
4) Prepend the type from 3 at the top of respective column
So here is the example
1) For first, I know 2 techniques. i.e via jquery
$.each(row, function (index,item) {
alert(typeof(item)); //result is object instead of specific type
2) matrix traversing
for (var i=0;i<data[i].length;i++) {
for (var j=0;j<data1.length;j++) {
var r = data1[j][i];
if(isFinite(r) == true && )
numeric++ ;
else {str++;}
I know this is not the best method, but it is working for me well in giving number of string and numeric types
I know there is a an unshift() that prepend data at the top of a list..But, still not sure how will it work for me.
Any help and suggestions.
Not sure what your end goal is. One would likely say a column is string if any of the values is a string, and number only if all items are numbers.
Be careful with polluting global space with variables. Usually you would confine then to closed scopes.
More strings or more numbers also have to account for cases where the count is equal. E.g. five numbers and five strings.
The following could be a start:
// Should be expanded to fully meet needs.
function isNumber(n) {
return !isNaN(n + 0) /* && varius */;
}
// Count types of data in column.
// Return array with one object for each column.
function arrTypeStat(ar) {
var type = [], i, j;
// Loop by column
for (i = 0; i < ar[0].length; ++i) {
// Type-stats for column 'i'
type[i] = {num: 0, str: 0};
// Loop rows for current column.
for (j = 1; j < ar.length; ++j) {
if (isNumber(ar[j][i]))
++type[i].num;
else
++type[i].str;
}
}
return type;
}
// Add a row after header with value equal to 'num' or 'str' depending
// on which type is most frequent. Favoring string if count is equal.
function arrAddColType(ar) {
var type = arrTypeStat(ar);
// Add new array as second item in 'ar'
// as in: add a new (empty) row. This is the row to be filled
// with data types.
ar.splice(1, 0, []);
// Note:
// Favour string over number if count is equal.
for (i = 0; i < type.length; ++i) {
ar[1][i] = type[i].num > type[i].str ? 'num' : 'str';
}
return ar;
}
Sample fiddle. Use console, F12, to view result of test. Note that in this sample the numbers are represented both as quoted and un-quoted.
// Sample array.
var arrTest = [
['album','artist','price'],
[50,5,'50'],
[5,5,'5'],
['Lateralus', 'Tool', '13'],
['Ænima','Tool','12'],
['10,000 days','Tool',14]
];
// Test
var test = arrAddColType(arrTest);
// Log
console.log(
JSON.stringify(test)
.replace(/],\[/g, '],\n['));
Yield:
[["album","artist","price"],
["str","str","num"],
[50,5,"50"],
[5,5,"5"],
["Lateralus","Tool","13"],
["Ænima","Tool","12"],
["10,000 days","Tool",14]]

Merge arrays with similar values keeping the order of the values inside

Here's an interesting task that I faced today and I cannot think of any easy way to achieve the desired result.
Let's suppose we have a database with the following fields (columns): A,B,C,D,E,F,G but we don't know the names nor the count of the fields.
We receive a set of records from this database in the following format: {A:value1, B:value2, ...}.
If a value is not set for the current record the key will be missing too. This means I can receive {A:value} or {C:value1, D:value2} as valid records. The order of the keys will always stay the same. This means {D:value, C:value} is not a valid record.
I'm trying to recover the field names based on the returned records and keep the order of the keys.
For example I can receive records with the following keys:
A,C,D,E,F
D,F,G
A,B,F
From the example above I should be able to restore the original sequence which is A,B,C,D,E,F,G.
The first record gives us A,C,D,E,F.
The second one tells us that G is after F so now we have A,C,D,E,F,G
The third record gives us that B is after A so now we have A,B,C,D,E,F,G
If the order cannot be determined for sure we can use alphabetical order. Example for this is:
A,B
A,C
In the example above we cannot determine if the original order is A,B,C or A,C,B.
Any ideas how to implement this to work in the general case?
I will be implementing this algorithm using JavaScript but PHP, C++ or Java are also welcome.
EDIT: Do not think of the objects as standart JSON objects. In the real environment the structure is much more complex and the language is not pure JavaScript, but a modified version of ECMAScript. If it will be easier to understand - think only of the keys as an array of values ['A','B','C',...] and try to merge them, keeping the order.
EDIT 2: After struggling for some time and reading some ideas I came with the following solution:
Create an object that holds all relations - which column comes after which from each database record.
Create a relation between each a->b, b->c => a->c (inspired by Floyd–Warshall where each distance is considered as 1 if exists).
Create a sorting function (comparator) that will check if two elements can be compared. If not - alphabetical order will be used.
Get only the unique column names and sort them using the comparator function.
You can find the source-code attached below:
var allComparators = {};
var knownObjects = ['A,C,D,E,F','D,F,G','A,B,F'];
var allFields = knownObjects.join(',').split(',');
for (var i in knownObjects) {
var arr = knownObjects[i].split(',');
for (var i = 0; i < arr.length; i++) {
for (var j = i + 1; j < arr.length; j++) {
allComparators[arr[i]+'_'+arr[j]] = 1;
}
}
}
allFields = allFields.filter(function(value, index, self) {
return self.indexOf(value) === index;
});
for (var i in allFields) {
for (var j in allFields) {
for (var k in allFields) {
if (allComparators[allFields[i]+'_'+allFields[j]] && allComparators[allFields[j]+'_'+allFields[k]]) {
allComparators[allFields[i]+'_'+allFields[k]] = 1;
}
}
}
}
allFields.sort(function(a, b) {
if (typeof allComparators[a + '_' + b] != 'undefined') {
return -1;
}
if (typeof allComparators[b + '_' + a] != 'undefined') {
return 1;
}
return a > b;
});
console.log(allFields);
I give you the algorithm in a very direct and understandable way but the code! please try yourself and ask for help if required.
I express myself in two ways
In technical terms :
Generate a precedence graph (that is a directed graph)
Topological sort it
In more details :
Graph : Map(String, ArrayList< String >) = [Map(key,value)]
each key in the map corresponds to an element (A,B,C,...)
each value contains the elements that should place after the key,e.g for A it is {B,C,D,...}
How to fill the graph :
for each row:
for each element inside the row:
if the element is already as a key in the map
just add its immediate next item to the list*
else
add the element to the map and set the value to immediate next element of it**
* if the element is the last one in the row don't add anything to the map
** if the element is the last one in the row use {}, an empty list, as the value
Topological sort:
List sortedList;
for each key in the map:
if value.size() == 0 {
remove key from the map
add it the key to the sortedList
for each key' in the map:
if value'.contains(key)
value'.remove(key) (and update the map)
}
invert the sortedList
Test case :
the map for your first input will be:
{ A:{C,B} , C:{D} , D:{E,F} , E:{F} , F:{G} , G:{} , B:{F} }
Sort :
1 - G -> sortedList, map = { A:{C,B} , C:{D} , D:{E,F} , E:{F} , F:{} , B:{F} }
2 - F -> sortedList, map = { A:{C,B} , C:{D} , D:{E} , E:{} , B:{} }
3 - E -> sortedList, map = { A:{C,B} , C:{D} , D:{} }
4 - D -> sortedList, map = { A:{C,B} , C:{} }
5 - C -> sortedList, map = { A:{B} , B:{} }
6 - B -> sortedList, map = { A:{} }
6 - A -> sortedList, map = { }
sortedList = {G,F,E,D,C,B,A}
Invert - > {A,B,C,D,E,F,G}
do you think something like this would work?
var oMergedList = [];
function indexOfColumn(sColumnName)
{
for(var i = 0 ; i < oMergedList.length;i++)
if(oMergedList[i]==sColumnName)
return i;
return -1;
}
function getOrdinalIndex(sColumnName)
{
var i = 0;
for( ; i < oMergedList.length;i++)
if(oMergedList[i]>sColumnName)
break;
return i;
}
function merge(oPartial)
{
var nPreviousColumnPosition = -1;
for(var i = 0 ; i < oPartial.length;i++)
{
var sColumnName = oPartial[i] ;
var nColumnPosition = indexOfColumn(sColumnName);
if(nColumnPosition>=0)//already contained
{
if(nPreviousColumnPosition>=0 && nColumnPosition!=(nPreviousColumnPosition+1))//but inserted on wrong place
{
oMergedList.splice(nColumnPosition, 1);
nColumnPosition = nPreviousColumnPosition
oMergedList.splice(nColumnPosition, 0, sColumnName);
}
nPreviousColumnPosition = nColumnPosition;
}
else //new
{
if(nPreviousColumnPosition<0)//no reference column
{
nPreviousColumnPosition = getOrdinalIndex(sColumnName);
}
else// insert after previous column
nPreviousColumnPosition++;
oMergedList.splice(nPreviousColumnPosition, 0, sColumnName);
}
}
}
/* latest sample
merge(['A','C','E','G']);
merge(['A','D']);
merge(['C','D']);
*/
/* default sample
merge(['A','C','D','E','F']);
merge(['D','F','G']);
merge(['A','B','F']);
*/
/* fix order
merge(['A','B']);
merge(['A','C']);
merge(['A','B','C']);
*/
/* insert alphabetically
merge(['B']);
merge(['A']);
merge(['C']);
*/
document.body.innerHTML = oMergedList.join(',');
the only "undefined" parts are where to insert if you have no previous columns (I putted in firt position)
and second in the case A,B.. A,C the columns will be inserted when first seen
means A,B..A,C will give A,C,B .. and means A,C..A,B will give A,B,C
edited to use the current array position to fix
previous addition so if you add [A,C][A,B] you will get [A,C,B] but if you then pass [A,B,C]
the array will be fixed to reflect the new order
also when new columns appears and there is no reference column appends in alphabetical order
fixed the column correctioning par.. should now give you the correct result..
As described by JSON.org there is not such thing as a Json ordered keys:
An object is an unordered set of name/value pairs.
That being said, it becomes quite easy to merge objects as you don't need the order.
for (var attrname in obj2) { obj1[attrname] = obj2[attrname]; }
Source: How can I merge properties of two JavaScript objects dynamically?

Can I select 2nd element of a 2 dimensional array by value of the first element in Javascript?

I have a JSON response like this:
var errorLog = "[[\"comp\",\"Please add company name!\"],
[\"zip\",\"Please add zip code!\"],
...
Which I'm deserializing like this:
var log = jQuery.parseJSON(errorLog);
Now I can access elements like this:
log[1][1] > "Please add company name"
Question:
If I have the first value comp, is there a way to directly get the 2nd value by doing:
log[comp][1]
without looping through the whole array.
Thanks for help!
No. Unless the 'value' of the first array (maybe I should say, the first dimension, or the first row), is also it's key. That is, unless it is something like this:
log = {
'comp': 'Please add a company name'
.
.
.
}
Now, log['comp'] or log.comp is legal.
There are two was to do this, but neither avoids a loop. The first is to loop through the array each time you access the items:
var val = '';
for (var i = 0; i < errorLog.length; i++) {
if (errorLog[i][0] === "comp") {
val = errorLog[i][1];
break;
}
}
The other would be to work your array into an object and access it with object notation.
var errors = {};
for (var i = 0; i < errorLog.length; i++) {
errors[errorLog[i][0]] = errorLog[i][1];
}
You could then access the relevant value with errors.comp.
If you're only looking once, the first option is probably better. If you may look more than once, it's probably best to use the second system since (a) you only need to do the loop once, which is more efficient, (b) you don't repeat yourself with the looping code, (c) it's immediately obvious what you're trying to do.
No matter what you are going to loop through the array somehow even it is obscured for you a bit by tools like jQuery.
You could create an object from the array as has been suggested like this:
var objLookup = function(arr, search) {
var o = {}, i, l, first, second;
for (i=0, l=arr.length; i<l; i++) {
first = arr[i][0]; // These variables are for convenience and readability.
second = arr[i][1]; // The function could be rewritten without them.
o[first] = second;
}
return o[search];
}
But the faster solution would be to just loop through the array and return the value as soon as it is found:
var indexLookup = function(arr, search){
var index = -1, i, l;
for (i = 0, l = arr.length; i<l; i++) {
if (arr[i][0] === search) return arr[i][1];
}
return undefined;
}
You could then just use these functions like this in your code so that you don't have to have the looping in the middle of all your code:
var log = [
["comp","Please add company name!"],
["zip","Please add zip code!"]
];
objLookup(log, "zip"); // Please add zip code!
indexLookup(log, "comp"); // Please add company name!
Here is a jsfiddle that shows these in use.
Have you looked at jQuery's grep or inArray method?
See this discussion
Are there any jquery features to query multi-dimensional arrays in a similar fashion to the DOM?

Create an array with tree elements in Javascript

I need to create an array from tree elements in Javascript and being a newbie I don't know how to achieve this.
pseudo-code :
function make_array_of_tree_node(tree_node)
{
for (var i = 0; i < tree_node.childCount; i ++) {
var node = tree_node_node.getChild(i);
if (node.type ==0) {
// Here I'd like to put a link (node.title) in an array as an element
} else if (node.type ==6) {
// Here the element is a folder so a I need to browse it
make_array_of_tree_node(node)
}
}
}
// Some code
make_array_of_tree_node(rootNode);
// Here I'd like to have access to the array containing all the elements node.title
You can declare an array like this:
var nodes = [];
Then you can add things to it with:
nodes.push(something);
That adds to the end of the array; in that sense it's kind-of like a list. You can access elements by numeric indexes, starting with zero. The length of the array is maintained for you:
var len = nodes.length;
What you'll probably want to do is make the array another parameter of your function.
edit — To illustrate the pattern, if you've got a recursive function:
function recursive(data, array) {
if ( timeToStop ) {
array.push( data.whatever );
}
else {
recursive(data.subData, array);
}
}
Then you can use a second function to be the real API that other code will use:
function actual(data) {
var array = [];
recursive(data, array); // fills up the array
return array;
}
In JavaScript, furthermore, it's common to place the "recursive" function inside the "actual" function, which makes the recursive part private and keeps the global namespace cleaner.

Categories