Tensorflow.js selecting specific part of Tensor - javascript

In javascript, I'm successfully converting a RGB video frame from my webcam to a tensor using TensorFlowJS's function tf.browser.fromPixels(). Now, I'd like to select only a part of this tensor according to values I've previously obtained. specifically a rectangle from the video frame with coordinates [x1,y1,x2,y2] but I'm struggling to do so using TFJS function tf.stridedSlice(), because I can't figure out how the function's parameters works.
For example, the video frame tensor has shape [480,640,3], and I'd like to cut a whole rectangle from it with shape [270,202,3], of which I know the upper left (x1,y1) and bottom right (x2,y2) coordinates, how can I achieve this in some manner like:
tensorImg = tf.browser.fromPixels(videoFrame);
tensorCropped = tf.stridedSlice(tensorImg,[x1,y1],[x1+x2,y1+y2]); ???
Thanks.

This should work:
const cropBox = [[0.15, 0.15, 0.85, 0.85]]; // top,left,bottom,right in range 0..1 (not in pixel range)
const outputSize = [200, 200]; // how large we want output to be
const resize = tf.image.cropAndResize(inputTensor, cropBox, [0], outputSize);
Note that cropBox itself is array of arrays and then second param says which entry to actually use [0])

Related

Get all pixels within a rectangle using corner x,y coordinates

I have a 500 pixel by 500 pixel image that I am using to pull data from a 250,000 index array where each index represents 1 pixel.
The user is able to draw a rectangle at any orientation, and I am capturing the coordinates for each corner.
I am trying to capture each pixel within the rectangle to reference the data array and extract the related data.
I looked at Bresenham algorithm in Javascript and while I can get all the points between each of the coordinates using this solution, I am unable to loop through these points as the paths do not always contain the same number of pixels.
An example of the values I'm looking for using the following coordinates would be:
corner1 = [100,100]
corner2 = [100,105]
corner3 = [105,105]
corner4 = [105,100]
And the result (sort order is not important):
pixelsInRectangle = [
[100,100],[100,101],[100,102],[100,103],[100,104],[100,105],
[101,100],[101,101],[101,102],[101,103],[101,104],[101,105],
[102,100],[102,101],[102,102],[102,103],[102,104],[102,105],
[103,100],[103,101],[103,102],[103,103],[103,104],[103,105],
[104,100],[104,101],[104,102],[104,103],[104,104],[104,105],
[105,100],[105,101],[105,102],[105,103],[105,104],[105,105]
]
One set of coordinates I'm trying to solve for are:
corner1 = [183,194]
corner2 = [190,189]
corner3 = [186,184]
corner4 = [179,190]
Any recommendations would be greatly appreciated!
If rectangle is not axis aligned:
Sort vertices by Y-coordinate.
Get the lowest one. From two next Y-coordinates choose left and right ones.
Start simple line rasterization scan along left edge and along right edge simultaneously: for current integer Y value calculate corresponding rounded X-coordinate for left edge, for right edge, and output all horizonatal line between (xleft, y)-(xright,y)
For edge between vertices (x0,y0)-(x1,y1) formula is
x = x0 + (x1-x0)*(y-y0)/(y1-y0)
(example for triangle exploiting the same technique)
When some vertex is reached, change equation of corresponding edge, continue.
Using this way, you fill triangle, parallelogramm, another triangle (or just two triangles if two vertices share the same Y)
(You can use Bresenham or DDA, or another line rasterization algorithm if necessary)

Uint clamped array to data url

I want to get part of a canvas in the base 64 data url format which will be sent to my api
for this is I am using the ctx.getImageData() method which is returning a clamped uint 8 array
here is the code:
const data = ctx.getImageData(mousex, mousey, mousex1, mousey1);
const clamped = data.data
}
I have tried many btoa methods but the either return a broken array full of A or a broken array with a lot of /
You can use canvas.toDataURL(type, encoderOptions) for this purpose.
First extract the part of your source image. You don't need getImageData() for that. Instead create a second canvas with part_canvas = document.createElement("canvas"), this canvas doesnt have to be visible on the page.
Assign it its size .width and .height of the part you want to extract
part_ctx = part_canvas.getContext("2d") on the canvas
Then part_ctx.drawImage(source_image, part_x, part_x, part_width, parth_height, 0, 0, part_width, parth_height); This will take a rectangualr area of the source image and put it in the invisble canvas.
Finally you can do part_canvas.toDataURL() and you have the data URL
Didn't test. But I think this should work.

Dynamic Scaling for canvas graphs

I'm working on a canvas JS application to facilitate graphing, it's a personal project.
An issue I'm having trouble with currently is scaling of variables to display visually.
for example, users enter a set of points, these can be any number. I cannot always 1:1 scale graph every-point. Imagine a canvas of 600x600 and values of (1000,1000) and (1,1). you would have to do some scaling/modification to decide where these points should be put on the graph.
How can one dynamically scale numbers like this and have them sit in reasonable places? Are there common approaches to solving this problem?
Yes, you can "map" source values into a designated range.
This mapRange function allows you to scale/map your 1000x1000 values into your 600x600 canvas
// Given low,high values of the source(1000,1000)
// Given low,hight values of the mapped numbers (600,600)
// and given a value to map from the source to the destination range (value)
// map the source value into a designated range
function mapRange(value, sourceLow, sourceHigh, mappedLow, mappedHigh){
return mappedLow + (mappedHigh - mappedLow) *
(value - sourceLow) / (sourceHigh - sourceLow);
}

How to draw on an HTML5 Canvas, pixel-by-pixel

Suppose that I have a 900x900 HTML5 Canvas element.
I have a function called computeRow that accepts, as a parameter, the number of a row on the grid and returns an array of 900 numbers. Each number represents a number between 0 and 200. There is an array called colors that contains an array of strings like rgb(0,20,20), for example.
Basically, what I'm saying is that I have a function that tells pixel-by-pixel, what color each pixel in a given row on the canvas is supposed to be. Running this function many times, I can compute a color for every pixel on the canvas.
The process of running computeRow 900 times takes about 0.5 seconds.
However, the drawing of the image takes much longer than that.
What I've done is I've written a function called drawRow that takes an array of 900 numbers as the input and draws them on the canvas. drawRow takes lots longer to run than computeRow! How can I fix this?
drawRow is dead simple. It looks like this:
function drawRow(rowNumber, result /* array */) {
var plot, context, columnNumber, color;
plot = document.getElementById('plot');
context = plot.getContext('2d');
// Iterate over the results for each column in the row, coloring a single pixel on
// the canvas the correct color for each one.
for(columnNumber = 0; columnNumber < width; columnNumber++) {
color = colors[result[columnNumber]];
context.fillStyle = color;
context.fillRect(columnNumber, rowNumber, 1, 1);
}
}
I'm not sure exactly what you are trying to do, so I apologize if I am wrong.
If you are trying to write a color to each pixel on the canvas, this is how you would do it:
var ctx = document.getElementById('plot').getContext('2d');
var imgdata = ctx.getImageData(0,0, 640, 480);
var imgdatalen = imgdata.data.length;
for(var i=0;i<imgdatalen/4;i++){ //iterate over every pixel in the canvas
imgdata.data[4*i] = 255; // RED (0-255)
imgdata.data[4*i+1] = 0; // GREEN (0-255)
imgdata.data[4*i+2] = 0; // BLUE (0-255)
imgdata.data[4*i+3] = 255; // APLHA (0-255)
}
ctx.putImageData(imgdata,0,0);
This is a lot faster than drawing a rectangle for every pixel. The only thing you would need to do is separate you color into rgba() values.
If you read the color values as strings from an array for each pixel it does not really matter what technique you use as the bottleneck would be that part right there.
For each pixel the cost is split on (roughly) these steps:
Look up array (really a node/linked list in JavaScript)
Get string
Pass string to fillStyle
Parse string (internally) into color value
Ready to draw a single pixel
These are very costly operations performance-wise. To get it more efficient you need to convert that color array into something else than an array with strings ahead of the drawing operations.
You can do this several ways:
If the array comes from a server try to format the array as a blob / typed array instead before sending it. This way you can copy the content of the returned array almost as-is to the canvas' pixel buffer.
Use a web workers to parse the array and pass it back as a transferable object which you them copy into the canvas' buffer. This can be copied directly to the canvas - or do it the other way around, transfer the pixel buffer to worker, fill there and return.
Sort the array by color values and update the colors by color groups. This way you can use fillStyle or calculate the color into an Uint32 value which you copy to the canvas using a Uint32 buffer view. This does not work well if the colors are very spread but works ok if the colors represent a small palette.
If you're stuck with the format of the colors then the second option is what I would recommend primarily depending on the size. It makes your code asynchronous so this is an aspect you need to deal with as well (ie. callbacks when operations are done).
You can of course just parse the array on the same thread and find a way to camouflage it a bit for the user in case it creates a noticeable delay (900x900 shouldn't be that big of a deal even for a slower computer).
If you convert the array convert it into unsigned 32 bit values and store the result in a Typed Array. This way you can iterate your canvas pixel buffer using Uint32's instead which is much faster than using byte-per-byte approach.
fillRect is meant to be used for just that - filling an area with a single color, not pixel by pixel. If you do pixel by pixel, it is bound to be slower as you are CPU bound. You can check it by observing the CPU load in these cases. The code will become more performant if
A separate image is created with the required image data filled in. You can use a worker thread to fill this image in the background. An example of using worker threads is available in the blog post at http://gpupowered.org/node/11
Then, blit the image into the 2d context you want using context.drawImage(image, dx, dy).

Image data in JS

I want to know if I've understood this correctly.
I loop my map and load sprites on the map.
So I decided to store the pixel information in an array so that when I click with my mouse I check if its in a pixel array range and get the id related to it (effectively being pixel accurate for detecting what object was clicked?)
This is my thinking process:
I draw the sprite:
ctx.drawImage(castle[id], abposx, abposy - (imgheight/2));
myImageData[sdata[i][j][1]] =
ctx.getImageData(abposx, abposy, castle[id].width, castle[id].height);
Then some how with left click, check if mouse x and mouse y is inbetween the range of the arrays and return the value of myImageData?
Or have I misunderstood what getImageData is about?
getImageData gives you all of the pixel data for an image. Basically you only need to use getImageData if you are doing any sort of pixel manipulation with the image, like changing its hue/color, or applying a filter, or need specific data, such as the r/g/b, or alpha values. In order to check for pixel perfect collisions you an do something like the following:
var imageData = ctx.getImageData(x, y, 1, 1);
if(imageData.data[3] !== 0){
// you have a collision!
}
imageData.data[0-3] holds an array of data, 0-2 are the color values r/g/b, and 3 is the alpha value. So we assume if the alpha is 0, it must be a transparent portion. Also note, in the example and fiddle I am grabbing the data from the canvas itself, so if there was an image behind it that wasnt transparent it would count as not being transparent. The best way to do it if you have many images that overlap is to keep a copy of the image by itself offscreen somewhere and do a translation of the coordinates to get the position on the image. Heres a good MDN Article explaining getImageData as well.
Live Demo

Categories