Trying to scrape items in a canvas with selenium

Trying to scrape items in a canvas with selenium - javascript

I'm trying to scrape the High, Low ( These will show by going to settings > scale and enabling their lines ) and current price from this element, a canvas (XPATH: "/html/body/div[2]/div[1]/div[2]/div[1]/div[2]/table/tr[1]/td[3]/div/canvas[1]" ) from this webpage with python, using selenium.
I found some answers that were trying to get back to the JavaScript function done to get the values, I tried but without actually finding anything ( I'm not that good at reversing JavaScript ).
Thank you in advance for the help.

Assuming that the text is actually being drawn client side on a canvas object:
You could inject your own canvas ctx by changing the prototype getContext(...) method on HTMLCanvasElement. Adapting the code here:
HTMLCanvasElement.prototype.__oldGetContext = HTMLCanvasElement.prototype.getContext;
HTMLCanvasElement.prototype.getContext = (type, options) => {
let ctx = this.__oldGetContext(type, options);
ctx.__oldfillText = ctx.fillText
ctx.fillText = (text, x, y, options) => {
console.log('Drawing text ', text, 'at position ', x, y);
ctx.__oldfillText(text, x, y, options)
}
return ctx;
}
You might additionally have to override strokeText, and the getContext method for the other canvas types.
However, if the text is being rendered server-side, then you're out of luck and have to use OCR. On the other hand, if you just want pricing data, there are loads of crypto (smh) pricing apis out there that would be far less hassle than any of this.

Related

Different predictions if running in Node instead of Browser (using the same model_web - python converted model)

pretty new to ML and tensorflow!
I made an object detection model with http://cloud.annotations.ai that permits to train and convert a model in different formats, tfjs (model_web) too.
That website provides also boilerplates for running the model within a browser (react app)... just like you do - probably it is the same code, didn't spend enough time.
So I have this model running inside a browser, giving prediction about objects in a photo with pretty good results considering the amount of example I gave and the prediction score (0.89). the given bounding box is good too.
But, unfortunately, I didn't have "just one video" to analyze frame by frame inside a browser, I've got plenty of them. So I decided to switch to node.js, porting the code as is.
Guess what? TF.js relies on DOM and browser components, and almost none examples that works with Node exists. So not a big deal, just spent a morning figuring out all the missing parts.
Finally I'm able to run my model over videos that are splitted in frames, at a decent speed - although having the "Hello there, use tfjs-node to gain speed" banner when I'm already using tfjs-node - but results seems odd.
Comparing the same picture with the same model_web folder gave the same prediction but with lower score (0.80 instead of 0.89) and a different bounding box, with object not centered at all.
(TL;DR)
Does tfjs have different implementation of the libraries (tfjs and tfjs-node) that makes different use of the same model? I don't think it can be a problem of input because - after a long search and fight - i figure out two ways to give the image to tf.browser.getPixel in Node (and I'm still wondering why I have to use a "browser" method inside tfjs-node). Anyone made comparisons?
So... that's the code I used, for your reference:
model_web is being loaded with tf.loadGraphModel("file://path/to/model_web/model.json");
two different ways to convert a JPG and make it works with tf.browser.getPixel()
const inkjet = require('inkjet');
const {createCanvas, loadImage} = require('canvas');
const decodeJPGInkjet = (file) => {
return new Promise((rs, rj) => {
fs.readFile(file).then((buffer) => {
inkjet.decode(buffer, (err, decoded) => {
if (err) {
rj(err);
} else {
rs(decoded);
}
});
});
});
};
const decodeJPGCanvas = (file) => {
return loadImage(file).then((image) => {
const canvas = createCanvas(image.width, image.height);
const ctx = canvas.getContext('2d');
ctx.drawImage(image, 0, 0, image.width, image.height);
const data = ctx.getImageData(0, 0, image.width, image.height);
return {data: new Uint8Array(data.data), width: data.width, height: data.height};
});
};
and that's the code that use the loaded model to give predictions - same code for node and browser, found at https://github.com/cloud-annotations/javascript-sdk/blob/master/src/index.js - doesn't works on node as it is, I changed require("#tensorflow/tfjs"); with require("#tensorflow/tfjs-node"); and replaced fetch with fs.read
const runObjectDetectionPrediction = async (graph, labels, input) => {
const batched = tf.tidy(() => {
const img = tf.browser.fromPixels(input);
// Reshape to a single-element batch so we can pass it to executeAsync.
return img.expandDims(0);
});
const height = batched.shape[1];
const width = batched.shape[2];
const result = await graph.executeAsync(batched);
const scores = result[0].dataSync();
const boxes = result[1].dataSync();
// clean the webgl tensors
batched.dispose();
tf.dispose(result);
const [maxScores, classes] = calculateMaxScores(
scores,
result[0].shape[1],
result[0].shape[2]
);
const prevBackend = tf.getBackend();
// run post process in cpu
tf.setBackend("cpu");
const indexTensor = tf.tidy(() => {
const boxes2 = tf.tensor2d(boxes, [result[1].shape[1], result[1].shape[3]]);
return tf.image.nonMaxSuppression(
boxes2,
maxScores,
20, // maxNumBoxes
0.5, // iou_threshold
0.5 // score_threshold
);
});
const indexes = indexTensor.dataSync();
indexTensor.dispose();
// restore previous backend
tf.setBackend(prevBackend);
return buildDetectedObjects(
width,
height,
boxes,
maxScores,
indexes,
classes,
labels
);
};

Do different implementation of the libraries (tfjs and tfjs-node) that makes different use of the same model
If the same model is deployed both in the browser and in nodejs, the prediction will be the same thing.
If the predicted value are different, it might be related to the tensor used for the prediction. The processing from the image to the tensor might be different resulting in different tensors being used for the prediction thus causing the output to be different.
i figure out two ways to give the image to tf.browser.getPixel in Node (and I'm still wondering why I have to use a "browser" method inside tfjs-node)
The canvas package use the system graphic to create the browser like canvas environment that can be used by nodejs. This makes it possible to use tf.browser namespace especially when dealing with image conversion. However it is still possible to use directly nodejs buffer to create a tensor.

How do I draw a Javascript-modified SVG object on a HTML5 canvas?

The overall task I'm trying to achieve is to load an SVG image file, modify a color or text somewhere, and then draw it onto an HTML5 canvas (presumably with drawImage(), but any reasonable alternative would be fine).
I followed advice on another StackOverflow question on how to load and modify a SVG file in Javascript, which went like this:
<object class="svgClass" type="image/svg+xml" data="image.svg"></object>
followed in Javascript by
document.querySelector("object.svgClass").
getSVGDocument().getElementById("svgInternalID").setAttribute("fill", "red")
And that works. I now have the modified SVG displaying in my web page.
But I don't want to just display it - I want to draw it as part of an HTML5 canvas update, like this:
ctx.drawImage(myModifiedSVG, img_x, img_y);
If I try storing the result of getSVGDocument() and passing that in as myModifiedSVG, I just get an error message.
How do I make the HTML5 canvas draw call for my modified SVG?
Edit: I can draw an SVG image on an HTML5 canvas already through doing this:
var theSVGImage = new Image();
theSVGImage.src = "image.svg";
ctx.drawImage(theSVGImage, img_x, img_y);
and that's great, but I don't know how to modify text/colors in my loaded SVG image that way! If someone could tell me how to do that modification, then that would also be a solution. I'm not tied to going through the object HTML tag.

For a one shot, you could rebuild a new svg file, load it in an <img> and draw it again on the canvas:
async function doit() {
const ctx = canvas.getContext('2d');
const images = await prepareAssets();
let i = 0;
const size = canvas.width = canvas.height = 500;
canvas.onclick = e => {
i = +!i;
ctx.clearRect(0, 0, size, size);
ctx.drawImage(images[i], 0,0, size, size);
};
canvas.onclick();
return images;
}
async function prepareAssets() {
const svgDoc = await getSVGDOM();
// There is no standard to draw relative sizes in canvas
svgDoc.documentElement.setAttribute('width', '500');
svgDoc.documentElement.setAttribute('height', '500');
// generate the first <img> from current DOM state
const originalImage = loadSVGImage(svgDoc);
// here do your DOM manips
svgDoc.querySelectorAll('[fill="#cc7226"]')
.forEach(el => el.setAttribute('fill', 'lime'));
// generate new <img>
const coloredImage = loadSVGImage(svgDoc);
return Promise.all([originalImage, coloredImage]);
}
function getSVGDOM() {
return fetch('https://upload.wikimedia.org/wikipedia/commons/f/fd/Ghostscript_Tiger.svg')
.then(resp => resp.text())
.then(text => new DOMParser().parseFromString(text, 'image/svg+xml'));
}
function loadSVGImage(svgel) {
// get the markup synchronously
const markup = (new XMLSerializer()).serializeToString(svgel);
const img = new Image();
return new Promise((res, rej) => {
img.onload = e => res(img);
img.onerror = rej;
// convert to a dataURI
img.src= 'data:image/svg+xml,' + encodeURIComponent(markup);
});
}
doit()
.then(_ => console.log('ready: click to switch the image'))
.catch(console.error);
<canvas id="canvas"></canvas>
But if you are going to do it with a lot of frames, and expect it to animate...
You will have to convert your svg into Canvas drawing operations.
The method above is asynchronous, so you cannot reliably generate new images on the fly and get it ready to be drawn in a single frame. You need to store a few of these ahead of time, but since how long it will take to load the image is completely random (at least it should be) this might be a real programming nightmare.
Add to that the overhead the browser will have in loading a whole new SVG document every frame (yes, browsers do load the SVG document even when loaded inside an <img>), then paint it on the canvas, and finally remove it from the memory which will get filled in no time, you won't have a much free CPU to do anything else.
So the best here is probably to parse your SVG and to convert it to CanvasRenderingContext2D drawing operations => Draw it yourself.
This is achievable, moreover now that we can pass d attributes directly into Path2D object constructor, and that most of SVG objects have correspondence in the Canvas2D API (we even can use SVG filters), but that's still a lot of work.
So you may want to look at libraries that do that. I'm not an expert in libraries myself, and I can't recommend any, but I know that canvg does that since a very long time, I just don't know if they do expose their js objects in a reusable way. I know that Fabric.js does, but it also comes with a lot of other features that you may not need.
The choice is yours.

Get the pixel screen size in Spark AR studio (for Facebook)

I am starting to work with Spark AR studio and I looking for to get the screen size in pixel to compare the coordinate obtained by the gesture.location on Tap.
TouchGestures.onTap().subscribe((gesture) => {
// ! The location is always specified in the screen coordinates
Diagnostics.log(`Screen touch in pixel = { x:${gesture.location.x}, y: ${gesture.location.y} }`);
// ????
});
The gesture.location is in pixel (screen coordinate) and would like to compare it with the screen size to determine which side of the screen is touched.
Maybe using the Camera.focalPlane could be a good idea...
Update
I tried two new things to have the screen size:
const CameraInfo = require('CameraInfo');
Diagnostics.log(CameraInfo.previewSize.height.pinLastValue());
const focalPlane = Scene.root.find('Camera').focalPlane;
Diagnostics.log(focalPlane.height.pinLastValue());
But both return 0

This answer might be a bit late but it might be a nice addition for people looking for a solution where the values can easily be used in script, I came across this code(not mine, forgot to save a link):
var screen_height = 0;
Scene.root.find('screenCanvas').bounds.height.monitor({fireOnInitialValue: true}).subscribe(function (height) {
screen_height = height.newValue;
});
var screen_width = 0;
Scene.root.find('screenCanvas').bounds.width.monitor({fireOnInitialValue: true}).subscribe(function (width) {
screen_width = width.newValue;
});
This worked well for me since I couldn't figure out how to use Diagnostics.log with the data instead of Diagnostics.watch.

Finally,
Using the Device Info in the Patch Editor and passing these to the script works!
First, add a variable "to script" in the editor:
Then, create that in patch editor:
And you can grab that with this script:
const Patches = require('Patches');
const screenSize = Patches.getPoint2DValue('screenSize');
My mistake was to use Diagnostic.log() to check if my variable worked well.
Instead use Diagnostic.watch():
Diagnostic.watch('screenSize.x', screenSize.x);
Diagnostic.watch('screenSize.y', screenSize.y);

Screen size is available via the Device Info patch output, after dragging it to patch editor from the Scene section.

Now in the open beta (as of this post) you can drag Device from the scene sidebar into the patch editor to get a patch that outputs screen size, screen scale, and safe area inserts as well as the self Object.
The Device patch

The device size can be used in scripts using CameraInfo.previewSize.width and CameraInfo.previewSize.height respectively. For instance, if you wanted to get 2d points representing the min/max points on the screen, this'd do the trick.
const CameraInfo = require('CameraInfo')
const Reactive = require('Reactive')
const min = Reactive.point2d(
Reactive.val(0),
Reactive.val(0)
)
const max = Reactive.point2d(
CameraInfo.previewSize.width,
CameraInfo.previewSize.height
)
(The point I want to emphasize being that CameraInfo.previewSize.width and CameraInfo.previewSize.height are ScalarSignals, not number literals.)
Edit: Here's a link to the documentation: https://sparkar.facebook.com/ar-studio/learn/documentation/reference/classes/camerainfomodule

Using Processing.js: Can I have multiple canvases with only one data-processing-source sketch.pde?

using Processing.js, I would like to know if what I'm trying to do is even possible. I've looked on Pomax's tutorials, Processing.js the quick start of JS developers page, PJS the Google group, here, and I can't seem to find an answer to the question, "Can you have multiple canvases, such that they all use the same processing sketch (in my example below, engine.pde) each canvas passing variables to the sketch with the result being processing opens different images in each canvas, but edits them the same way.
So to sum up, I would like to use only 1 processing sketch, with many canvases, with each canvas telling the processing sketch a different name, and having a corresponding background image open in the sketch in each canvas.
<!DOCTYPE html><html><head><meta charset="utf-8">
<script src="../../../processingjs/processing.js"></script>
<script>
// Tell sketch what counts as JavaScript per Processing on the Web tutorial
var bound = false;
function bindJavascript(instance) { // Can I pass 'instance' like this?
var pjs = Processing.getInstanceById(instance);
if(pjs!=null) {
pjs.bindJavascript(this);
bound = true; }
if(!bound) setTimeout(bindJavascript, 250); }
bindJavascript('B104');
bindJavascript('B105');
function drawSomeImages(instance) {
// This is where I am trying to tell processing that each canvas has a number, and the number is assigned to a corresponding image.
var pjs = Processing.getInstanceById(instance);
var imageName = document.getElementById(instance);
pjs.setup(instance);
}
drawSomeImages('B104');
drawSomeImages('B105');
// Where is the Mouse?
function showXYCoordinates(x, y) { ... this is working ... }
// Send images back to server
function postAjax(canvasID) { ... AJAX Stuff is working ...}
</script>
</head>
<body>
<canvas id="B104" data-processing-sources="engine.pde" onmouseout="postAjax('B104')"></canvas>
<canvas id="B105" data-processing-sources="engine.pde" onmouseout="postAjax('B105')"></canvas>
</body>
</html>
And on the processing side:
/* #pjs preload=... this is all working ; */
// Tell Processing about JavaScript, straight from the tutorial...
interface JavaScript {
void showXYCoordinates(int x, int y);
}
void bindJavascript(JavaScript js) {
javascript = js;
}
JavaScript javascript;
// Declare Variables
PImage img;
... some other variables related to the functionality ...
void setup(String instance) {
size(300,300);
img = loadImage("data/"+instance+".png");
//img = loadImage("data/B104.png"); Example of what it should open if canvas 104 using engine.pde
background(img);
smooth();
}
void draw() { ... this is fine ... }
void mouseMoved(){ ... just calls draw and checks if mouse is in canvas, fine... }
if(javascript!=null){
javascript.showXYCoordinates(mouseX, mouseY);
}}

Just add a million canvas elements to your page all with the same data-processing-sources attribute, so they all load the same file. Processing.js will build as many sketches as you ask for, it doesn't care that the sketch files are the same for each one =)
(Note that what you described, one sketch instance rendering onto multiple canvases, giving each a different image, is not how sketches work. A sketch is tied to a canvas as its drawing surface. However, you can make a million "slave" sketches whose sole responsibility is to draw images when so instructed from JavaScript, and making the master sketch tell JavaScript to tell slave sketches to draw. Note that this is very, very silly. Just make JavaScript set the image, you don't need Processing if you're just showing images really)

Dynamically "unload" a Processing JS sketch from canvas

I'm using some javascript to allow users to dynamically load a sketch on click to a canvas element using:
Processing.loadSketchFromSources('canvas_id', ['sketch.pde']);
If I call Processing.loadSketchFromSources(...) a second (or third...) time, it loads a second (or third...) .pde file onto the canvas, which is what I would expect.
I'd like for the user to be able to click another link to load a different sketch, effectively unloading the previous one. Is there a method I can call (or a technique I can use) to check if Processing has another sketch running, and if so, tell it to unload it first?
Is there some sort of Processing.unloadSketch() method I'm overlooking? I could simply drop the canvas DOM object and recreate it, but that (1) seems like using a hammer when I need a needle, and (2) it results in a screen-flicker that I'd like to avoid.
I'm no JS expert, but I've done my best to look through the processing.js source to see what other functions may exist, but I'm hitting a wall. I thought perhaps I could look at Processing.Sketches.length to see if something is loaded already, but simply pop'ing it off the array doesn't seem to work (didn't think it would).
I'm using ProcessingJS 1.3.6.

In case someone else comes looking for the solution, here's what I did that worked. Note that this was placed inside a closure (not included here for brevity) -- hence the this.launch = function(), blah blah blah... YMMV.
/**
* Launches a specific sketch. Assumes files are stored in
* the ./sketches subdirectory, and your canvas is named g_sketch_canvas
* #param {String} item The name of the file (no extension)
* #param {Array} sketchlist Array of sketches to choose from
* #returns true
* #type Boolean
*/
this.launch = function (item, sketchlist) {
var cvs = document.getElementById('g_sketch_canvas'),
ctx = cvs.getContext('2d');
if ($.inArray(item, sketchlist) !== -1) {
// Unload the Processing script
if (Processing.instances.length > 0) {
// There should only be one, so no need to loop
Processing.instances[0].exit();
// If you may have more than one, then use this loop:
for (i=0; i < Processing.instances.length; (i++)) {
// Processing.instances[i].exit();
//}
}
// Clear the context
ctx.setTransform(1, 0, 0, 1, 0, 0);
ctx.clearRect(0, 0, cvs.width, cvs.height);
// Now, load the new Processing script
Processing.loadSketchFromSources(cvs, ['sketches/' + item + '.pde']);
}
return true;
};

I'm not familiar with Processing.js, but the example code from the site has this:
var canvas = document.getElementById("canvas1");
// attaching the sketchProc function to the canvas
var p = new Processing(canvas, sketchProc);
// p.exit(); to detach it
So in your case, you'll want to keep a handle to the first instance when you create it:
var p1 = Processing.loadSketchFromSources('canvas_id', ['sketch.pde']);
When you're ready to "unload" and load a new sketch, I'm guessing (but don't know) that you'll need to clear the canvas yourself:
p1.exit();
var canvas = document.getElementById('canvas_id');
var context = canvas.getContext('2d');
context.clearRect(0, 0, canvas.width, canvas.height);
// Or context.fillRect(...) with white, or whatever clearing it means to you
Then, from the sound of things, you're free to attach another sketch:
var p2 = Processing.loadSketchFromSources('canvas_id', ['sketch2.pde']);
Again, I'm not actually familiar with that library, but this appears straightforward from the documentation.

As of processing.js 1.4.8, Andrew's accepted answer (and the other answers I've found in here) do not seem to work anymore.
This is what worked for me:
var pjs = Processing.getInstanceById('pjs');
if (typeof pjs !== "undefined") {
pjs.exit();
}
var canvas = document.getElementById('pjs')
new Processing(canvas, scriptText);
where pjs is the id of the canvas element where the scrips is being run.

We Keep Coding

JavaScript is the programming language of the Web.

Trying to scrape items in a canvas with selenium - javascript

Related

Different predictions if running in Node instead of Browser (using the same model_web - python converted model)

How do I draw a Javascript-modified SVG object on a HTML5 canvas?

Get the pixel screen size in Spark AR studio (for Facebook)

Using Processing.js: Can I have multiple canvases with only one data-processing-source sketch.pde?

Dynamically "unload" a Processing JS sketch from canvas

Categories

Resources