How do I scrape constantly updated JavaScript post-login using Python?

How do I scrape constantly updated JavaScript post-login using Python? - javascript

I know there are many similar questions, however, they are all piecemeal to the problem I have, and I haven't been successful in putting the information together.
I am using a FLIR ax8 thermal camera, and this camera has a web-interface that one can interact with via ethernet. Long story short, temperature values are constantly displayed and updated, and I would like to scrape those values. I would like to do this without opening a browser with a GUI, and just be able to call every so often to get them.
The first step is a simple login page, located at "cameraIP"/login. It's very basic, but I need a solution that gets me through this, and be able to maintain the login session. Then it's just the interface. Attached are two images, the first showing interface as seen in Chrome, and the second a terminal output of what I scraped using Python's Requests module.
As you can see, the numbers are clearly not there, as they are rendered by JavaScript. This is essentially all I have to work with. If someone could give advice on how this is possible to get those temperature values every so often, that would be great.
If there are ANY questions, just leave a comment down below and I can provide more information, such as the JS files listed under the web interface if they are needed.

I personally use scrapy splash to render the javascript when scraping using scrapy: http://splash.readthedocs.io/en/stable/

Related

How can I automatically retrieve texts of emails and further scrape them?

My business has hundreds of incoming emails daily and my plan was to have the sorting and answering at least partly automated. I know that using JavaScript it would be possible to select those elements on the webpage (i.e. in my inbox) that are email tabs but, as far as I'm concerned, I can't implement cursor movement and clicking in JavaScript to open up the emails one-by-one and copy-paste their contents into a separate file. I want to collect and analyze the texts from incoming emails, classify them based on topics using a large set of keywords, and, once the grouping is finished, assign sample answers to these messages that only have to be proofread and then can be sent out.
My idea was to use Python because it is quite convenient to move the cursor in Python. However, I can't seem to figure out how can I analyze information that is currently on the screen, so that the program can "see" if there are any new emails. In JavaScript this seemed easier, I don't know if it is even possible using Python though.
I am using Windows.
Am I on the right track? Or totally wrong? Maybe I should consider another programming language? Thank you for your insights in advance.

As far as i understand you need to automate the functionality of collecting information in emails to a separate file for further processing. For this I think you can use Selenium Web automation tool (Python) . It is normally used for web site testing. But can be used in use cases like you mentioned. Hope this helps.
https://selenium-python.readthedocs.io/
https://pypi.org/project/selenium/

How to proceed with live, local website

I've done plenty of web development at a pretty basic level, and usually just local pages to be called from shared network drives.
Here is where I'm stuck:
I am attempting to build a simple application for work where other leads and I can open a local html page from our shared drive, and add/remove employee names to different tasks, so we can keep tabs on who's doing what.
The tasks are the same every day, just hardcoded titles on sectioned out divs.
Problem is, I can't figure out how to make it to where changes I make will populate for other people with the window open (considering this is just a local page and not a live environment being hosted on anything).
For the general design, I've toyed with hard coding all the employee names under each task in hidden div tags, with a bit of jquery to make the div visible when that worker is assigned.
I have also toyed with appending data to the existing tags using .innerHTML.
Still have no idea how to make this live so we can see each others changes.
Can you point me in the right direction?

I've figured it out. It may be a bit on the lazy side, but i'm just setting the HTML page to auto refresh every 5s. The data displayed will pull from a notepad doc that leads will be able to update and save.

I am assuming that you just want your peers to be able to see the changes your making live? If so you need to use Ajax. You could set it to ping the server every so often to update on the fly. I'm still quite uncertain as to whether or not that was your question, but if not please elaborate or post your code so I can help further.

Make javascript do stuff on external pages

Ok so im learning javascript and I just wanted to know if its possible to make it do actions on external pages. For example if I wanted to say 'onload redirect to somesite.com/page1 then once on somesite.com/page1 fill in register form with these details'
is that possible?

You cannot do this.
This would represent, for lack of a better word, an enormous security hole.
The only way to make an external page "do stuff" is to write code that is on or explicitly included in that page itself. Period.
I have however, seen external pages get loaded INTO the current page as strings, and then have the javascript that loaded those pages modify that markup directly. But that is ugly.

On the first page you could modify some variables/values in a database. Then, in the second page you could check the values in your database, and do different "stuff" depending on those values.
You would need to set up a database and use some server-side scripting along with Javascript (server-side scripting is used to interact with your server/database). In your first page, the server-side script, like PHP, would fetch info from your Javascript. In your second page, your Javascript would fetch info from your server side script and then do stuff to that page.
This is a much safer way. If you are taking user input from things like HTML fields, you need to look into cleaning the input to prevent something called "cross-site scripting (XSS)".

You could do this IF you were rendering the other page in a frame of some sort.
There are multiple ways in which you can render an entire external page as a piece of your page. Many pages take precautions to block being rendered in a frame for just this very reason though (Not to mention copyright issues).
Once you're rendering the external page inside your page you should be able to reference components nested in your frame and do the sort of thing that you're describing.

There's no way to do this with JavaScript. The developers of all the major browsers work very, very hard to prevent this sort of thing. If this were possible, it would open up pretty massive security holes.
If you really want to use something like this for testing, you can look at browser automation software like Selenium. This allows you to automate various testing scenarios in your browser, but it does not affect other clients using your site.

searching a large amount of text using javascript and html5 storage

I have a web app that relies on html5 offline storage features so that it can be accessed by the user without an internet connection. The app essentially just serves html pages and a little bit of css and javascript.
I am trying to add the ability to search the text served on these pages for key words, but because the app isn't guaranteed access to the server it needs to be able to perform these searches on the client side.
My thought is I can store the searchable text in the browser's web sql database and perform the search either through javascript or through the browser's sql api. I have a few question about the best way to do this:
1) I vaguely remember an article about how to implement something like this, maybe from airbnb? Does anyone remember such an article?
2) The text is 2,000,000+ words so I would assume that indexOf is going to break down at this data size. Is there any chance regex will hold up? What are some options for implementing the actual search? (libraries, algorithms, etc.) Any article suggestions for understanding the tradeoffs of string search algorithms if I need to go down that road?

Well, I just wrote a quick benchmark for you and was surprised to find that you could probably get away with using String.indexOf(). I get about 35ms per search, which is about 30 searches per second.
EDIT: a better benchmark. There appears to be some sort of initialization delay, but it looks like indexOf is pretty fast. You could play around with the benchmark and see if it looks like it will work for you.

WebApp that communicates using only json objects?

Hey everyone, I've been thinking about how the majority of web apps works at the moment. If as an example the backend is written in java/php/python what you probably see is that the backend is "echoing / printing" the html ready to the browser, right.
For web apps that works using almost exclusively ajax, is there a reason to not simply communicate without html, as an example just by using JSON objects passing back and fourth between the server and client, and instead of "printing or echoing" html in our script/app backend we simply echo the json string, ajax fetches it and converts the JSON string to an object with all of our attributes/arrays and so on.
Surely this way we have less characters to send, no html tags and so on, and on the client side we simply use frameworks such as jQuery etc and create/format our html there instead of printing and echoing the html in the server scripts?
Perhaps people already do this but I have not really seen a lot of apps work this way?
The reason is that I want todo this is because I would like to separate the layer of presentation and logic more than it currently is, so instead of "echoing" html in my java/php I just "echo" json objects, and javascript takes care of the whole presentation layer, is there something fundamentally wrong with this, what are your opinions?
Thanks again Stackoverflow.

There are quite a few apps that work this way (simply communicating via AJAX using JSON objects rather than sending markup).
I've worked on a few and it has its advantages.
In some cases though (like when working with large result sets) it makes more sense to render the markup on the server side and send it to the browser. That way, you're not relying on JavaScript/DOM Manipulation to create a large document (which, depending on the browser, would perform poorly).

This is a very sensible approach, and is actually used in some of our applications in production.
The main weakness of the approach is that it increases the load on the browser resource-wise and therefore might - in light of browsers often already-sluggish JS performance - lead to worse user experience unless the presentation layer mechanics is very well tuned.

Now a days many webapps use this approach like gmail and other big apps even facebook this
the main advantage of this approach is user dont need to refresh all the pages and he gets what we want to show him or what he desired.
but we have to make a both version like ajax and normal page refresh what if the user refresh the page.
we can use jquery template which generates html and also google's closer which is used by a gmail and other google products.

We Keep Coding

JavaScript is the programming language of the Web.

How do I scrape constantly updated JavaScript post-login using Python? - javascript

I personally use scrapy splash to render the javascript when scraping using scrapy: http://splash.readthedocs.io/en/stable/

Related

How can I automatically retrieve texts of emails and further scrape them?

How to proceed with live, local website

Make javascript do stuff on external pages

searching a large amount of text using javascript and html5 storage

WebApp that communicates using only json objects?

Categories

Resources