Go lang executing javascript for retrieving text in page

Go lang executing javascript for retrieving text in page - javascript

I'm trying to retrieve text that is loaded dynamically from a web page using golang.
The text to retrieve is on this page :
https://www.protectedtext.com/testretrieve?1234
This text is encrypted by a password and then decrypted on client side and loaded dynamically on the page.
I already tried with goquery by selecting 'textarea' object, but I can't get the text cause it's loaded dynamically.
How can I achieve this ? By executing JS in Go ? It's working in my chrome console, but no ideas on how to do that in Go.
A lightweight solution is the best for my project. Or any other website that can store and edit same text without modifying the URL ?

You may need a headless browser to load the javascript like for example phantomgo
However looking at the page source code we can see that they use sha512 for the tab title and aes for the textarea field.

The page which you shared contains https://www.protectedtext.com/testretrieve?1234, only one element of class textarea-contents
simple get class documents using goquery and get 0th part

Related

Cross borwser how to check for html editor changes when attributes order changes relativly?

My company uses tinyMCE editor for content editing feature.
problem : when saving content (as a bulk HTML string) from browser say (chrome) then view on Firefox
Attributes order changes as you see in this => differences in HTML between chrome & Firefox
Our problem is based on content.If content changes, business changes as well.
But in this case user doesn't change content, the browser does.
Scenario
- tinymce is loaded inside a popup
- user edites content & closes the editor popup
- we render edited HTML in a div element (part of a form)
- part of server-side form validation is checking for content (HTML) changes
- using C# to compare saved vs edited HTML content as two strings
Do you have any ideas on how to find the actual changes or could you provide us with a hint about the way to solve this ?

The reason why this occurs is due to how the HTML content is parsed by the browser. At this point of time, TinyMCE cannot guarantee the order of the attributes.
To resolve the issue for your use case, I would suggest parsing the HTML on the server (preferable) or client side before storing the data.
Depending on the technology stack you're on, there are a range of HTML Parsers written in languages from PHP, Java to Ruby. Prettier is one of the "go to" parsers these days - unfortunately there are no ASP.NET solutions options as far as I can see.

Selenium Python : Unable to get element by id where id = principal

For my daily work, I'm trying to create a Python Script which is able to fill in different forms, from different websites.
Here is the thing, for some kinds of website, I'm not able to catch the form elements which Selenium. E.g. in this website : https://econnect.bpcl.in/selfservice-ext/pub/login.html Where I inspect the page with Chrome, the input "User id" box has the id "principal", but Selenium is not able to get it. And when I display the html code of this page, the form looks like being included from another page or something.
I tried to getelementbyid, byname, bycssselector, etc. I also tried to wait for the page to be entirely loaded by using WebDriverWait(driver, 5), but it still do not work.
i also tryied driver.execute_async_script and driver.execute_script
Do you have any solutions or suggestions ?
PS : Even with javascript, I'm not able to get this element by id
Thanks

You can try to use x-path, I'm using a chrome extension: https://chrome.google.com/webstore/detail/xpath-finder/ihnknokegkbpmofmafnkoadfjkhlogph?hl=en
To locate the element you want to fill
And then you can use
driver.find_element_by_xpath("X-PATH").sendkeys("...")
If you don't want to download a chrome extension you can try to go the HTML script, but I doubt it would work because you said it wasn't working for you...

Get gdocsviewer plugin generated content using jquery

I am using jquery.gdocsviewer.min.js plugin to read office and pdf documents in a website. The plugin is working fine. I am trying to get the content of the generated preview using jquery
var rowcontent=$('.embed').html();///embed is the class of the link
But I cant get this to work. Check a working fiddle
http://jsfiddle.net/4s8bn/133/
Please advice me on what to do or if its practical. I want to fetch the content and save it as html in a database.

In client side, it is not possible to get that pdf directly and save as html in database.You have to do it in server side. For server side,here is an example with php Once you get that in server side you can catch it with Ajax in client side.

Try:
var rowcontent = $('.gdocsviewer').html();
From the docs:
The plugin inserts a the IFRAME for the viewer inside an injected DIV.
The DIV tags all carry the class name "gdocsviewer", which allow for
styling all the gdocsViewer instances via CSS. If the anchor tag has
the ID attribute defined, then the injected DIV tag is also set an ID
attribute in the format of ID_of_Anchor + '-gdocsviewer'. See the demo
source code for more details.

javascript that writes in the google chrome url bar

I want to try and right some code that will guess the default ip of a router that you are currently connected to. To do this, i would write a bit of javascript code that would type into the google chrome URL bar and attempt to search it. For instance: it would type 192.168.0.0 , then 192.168.0.1, etc... currently my largest problem with this is that i have no idea how i would write code that would locate and type into the url bar, i could do it with any other user input. How would i do this?

URL bar is not part of the window so you couldn't just locate it using javascript as with DOM elements.
To read and write to URL you could use window.location (more to find here: https://developer.mozilla.org/en-US/docs/Web/API/Window/location).
If you change window.location to another address, it will force browser to load new content, in the same way as reloading a page. Remember that loading new page will probably lead to discarding your javascript code if it was loaded from within the website.
Another way is to use iframe and dynamically change its url, like here: dynamically set iframe src
If you want to make tool that iterates through possible addresses, i would recommend writing it as a Chrome extension. More about it and tutorial are available here: https://developer.chrome.com/extensions

is there a tool to capture all DOM webpage elements generated by browser-side javascript as html, for a full page html archive?

if a whole bunch of elements gets generated in my browser by javascript (using JSON data or just out of thin air) I am not able to fully archive such a page by saving its source. I already tried saving it as .mht file in IE, but that does not work - IE does not save the dynamically generated elements either.
An example of such a page is here http://www.amazon.com/gp/bestsellers/wireless/ref=zg_bs_nav - notice that "price" and "X new" elements do not exist in the source html but rather are dynamically generated.
If I wanted to parse this, I could work directly with the DOM by various means, yadda-yadda. But if I want to automagically save the page as html document such that it could be rendered with all the dynamically generated elements nicely rendered even while javascript is turned off, so far I am SOL.
Any suggestions?

In Firefox there's the Web Developer extension: https://addons.mozilla.org/en-US/firefox/addon/web-developer/
Once installed you can use View Source -> View Generated Source to access the JavaScript-modified HTML.

We Keep Coding

JavaScript is the programming language of the Web.

Go lang executing javascript for retrieving text in page - javascript

You may need a headless browser to load the javascript like for example phantomgo However looking at the page source code we can see that they use sha512 for the tab title and aes for the textarea field.

The page which you shared contains https://www.protectedtext.com/testretrieve?1234, only one element of class textarea-contents simple get class documents using goquery and get 0th part

Related

Cross borwser how to check for html editor changes when attributes order changes relativly?

Selenium Python : Unable to get element by id where id = principal

Get gdocsviewer plugin generated content using jquery

javascript that writes in the google chrome url bar

is there a tool to capture all DOM webpage elements generated by browser-side javascript as html, for a full page html archive?

Categories

Resources