Is there a way to get current HTML from browser in Python? - javascript

I am currently working on a HTML presentation, that works well, but I need the presentation to be followed simultaneously with a NAO robot who reads a special html tag. I somehow need to let him know, which slide I am on, so that he can choose the correct tag.
I use Beautiful Soup for scraping the HTML, but it does so from a file and not from a browser. The problem is, there is javascript running behind, assigning various classes to specific slides, that tell the current state of the presentation. And I need to be able to access those, but in the default state of the presentation they are not present and are added asynchronously throughout the process of the presentation.
Hopefully, my request is clear.
Thank you for your time

http://www.seleniumhq.org/ (probably webdriver) is your friend. Initialize a browser and call browser.html to get the document in the current state.

There's wget on the robot, you could use it... (though I'm not sure I understand where is really the problem...)

Related

What's the ultimate and definitive way to clear the browser cache?

I'm using Liferay CMS as part of my Uni course in full stack development and, as a final project, I have to use the d3.js library to display some graphs. I'm struggling to clear the browser cache though, and that makes the developing process very tedious and time consuming: I'd like to see my front-end changes right away without having to fiddle with the browser cache, especially because, as I'm working with svg elements, it sometimes gets tricky to line up stuff and so on. Sometimes clearing the cache works, sometimes it doesn't, as well as opening a new private window, but there must be a conclusive and foolproof method to delete all cached elements. Does somebody know how to do that?
Liferay has a "Developer Mode" which should bypass quite a lot of caching anyway. In your portal-ext.properties (typically in ${liferay.home}, just add the line
include-and-override=portal-developer.properties
to activate this mode.
It will also skip minifiers and concatenation of all of the different resources that you're loading.
This doesn't clear caches but will solve your updating problem.
In the HTML, add an (unused) query string to the html link to linked files and alter it each time you make an update to the file. e.g. for css:
<link rel="stylesheet" href="styles.css?a">
Then, each time you make changes to the file pointed to, change the 'a' to 'b' or anything (Don't change the linked file's name, the query string will be ignored).
This forces the browser to 'change' the linked file each time the href changes and so the altered file gets reloaded.
The method will work for script and other linked files. The query string could be something meaningful such as version numbers - ?v1, but anything will do.
Edit, as noted by #GerardoFurtado, a further discussion of this idea is available here Cache busting via params

Use Site Property in eSpace Javascript

I included Google Analytics (javascript) in my Outsystems website via de eSpace Javascript. Now I want to place the Analytics Key in my Site Properties so I can update it easily for every environment.
How can I use a Site Property in my Javascript?
You can create a site property to store the Tracking ID.
site Property screenshot
Second, you need to create a webblock with an unescaped expression, and add your javascript this way:
weblock expression screenshot
Finally, you just need to drag you weblock to each webpage you want to track.
cheers,
Vera
As far as I know, you cannot use Site Properties in the eSpace JavaScript window. For that, you have to use an escaped expression on a web screen or web block to add your JavaScript code along with the use of Site Properties.
Since you want the same script on all the web screens, I suggest that you add this expression in the Footer web block, so that it will be automatically added to all the web screens you create.
I can understand your use case. If I read it correctly, you're trying to use some JavaScript in one espace, that would be run in every page load, something like an
onLoad(function(){
// your Google Analytics code, but using the value from the site property
})
And in this way, you would be able to update the site property without the need to republish all consumers. Seems like a nice approach :)
On way to be able to achieve this, would be to have your JavaScript to request the key on the fly to the server side, and maybe cache it.
This can be easier or harder depending on the Platform version you're running... But here's a simple way to achieve it.
Add the site property to the espace. Build a page that has no layout, and in the preparation, add a download widget that only downloads the value of your site property. In the same espace, in the espace JavaScript, add an AJAX request to the page I was referring to before, and when you get the response back, start your Google Analytics code.
To be able to use this in every other espace, and in every page, you still need to reference something from the Google Analytics espace though, so that espace JavaScript is run in every page
Hope it helps :)

Code Processing (Client Side instead of on Server)

I want generate a development tool that I can input code (Such as HTML, CSS and JS) and it will create a preview/result window (like JSFiddle). I will be using it for tutorials in school and need a unique site to do this from (I would love to use CodePen, JS Fiddle or Codecademy... But I can't).
I am able to generate a form that can be processed and shown in an iframe (through PHP where it simply echos the information into a new html file that is shown in the iframe). But this came with problems; I only have a cheap server and won't want to put too much pressure on it so need todo this through JS/jQuery.
Firstly is this possible? And how would I go about doing it (code examples would be great!)?
Thanks in advance (I appologise if I haven't given enough detail but I'm fairly new to this and may just be asking a pointless question (I'm only 15 :/ ) )
Cheers :)
There is a rather impressive project called php.js that will let you parse and execute a subset of PHP code in the browser.
If you want to do it complete on a client/ in browser like jsfiddel do, then you need 2 or more frames.
One is for your code and one is for the output.
If you click on "run", then need to apply your code to the frame. You can do this by accessing the document object of the frame. If you got it, you´ll need to inject your code there. There many examples in the web on how to access a child document object from an frame/iframe.

using document.write in remotely loaded javascript to write out content - why a bad idea?

I'm not a full-time Javascript developer. We have a web app and one piece is to write out a small informational widget onto another domain. This literally is just a html table with some values written out into it. I have had to do this a couple of times over the past 8 years and I always end up doing it via a script that just document.write's out the table.
For example:
document.write('<table border="1"><tr><td>here is some content</td></tr></table>');
on theirdomain.com
<body>
....
<script src='http://ourdomain.com/arc/v1/api/inventory/1' type='text/javascript'></script>
.....
</body>
I always think this is a bit ugly but it works fine and we always have control over the content (or a trusted representative has control such as like your current inventory or something). So another project like this came up and I coded it up in like 5 minutes using document.write. Somebody else thinks this is just too ugly but I don't see what the problem is. Re the widget aspect, I have also done iframe and jsonp implementations but iframe tends not to play well with other site's css and jsonp tends to just be too much. Is there a some security element I'm missing? Or is what I'm doing ok? What would be the strongest argument against using this technique? Is there a best practice I don't get?
To be honest, I don't really see a problem. Yes, document.write is very old-school, but it is simple and universally supported; you can depend on it working the same in every browser.
For your application (writing out a HTML table with some data), I don't think a more complex solution is necessary if you're willing to assume a few small risks. Dealing with DOM mutation that works correctly across browsers is not an easy thing to get right if you're not using jQuery (et al).
The risks of document.write:
Your script must be loaded synchronously. This means a normal inline script tag (like you're already using). However, if someone gets clever and adds the async or defer attributes to your script tag (or does something fancy like appending a dynamically created script element to the head), your script will be loaded asynchronously.
This means that when your script eventually loads and calls write, the main document may have already finished loading and the document is "closed". Calling write on a closed document implicitly calls open, which completely clears the DOM – it's esentially the same as wiping the page clean and starting from scratch. You don't want that.
Because your script is loaded synchronously, you put third-party pages at the mercy of your server. If your server goes down or gets overloaded and responds slowly, every page that contain your script tag cannot finish loading until your server does respond or the browser times out the request.
The people who put your widget on their website will not be happy.
If you're confident in your uptime, then there's really no reason to change what you're doing.
The alternative is to load your script asynchronously and insert your table into the correct spot in the DOM. This means third parties would have to both insert a script snippet (either <script async src="..."> or use the dynamic script tag insertion trick. They would also need to carve out a special <div id="tablegoeshere"> for you to put your table into.
Using document.write() after loading the entire DOM do not allow you to access DOM any further.
See Why do I need to use document.write instead of DOM manipulation methods?.
You are in that case putting away a very powerfull functionnality of in web page...
Is there a some security element I'm missing?
The security risk is for them in that theirdomain.com trusting your domain's script code to not do anthing malicous. Your client script will run in the context of their domain and can do what it likes such as stealing cookies or embedding a key logger (not that you would do that of course). As long as they trust you, that is fine.

Obfuscate file names in webpage

I'm creating a web-application which will be taking survey-type data.
Users are presented with several files and asked a question. The user, in the hope of not skewing data, must not be able to know the file name of the file.
An empty div is created for a JPlayer instance to sit in, and I have added the "location" attribute to the div, so while setting up the JPlayer instance on the client side the JPlayer knows what .wav to play
<div id="jquery_jplayer" class="jp-jplayer" location="sound.wav"></div>
Here is part of the javascript which sets up the sounds to be played and here its easy to see that the file location is simply dragged from the div
$("#jquery_jplayer").jPlayer("setMedia", {
wav: $(this).attr("location")
});
Basically, the intention is to hide "sound.wav" from the HTML document and keep the javascript dynamic.
A translation file between obfuscated and deobfuscated could be possible but it would be nice to keep this dynamic.
If you want to truly hide logic from your viewers, then you need to do it server-side rather than with client-side javascript. You can "complicate" the dissection of what is happening in the client-side code, but you cannot truly hide it.
If you want further help with the obfuscation, you'll have to describe better what you're really trying to do. The current description doesn't seem to offer enough information. What is this file path? What is it being used for? Why do you need to hide it?
If what you really want is just a Javascript function to obfuscate and de-obfuscate the sound filename, you can find lots of options with Google depending upon how elaborate you want to get. My guess here is that the determined cheat won't be fooled (since all the code is there for deobfuscating) so all you're really trying to do is make it non-obvious at first glance. Thus, any simple algorithm will do.
Since you're already using jQuery, here's a jQuery that does simple string obfuscation: http://plugins.jquery.com/project/RotationalStringObfuscator. You'd have to run the obfuscator yourself in some sort of test app to record what the server should set each filename to and then do the reverse in the client when you want to actually use the filename.
If you ask me, a better solution would be to give the filenames non-meaningful names from the beginning. This would be names like 395678264.wav and just use them that way (on both server and client). Then, the name is meaningless to anyone snooping. No deobfuscation or translation table is required because this is the real filename.

Categories