Edit: While trying to find the solution, the topic has changed a bit - TL;DR: See the answer. For all interested in the development of the topic, go on ;-)
The title might be confusing, but I don't know how to put it in better words - let's say I have a simple view function:
#app.route('/chart')
def chart():
return render_template('chart.html')
If I open now directly this view in the browser, via "http://localhost:5000/start", it works perfectly fine - the javascript stuff works as expected (drawing a chart, using a slider to call different views etc.).
However, using a link from an other template does not work, e.g. I have a root view:
#app.route('/')
def index():
return render_template('index.html')
and inside of this index.html a link directed to '/chart':
Chart
Now, pressing that link, it loads my template, I can see the basic html stuff, but the javascript code doesn't load/start at all (also the source code received by the browser is proper). In both scenarios the server receives the same command:
127.0.0.1 - - [13/May/2015 10:48:18] "GET /chart HTTP/1.1" 200 -
But only if I open it directly in the browser it works as expected.
Is there something wrong with my link/redirect?
Edit: I found the one single line which causes the problems: In my index.html I'm loading jQuery mobile:
<script src="http://code.jquery.com/mobile/1.4.5/jquery.mobile-1.4.5.min.js"></script>
Removing this line, everything works as expected - but of course I do wan't to have it included, why and how can this influence the link behavior?
By default, jQuery uses Ajax to handle page requests. This allows for smooth transition between pages (without the page refresh feel), but it also means that only the body of the requested page will be loaded - and therefore no javascript libraries included in the header.
Therefore all necessary javascript has to be included in the header of the original site, or instead a page request has to be done without ajax, using the data-ajax="false"attribute. This will do a real page refresh, without ajax transition.
For more information read this.
Related
So, I have a project where I need to get the photos from a profile.
I am able to navigate to the photos page of a profile, but I believe the JavaScript is not loading.
I am currently using HtmlUnit but if you know of another Java API that would help I'm all ears.
Basically, when I view Facebook in a normal browser, it will load all of the pages and I can inspect the elements.
When inspecting, there is a div called fbStarGrid and a few other modifiers. This div contains all the images for a user's profile.
When I use HTMLUnit, I cannot find the div. I had it print the full page XML to a file, and I found that the div is commented out. I believe this means the Javascript never ran to load the content.
After browsing a lot of javascript help on SO, I have found a few things that help with debugging but can't seem to fix the problem.
The first thing I've done is create an instance of a JavaScriptJobManager. I used it to see how much JavaScript is not complete. After waiting for a while (10+ seconds) it says there are still 3 JS jobs uncomplete. After a very long time (about 60 seconds), it says there are 2 JS jobs uncomplete.
I do not know what is hanging with those JS jobs.
I get a warning upon page load about application/ld+json not running but I do not believe that part of the website is related to the photos.
Is there something I can do to force the JS to run? Is there a job it's stuck on and won't proceed to the next job?
I've also wondered if it's an issue with the page not re-syncing.
I've tried two solutions related to this:
Setting the AjaxController to NicelyResynchronizingAjaxController()
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
And someone suggested creating a custom controller that forces syncing.
webClient.setAjaxController(new AjaxController(){
#Override
public boolean processSynchron(HtmlPage page, WebRequest request, boolean async)
{
return true;
}
});
Neither of these seemed to effect the page.
If HTMLUnit is not the right library for the job, any other ideas? I need this to be headless/guiless to run on a linux server. Java is preferred, but I can switch languages if necessary.
I'm currently working on a project to track products from several websites. I use a python scraper to retrieve all the URLs related to the listed products, and later, regularly check if these URLs are still active.
To do so I use the Python requests module, run a get request and look at the response's status code. Usually I get 200, 301, 302 or 404 as expected, except in the following case:
http://www.sephora.fr/Parfum/Parfum-Femme/Totem-Orange-Eau-de-Toilette/P2232006
This product has been removed and while opening the link (sorry it's in French), I am briefly shown a placeholder page saying the product is not available anymore and then redirected to the home page (www.sephora.fr).
Oddly, Python still returns a 200 status code and so do various redirect tracers such as wheregoes.com or redirectdetective.com. The worst part is that the response URL still is the original, so I can't even trace it that way.
When analyzing with Chrome DevTools and preserving the logs, I see that at some point the page is reloaded. However I'm unable to find out where.
I'm guessing this is done client-side via Javascript, but I'm not quite sure how. Furthermore, I'd really need to be able to detect this change from within Python.
As a reference, here's a link to a working product:
http://www.sephora.fr/Parfum/Parfum-Femme/Kenzo-Jeu-d-Amour-Eau-de-Parfum/P1894014
Any leads?
Thank you !
Ludwig
The page has a meta tag, that redirects the page to the root URL:
<meta http-equiv="refresh" content="0; URL=/" />
Please, help me!
Can you explain me how to reload part of the page WITH change URL, but it must will not including hash tag.
some code:
URL: mysite.com/first/
<html>
<body>
<h1>RELOAD!</h1>
<div>few words...</div>
<input type="button">
<body>
</html>
want change to:
URL: mysite.com/second/
<html>
<body>
<h1>RELOAD!</h1>
<div>other few words...</div>
<input type="button">
<body>
</html>
How I can do reload content only in DIV on another?
I was see one more examples, like this next:
if (location.href.indexOf("#") > -1)
location.assign(location.href.replace('#', "/"));
BUT - it reload all page!
.htaccess - I cannot use, 'cause char "#" and next text not to sent to the server.
Also I have seen code:
history.pushState(null, null, '#myhash');
if(history.pushState) {
history.pushState(null, null, '#myhash');
}
else {
location.hash = '#myhash';
}
but cannot understand it right.
Maybe there is an other right way how to do it.
There are actually two different problems here:
How to load content from the server and display it in the existing page, without reloading the whole page
How to make it look like you are on a new URL, without reloading the page
Neither of these have anything to do with .htaccess (by which is generally meant Apache's mod_rewrite module) because they are both about how the client is behaving, not the server.
The first part is generally referred to as "AJAX", about which you will find tons of information online. The "X" originally stood for "XML", but actually you can fetch whatever kind of data you want, such as plain text, or a piece of ready-made HTML, and use JavaScript to put it into place on your page. The popular jQuery library has a method called .load(), which makes a request to the server, and uses the response to replace a particular part of the page.
The second part is a little trickier - since the page hasn't actually been reloaded, you essentially want the browser to lie about the current URL. The reason you will see a lot of examples changing only parts of the URL after the # is precisely because these aren't sent to the server; traditionally, they're used to scroll the current page to a paticular "anchor". You can therefore change them as often as you like, and if the user bookmarks or shares your page, you can look at the part after the # and re-load the state they bookmarked/shared.
However, as part of the "HTML5" group of technologies, an ability was added to change the actual URL bar of the browser, by "pushing a state" to the history object. In other words, adding an entry to the back/forward menu of the browser, without actually loading a new page. There are obvious security restrictions (you can't pretend the user navigated to a completely different domain), but the API itself is quite simple. Here is the MDN documentation explaining it.
For your simple example, assuming jQuery has been included, you might do something like this:
// Find the div with a jQuery selector; this would be more specific in reality
jQuery('div')
// Request some text from the server to replace the div
.load(
// This URL can be anything that generates the appropriate HTML
'/ajax.php?mode=div-content&stage=second',
// Add a callback function for when the AJAX call has finished
function(responseText, textStatus, XMLHttpRequest) {
// Inside the callback function, set the browser's URL bar
// and history to pretend this is a new page
history.pushState({}, '', '/second/');
}
);
Note that jQuery is far from the only way of doing this, it just keeps the example simple to make use of an existing function that does a lot of the work for us.
Ok, so all the rage these days is having a site like this:
mysite.com/
mysite.com/about
mysite.com/contact
But then if the user has Javascript enabled, then to have them browse those pages with Ajax:
mysite.com/#/
mysite.com/#/about
mysite.com/#/contact
That's all well and good. I have that all working perfectly well.
My question is, if the user arrives at "mysite.com/about", I want to automatically redirect them to "mysite.com/#/about" immediately if they have Javascript.
I have it working so if they arrive at "mysite.com/about", that page will load fine on its own (no redirects) and then all clicks after that load via ajax, but the pre-fragment URL doens't change. e.g. if they arrive on "mysite.com/about" and then click "contact", the new URL will be "mysite.com/about#/contact". I really don't like that though, it's very ugly.
The only way I can think of to automatically redirect a user arriving at "mysite.com/about" to "mysite.com/#/about" is to have some javascript in the header that is ONLY run if the page is NOT being loaded via ajax. That code looks like this ($ = jQuery):
$(function(){
if( !location.hash || location.hash.substr(1,1) != '/' ) {
location.replace( location.protocol+'//'+location.hostname+'/#'+location.pathname+location.search );
}
});
That technically works, but it causes some very strange behavior. For example, normally when you "view source" for a page that has some ajax content, that ajax content will not be in the source because you're viewing the original page's source. Well, when I view source after redirecting like this, then the code I see is ONLY the code that was loaded via Ajax - I've never seen anything like that before. This happens in both Firefox 3.6 and Chrome 6.0. I haven't verified it with other browsers yet but the fact that two browsers using completely different engines exhibit the same behavior indicates I am doing something bad (e.g. not just a bug with FF or Chrome).
So somehow the browser thinks the page I'm on "is" the Ajax page. I can continue to browse around and it works fine, but if I e.g. close Firefox and re-open it (and it re-opens the pages I was on), it only reloads the Ajax fragment of the page, and not the whole wrapper, until I do a manual refresh. (Chrome doesn't do this though, only Firefox). I've never seen anything like that.
I've tried using setTimeout so it does not do the redirect until after the page has fully loaded, but the same thing happens. Basically, as far as I can tell, this only works if the fragment is put there as the result of a user action (click), and not automatically.
So my question is - what's the best way to automatically redirect a Javascript capable browser from a "normal" URL to an Ajax URL? Anyone have experience doing this? I know there are sites that do this - e.g., http://rdio.com (a music site). No weirdness happens there, but I can't figure out how they're doing it.
Thanks for any advice.
This behavior is like the new twitter. If you type the URL:
http://twitter.com/dinizz
You will be redirected to:
http://twitter.com/#!/dinizz
I realize that this is done, not with javascript but in the server side. I am looking for a solution to implements this using ruby on rails.
Although I suggest you to take a look on this article: Making AJAX Applications Crawlable
I have a html page on my localhost - get_description.html.
The snippet below is part of the code:
<input type="text" id="url"/>
<button id="get_description_button">Get description</button>
<iframe id="description_container" src="#"/>
When the button is clicked the src of the iframe is set to the url entered in the textbox. The pages fetched this way are very big with lots of linked files. What I am interested in the page is a block of text contained in a <div id="description"> element.
Is there a way to mitigate downloading of resources linked in the page that loads into the iframe?
I don't want to use curl because the data is only available to logged in users and the steps to take with curl to get the content is too complicated. The iframe is simple as I use this on a box which sends the right cookies to identify the request as coming from a logged in user, but the problem is that it is very wasteful to get nearly 1 MB of data to keep 1 KB of it and throw out the rest.
Edit
If the proposed method just works in Firefox it is fine, so I added Firefox tag. Also, it is possible that the answer actually is from the realm of Firefox add-on techniques, so I added that tag as well.
The problem is not that I cannot get at what I'm looking for, rather, the problem is the easy iframe method is wasteful.
I know that Firefox does allow loading only the text of a page. If you open a page and press Ctrl+U you are taken to 'view page source' window, There links behave as normal and are clickable, if you click on a link in source view, the source of the new page is loaded into the view source window, without the linked resources being downloaded, exactly what I'm trying to get. But I don't know how to access this behaviour.
Another example is the Adblock add-on. It somehow kills elements before they get loaded. With plain Javascript this is not possible. Because it only is triggered too late to intervene in good time.
The Same Origin Policy forbids any web page to access contents of any other web page in a different domain so basically you cannot do that.
However it seems that with some browsers it is allowed to access web pages content if you are trying to access it from a local web page which seems to be your case.
Safari, IE 6/7/8 are browser that allow a local web page to do so via XMLHttpRequest (source: Google Browser Security Handbook) so you may want to choose to use one of those browsers to do what you need (note that future versions of those browsers may not allow to do so anymore).
A part from this solution I only see two possibities:
If the web pages you need to fetch content from are somehow controlled by you, you can create a simpler interface to let other web pages to get the content you need (for example allowing JSONP requests).
If the web pages you need to fetch content from are not controlled by you the only solution I see is to fetch content server side logging in from the server directly (I know that you don't want to do so, but I don't see any other possibility if the previous I mentioned are not practicable)
Hope it helps.
Actually I've seen Cross Domain jQuery .load request before, here: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/
The author claims that codes like these found on that page
$('#container').load('http://google.com'); // SERIOUSLY!
$.ajax({
url: 'http://news.bbc.co.uk',
type: 'GET',
success: function(res) {
var headline = $(res.responseText).find('a.tsh').text();
alert(headline);
}
});
// Works with $.get too!
would work. (The BBC code might not work because of the recent redesign, but you get the idea)
Apparently it is using YQL wrapped into a jQuery plugin to do the trick. Now I cannot say I fully understand what he is doing there but it appears to work, and fits the bill. Once you load the data I suppose it is a simple matter of filtering out the data that you need.
If you prefer something that works at the browser level, may I suggest Mozilla's Jetpack framework for lightweight extensions. I've not yet read the documentations in its entirety but it should contain the APIs needed for this to work.
There are various ways to go about this in AJAX, I'm going to show the jQuery way for brevity as one option, though you could do this in vanilla JavaScript as well.
Instead of an <iframe> you can just use a container, let's say a <div> like this:
<div id="description_container"></div>
Then to load it:
$(function() {
$("#get_description_button").click(function() {
$("#description_container").load($("input").val() + " #description");
});
});
This uses the .load() method which takes a string in this format: .load("url selector"), then takes that element in the page and places it's content inside the container you're loading, in this case #description_container.
This is just the jQuery route, mainly to illustrate that yes, you can do what you want, but you don't have to do it exactly like this, just showing the concept is getting what you want from an AJAX request, rather than in an <iframe>.
Your description sounds like you are fetching pages from the same domain (you said that you need to be logged in and have session credentials) so have you tried to use async request via XMLHttpRequest? It might complain if the html on a page is particularly messed up but you chould still be able to get raw text via .responseText and extract what you need with a regex.