How does Twitter display my profile instantly? - javascript

Context
I realized that in Twitter, the profile page is displayed in different ways depending on how it is called:
By clicking the profile link in the menu, the DOM and the latest tweets are loaded, and the page is displayed in ~4 seconds. Each time.
Using the keyboard shortcut GP (or the link on the left), the page is displayed instantly.
Details
I noticed that the profile must have been recently displayed for GP instantly displays the page.
By closing and opening the browser, the profile must be displayed again for GP instantly displays the page.
Investigation
So at first I thought Twitter could use a serverside session variable to store data. Then I discovered a use of localStorage in the Twitter source code. I confess, DOM storage is unfamiliar to me and the Twitter JavaScript code is unreadable. So I don't sure they use localStorage to store the profile.
Question
Any hypothesis, infos or links about Twitter DOM storage / session storage?

This is an interesting question, so I went to twitter, and did some investigation myself.
Clicking on my profile name, the link is done with AJAX. I see my timeline getting downloaded. But, the page is already loaded in advance, so my information is also already downloaded.
By clicking on the link on the left, or with GP you just display the page already loaded (hidden, or in JavaScript object, so in memory). It will just display your profile already downloaded, and by AJAX download the feed (JSON).
The URL will change from https://twitter.com/#!/ to https://twitter.com/#!/wraldpyk (in my case).
When you click your profile in the menu (top right) you go to https://twitter.com/wraldpyk. This will re-open the page, and will download everything. Note you will get redirected to https://twitter.com/#!/wraldpyk, and meanwhile your timeline also gets downloaded (open FireBug and see the images and feeds getting downloaded)
As far as I can tell, no local storage (except in JavaScript, like everyone does) is done. All data is constantly downloaded on new page load.
Same thing happens when you type gh when on your profile. (And also with all other shortcuts)

Related

Problem with Document expired after pressing "back button" in browser

We are using an e-commerce script, which is coded with ionCube technology. In the product catalog we have filters. They are sent via post method. Because of the script I cannot use get - even if I want to.
When the user will try to go to a product details, by choosing one of the filtered products in the catalog, when he tries to go back via back button in browser, he gets Document expired. After clicking the refresh button it shows the right catalog page with all filters which were chosen.
We tried to set this on server:
ini_set('session.cache_limiter','public');
It helps with the above problem, but it corrupts the cart page - everything goes crazy.
I tried many scripts founded in Stack Overflow and in other places on the net, but it won't work.
Please notice that I also cannot use PHP, because of ionCube. When I am trying to add anything in the index.php I get a corrupt notice after the page reload.
Any solution?

Expiring links to images in Facebook share?

We are implementing a facebook share dialog so users can share images from their accounts. Those images are hosted on S3 and we use expiring links to ensure that normally user images are only accessible to them.
The question is, if we provide that link to the facebook js library to create the share dialog, when the user posts, does facebook make a copy (where our link expiring 2min later is fine) or does that link have to remain available for longer or forever? If it does make a copy, is that when the user clicks the Post button? Or earlier when the preview is show in the dialog?
Following this link,
Share articles, photos, videos and other content as a URL that points
to the page where your content lives.
It seems facebook don't copy over image, it just keeps a reference.

Creating a link to another website and loading a modal

I have just seen a website that can create a link to any website and display a modal when the link is clicked on someone else website. I was just curious if anyone knows how this is done?
Here's the test link that does this:
https://twitter.com/workladuk/status/955752813333766144
Here's how this scheme works:
Notice that clicking on the link in the tweet mentioned in your comment (seen at https://twitter.com/workladuk/status/955752813333766144) doesn't actually take you to StackOverflow, even though it appears to point to this article.
It takes you to http://readr.me/vc-25, a totally different site. This is clear from the browser address bar.
By inspecting the HTML of that page using the browser developer tools, we can see that it actually is a totally different page containing an overlay with the signup form, and also an iframe containing the page the user was hoping to visit, giving the illusion that they're on the page and just need to close the popup to view it. Once they do close the popup, it actually makes a whole new HTTP request and redirects the user to the real page.
Interestingly, this was even more obvious given the example you used, because when going to the site with the signup form, the StackOverflow page displayed underneath it showed I was not signed in, even though I was signed in to SO in other tabs in the browser. This will be because running it in an iframe caused it to be run in a separate session, in which I was not signed in. This was a another big clue to show that I was not on the real Stackoverflow page.
So to be clear, it is absolutely not making a popup appear on another website, because that's impossible without hacking it. Instead it's actually creating another page containing the signup form, redirecting the user to that page and embedding the "real" page within that to create an illusion.

Python - How to scrape multiple dynamically updated forms / webpages?

I've been trying to scrape a dynamically updated website, each webpage containing hundreds of rows, and the website in total has thousands of of pages (as in each page is accessed by clicking a "next" button or a number on the bottom of the page, just like you see in the bottom of a Google search page).
While I've been able to successfully scrape the pages, I've had trouble getting 100% accuracy in my results namely because the pages are dynamically updated (javascript). When a user logs in to their account, the system puts them back to the very top of the first row of the first page. So, for example, if I were just about to scrape page 101, and I were on page 100, and a user on page 101 logs in to their account, then I would miss that user's info. Considering the volume of activity, this can be quite problematic.
I tried running my automation during the wee hours, but realized there were users world-wide, so that was a fail. I also can't scrape pages in parallel because the forms are accessed/uploaded through javascript and I've had to use Selenium to click through one page at a time. (There's no unique URL per page; I've also tried looking through my browser's Network tab, but there's no variable that changes when I click on another page). I also tried accessing the API following the instructions on here, but the link that I was able to obtain only displays the information on the current page -- so it's no different than what I was able to access through the HTML source.
What are my options? Is there someway I can catch all the information at once so that I don't risk missing any information?
I know there will be people asking for the URL, but unfortunately I can't give it away. Even if I did, I couldn't give away the username and password. I'm a beginner at web-scraping, so any help is really appreciated!
If you've got no problem hitting the page as many times as you want, and the information never disappears, just go through all the pages as fast as you can, over and over again. In Selenium you can control multiple tabs and/or browsers simultaneously all using the same cookie to make your scraping faster.

How does facebook navigates to a photo(changes the window.location) but the past content remains?

I just noticed when I click a photo on my facebook news feed, the window location changes, the photo appear's, but the content from the previous page is still at the back of the photo. You can see it because the background of the photo viewer is transparent.
How can this be achieved?
Well. the URL changes to something like this: /photo.php?fbid=10150643780577073&set=a.446526812072.240769.709452072&type=1&theater
There is enough information in the query string to know what page the user came from. This information is used to display the photo in the foreground and to include the original page in the background. So both pages use some if facebooks backend code to generate the html frontend and in the case of the photo.php page include something extra: the forground picture plus the necessary css & scripts.
In the future it will be using HTML5's history API. But for cross-browser compatibility and backward-compatibility use the history.js library.
Pre-HTML5, the way to set the location without causing a page refresh is to append a # anchor position to the url, as though you had moved to an anchor position (traditionally used to move to and link to a specific paragraph on a page), e.g. url/to/page#someposition.
This generalises to representing a page state in the anchor; for example, a specific message in gmail has the URL https://mail.google.com/mail/?shva=1#inbox/2h42c4ahe7fge7 etc.
If you use history.js, you should be able to easily upgrade to pushState etc as and when they become widely supported.

Categories