Site Scraping for JavaScript rendered site - javascript

I'm trying to get the part numbers of which items go in their respective categories from http://www.dynacorn.com/ListItems.aspx but, for me at least, the catch is that the pages are paginated and therefore I cannot figure out how to move on to the next page. I've been reading about JSoup but am unsure of how to implement this. Can someone show me an example of how I could go from page to page when scraping?
Thanks!

Related

How to create a page location in browser using javascript/jQuery?

I have a setup where I display a list of buttons and clicking on the buttons triggers a function that contacts a firebase database and gets the contents of a 'slide' that is to be shown to the user. The function then clears the content of the page and then creates elements from the data acquired from the database.
Now obviously, when I press back browser button once I've replaced the content, it won't take me back to the previous content. But I believe that my user's experience will be much better if it actually took them back to the list of buttons. I have two faint ideas on how to go about solving this problem but I'm lacking in specific details of how I can go about it.
Possible Solution 1:
Some way to dynamically create a new page using javascript and then serve it to the user.
Possible Solution 2:
Some way to simulate that the page has changed location. Maybe using anchoring links.
Let me know if you have any other solutions in mind or if you know how I should go about implementing these. Your help will be much appreciated. :D

How to scrape website data into an Excel worksheet?

I'm a novice programmer trying to compile an Excel list of all the inc5000 companies and their industry, location, revenue, and CEO. Is there any way for me to automate this so that I don't have to manually input all 5000?
Some issues:
-The inc5000 list only displays 50 companies on a page, and scrolling to the next page does not change the URL. I tried converting the URL into HTML, but none of the metadata actually shows up in the HTML code (I used https://try.jsoup.org/~LGB7rk_atM2roavV0d-czMt3J_g).
-All of the information I need is on this one scrolling page (https://www.inc.com/profile/loot-crate), but the URL changes for each company as you progress down the page. Is there any way to grab the data from this site without manually changing 5000 URLs?
I'm really new to programming and I know next to nothing about HTML/JavaScript/Web design-- I only know basic Java. I would really appreciate any help or potential leads into a solution.
Here's the easy way:
Go to the page, hit f12, go to the "Network" tab of debug tools, select XHR (to filter to only the data calls) then scroll to the bottom of the page. The page makes a query for each company, that you can access in the debug tools.
Once you have all the pages, you can highlight all the rows in the file name list to the left, right click, and save it to a .har file.
From there, just write a script to pull out the json and you're set.

Products fade in/appear on scrolling down page

Does anyone have any idea how to achieve this? On our site we prefer not to have pagination and have all products on one page, however we have seen it done on other sites where as you scroll, the products appear into view.
This is an example of our site in a category with a lot of products: http://goo.gl/OiHIFO
If someone could help/advise or offer a link to something which will achieve this for us i'd appreciate it.
The way you could do it is using similar methods to pagination where you load the first 50 but when you get to almost the bottom it would then make an ajax call and pull the next 50.
Those next 50 would then be appended to the results or faded in using jQuery.
Its better to load the content as you need it rather than loading everything at once but only displaying the actual product when you scroll to it

Dynamic pages loading (without realoading whole page) in angular?

I decided to learn AngularJS, but I got stuck on my first project :/
I want to make menu like in first example here: http://tutorialzine.com/2013/08/learn-angularjs-5-examples/ So there are a few options: "home", "projects" etc. and if I click on one I get other part of site without reloading whole page.
What I want to accomplish:
dynamic data loading, when I click on link, refreshes only one part of page not all. I want to make page like this (click on any link - the menu bar sands still and isn't reloaded)
all parts of site are in separate files (I don't want to keep all page data in one html file, I want to break it into pieces)
adress of page (in browsers adress bar) is changing when I click on link
I know I could use AJAX or something like this, but the problem is: I want to make Google-friendly site and - as in know - Google have some problems with AJAX based sites.
Can you tell me what way should I choose and if angularJS actually qualify to this job?

How To Display List of Sites One At A Time

I am looking for a way to display a list of websites one at a time from a URL list. I'm fine with a very manual solution, I found an AJAX solution where each "page" is displayed in a tab but it is very heavy because if I have 50 pages I want users to page through one at a time, this solution essentially pulls all 50 pages onto the one page. Do you know of a framework which does the same thing but only loads one page at a time? Thank you very much for the advice and help. Here is the site I found - http://css-tricks.com/jquery-ui-tabs-with-nextprevious/
You could load the URLs into an array and then create a 'next' button that loads the next url into a div; replacing the previous one.
do you require doing this will javascript?
might be easier to curl the pages using php, then echo this returned data as an eval-able array into the html. Then allow user to alter which part of the returned array you are looking at using a next and prev button.
if you pre-load each one it will be heavy as you have noted.
This idea is screaming for AJAX. With proper AJAX calls, you would only load a page once it has actually been selected by tab. Any previous page loaded into the area would need to be dumped. You shouldn't actually need to physically switch tabs if you're using the src attribute of an iframe, simply changing the src and forcing it to refresh itself should accomplish the trick. If you are performing a screen scrape through a remote web service, then you could simply use jQuery/AJAX to rewrite the innerHTML of the panel in question.

Categories