I am attempting to create an R script that specifically details how to acquire the data I am using for analysis in R for reproducability reasons. Usually the first step is as simple as assigning a url for the xls file to a variable in R and proceeding from there, but the website I am scraping from seems to produce its xls files via javascript (A language I have no knowledge of).
Follow these steps to get to an example XLS:
Go to http://hcupnet.ahrq.gov/HCUPnet.jsp?Id=B08F84A071883804&Form=SelDXPR&JS=Y&Action=%3E%3ENext%3E%3E&_DXPR=DX1
Click "Principal Diagnosis"
Type "599.0" in the text box (without the quotation marks) and leave the radio button for "Each code separately" checked
Click "Next"
On this page check all of the radio buttons
Click "Next"
On this page check all of the radio buttons
Click "Next"
On this page you should see all of the data, as well as some links. One of these links is titled "Save results as an Excel spreadsheet". Clicking on this link will download an XLS file with the data to your computer.
I've inspected the element and can clearly see that it is querying a database, I'm just not entirely sure how to get that query into my R script to pull the xls file down.
Any help is much appreciated.
(not technically a full-on answer, perhaps, but the comment box doesn't allow real formatting)
RSelenium can perform all those actions. However, will there be many different selections/combinations of options? If not, you could just build a list of URLs like this one:
http://hcupnet.ahrq.gov/HCUPnet.jsp?Parms=H4sIAAAAAAAAACWJPQ.CMBgG_5KYMDACHSTGBgUT16fvtcrXoos_nybc5ab766evfDcU52vnonsVmlVWVfCnFIcQw9rXo0wEDIgsrGwk3ppkNQ0tTnb72BNPP7YX7jzyy6YDzbkdSuxNr2gAAAA6D4E19A7096C3AE9FD48005A5B0802A684BBBEB8
which goes right to that page for each download. You can capture that url by hitting Esc instead of actually downloading the XLS file and then copying the URL from the location bar.
On that page you can use the XML library or rvest to ingest and extract the onclick attribute following tag:
<a href="Javascript:void(0)"
onclick="window.open('HCUPnet.xls?Id=0A8C3E07CD01B562&Form=DispTab&JS=&Action=%3E%3ENext%3E%3E&__InDispTab=Yes&_Results=Save&_Results3=&SortOpt=');">
<img height="19" src="arrow_off3.gif" alt="" align="absMiddle" width="15" border="0">
Email a link to this page</a>
(I included the full anchor reference as you'll need to use that in the XPath or CSS selector to find that tag, but you might be able to get away with just doing an XPath or CSS "contains" for HCUPnet.xls in the onclick attribute, too).
Then, just extract the HCUP… string from there and prepend http://hcupnet.ahrq.gov/ to it in a download.file call.
Related
I find myself having to interact with a web page that hides state in various places so that one cannot easily share it as a URL, for example this page which allows users to look up information from city zoning applications:
https://aca.cityofberkeley.info/community/Default.aspx
You can interact with the page all you want, but the URL in the location bar will remain the same as the above.
Currently, city staff provide users with instructions like "Load this URL, click on the 'Zoning' tab, enter DRCP2020-0010 under the 'Permit Number' field, click 'Search', then when the records come up, click 'Record Info' and then select 'Attachments' from the dropdown menu, then click on the PDF document that says '2020-10-21_DRCP_APP_PCKT_2801 Adeline.pdf'". I would like to be able to replace these instructions with a URL.
Another example is the website where video from city council meetings is archived:
http://berkeley.granicus.com/MediaPlayer.php?publish_id=cbebb4e6-5b83-11eb-920e-0050569183fa
It would be nice to be able to produce a link which brings up one of the meeting videos, and seeks to a certain timestamp like 53:40, so that I can refer to something specific that was said at a meeting.
Looking at the pages that are loaded when I follow the instructions in each case, I can see that there are some POST forms, cookies, hidden input fields, and so on.
Is there some kind of tool that I can use to create "deep links" to pages like these, that were generated using non-URL hidden state, which will allow me to quickly share what I'm looking at with another user?
What I'm seeking is similar to the frmget "bookmarklet", which changes the forms on a page to use GET instead of POST. Sometimes this succeeds in producing a URL which captures form submission query parameters. However, it doesn't work for these applications, for whatever reason.
This question is possibly related to the idea of capturing a web page's DOM state using "browser screenshots" and a script called html2canvas. A possible solution might involve getting and setting cookies in a bookmarklet. Ideally something that produces a normal "https://" URL would be ideal, but if it is impossible to solve the problem except by outputting a "javascript:" URL (bookmarklet) then that is acceptable to me (in spite of the security implications). Thanks.
That seems like not a programming matter. It seems like the site has some security issues as well.
QUESTION A: About Zoning
Here are some links you can use
Direct link to Zoning (I've found it via Advanced search from the site):
https://aca.cityofberkeley.info/CitizenAccess/Cap/CapHome.aspx?module=Planning&TabName=Planning&TabList=Home%7C0%7CBuilding%7C1%7CHousing%7C2%7CPlanning%7C3%7CFire%7C4%7CLicenses%7C5%7CPublicWorks%7C6%7CCurrentTabIndex%7C3
A strange link to the list of files (I've found it via downloading a file, then going to chrome://downloads, then right-clicking the file I've download. The link has been the following):
https://aca.cityofberkeley.info/CitizenAccess/FileUpload/AttachmentsList.aspx?iframeid=ctl00_PlaceHolderMain_attachmentEdit&module=Planning&isInConfirm=False&isdetail=True&isaccountmanager=False&isAdmin=True&isPeopleDocument=&agencyCode=BERKELEY&isForConditionDocument=N
It still doesn't give the direct link to the file, but it it gives the list of attachement of the previously opened Zoning record.
Currently I have no idea what file is triggered by javascipt:__doPostBack('attachmentList$gdvAttachmentList$ctl02$lnkFileName','').
In any case, based on what we have, step one, and then step two seems like minimize the path to download the file. I guess there could be a way to download the file directly, but I currently don't see any easy way. Maybe someone else could figure it out.
QUESTION B: About video
I've used an embed link that shows all the attributes that can be used.
There is a pretty strange but working way to give the exact timestamp. Change starttime from the link below:
https://berkeley.granicus.com/MediaPlayer.php?publish_id=cbebb4e6-5b83-11eb-920e-0050569183fa&starttime=0&stoptime=undefined&autostart=1
So replacing 0 for 3600 will rewind the video forward by one hour (3600 seconds):
https://berkeley.granicus.com/MediaPlayer.php?publish_id=cbebb4e6-5b83-11eb-920e-0050569183fa&starttime=3600&stoptime=undefined&autostart=1
The problem here is that ... you cannot rewind back manually that particular hour (it just gets kind cropped out). But it works to show the exact episode.
That's a pretty strange site.
I wanted to make specific inputs go into another part of a site (e.g. https://myexamplesite.com/anotherpartofit.html)
And also make those specific inputs be the ones that someone saved it in.
An example of what I think would work is: Get value from input 1 from /apartofit.html and put it in input 2 at /anotherpartofit.html and make it non-editable
If it needs to use a database, I would prefer if you could help me with firebase (Google's Database). But in my knowlege, it probably needs to use javascript, so I'll be tagging it, if it doesn't, let me know!
in visual studio or notepad or every offline web creating spaces you can't but if you buy a domain or simple , online site , you have to crate a page and get the link of that page then open another page create a button set the button's href to link of pervious page finally hit the button !
I have constructed an on-line What's On page supported by a mysql database table including a field to contain an event title. An admin page permits entries to be added, edited and deleted. More recently I was asked if web links could be embedded in the title field, and found that editing the field to include a link of the form:
Click <a class=hdg onclick="newWindow('http://www.address');"> HERE </a> to open web link
would do the trick and open the URL in a new window via the newWindow function.
Unfortunately any attempt to use the admin page to edit such a mysql record corrupts the admin page display, since the string is returned from the title field as the value of a text box, and the link text is then interpreted by the browser so that only part of the field displays in the box on the screen. The remainder of the string appears outside the box, which is confusing to a non-technical user.
A quick and dirty fix is to use Ctrl-A to select the whole of the text box contents when editing, and then type or paste the whole of the title content into the box, when it commits correctly to the database. However, if anybody knows of a way to code the javascript so that on the one hand it will function correctly as a web link and on the other hand can be edited via an HTML form, I would be glad to know. Ultimately I guess I'll re-construct the database to hold the actual URL separately and use php to build the javascript link, but meanwhile ?
Use htmlentities() to put the field value into the web page with all the HTML special characters replaced with entities, so they won't be interpreted:
echo '<textarea>' . htmlentities($value) . '</textarea>';
I have an application which creates custom images in the backend. Stores the link in a db table and then they are showed on a UI with checkboes attached to each of them.Now i want the user to be able to click on a download button and then all the images whose checkboxes are selected should get download (and not rendered) in one of the two ways possible
1.) create a zip file of selected and then download(future scope)
2.) download all images individually one by one.
below is my html code
<div class="span5" data-type="collage-image-structure">
<input type="checkbox" class="images_checkbox" data-url="some_url_of_image.png" checked="checked or unchecked">
<img src="some_url_of_image.png">
</div>
Now on click of a button I want download images that are checked. I have tried the iframe solution provided that i read somewhere on stackoverflow(sorry i cant find the link right now). The solution downloads the image in an empty iframe but does not show the download popup as traditional browsers. Also providing a link in the tag and using the download attribute solves things but i dont want to attach an tag to every image. All i want is to write a javascript function that will pass image_urls one by one and download them.Also i do not want to use any readymade plugins available to avoid the overhead of loading the jS file everytime with page load.Please help I need a solution quickly.
So I want to use a <asp:FileUpload> control to "upload" a picture.
I dont really upload the picture, I just want to get it's inputstream so I can change it to a byte array and place it in the database.
However when I add a <asp:FileUpload> it comes with a static button and text field. Thou I like the textfield, I want to change the text of the button because my site is full english and the button's text changes depending on.. well something with the language of the browser or OS.
So I searched on google for a while and fould some info about making a html control
<input type='button' style='visibility: hidden'> and make another button which activates the file button by using javascript.
So here's the problem, when I add runat=server to the hidden file button I can't "find" it anymore using the document.getElementById javascript function and thus can not get the inputstream or the file.
What i'm asking is if there isn't a simple way to change the text of a <asp:FileUpload> so I can still use that control. If not, could you please show me a way how I can get the hidden file button to work with code behind and get it's inputstream?
Have a look at blog post Styling an input type="file" by Peter-Paul Koch. You may try Ajax AsyncFileUpload control or use uploadify jQuery plug-ins too.