get the actuel domain name (server name) in Shiny app - javascript

I have 3 servers, dev, test and prod. My Shiny code shoud be deployed from dev to prod.
Now the problem:
In the ui.R I refere via href = 'https://dev.com/start/' to another site named start. Is it possible to get the domain name, dev, test and prod automatically? Something like, `href = 'https://what is the actuall domain.com/start/'
addendum: as DanielR answerd, one can use session$clientData$url_hostname, however my problem is taht I need the hostname in dashboardHeader. The place in ui.R where I need the dynamical href is:
dashboardPage(
dashboardHeader(title = "KRB",
titleWidth = 150,
tags$li(a(href ='https://dev.com/start/

You can get the hostname using the session$clientData$url_hostname in your server function. See https://shiny.rstudio.com/articles/client-data.html
Here's a little app:
library(shiny)
ui <- fluidPage(
uiOutput('urlui')
)
server <- function(input, output, session) {
output$urlui <- renderUI({
htmltools::a('my link',
href=paste0('http://', session$clientData$url_hostname))
})
}
shinyApp(ui = ui, server = server)

Now the problem: In the ui.R I refere via href = 'https://dev.com/start/' to another site named start. Is it possible to get the domain name, dev, test and prod automatically?
For what you want to achieve here, you don’t need to get the actual host name, if you can just use a relative URL instead of a full absolute one to begin with.
Instead of
tags$li(a(href ='https://dev.com/start/' …
use
tags$li(a(href ='/start/' …
Relative URLs with a leading slash refer to the domain root, so this should resolve to https://[hostname]/start/ automatically, without you having to determine what [hostname] actually is in this case. The browser basically does that part for you when it resolves relative URLs, based on the address of the currently displayed main document.

Related

How to set url in index.html in swagger-UI

I am using Swagger-UI for jax-rs jersey.
So there is this index.html. There you have to enter the url for the swagger.json .
So this is a big problem.
We are deploying our application to a lot different environments.
And the respective swagger.json will always be on the same environment.
We have Jenkins build jobs and we cannot edit index.html for every environment.
window.onload = function() {
// Begin Swagger UI call region
const ui = SwaggerUIBundle({
url: "**https://petstore.swagger.io/v2/swagger.json**",
Property url I always have to set.
What should I do?
P.S.
In Springfox Swagger-UI there is no physical swagger.json
But in jax-rs I have this dist folder and there is always a physical json
as far as I understand. Where should I put this so all different
clients can access it.
You can use vanilla JS for that:
var currentUrl = window.location.origin;
var apiBasePath = currentUrl + '/v2';
window.ui = SwaggerUIBundle({
url: apiBasePath + "/swagger.json",
...
})

Web Scraping interactive map (javascript) with R and PhantomJS

I am trying to scrape data from an interactive map (looking to get crime data for a county). I am using R (rvest) and trying to use phantomjs too. I'm new to web scraping so I am not really understanding how all the elements work together (trying to get there).
The problem I believe I am having is that after I run the phantomjs and upload the html using R's rvest package, I end up with more scripts and no clear data in the html. My code is below.
writeLines("var url = 'http://www.google.com';
var page = new WebPage();
var fs = require('fs');
page.open(url, function (status) {
just_wait();
});
function just_wait() {
setTimeout(function() {
fs.write('cool.html', page.content, 'w');
phantom.exit();
}, 2500);
}
", con = "scrape.js")
A function that takes in the url that I want to scrape
s_scrape <- function(url = "https://gis.adacounty.id.gov/apps/crimemapper/",
js_path = "scrape.js",
phantompath = "/Users/alihoop/Documents/phantomjs/bin/phantomjs"){
# this section will replace the url in scrape.js to whatever you want
lines <- readLines(js_path)
lines[1] <- paste0("var url ='", url ,"';")
writeLines(lines, js_path)
command = paste(phantompath, js_path, sep = " ")
system(command)
}
Execute the js_scrape() function and get a html file saved as "cool.html"
js_scrape()
Where I am not understanding what to do next is the below R code:
map_data <- read_html('cool.html') %>%
html_nodes('script')
The output I get in the HTML via phantomjs is just scripts again. Looking for help on how to proceed when faced (in my mind) is javascript nested in javascript scripts(?)
Thank you!
This site uses javascript to make queries to the server. One solution is to reproduce the rest request and read the returning JSON file directly. This avoids the need to use Phantomjs.
From the developer tools options from your browser and looking through the xhr files, you will find a file(s) named "query" with a link similar to: "https://gisapi.adacounty.id.gov/arcgis/rest/services/CrimeMapper/CrimeMapperWAB/FeatureServer/11/query?f=json&where=1%3D1&returnGeometry=true&spatialRel=esriSpatialRelIntersects&outFields=*&outSR=102100&resultOffset=0&resultRecordCount=1000"
Read this JSON response directly and convert to a list with the use of the jsonlite package:
library(jsonlite)
output<-jsonlite::fromJSON("https://gisapi.adacounty.id.gov/arcgis/rest/services/CrimeMapper/CrimeMapperWAB/FeatureServer/11/query?f=json&where=1%3D1&returnGeometry=true&spatialRel=esriSpatialRelIntersects&outFields=*&outSR=102100&resultOffset=0&resultRecordCount=1000")
output$features
Find the first number in the link, (11 in this case) "FeatureServer/11/query?f=json". This number will determine which crime to query the server with. I found, it can take a value from 0 to 11. Enter 0 for arson, 4 for drugs, 11 for vandalism, etc.

Unable to scrape multiple pages using phantomjs in r

I'm trying to scrape county assessor data on historic property values for multiple parcels generated using javascript from https://www.washoecounty.us/assessor/cama/?command=assessment_data&parid=07101001 using phantomjs controlled by RSelenium.
'paraid' in the url is the 9 digit parcel number. I have a dataframe containing a list of parcel numbers that I'm interested in (a few hundred in total), but have been attempting to make the code work on a small subset of those:
parcel_nums
[1] "00905101" "00905102" "00905103" "00905104" "00905105"
[6] "00905106" "00905107" "00905108" "00905201" "00905202"
I need to scrape the data in the table generated on the page for each parcel and preserve it. I have chosen to write the page to a file "output.htm" and then parse the file afterwards. My code is as follows:
require(plyr)
require(rvest)
require(RSelenium)
require(tidyr)
require(dplyr)
parcel_nums <- prop_attr$APN[1:10] #Vector of parcel numbers
pJS <- phantom()
remDr <- remoteDriver(browserName = "phantomjs")
remDr$open()
result <- remDr$phantomExecute("var page = this;
var fs = require(\"fs\");
page.onLoadFinished = function(status) {
var file = fs.open(\"output.htm\", \"w\");
file.write(page.content);
file.close();
};")
for (i in 1:length(parcel_nums)){
url <- paste("https://www.washoecounty.us/assessor/cama/?command=assessment_data&parid=",
parcel_nums[i], sep = "")
Sys.sleep(5)
emDr$navigate(url)
dat <- read_html("output.htm", encoding = "UTF-8") %>%
html_nodes("table") %>%
html_table(, header = T)
df <- data.frame(dat)
#assign parcel number to panel
df$apn <- parcel_nums[i]
#on first iteratation initialize final data frame, on sebsequent iterations append the final data frame
ifelse(i == 1, parcel_data <- df, parcel_data <- rbind(parcel_data, df))
}
remDr$close
pJS$stop()
This will work perfectly for one or two iterations of the loop, but it suddenly stops preserving the data generated by the javascript and produces an error:
Error in `$<-.data.frame`(`*tmp*`, "apn", value = "00905105") :
replacement has 1 row, data has 0
which is due to the parser not locating the table in the output file because it is not being preserved. I'm unsure if there is a problem with the implementation I've chosen or if there is some idiosycrasy of the particular site that is causing the issue. I am not familiar with JavaScript so the code snippet used is taken from an example I found. Thank you for any assistance.
The below answer worked perfectly. I also moved the Sys.sleep(5) to after the $navigate to allow the page time to load the javascript. The loop is now executing to completion.
require(plyr)
require(rvest)
require(RSelenium)
require(tidyr)
require(dplyr)
parcel_nums <- prop_attr$APN[1:10] #Vector of parcel numbers
#pJS <- phantom()
remDr <- remoteDriver()
remDr$open()
# #result <- remDr$executeScript("var page = this;
# var fs = require(\"fs\");
# page.onLoadFinished = function(status) {
# var file = fs.open(\"output.htm\", \"w\");
# file.write(page.content);
# file.close();
# };")
#length(parcel_nums)
for (i in 1:length(parcel_nums)){
url <- paste("https://www.washoecounty.us/assessor/cama/?command=assessment_data&parid=",
parcel_nums[i], sep = "")
Sys.sleep(5)
remDr$navigate(url)
doc <- htmlParse(remDr$getPageSource()[[1]])
doc_t<-readHTMLTable(doc,header = TRUE)$`NULL`
df<-data.frame(doc_t)
#assign parcel number to panel
df$apn <- parcel_nums[i]
#on first iteratation initialize final data frame, on sebsequent iterations append the final data frame
ifelse(i == 1, parcel_data <- df, parcel_data <- rbind(parcel_data, df))
}
remDr$close
This gave me a solution. And this should work with the the phantomJS too. I request you to test and reply.
I have lost an entire day trying to solve a similar issue. So I share my learning to help others save time and nerves..
I guess we need to understand that opening, navigating and other browsing actions through the remote driver need time to complete.
So we have to wait before we try to read or do anything on the pages we are expecting to scrape.
My problems were solved when I introduced Sys.sleep(5) after the remDr$navigate(url) call.
It seems that a neater solution consists of inserting an remDr$setTimeout(type = "page load", milliseconds = 10000) as suggested at how to check if page finished loading in RSelenium but didn't test it yet.

Parsing javascript generated pages in R

At work I'd like to parse some web pages. Unfortunately I can't add any real page to my example because the urls at work are confident. I can only try to explain what is the problem.
To parse I wrote the following script in R. As a mock url I used www.imdb.com.:
library(rvest)
library(plyr)
# urls
url <- "http://www.imdb.com/"
# parse
html <- try(read_html(url))
# select
select_meta <- function(html) {
html %>%
html_nodes(xpath = "//div") %>%
html_attrs # function to select meta
}
meta <- select_meta(html)
Problem is this script doesn't return anything for the pages I use at work. I guess this is because the scripts are generated by javascript. I found this tutorial which explains how to scrape javascript generated pages in R.
The code used to generate the page in the tutorial is the following:
// scrape_techstars.js
var webPage = require('webpage');
var page = webPage.create();
var fs = require('fs');
var path = 'techstars.html'
page.open('http://www.techstars.com/companies/stats/', function (status) {
var content = page.content;
fs.write(path,content,'w')
phantom.exit();
});
I don't have any Javascript knowledge so I'm having trouble scaling page.open (which only works for 1 page) to multiple pages (at work I have to parse roughly 100 pages). So instead of relying on phantom js I'd rather have a solution which is completely R based (if this is totally inefficient and offensive to real coders, I apologise in advance). So the crux of my question is: "how can I generate several pages in R?".
This is a one-off thing so I'm not really thinking about reading up on Javascript or parsing. Thanks in advance for helping me out.

Inject local .js file into a webpage?

I'd like to inject a couple of local .js files into a webpage. I just mean client side, as in within my browser, I don't need anybody else accessing the page to be able to see it. I just need to take a .js file, and then make it so it's as if that file had been included in the page's html via a <script> tag all along.
It's okay if it takes a second after the page has loaded for the stuff in the local files to be available.
It's okay if I have to be at the computer to do this "by hand" with a console or something.
I've been trying to do this for two days, I've tried Greasemonkey, I've tried manually loading files using a JavaScript console. It amazes me that there isn't (apparently) an established way to do this, it seems like such a simple thing to want to do. I guess simple isn't the same thing as common, though.
If it helps, the reason why I want to do this is to run a chatbot on a JS-based chat client. Some of the bot's code is mixed into the pre-existing chat code -- for that, I have Fiddler intercepting requests to .../chat.js and replacing it with a local file. But I have two .js files which are "independant" of anything on the page itself. There aren't any .js files requested by the page that I can substitute them for, so I can't use Fiddler.
Since your already using a fiddler script, you can do something like this in the OnBeforeResponse(oSession: Session) function
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("MY.TargetSite.com") ) {
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
// Remove any compression or chunking
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
// Find the end of the HEAD script, so you can inject script block there.
var oRegEx = oRegEx = /(<\/head>)/gi
// replace the head-close tag with new-script + head-close
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>console.log('We injected it');</script></head>");
// Set the response body to the changed body string
oSession.utilSetResponseBody(oBody);
}
Working example for www.html5rocks.com :
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("html5rocks") ) { //goto html5rocks.com
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
var oRegEx = oRegEx = /(<\/head>)/gi
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>alert('We injected it')</script></head>");
oSession.utilSetResponseBody(oBody);
}
Note, you have to turn streaming off in fiddler : http://www.fiddler2.com/fiddler/help/streaming.asp and I assume you would need to decode HTTPS : http://www.fiddler2.com/fiddler/help/httpsdecryption.asp
I have been using fiddler script less and less, in favor of fiddler .Net Extensions - http://fiddler2.com/fiddler/dev/IFiddlerExtension.asp
If you are using Chrome then check out dotjs.
It will do exactly what you want!
How about just using jquery's jQuery.getScript() method?
http://api.jquery.com/jQuery.getScript/
save the normal html pages to the file system, add the js files manually by hand, and then use fiddler to intercept those calls so you get your version of the html file

Categories