View and edit file permissions disappear unexpectedly in Chrome with Plotly Dash - javascript

Problem:
My Plotly Dash app (python) has a clientside callback (javascript) which prompts the user to select a folder, then saves a file in a subfolder within that folder. Chrome asks for permission to read and write to the folder, which is fine, but I want the user to only have to give permission once. Unfortunately the permissions, which should persist until the tab closes, disappear often. Two "repeatable cases" are:
when the user clicks a simple button ~15 times very fast, previously accepted permissions will disappear (plotting a figure also does this in my real application)
downloading a file within a few seconds of reloading the page results in the permissions automatically going away within about 5 seconds
I can see the permissions (file and pen icon) disappear at the right of the chrome url banner.
What I've tried:
testing with Ublock Origin on/off (and removed from chrome) to see if the extension interfered (got idea from the only somewhat similar question I've come across: window.confirm disappears without interaction in Chrome)
turning debug mode off
using Edge instead of chrome (basically the same behavior was observed)
adding more computation to Test button to find repeatable case, but still needed to click it a lot to remove permissions (triggering callbacks / updating Dash components seems to be the issue, not server resources)
Example python script (dash app) to show permissions disappearing:
import dash
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output
from dash import html
app = dash.Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])
app.layout = html.Div([
dbc.Button(id="model-export-button", children="Export Model"),
dbc.Label(id="test-label1", children="Click to download"),
html.Br(),
dbc.Button(id="test-button", children="Test button"),
dbc.Label(id="test-label2", children="Button not clicked")
])
# Chrome web API used for downloading: https://web.dev/file-system-access/
app.clientside_callback(
"""
async function(n_clicks) {
// Select directory to download
const directoryHandle = await window.showDirectoryPicker({id: 'save-dir', startIn: 'downloads'});
// Create sub-folder in that directory
const newDirectoryHandle = await directoryHandle.getDirectoryHandle("test-folder-name", {create: true});
// Download files to sub-folder
const fileHandle = await newDirectoryHandle.getFileHandle("test-file-name.txt", {create: true});
const writable = await fileHandle.createWritable();
await writable.write("Hello world.");
await writable.close();
// Create status message
const event = new Date(Date.now());
const msg = "File(s) saved successfully at " + event.toLocaleTimeString();
return msg;
}
""",
Output('test-label1', 'children'),
Input('model-export-button', 'n_clicks'),
prevent_initial_call=True
)
#app.callback(
Output('test-label2', 'children'),
Input('test-button', 'n_clicks'),
prevent_initial_call=True
)
def test_button_function(n):
return "Button has been clicked " + str(n) + " times"
if __name__ == "__main__":
app.run_server(debug=False)

This is now possible! In your code, replace the line…
await window.showDirectoryPicker({id: 'save-dir', startIn: 'downloads'});
…with…
await window.showDirectoryPicker({
id: 'save-dir',
startIn: 'downloads',
mode: 'readwrite', // This is new!
});

Related

Disable tabs opened in selenium (Node.js)

Im using the selenium webdriver for node.js and im also loading an extension, loading the extension works fine but when I start my project it goes to the page I want then instantly the extension opens a new tab (Thank you for adding this extension bla bla bla), Im wondering if theres a way to disable tabs that are not opened by myself, ive tried this:
await driver.get('https://mywebsite.com') //open my initial site
await driver.sleep(1000) //give time for the extension site to open
driver.switchTo(await driver.getAllWindowHandles()[1]) //switch to extension site
await driver.close()
driver.switchTo(await driver.getAllWindowHandles()[0]) //switch back to the main site
//rest of my code
Unfortunately this just does not seem to work, any advice appreciated!
There's no way to disable tabs not opened by your script. As long as you don't change window handles, the driver will still be on the original tab. You can proceed with the script from there, ignoring the other opened tabs.
I think the main issue I see with your code is that you are passing parameters to .switchTo() instead of .window(). It should be driver.switchTo().window(handle);.
If you want to find the new window to close it, I wrote that code in this answer. All you need to do is to add the .close() line after that code and switch back to the original handle, which you already have in your current code (after fixing with my feedback above).
Another approach is heavily based on the selenium.dev docs:
// Open the initial site
await driver.get('https://mywebsite.com')
// Store the ID of the original window
const originalWindow = await driver.getWindowHandle();
// Wait for the new window or tab
await driver.wait(async () => (await driver.getAllWindowHandles()).length === 2, 10000);
// Loop through until we find a new window handle
const windows = await driver.getAllWindowHandles();
windows.forEach(async handle => {
if (handle !== originalWindow) {
await driver.switchTo().window(handle);
}
});
await driver.close()
await driver.switchTo().window(originalWindow);
// Rest of the code

Unable to download research article from scihub using browser emulation with selenium

I am trying to automate the download of research articles from scihub (https://sci-hub.scihubtw.tw/) based on their corresponding article titles. I am using a library called scholarly (https://pypi.org/project/scholarly/) to get the url, author information related to the given article title as shown in the code below.
I use the fetched url (as described above) to emulate the download process using scihub. But I am unable to download directly, since I can't press the open button on the search page (https://sci-hub.scihubtw.tw/). And pressing enter after populating the query forwards me to another page with an open button. I am unable to fetch and press the open button for some reason and it always returns me a null element using the selenium library.
However, I am able to execute the following in the browser console and successfully download the pape,
document.querySelector("#open-button").click()
But, trying to get similar response from selenium is failing.
Kindly help me resolve this issue.
## This part of code fetches url using scholarly library from google scholar
from scholarly import scholarly
search_query = scholarly.search_pubs('Hydrogen-hydrogen pair correlation function in liquid water')
search_query = [query for query in search_query][0]
## This part of code uses selenium to automate download process
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
import time
download_dir = '/Users/cacsag4/Downloads'
# setup the browser
options = webdriver.ChromeOptions()
options.add_experimental_option('prefs', {
"download.default_directory": download_dir, #Change default directory for downloads
"download.prompt_for_download": False, #To auto download the file
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True #It will not show PDF directly in chrome
})
browser = webdriver.Chrome('./chromedriver', options=options)
browser.delete_all_cookies()
browser.get('https://sci-hub.scihubtw.tw/')
# Find the search element to send the url string to it
searchElem = browser.find_element(By.CSS_SELECTOR, 'input[type="textbox"]')
searchElem.send_keys(search_query.bib['url'])
# Emulate pressing enter two different ways, either by pressing return key or by executing JS
#searchElem.send_keys(Keys.ENTER) # This produces the same effect as the next line
browser.execute_script("javascript:document.forms[0].submit()")
# Wait for page to load
time.sleep(10)
# Try to press the open button using JS or by fetching the button by its ID
# This returns error since its unable to fetch open-button id
browser.execute_script('javascript:document.querySelector("#open-button").click()')
#openElem = browser.find_element(By.ID, "open-button") ## This also returns a null element
Ok, so I got the answer to this question. Sci-hub stores its pdf inside an iframe, so all you got to do is fetch the src attribute of the iframe after pressing enter on the first page. The following code does the job.
from scholarly import scholarly
search_query = scholarly.search_pubs('Hydrogen-hydrogen pair correlation function in liquid water')
search_query = [query for query in search_query][0]
print(search_query.bib['url'])
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
import time
download_dir = '/Users/cacsag4/Downloads'
# setup the browser
options = webdriver.ChromeOptions()
options.add_experimental_option('prefs', {
"download.default_directory": download_dir, #Change default directory for downloads
"download.prompt_for_download": False, #To auto download the file
"download.directory_upgrade": True,
"plugins.always_open_pdf_externally": True #It will not show PDF directly in chrome
})
browser = webdriver.Chrome('./chromedriver', options=options)
browser.delete_all_cookies()
browser.get('https://sci-hub.scihubtw.tw/')
# Find the search element to send the url string to it
searchElem = browser.find_element(By.CSS_SELECTOR, 'input[type="textbox"]')
searchElem.send_keys(search_query.bib['url'])
# Emulate pressing enter two different ways, either by pressing return key or by executing JS
#searchElem.send_keys(Keys.ENTER) # This produces the same effect as the next line
browser.execute_script("javascript:document.forms[0].submit()")
# Wait for page to load
time.sleep(2)
# Try to press the open button using JS or by fetching the button by its ID
# This returns error since its unable to fetch open-button id
#browser.execute_script('javascript:document.querySelector("#open-button").click()')
openElem = browser.find_element(By.CSS_SELECTOR, "iframe") ## This also returns a null element
browser.get(openElem.get_attribute('src'))

Download Video from URL without opening in chrome browser

I have registered for a course that has roughly 150 videos.
What I have done Uptil NOW:
There is no download button available right now.
In order to get the URL of each video file, I have created the script which I run through Console as below:
The site where I am watching these videos is different than the xxxxx marked site.
e.g. I am watching on linkedin learning and video is on lynda,etc.
console.log(("<h2>"+ document.title)+"</h2>"
+
" click here ");
document.getElementsByClassName("video-next-button")[0].click();
an example of output from above code is:
<h2>Overview of QGIS features: Learning QGIS (2015)</h2>
<a href="https://files3.xxxxx.com/secure/courses/383524/VBR_MP4h264_main_SD/383524_01_01_XR15_Overview.mp4?V0lIWk4afWPs3ejN5lxsCi1SIkGKYcNR_F7ijKuQhDmS1sYUK7Ps5TYBcV-MHzdVTujT5p03HP10F_kqzhwhqi38fhOAPnNJz-dMyvA2-YIpBOI-wGtuOjItlVbRUDn6QUWpwe1sRoAl__IA1zmJn3gPvC7Fu926GViqVdLa3oLB0mxRGa7i> click here </a>
I have replaced domain name with xxxxx
This way I can get cover all videos without clicking next (I would like to know if I can automate this process by using some timeout techniques as well)
each of this link, when clicked, chrome window looks like below:
this way after clicking 3dots -> Download, I can save video individually.
What I want:
Method to save all videos without the need to open individually.
Challenge
To begin with, fetching and saving large binary files is possible when:
The host server's CORS support is enabled.
Accessing the host's network from the same site-origin.
Server-to-Server.
Okay, this would reason why your anchor attempt did not work, in fact, accessing the host's network from your localhost will deny you from accessing the resource's content unless the host server's CORS support is enabled which is unlikely.
Workaround
Alternatively, this will leave us with the other two options, accessing from the same site-origin in particular due to its simplicity, the strategy lies in executing the fetching/saving script from the browser itself, thus, the host server will be gentle with the requests, since they are very similar to the ones coming from the same site.
Steps
Go to the site you wish to download the files from (I used https://www.sample-videos.com).
Right-click the web page and select 'Inspect' (Ctrl + Shift + I).
Finally, switch to the 'Console' tab to start coding.
Code
const downloadVideos = (videos, marker) => {
// it's important to throttle between requests to dodge performance or network issues
const throttleTime = 10000; // in milliseconds; adjust it to suit your hardware/network capabilities
const domain = 'https://www.sample-videos.com'; // site's domain
if (marker < videos.length) {
console.log(`Download initiated for video ${videos[marker].name} # marker:${marker}`);
const anchorElement = document.createElement('a');
anchorElement.setAttribute('href', `${domain}${videos[marker].src}`);
anchorElement.setAttribute('download', videos[marker].name);
document.body.appendChild(anchorElement);
// trigger download manually
anchorElement.click();
anchorElement.remove();
marker += 1;
setTimeout(downloadVideos, throttleTime, videos, marker);
}
};
// assuming all videos are stored in an array, each video must have 'src' and 'name' attributes
const videos = [
{ src: '/video123/mp4/480/big_buck_bunny_480p_30mb.mp4', name: 'video_480p.mp4' },
{ src: '/video123/mp4/720/big_buck_bunny_720p_1mb.mp4', name: 'video_720p.mp4' }
];
// fireup
downloadVideos(videos, 0);
... ahem!

Executing FolderBrowserDialog in powershell from client browser using javascript

I'm trying to trigger some sort of Folder Selection Dialog, I have a working model with nodejs and the powershell but it only works when the server and client are on the same machine. I need the prompt to occur on the client side triggered from the browser. From what i understand I can not trigger Powershell from Chrome? So is there an alternative or am i just screwed?
My current Powershell script
{
param([string]$Description="Select Folder",[string]$RootFolder="Desktop")
[System.Reflection.Assembly]::LoadWithPartialName("System.windows.forms") |
Out-Null
$objForm = New-Object System.Windows.Forms.FolderBrowserDialog
$objForm.Rootfolder = $RootFolder
$objForm.Description = $Description
$Show = $objForm.ShowDialog()
If ($Show -eq "OK")
{
Return $objForm.SelectedPath
}
Else
{
Write-Error "Operation cancelled by user."
}
}
$folder = Select-FolderDialog # the variable contains user folder selection
write-host $folder
My javascript function
async function asyncfindDir() {
//executes powershell script
let promise = new Promise((resolve, reject) => {
const Shell = require('node-powershell');
const ps = new Shell({
executionPolicy: 'Bypass',
noProfile: true
});
ps.addCommand('./selectfolder.ps1');
ps.invoke()
.then(output => {
//console.log(output);
var shelloutput = output;
console.log (shelloutput + '^^from external script');
res.send(shelloutput);
})
.catch(err => {
console.log('please select a directory path')
//console.log('err');
});
});
};
Is there anyway to get that working locally?
Is there a trigger i'm not aware of to access that kind of dialog from the browser? I know i'm not the only person with this issue but i have yet to see a real solution.
Short answer: No.
Longer answer, is best illustrated by rephrasing your question with a different script name:
Using my browser, can I click on a link to visit a website, and have it run a random
PowerShell script called Delete_All_Files.ps1?
Answers why you will never be able to run a PowerShell script from a browser, on a remote machine, and why browsers will deliberately block you from doing it, because people usually don't want to have all their files deleted when they click on a random link in their email.
If you want to run PowerShell scripts on remote machines, then you should look into PSRemoting and Enter-PSSession.
#kuzimoto is right. If you just want to display a folder dialog box, there are easier ways to do that and Fine Uploader is an easier way.
Replying to your comment: If you want to specify a directory name, the reason you can't do it is because you are essentially asking:
Using my browser, can I click on a link to visit a website, and have
it run a script that will enumerate through all the files and folders
in my C:\ so that it can choose the folder C:\users\Justin
Miller\Desktop\SECRET FILES\?
The reason both operations do not work is because both operations require local computer access. i.e. local script execution access, and local directory knowledge access. Security-wize, we, in general, don't want to visit a random website and have it execute random code, or know what files/folders I have on my machine, which is why you won't be able to do what you want to try to do.

Retrieve html content of a page several seconds after it's loaded

I'm coding a script in nodejs to automatically retrieve data from an online directory.
Knowing that I had never done this, I chose javascript because it is a language I use every day.
I therefore from the few tips I could find on google use request with cheerios to easily access components of dom of the page.
I found and retrieved all the necessary information, the only missing step is to recover the link to the next page except that the one is generated 4 seconds after loading of page and link contains a hash so that this step Is unavoidable.
What I would like to do is to recover dom of page 4-5 seconds after its loading to be able to recover the link
I looked on the internet, and much advice to use PhantomJS for this manipulation, but I can not get it to work after many attempts with node.
This is my code :
#!/usr/bin/env node
require('babel-register');
import request from 'request'
import cheerio from 'cheerio'
import phantom from 'node-phantom'
phantom.create(function(err,ph) {
return ph.createPage(function(err,page) {
return page.open(url, function(err,status) {
console.log("opened site? ", status);
page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js', function(err) {
//jQuery Loaded.
//Wait for a bit for AJAX content to load on the page. Here, we are waiting 5 seconds.
setTimeout(function() {
return page.evaluate(function() {
var tt = cheerio.load($this.html())
console.log(tt)
}, function(err,result) {
console.log(result);
ph.exit();
});
}, 5000);
});
});
});
});
but i get this error :
return ph.createPage(function (page) {
^
TypeError: ph.createPage is not a function
Is what I am about to do is the best way to do what I want to do? If not what is the simplest way? If so, where does my error come from?
If You dont have to use phantomjs You can use nightmare to do it.
It is pretty neat library to solve problems like yours, it uses electron as web browser and You can run it with or without showing window (You can also open developer tools like in Google Chrome)
It has only one flaw if You want to run it on server without graphical interface that You must install at least framebuffer.
Nightmare has method like wait(cssSelector) that will wait until some element appears on website.
Your code would be something like:
const Nightmare = require('nightmare');
const nightmare = Nightmare({
show: true, // will show browser window
openDevTools: true // will open dev tools in browser window
});
const url = 'http://hakier.pl';
const selector = '#someElementSelectorWitchWillAppearAfterSomeDelay';
nightmare
.goto(url)
.wait(selector)
.evaluate(selector => {
return {
nextPage: document.querySelector(selector).getAttribute('href')
};
}, selector)
.then(extracted => {
console.log(extracted.nextPage); //Your extracted data from evaluate
});
//this variable will be injected into evaluate callback
//it is required to inject required variables like this,
// because You have different - browser scope inside this
// callback and You will not has access to node.js variables not injected
Happy hacking!

Categories