Download Video from URL without opening in chrome browser - javascript

I have registered for a course that has roughly 150 videos.
What I have done Uptil NOW:
There is no download button available right now.
In order to get the URL of each video file, I have created the script which I run through Console as below:
The site where I am watching these videos is different than the xxxxx marked site.
e.g. I am watching on linkedin learning and video is on lynda,etc.
console.log(("<h2>"+ document.title)+"</h2>"
+
" click here ");
document.getElementsByClassName("video-next-button")[0].click();
an example of output from above code is:
<h2>Overview of QGIS features: Learning QGIS (2015)</h2>
<a href="https://files3.xxxxx.com/secure/courses/383524/VBR_MP4h264_main_SD/383524_01_01_XR15_Overview.mp4?V0lIWk4afWPs3ejN5lxsCi1SIkGKYcNR_F7ijKuQhDmS1sYUK7Ps5TYBcV-MHzdVTujT5p03HP10F_kqzhwhqi38fhOAPnNJz-dMyvA2-YIpBOI-wGtuOjItlVbRUDn6QUWpwe1sRoAl__IA1zmJn3gPvC7Fu926GViqVdLa3oLB0mxRGa7i> click here </a>
I have replaced domain name with xxxxx
This way I can get cover all videos without clicking next (I would like to know if I can automate this process by using some timeout techniques as well)
each of this link, when clicked, chrome window looks like below:
this way after clicking 3dots -> Download, I can save video individually.
What I want:
Method to save all videos without the need to open individually.

Challenge
To begin with, fetching and saving large binary files is possible when:
The host server's CORS support is enabled.
Accessing the host's network from the same site-origin.
Server-to-Server.
Okay, this would reason why your anchor attempt did not work, in fact, accessing the host's network from your localhost will deny you from accessing the resource's content unless the host server's CORS support is enabled which is unlikely.
Workaround
Alternatively, this will leave us with the other two options, accessing from the same site-origin in particular due to its simplicity, the strategy lies in executing the fetching/saving script from the browser itself, thus, the host server will be gentle with the requests, since they are very similar to the ones coming from the same site.
Steps
Go to the site you wish to download the files from (I used https://www.sample-videos.com).
Right-click the web page and select 'Inspect' (Ctrl + Shift + I).
Finally, switch to the 'Console' tab to start coding.
Code
const downloadVideos = (videos, marker) => {
// it's important to throttle between requests to dodge performance or network issues
const throttleTime = 10000; // in milliseconds; adjust it to suit your hardware/network capabilities
const domain = 'https://www.sample-videos.com'; // site's domain
if (marker < videos.length) {
console.log(`Download initiated for video ${videos[marker].name} # marker:${marker}`);
const anchorElement = document.createElement('a');
anchorElement.setAttribute('href', `${domain}${videos[marker].src}`);
anchorElement.setAttribute('download', videos[marker].name);
document.body.appendChild(anchorElement);
// trigger download manually
anchorElement.click();
anchorElement.remove();
marker += 1;
setTimeout(downloadVideos, throttleTime, videos, marker);
}
};
// assuming all videos are stored in an array, each video must have 'src' and 'name' attributes
const videos = [
{ src: '/video123/mp4/480/big_buck_bunny_480p_30mb.mp4', name: 'video_480p.mp4' },
{ src: '/video123/mp4/720/big_buck_bunny_720p_1mb.mp4', name: 'video_720p.mp4' }
];
// fireup
downloadVideos(videos, 0);
... ahem!

Related

Prevent user from saving video files

I am currently building an application with ReactJs and MongoDB displaying videos. My problem is that I want to prevent the end user from saving those videos either by accessing the console and inspect it and therefore get the url of the video or by simply downloading it on their computer.
At the moment, I have a script that disconnects the user if he opens the devTools :
useEffect(() => {
console.log(Object.defineProperties(new Error, {
message: {get() {
setOpened(true)
}
},
toString: {value() {(new Error).stack.includes('toString#')&&alert('Safari')}}
}));
if (openned) {
logoutHandler()
}
}, []);
And other one to prevent right clicking :
useEffect(() => {
document.onkeydown = function (e) {
return false;
};
document.addEventListener('contextmenu', (e) => {
e.preventDefault();
setOpened(true)
});
}, []);
The problem is that with a simple add-on, https://addons.mozilla.org/fr/firefox/addon/absolute-enable-right-click/, the user can right click again and save the video by clicking on "Save video as...".
I have also thought of splitting the videos when I upload them and then kind of "stream" them, but I haven't found any proper documentation on the subject...
Currently the videos are stored into a Firebase bucket.
Would you have any advice on the matter please ?
IMO The best you can do is make it difficult and it sounds like you are doing a very good job of it. (minify your code to help prevent users from using local orverrides to get past your security, you can also look into DRM)
However you cannot 100% prevent a user from saving the video if they are left alone with the source.
The user can just setup a camera, screen record, packet sniff (wireshark), modify your javascript (local overrides), etc etc.
firewalls don't stop dragons and all that.
I believe the only way to do what you are trying to do is to monitor the users while they watch the videos.
If your users need accounts to view the videos, then flashing a few pixels at special spots to be decoded later as the users id is one way to track down who is leaking your videos and remove them, but even that has its issues (compression, cropping, watermarks etc).

iframe content doesn't always load

So I have a system that essentially enabled communication between two computers, and uses a WebRTC framework to achieve this:
"The Host": This is the control computer, and clients connect to this. They control the clients window.
"The Client": The is the user on the other end. They are having their window controlled by the server.
What I mean by control, is that the host can:
change CSS on the clients open window.
control the URL of an iframe on the clients open window
There are variations on these but essentially thats the amount of control there is.
When "the client" logs in, the host sends a web address to the client. This web address will then be displayed in an iframe, as such:
$('#iframe_id').attr("src", URL);
there is also the ability to send a new web address to the client, in the form of a message. The same code is used above in order to navigate to that URL.
The problem I am having is that on, roughly 1 in 4 computers the iframe doesn't actually load. It either displays a white screen, or it shows the little "page could not be displayed" icon:
I have been unable to reliably duplicate this bug
I have not seen a clear pattern between computers that can and cannot view the iframe content.
All clients are running google chrome, most on an apple powermac. The only semi-link I have made is that windows computers seem slightly more susceptible to it, but not in a way I can reproduce. Sometimes refreshing the page works...
Are there any known bugs that could possibly cause this to happen? I have read about iframe white flashes but I am confident it isn't that issue. I am confident it isn't a problem with jQuery loading because that produces issues before this and would be easy to spot.
Thanks so much.
Alex
edit: Ok so here is the code that is collecting data from the server. Upon inspection the data being received is correct.
conn.on('data', function(data) {
var data_array = JSON.parse(data);
console.log(data_array);
// initialisation
if(data_array.type=='init' && inititated === false) {
if(data_array.duration > 0) {
set_timeleft(data_array.duration); // how long is the exam? (minutes)
} else {
$('#connection_remainingtime').html('No limits');
}
$('#content_frame').attr("src", data_array.uri); // url to navigate to
//timestarted = data_array.start.replace(/ /g,''); // start time
ob = data_array.ob; // is it open book? Doesnt do anything really... why use it if it isnt open book?
snd = data_array.snd; // is sound allowed?
inititated = true;
}
}
It is definitele trying to make the iframe navigate somewhere as when the client launches the iframe changes - its trying to load something but failing.
EDIT: Update on this issue: It does actually work, just not with google forms. And again it isn't everybody's computers, it is only a few people. If they navigate elsewhere (http://www.bit-tech.net for example) then it works just fine.
** FURTHER UPDATE **: It seems on the ones that fail, there is an 'X-Frames-Origin' issue, in that its set the 'SAMEORIGIN'. I dont understand why some students would get this problem and some wouldn't... surely it depends upon the page you are navigating to, and if one person can get it all should be able to?
So the problem here was that the students were trying to load this behind a proxy server which has an issue with cookies. Although the site does not use cookies, the proxy does, and when the student had blocked "third party cookies" in their settings then the proxy was not allowing the site to load.
Simply allowed cookies and it worked :)
iframes are one of the last things to load in the DOM, so wrap your iframe dependent code in this:
document.getElementById('content_frame').onload = function() {...}
If that doesn't work then it's the document within the iframe. If you own the page inside the iframe then you have options. If not...setTimeout? Or window.onload...?
SNIPPET
conn.on('data', function(data) {
var data_array = JSON.parse(data);
console.log(data_array);
// initialisation
if (data_array.type == 'init' && inititated === false) {
if (data_array.duration > 0) {
set_timeleft(data_array.duration); // how long is the exam? (minutes)
} else {
$('#connection_remainingtime').html('No limits');
}
document.getElementById('content_frame').onload = function() {
$('#content_frame').attr("src", data_array.uri); // url to navigate to
//timestarted = data_array.start.replace(/ /g,''); // start time
ob = data_array.ob; // is it open book? Doesnt do anything really... why use it if it isnt open book?
snd = data_array.snd; // is sound allowed?
inititated = true;
}
}
}

How does this JavaScript open Windows Settings in Firefox?

After a new install of Firefox 45 Developer Edition, I saw this page. It has a button ("Let's do it") that when clicked, somehow opens up the Choose default apps settings page in Windows 10.
https://www.mozilla.org/en-US/firefox/windows-10/welcome/?utm_source=firefox-browser&utm_medium=firefox-browser
How is this done? I couldn't find anything through the Developer Console in the labyrinthine code on that page. Besides, I would have thought browsers don't allow JavaScript to open something as sensitive as the Settings app.
The page fires a custom event of type mozUITour on the document. This event is handled in the browser by content-UITour.js, which shovels out most of the actual processing to UITour.jsm. The unobfuscated client-side code can be viewed in UITour-lib.js.
Cutting through all the client-side abstraction, this is what’s happening:
document.dispatchEvent(new CustomEvent('mozUITour', {
bubbles: true,
detail: {
action: 'setConfiguration',
data: {
configuration: 'defaultBrowser'
}
}
}));
Then in the browser, it handles the event, dispatches the event in another internal event queue, where it will be processed by calling into nsIShellService::setDefaultBrowser, implemented by nsWindowsShellService.cpp. On what’s currently line 943, we have:
if (IsWin10OrLater()) {
rv = LaunchModernSettingsDialogDefaultApps();
} else {
rv = LaunchControlPanelDefaultsSelectionUI();
}
And LaunchModernSettingsDialogDefaultApps, I think, is a pretty descriptive function name.
Now, from your comment, “in a way that one could use it on their own page, for example”? Not so likely. content-UITour.js checks that the page has the uitour permission. From browser/app/permissions, we have:
# UITour
origin uitour 1 https://www.mozilla.org
origin uitour 1 https://self-repair.mozilla.org
origin uitour 1 https://support.mozilla.org
origin uitour 1 about:home
So unless you’re www.mozilla.org, self-repair.mozilla.org, support.mozilla.org, or about:home, you can’t do it, at least not by default. Before Firefox 15 (17 with a manual settings change, see this bug for more information), you might be able to use netscape.security.PrivilegeManager.enablePrivilege to request extra permissions from the browser, but that’s not around any more, and I’m not sure that even touches the same permission mechanism.

IE 9 and 10 yield unexpected and inconsistent MediaError's

We have a set of HTML blocks -- say around 50 of them -- which are iteratively parsed and have Audio objects dynamically added:
var SomeAudioWrapper = function(name) {
this.internal_player = new Audio();
this.internal_player.src = this.determineSrcFromName(name);
// ultimately an MP3
this.play = function() {
if (someOtherConditionsAreMet()) {
this.internal_player.play();
}
}
}
Suppose we generate about 40 to 80 of these on page load, but always the same set for a particular configuration. In all browsers tested, this basic strategy appears to work. The audio load and play successfully.
In IE's 9 and 10, a transient bug surfaces. On occasion, calling .play() on the inner Audio object fails. Upon inspection, the inner Audio object has a .error.code of 4 (MEDIA_ERR_SRC_NOT_SUPPORTED). The file's .duration shows NaN.
However, this only happens occasionally, and to some random subset of the audio files. E.g., usually file_abc.mp3 plays, but sometimes it generates the error. The network monitor shows a successful download in either case. And attempting to reload the file via the console also fails -- and no requests appears in IE's network monitor:
var a = new Audio();
a.src = "the_broken_file.mp3";
a.play(); // fails
a.error.code; // 4
Even appending a query value fails to refetch the audio or trigger any network requests:
var a = new Audio();
a.src = "the_broken_file.mp3?v=12345";
a.play(); // fails
a.error.code; // 4
However, attempting the load the broken audio file in a new tab using the same code works: the "unsupported src" plays perfectly.
Are there any resource limits we could be hitting? (Maybe the "unsupported" audio finishes downloading late?) Are there any known bugs? Workarounds?
I think we can pretty easily detect when a file fails. For other compatibility reasons we run a loop to check audio progress and completion stats to prevent progression through the app (an assessment) until the audio is complete. We could easily look for .error values -- but if we find one, what do we do about it!?
Addendum: I just found a related question (IE 9/10/11 sound file limit) that suggests there's an undocumented limit of 41 -- not sure whether that's a limit of "41 requests for audio files", "41 in-memory audio objects", or what. I have yet to find any M$ documentation on the matter -- or known solutions.
Have you seen these pages on the audio file limits within IE? These are specific to Sound.js, but the information may be applicable to your issue:
https://github.com/CreateJS/SoundJS/issues/40 ...
Possible solution as mentioned in the last comment: "control the maximum number of audio tags depending on the platform and reuse these instead of recreating them"
Additional Info: http://community.createjs.com/kb/faq/soundjs-faq (see the section entitled “I load a lot of sounds, why am running into errors in Internet Explorer?”)
I have not experienced this problem in Edge or IE11. But, I wrote a javascript file to run some tests by looping through 200 audio files and seeing what happens. What I found is that the problem for IE9 and IE10 is consistent between ALL tabs. So, you are not even guaranteed to be able to load 41 files if other tabs have audio opened.
The app that I am working on has a custom sound manager. Our solution is to disable preloading audio for IE9 and IE10 (just load on demand) and then when the onended or onpause callback gets triggered, to run:
this.src = '';
This will free up the number of audio that are contained in IE. Although I should warn that it may make a request to the current page the user is on. When the play method in the sound manager is called again, set the src and play it.
I haven't tested this code, but I wrote something similar that works. What I think you could do for your implementation, is resolve the issue by using a solution like this:
var isIE = window.navigator.userAgent.match(/MSIE (9|10)/);
var SomeAudioWrapper = function(name) {
var src = this.determineSrcFromName(name);
this.internal_player = new Audio();
// If the browser is IE9 or IE10, remove the src when the
// audio is paused or done playing. Otherwise, set the src
// at the start.
if (isIE) {
this.internal_player.onended = function() {
this.src = '';
};
this.internal_player.onpause = this.internal_player.onended;
} else {
this.internal_player.src = src;
}
this.play = function() {
if (someOtherConditionsAreMet()) {
// If the browser is IE, set the src before playing.
if (isIE) {
this.internal_player.src = src;
}
this.internal_player.play();
}
}
}

PhantomJS using too many threads

I wrote a PhantomJS app to crawl over a site I built and check for a JavaScript file to be included. The JavaScript is similar to Google where some inline code loads in another JS file. The app looks for that other JS file which is why I used Phantom.
What's the expected result?
The console output should read through a ton of URLs and then tell if the script is loaded or not.
What's really happening?
The console output will read as expected for about 50 requests and then just start spitting out this error:
2013-02-21T10:01:23 [FATAL] QEventDispatcherUNIXPrivate(): Can not continue without a thread pipe
QEventDispatcherUNIXPrivate(): Unable to create thread pipe: Too many open files
This is the block of code that opens a page and searches for the script include:
page.open(url, function (status) {
console.log(YELLOW, url, status, CLEAR);
var found = page.evaluate(function () {
if (document.querySelectorAll("script[src='***']").length) {
return true;
} else { return false; }
});
if (found) {
console.log(GREEN, 'JavaScript found on', url, CLEAR);
} else {
console.log(RED, 'JavaScript not found on', url, CLEAR);
}
self.crawledURLs[url] = true;
self.crawlURLs(self.getAllLinks(page), depth-1);
});
The crawledURLs object is just an object of urls that I've already crawled. The crawlURLs function just goes through the links from the getAllLinks function and calls the open function on all links that have the base domain of the domain that the crawler started on.
Edit
I modified the last block of the code to be as follows, but still have the same issue. I have added page.close() to the file.
if (!found) {
console.log(RED, 'JavaScript not found on', url, CLEAR);
}
self.crawledURLs[url] = true;
var links = self.getAllLinks(page);
page.close();
self.crawlURLs(links, depth-1);
From the documentation:
Due to some technical limitations, the web page object might not be completely garbage collected. This is often encountered when the same object is used over and over again.
The solution is to explicitly call close() of the web page object (i.e. page in many cases) at the right time.
Some included examples, such as follow.js, demonstrate multiple page objects with explicit close.
Open Files Limit.
Even with closing files properly, you might still run into this error.
After scouring the internets I discovered that you need to increase your limit of the number of files a single process is allowed to have open. In my case, I was generating PDFs with hundreds to thousands of pages.
There are different ways to adjust this setting based on the system you are running but here is what worked for me on an Ubuntu server:
Add the following to the end of /etc/security/limits.conf:
# Sets the open file maximum here.
# Generating large PDFs hits the default ceiling (1024) quickly.
* hard nofile 65535
* soft nofile 65535
root hard nofile 65535 # Need these two lines because the wildcards (above)
root soft nofile 65535 # are not applied to the root user as well.
A good reference for the ulimit command can be found here.
I hope that puts some people on the right track.
I had this error come up while running multiple threads in my ruby program.
I was running phantomjs with Capybara-poltergeist and each thread was visiting a page opening up the same CSV file and writing to it.
I was able to fix it by using the Mutex class.
lock = Mutex.new
lock.synchronize do
CSV.open("reservations.csv", "w") do |file|
file << ["Status","Name","Res-Code","LS-Num","Check-in","Check-out","Talk-URL"]
$status.length.times do |i|
file << [$status[i],$guest_name[i],$reservation_code[i],$listing_number[i],$check_in[i],$check_out[i], $talk_url[i]]
end
end
puts "#{user.email} PAGE NUMBER ##{p+1} WRITTEN TO CSV"
end
end

Categories