How to get HTML code source from another site - javascript

How do I get the HTML code of another site it wants cookies to be enabled?
I just need to parse this page www.fx-trend.com/pamm/rating/
I'm using javascript jquery (jQMobile) and sometimes PHP.(I prefer to use js)
here is a sample with PHP:
<?php
$url = 'url';
$html = file_get_html($url);
//$html = file_get_contents($url);
echo $html;
?>
here is a sample with js:
How to get data with JavaScript from another server?
OR
$(this).load(url);
alert($(this)); //returns object Object
server answer:
Cookies must be enabled in your browser! Try to clear all cookies, if
cookies are enabled.
code samples are welcome.

Try using Curl and enable cookies. The code sample below is snagged from this page.
<?php
/* STEP 1. let’s create a cookie file */
$ckfile = tempnam ("/tmp", "CURLCOOKIE");
/* STEP 2. visit the homepage to set the cookie properly */
$ch = curl_init ("url");
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$output = curl_exec ($ch);
var_dump($output);
Edit: You might have to fake a browser by changing the default user agent header.

Related

Web scraping for dynamic content

I am trying to scrape the information from a couple sites (mega.nz, openlaod.co) and the content is loaded dynamically so the code i am actuallu using doesn't work
<?php
require 'simple_html_dom.php';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://openload.co/f/41I9Ak_QBxw/DPLA.mp4");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
curl_close($ch);
echo $response;
$html = new simple_html_dom();
$html->load($response);
foreach ($html->find('img[id=imagedisplay]') as $key ) {
echo $key;
}
?>
when i use it on openload (like the example above) it redirects me to "https://oload.download/scraping/" being "/scraping" the folder where i have my script at.
Is there any javascript/jquery framework (or php) that i can use to scrape the content on the fly??
It's not suitable for a large amount of scraping, but in the past when I've needed to grab some basic data from a dynamic web page I've found that Selenium works pretty well.
Depending on your stack of choice, I'd recommend looking into headless browsers. This way you can render a page in the background and parse the resulting HTML.

how can I use curl with javascript

i am using curl to open a page and want to play video using javascript that was shown on the page . i have used following code
$url = "https://www.example.com/";
$link = "http://www.example.com/oembed?url=" . $url. "&format=json";
$curl = curl_init($link);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$return = curl_exec($curl);
curl_close($curl);
$result = json_decode($return, true);
echo '<pre>'; print_r($result);
echo $result['html'];
play();
function play(){
document.getElementById("play-button").click();
}
my curl is working but it didn't play the video.where am iI wrong? do i have pass the x-path of the button to play video?
PHP scripts are executed on the server, while JavaScript is executed on the browser (Node.js is an exception). Thus your PHP code is already executed when the JS wanted to call the click action and there's no way that the PHP code will execute on the browser, thus the curl is not getting called.
What you need to do is call the URL using JavaScript asynchronously. You can either use Ajax or Fetch for this.

curl command not working on server.... any alternate of this?

<?php
if(isset($_POST["submit"]))
{
$adm=$_POST["admno"];
$phn=$_POST["phn1"];
include("model.php");
$db = new database;
$r=$db->register($adm);
while($row=mysql_fetch_array($r))
{
if($row["phn_no1"]==$phn || $row["phn_no2"]==$phn || $row["phn_no3"]==$phn)
{
$formatted = "".substr($phn,6,10)." ";
$password = $formatted + $adm;
echo $password;
$db->setpassword($adm,$password);
$pre = 'PREFIX';
$suf = '%20ThankYou';
$sms = $pre.$password.$suf;
session_start();
$ch = curl_init("http://www.perfectbulksms.in/Sendsmsapi.aspx? USERID=ID&PASSWORD=PASS&SENDERID=SID&TO=$phn&MESSAGE=$sms");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
$result = curl_exec($ch);
curl_close($ch);
header("Location:password.php?msg=new");
}
else
{
header("Location:register.php?msg=invalid");
}
}
}
?>
this code is working perfect on my local host .. but when i put it on server ... it takes lots of time but the code in curl command is not working it only refers to next page ... i checked that curl is enabled .. if i use only sms api without curl command it sends sms immidiately.... but i want to run both header and also want to hide my sms api.... is there any alternate of this ???
Check if simple wget or curl from server to SMS API working fine or not ?
bash~/$wget "http://www.perfectbulksms.in/Sendsmsapi.aspx? USERID=ID&PASSWORD=PASS&SENDERID=SID&TO=$phn&MESSAGE=$sms"
bash~/$curl "http://www.perfectbulksms.in/Sendsmsapi.aspx? USERID=ID&PASSWORD=PASS&SENDERID=SID&TO=$phn&MESSAGE=$sms"
If wget or curl is fine then something wrong with your code.
If wget or curl not working from server then might be port 80 is blocked by your ISP for outgoing traffic. Check with ISP for same.
Also you can try
telnet www.perfectbulksms.in 80
and see if its getting connected or not.

Using cURL to get script content [duplicate]

This question already has answers here:
How to get javascript-generated content from another website using cURL?
(2 answers)
Closed 7 years ago.
I'm using cURL to access a site. The problem is that content that I need to grab is generated by a script as:
function Button(){
...
document.getElementById("out").innerHTML = name;
}
<p id="out"></p>
With cURL, I have the code of the page but not the content.
I'm using this config:
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_REFERER, $referer);
curl_setopt($curl, CURLOPT_COOKIEFILE, $cookiefile);
$redirects=5000;
$data = curl_redirect_exec($curl,$redirects);
curl_close($curl);
I could get the content generate by the script.
You cannot get data rendered in JS from PHP CURL. What you need is a headless browser, something that runs client side scripts like Phantom.JS or Casper.JS which have the capability of running Client-Side JavaScript.

jquery for google tts no voice

When I use this code let google_tts speak word voice, the code is ok but have a problem. The word voice must listen http://translate.google.com/translate_tts?tl=en&q=dog(word) first then run this code the rusult is OK, but when I won't listen http://translate.google.com/translate_tts?tl=en&q=dog(word) first the code can't speak the word.I reference Google Translate TTS problem ,I want to know the real problem and how to fix it ?
In browser Firefox is that better but have above-mentioned problem
In IE is audio error: not support file type...
In Chorme is no any action,even //translate.google.com/translate_tts?tl=en&q=dog have no voice
I want to know how to fix let IE and Firefox browser run successful, thank a lot
HTML
<form id="say-form">
<button id="say-button">Say!</button>
<audio id="audio" preload controls>
<source id="s1" />
</audio>
</form>
JQuery
$('#say-form').submit(function(){
var ar = new Array("dog","egg","what","big")
var i=0,file = $("#audio")
console.log(ar[0])
$("#s1").attr("src", "http://translate.google.com/translate_tts?tl=en&q="+ar[0]).detach().appendTo("#audio");
file[0].load();
file[0].play();
i++;
// when it play end, play next word until ar array it's finish
file.on( "ended", function(){
if(i!=ar.length)
{
$("#s1").attr("src", "http://translate.google.com/translate_tts?tl=en&q="+ar[i]).detach().appendTo("#audio");
$(this)[0].load();
$(this)[0].play();
i++;
}
});
return false;
});
Why do not you use Php?
$text = urlencode('my text');
$url = "http://translate.google.com/translate_tts?ie=utf-8&tl=en&q=".$text;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1)");
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
$return = curl_exec($ch);
curl_close($ch);
echo $return;
?>
like this Google tts api giving me blank mp3
or this http://ctrlq.org/code/19147-text-to-speech-php

Categories