I can successfully scrape all the items on this page using this script:
$html = file_get_contents($list_url);
$doc = new DOMDocument();
libxml_clear_errors(); // remove errors for yucky html
$xpath = new DOMXPath($doc);
$products = array();
$row = $xpath->query($product_location);
/* Create an array containing products */
if ($row->length > 0)
foreach ($row as $location)
$product_urls[] = $product_url_root . $location->getAttribute('href');
else { echo "product location is wrong<br>";}
$imgs = $xpath->query($photo_location);
/* Create an array containing the image links */
if ($imgs->length > 0)
foreach ($imgs as $img)
$photo_url[] = $photo_url_root . $img->getAttribute('src');
else { echo "photo location is wrong<br>";}
$was = $xpath->query($was_price_location);
/* Create an array containing the was price */
if ($was->length > 0)
foreach ($was as $price)
$stripped = preg_replace("/[^0-9,.]/", "", $price->nodeValue);
$was_price[] = "£".$stripped;
else { echo "was price location is wrong<br>";}
$now = $xpath->query($now_price_location);
/* Create an array containing the sale price */
if ($now->length > 0)
foreach ($now as $price)
$stripped = preg_replace("/[^0-9,.]/", "", $price->nodeValue);
$stripped = number_format((float)$stripped, 2, '.', '');
$now_price[] = "£".$stripped;
else { echo "now price location is wrong<br>";}
$result = array();
/* Create an associative array containing all the above values */
foreach ($product_urls as $i => $product_url)
$result[] = array(
'product_url' => $product_url,
'shop_name' => $shop_name,
'photo_url' => $photo_url[$i],
'was_price' => $was_price[$i],
'now_price' => $now_price[$i]
However, a problem arises if I want to get page two, or if I view 100 per page file_get_contents($list_url) will always return page one with its 24 values.
I presume that the page changes are being handled via AJAX request (though I can't find any evidence of this in the source). Is there a way to scrape exactly what I see on the screen?
I've seen talk of PhantomJS in previous answers but I'm not sure it'd be appropriate here given that I'm working in PHP.
It's because of a hashtag in the link which is generated by some js script. Turn off javascript for that site and check the output links it generates.
For example for page two it is http://www.hm.com/gb/subdepartment/sale?page=1
// Create DOM from URL or file
$file= file_get_html('http://stackoverflow.com/');
// Find your links
foreach($file->find('a') as $youreEement) {
echo $yourElement->href . '<br>';
(Sorry for my bad English)
I'm trying to change a quantity value inside an array after clicking a button
I tried searching for help on the web, but all topics I found don't use session and that bunch of things that I found in that code (I found that code on the internet)
if (isset($_POST["add_to_cart"])) {
if (isset($_SESSION["shopping_cart"])) {
$item_array_id = array_column($_SESSION["shopping_cart"], "item_id");
if (!in_array($_GET["id"], $item_array_id)) {
$count = count($_SESSION["shopping_cart"]);
$item_array = [
'item_id' => $_GET["id"],
'item_name' => $_POST["hidden_name"],
'item_price' => $_POST["hidden_price"],
'item_quantity' => $_POST["quantity"],
$_SESSION["shopping_cart"][$count] = $item_array;
} else {
echo '<script>alert("Item Already Added")</script>';
echo '<script>window.location="foodlist.php"</script>';
When I click submit, this "add_to_cart" is set and all information is sent to another page, but when I click it again to add 1 more item (the same item I clicked before) the code doesn't make a sum. I tried a lot of things in this else, but even my teacher couldn't help me :/
Here you have cart class sample:
class Cart
private $cart = array();
function __construct()
$this->cart = $_SESSION['cart'];
public function addProduct($id, $price)
$this->cart['products'][$id]['price'] = $price;
$this->cart['products'][$id]['quantity'] = $this->cart['products'][$id]['quantity'] + 1;
public function removeProduct($id)
$this->cart['products'][$id]['quantity'] = $this->cart['products'][$id]['quantity'] - 1;
if($this->cart['products'][$id]['quantity'] == 0){
public function delProduct($id)
public function showCart()
echo "<pre>";
echo "</pre>";
function sumCart(){
$sum = 0;
foreach ($this->cart['products'] as $k => $item) {
$sum = $sum + ((float) $item['price'] * (int) $item['quantity']);
return $sum;
// Save cart to session
public function saveCart()
$_SESSION['cart'] = $this->cart;
$cart = new Cart();
// Add products
// Save to session
// Show cart
// Show sum
echo "Sum " . $cart->sumCart();
Try like this.
Ok where to start, I will try and explain as much as I can.
I am using wordpress with contact form 7 and I am trying to populate 3 dropdown items on the contact form, I found some code that I was able to use with no problem but the problem with this was that it was getting the information from a excel file, the file is now to big and will not run on my website anymore so I would like to get the information from my database now.
I have made a table in my database "vehicle_information" with 3 columns "vehicle_type", "vehicle_make", vehicle_model"
I have code in my functions.php and code in my footer to be able to use the cf7 shortcodes.
Code from funtions.php
function ajax_cf7_populate_values() {
//MySQLi information
$db_host = '***';
$db_username = '***';
$db_password = '***';
$vehicles_makes_models = array();
//connect to mysqli database (Host/Username/Password)
$connection = mysqli_connect($db_host, $db_username, $db_password) or die('Error ' . mysqli_error());
//select MySQLi dabatase table
$vehicles_makes_models = mysqli_select_db($connection, 'vehicle_information') or die('Error ' . mysqli_error());
$sql = mysqli_query($connection, 'SELECT * FROM vehicle_type');
while($row = mysqli_fetch_array($sql)) {
$vehicles_makes_models[$row[0]][$row[1]][] = $row[2]; }
// setup the initial array that will be returned to the the client side script as a JSON object.
$return_array = array(
'vehicles' => array_keys($vehicles_makes_models),
'makes' => array(),
'models' => array(),
'current_vehicle' => false,
'current_make' => false
// collect the posted values from the submitted form
$vehicle = key_exists('vehicle', $_POST) ? $_POST['vehicle'] : false;
$make = key_exists('make', $_POST) ? $_POST['make'] : false;
$model = key_exists('model', $_POST) ? $_POST['model'] : false;
// populate the $return_array with the necessary values
if ($vehicle) {
$return_array['current_vehicle'] = $vehicle;
$return_array['makes'] = array_keys($vehicles_makes_models[$vehicle]);
if ($make) {
$return_array['current_make'] = $make;
$return_array['models'] = $vehicles_makes_models[$vehicle][$make];
if ($model) {
$return_array['current_model'] = $model;
// encode the $return_array as a JSON object and echo it
echo json_encode($return_array);
// These action hooks are needed to tell WordPress that the cf7_populate_values() function needs to be called
// if a script is POSTing the action : 'cf7_populate_values'
add_action( 'wp_ajax_cf7_populate_values', 'ajax_cf7_populate_values' );
add_action( 'wp_ajax_nopriv_cf7_populate_values', 'ajax_cf7_populate_values' );
Code from my footer
(function($) {
// create references to the 3 dropdown fields for later use.
var $vehicles_dd = $('[name="vehicles"]');
var $makes_dd = $('[name="makes"]');
var $models_dd = $('[name="models"]');
// run the populate_fields function, and additionally run it every time a value changes
$('select').change(function() {
function populate_fields() {
var data = {
// action needs to match the action hook part after wp_ajax_nopriv_ and wp_ajax_ in the server side script.
'action' : 'cf7_populate_values',
// pass all the currently selected values to the server side script.
'vehicle' : $vehicles_dd.val(),
'make' : $makes_dd.val(),
'model' : $models_dd.val()
// call the server side script, and on completion, update all dropdown lists with the received values.
$.post('<?php echo admin_url( 'admin-ajax.php' ) ?>', data, function(response) {
all_values = response;
$vehicles_dd.html('').append($('<option>').text(' -- choose vehicle -- '));
$makes_dd.html('').append($('<option>').text(' -- choose make -- '));
$models_dd.html('').append($('<option>').text(' -- choose model -- '));
$.each(all_values.vehicles, function() {
$option = $("<option>").text(this).val(this);
if (all_values.current_vehicle == this) {
$.each(all_values.makes, function() {
$option = $("<option>").text(this).val(this);
if (all_values.current_make == this) {
$.each(all_values.models, function() {
$option = $("<option>").text(this).val(this);
if (all_values.current_model == this) {
})( jQuery );
The problem is I am still learning and this is the first time I have had to use this funtion.
and I am getting an error on my website
Warning: array_keys() expects parameter 1 to be array, null given in /customers/4/0/0/motobid.co.uk/httpd.www/wp-content/themes/storevilla-child/functions.php on line 38 {"vehicles":null,"makes":[],"models":[],"current_vehicle":false,"current_make":false}
any help would be very greatful.
Just like to say code was supplied by BDMW.
Where you use the method array_keys(), instead of:
$return_array['makes'] = array_keys($vehicles_makes_models[$vehicle]);
Try this:
$return_array['makes'] = ! empty($vehicles_makes_models[$vehicle]) ? array_keys($vehicles_makes_models[$vehicle]) : [];
From what I've read, the array_keys() has been an issue depending on php versions. Hope this helps!
What I am trying to do is get scripts from body tag but only scripts that have text not script links
eg. <script type="text/javascript">console.log("for a test run");</script>
not the scripts that have file src.
And I want to place those scripts to end of page before </body>.
So far I have
echo "<pre>";
echo "reaches 1 <br />";
//work for inpage scripts
$mainBody = #$dom->getElementsByTagName('body')->item(0);
foreach (#$dom->getElementsByTagName('body') as $head) {
echo "reaches 2";
foreach (#$head->childNodes as $node) {
echo "reaches 3";
if ($node instanceof DOMComment) {
if (preg_match('/<script/i', $node->nodeValue)){
$src = $node->nodeValue;
echo "its a node";
if ($node->nodeName == 'script' && $node->attributes->getNamedItem('type')->nodeValue == 'text/javascript') {
if (#$src = $node->attributes->getNamedItem('src')->nodeValue) {
// yay - $src was true, so we don't do anything here
} else {
$src = $node->nodeValue;
echo "its a node2";
if (isset($src)) {
$move = ($this->params->get('exclude')) ? true : false;
foreach ($omit as $omitit) {
if (preg_match($omitit, $src) == 1) {
$move = ($this->params->get('exclude')) ? false : true;
if ($move)
$moveme[] = $node;
foreach ($moveme as $moveit) {
echo "Moving";
if ($pretty) {
$mainBody = $xhtml ? $dom->saveXML() : $dom->saveHTML();
Update 1
The problem is <script type="text/javascript"> can also be in div or can be in nested divs. So as using foreach #$head->childNodes only gets the top html tags and do not scan the inner tags that may contain <script> tags. I don't understand how to get all required script tags.
And there is no error but there also has no script tags on top nodes.
Update 2
After an answer of xpath, thanks for the answer. There is some progress in task. But now after moving of scripts to footer, I can't delete/remove original script tags.
Here is the updated code I have so far:
echo "<pre>3";
// echo "reaches 1 <br />";
//work for inpage scripts
$xpath = new DOMXPath($dom);
$script_tags = $xpath->query('//body//script[not(#src)]');
foreach ($script_tags as $tag) {
// var_dump($tag->nodeValue);
$moveme[] = $tag;
$mainBody = #$dom->getElementsByTagName('body')->item(0);
foreach ($moveme as $moveItScript) {
// var_dump($moveItScript->parentNode);
// $moveItScript->parentNode->removeChild($moveItScript);
/* try{
if ($pretty) {
}catch (Exception $ex){
echo "</pre>";
Update 3
I was working for Joomla, was trying to move scripts to footer of the page. I had used the scriptsdown plugin, which moved the scripts from head tag to bottom. but the scripts with in the mid page were not moved to the bottom, so that what was causing the inpage scripts to not respond properly.
My problem is now solved. Posting my solution code so if it might help someone in future.
function onAfterRender() {
$app = JFactory::getApplication();
$doc = JFactory::getDocument();
/* test that the page is not administrator && test that the document is HTML output */
if ($app->isAdmin() || $doc->getType() != 'html')
$pretty = (int)$this->params->get('pretty', 0);
$stripcomments = (int)$this->params->get('stripcomments', 0);
$sanitize = (int)$this->params->get('sanitize',0);
$debug = (int)$app->getCfg('debug',0);
if($debug) $pretty = true;
$omit = array();
/* now we know this is a frontend page and it is html - begin processing */
/* first - prepare the omit array */
if (strlen(trim($this->params->get('omit'))) > 0) {
foreach (explode("\n", $this->params->get('omit')) as $omitme) {
$omit[] = '/' . str_replace(array('/', '\''), array('\/', '\\\''), trim($omitme)) . '/i';
$moveme = array();
$dom = new DOMDocument();
$dom->recover = true;
$dom->substituteEntities = true;
if ($pretty) {
$dom->formatOutput = true;
} else {
$dom->preserveWhiteSpace = false;
$source = JResponse::getBody();
/* DOMDocument can get quite vocal when malformed HTML/XHTML is loaded.
* First we grab the current level, and set the error reporting level
* to zero, afterwards, we return it to the original value. This trickery
* is used to keep the logs clear of DOMDocument protests while loading the source.
* I promise to set the level back as soon as I'm done loading source...
if(!$debug) $erlevel = error_reporting(0);
$xhtml = (preg_match('/XHTML/', $source)) ? true : false;
switch ($xhtml) {
case true:
case false:
if(!$debug) error_reporting($erlevel); /* You see, error_reporting is back to normal - just like I promised */
if ($pretty) {
$newline = $dom->createTextNode("\n");
if($sanitize && !$debug && !$pretty) {
if ($stripcomments && !$debug) {
$comments = $this->_domComments($dom);
foreach ($comments as $node)
if (!preg_match('/\[endif]/i', $node->nodeValue)) // we don't remove IE conditionals
if ($node->parentNode->nodeName != 'script') // we also don't remove comments in javascript because some developers write JS inside of a comment
$body = #$dom->getElementsByTagName('footer')->item(0);
foreach (#$dom->getElementsByTagName('head') as $head) {
foreach (#$head->childNodes as $node) {
if ($node instanceof DOMComment) {
if (preg_match('/<script/i', $node->nodeValue))
$src = $node->nodeValue;
if ($node->nodeName == 'script' && $node->attributes->getNamedItem('type')->nodeValue == 'text/javascript') {
if (#$src = $node->attributes->getNamedItem('src')->nodeValue) {
// yay - $src was true, so we don't do anything here
} else {
$src = $node->nodeValue;
if (isset($src)) {
$move = ($this->params->get('exclude')) ? true : false;
foreach ($omit as $omitit) {
if (preg_match($omitit, $src) == 1) {
$move = ($this->params->get('exclude')) ? false : true;
if ($move)
$moveme[] = $node;
foreach ($moveme as $moveit) {
if ($pretty) {
//work for inpage scripts
$xpath = new DOMXPath($dom);
$script_tags = $xpath->query('//body//script[not(#src)]');
$mainBody = #$dom->getElementsByTagName('body')->item(0);
foreach ($script_tags as $tag) {
$body = $xhtml ? $dom->saveXML() : $dom->saveHTML();
In order to get ONLY the <script> nodes that dont have the src attribute you better use the DOMXPath:
$xpath = new DOMXPath($dom);
$script_tags = $xpath->query('//body//script[not(#src)]');
The variable $script_tags is now a DOMNodeList object that contains all of your script tags.
You can now loop over the DOMNodeList to get all the nodes and do whatever you would like to do with them:
foreach ($script_tags as $tag) {
$moveme[] = $tag;
I have the following HTML fragment, using PHP and JavaScript:
<!DOCTYPE html>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.0/jquery.min.js"></script>
var imageIndex = 0; // index into imageNames array
var imageHeight = 400; // height of image; changed by user clicking size buttons
var imageNames; // names of images user can view in this album
function pageLoaded() // execute this when the page loads.
// PHP -- generate the array of image file names
function getImageNames($directory)
$handle = opendir($directory); // looking in the given directory
$file = readdir($handle); // get a handle on dir,
while ($file !== false) // then get names of files in dir
$files[] = $file;
$file = readdir($handle);
if ($files[0] === ".") { unset($files[0]); } // Unix specific?
if ($files[1] === "..") { unset($files[1]); }
foreach($files as $index => $file) // only keep files with image extensions
{ $pieces = explode(".", $file);
$extension = strtolower(end($pieces));
if ($extension !== "jpg") { unset($files[$index]); }
$files = array_values($files); // reset array
natcasesort($files); // and sort it.
return $files;
<?php $imageDirectory = $_GET['directory'] . '/';
$imageNames = getImageNames($imageDirectory);
imageNames = <?php echo json_encode($imageNames); ?>;
imageHeight = 400;
imageIndex = 0;
reloadImage(); // loads the first image based on height and index
There is more after this, but this part doesn't refer to anything there, and my problem already exists by this point in the HTML output.
The problem is that, 5 lines from the end, I do a json_encode of an array of filenames. The output I get from this looks thusly:
imageNames = [{"59":"01-hornAndMusic.JPG","58":"02-DSC_0009.JPG","57":"03-DSC_0010.JPG","56":"04-Discussion.JPG","55":"05-DSC_0015.JPG","54":"06-DSC_0016.JPG","53":"07-DSC_0019.JPG","52":"08-strings.JPG","51":"09-strings2.JPG","50":"10-rehearsing.JPG","49":"11-StringsBigger2-001.JPG","48":"12-DSC_0041.JPG","47":"13-DSC_0046.JPG","46":"14-ensemble.JPG","45":"15-ensemble2.JPG","44":"16-DSC_0052.JPG","43":"17-rehearsing3.JPG","42":"18-rehearsing4.JPG","41":"19-rehearsing-001.JPG","40":"20-stringsBigger2.JPG","39":"21-rehearsing-002.JPG","38":"22-rehearsing-003.JPG","37":"23-ensemble3.JPG","36":"24-winds.JPG","35":"25-rehearsing-004.JPG","34":"26-stringsEvenBigger.JPG","33":"27-concentration.JPG","32":"28-concertMistress2.JPG","31":"29-stringsMore.JPG","30":"30-stringsMore-001.JPG","29":"31-stringsMore-002.JPG","28":"32-stringsMore-003.JPG","27":"33-stringsMore-004.JPG","26":"34-stringsMore-005.JPG","25":"35-DSC_0076.JPG","24":"36-stringsMore-007.JPG","23":"37-stringsMore-008.JPG","22":"38-stringsMore-009.JPG","21":"39-oboes.JPG","20":"40-winds-001.JPG","19":"41-DSC_0085.JPG","18":"42-DSC_0086.JPG","17":"43-percussion.JPG","16":"44-DSC_0088.JPG","15":"45-violinAtRest.JPG","14":"46-laughterInTheWoodwinds.JPG","13":"47-conducting-001.JPG","12":"48-DSC_0095.JPG","11":"49-DSC_0096.JPG","10":"50-AllTogetherNow.JPG","9":"51-DSC_0106.JPG","8":"52-horns.JPG","7":"53-DSC_0111.JPG","6":"54-conducting.JPG","5":"55-conducting-002.JPG","4":"56-conducting-003.JPG","3":"57-conducting-005.JPG","2":"58-DSC_0120.JPG","1":"59-DSC_0122.JPG","0":"60-everybody.JPG"}];
so I have the keys as well as the values of this hybrid PHP map/array thingie. What I want is just the values, put into a string array in the JavaScript.
I've gotten this to work sometimes, but not others, and I don't know the difference.
I think applying array_values function on $imageNames before encoding them should do the trick.
imageNames = <?php echo json_encode(array_values($imageNames)); ?>;
I'd do this:
imageNames = <?php echo json_encode(array_values($imageNames)); ?>;
Apologies for the generic title.
Essentially, when the script runs 'error' is alerted as per the jQuery below. I have a feeling this is being caused by the structuring of my JSON, but I'm not sure how I should change it.
The general idea is that there are several individual items, each with their own attributes: product_url, shop_name, photo_url, was_price and now_price.
Here's my AJAX request:
url : 'http://www.comfyshoulderrest.com/shopaholic/rss/asos_f_uk.php?id=1',
type : 'POST',
data : 'data',
dataType : 'json',
success : function (result)
var result = result['product_url'];
error : function ()
Here's the PHP that generates the JSON:
function scrape($list_url, $shop_name, $photo_location, $photo_url_root, $product_location, $product_url_root, $was_price_location, $now_price_location, $gender, $country)
header("Access-Control-Allow-Origin: *");
$html = file_get_contents($list_url);
$doc = new DOMDocument();
libxml_clear_errors(); // remove errors for yucky html
$xpath = new DOMXPath($doc);
$products = array();
$row = $xpath->query($product_location);
/* Create an array containing products */
if ($row->length > 0)
foreach ($row as $location)
$product_urls[] = $product_url_root . $location->getAttribute('href');
$imgs = $xpath->query($photo_location);
/* Create an array containing the image links */
if ($imgs->length > 0)
foreach ($imgs as $img)
$photo_url[] = $photo_url_root . $img->getAttribute('src');
$was = $xpath->query($was_price_location);
/* Create an array containing the was price */
if ($was->length > 0)
foreach ($was as $price)
$stripped = preg_replace("/[^0-9,.]/", "", $price->nodeValue);
$was_price[] = "£".$stripped;
$now = $xpath->query($now_price_location);
/* Create an array containing the sale price */
if ($now->length > 0)
foreach ($now as $price)
$stripped = preg_replace("/[^0-9,.]/", "", $price->nodeValue);
$now_price[] = "£".$stripped;
$result = array();
/* Create an associative array containing all the above values */
foreach ($product_urls as $i => $product_url)
$result = array(
'product_url' => $product_url,
'shop_name' => $shop_name,
'photo_url' => $photo_url[$i],
'was_price' => $was_price[$i],
'now_price' => $now_price[$i]
echo json_encode($result);
echo "this is empty";
$dbhost = "xxx";
$dbname = "xxx";
$dbuser = "xxx";
$dbpass = "xxx";
$con = mysqli_connect("$dbhost", "$dbuser", "$dbpass", "$dbname");
if (mysqli_connect_errno())
echo "Failed to connect to MySQL: " . mysqli_connect_error();
$id = $_GET['id'];
$result = mysqli_query($con, "SELECT * FROM scrape WHERE id = '$id'");
while($row = mysqli_fetch_array($result))
$list_url = $row['list_url'];
$shop_name = $row['shop_name'];
$photo_location = $row['photo_location'];
$photo_url_root = $row['photo_url_root'];
$product_location = $row['product_location'];
$product_url_root = $row['product_url_root'];
$was_price_location = $row['was_price_location'];
$now_price_location = $row['now_price_location'];
$gender = $row['gender'];
$country = $row['country'];
scrape($list_url, $shop_name, $photo_location, $photo_url_root, $product_location, $product_url_root, $was_price_location, $now_price_location, $gender, $country);
The script works fine with this much simpler JSON:
{"ajax":"Hello world!","advert":null}
You are looping over an array and generating a JSON text each time you go around it.
If you concatenate two (or more) JSON texts, you do not have valid JSON.
Build a data structure inside the loop.
json_encode that data structure after the loop.
If i have to guess you are echoing multiple json strings which is invalid. Here is how it should work:
$result = array();
/* Create an associative array containing all the above values */
foreach ($product_urls as $i => $product_url)
// Append value to array
$result[] = array(
'product_url' => $product_url,
'shop_name' => $shop_name,
'photo_url' => $photo_url[$i],
'was_price' => $was_price[$i],
'now_price' => $now_price[$i]
echo json_encode($result);
In this example I am echoing the final results only once.
You are sending post request but not sending post data using data
url : 'http://www.comfyshoulderrest.com/shopaholic/rss/asos_f_uk.php?id=1',
type : 'POST',
data : {anything:"anything"}, // this line is mistaken
dataType : 'json',
success : function (result)
var result = result['product_url'];
error : function ()