I am trying to make a web app that will figure out if one or many e-commerce items are out of stock from their url(s) entered by user. These urls can be seperated by commas. Currently, I make ajax calls to my one of my PHP scripts for each url after spliting them by comma in a javascript loop. Below is the code for that:
function sendRequest(urls) {
if (urls.length == 0) {
return;
} else {
var A = urls.split(',');
for (var i = 0; i < A.length; i++) {
var xmlhttp = new XMLHttpRequest();
xmlhttp.onreadystatechange = function () {
if (this.readyState == 4 && this.status == 200) {
var result_set = JSON.parse(this.responseText);
if (result_set.flag == 1) {
insertRow('stock-table', result_set.url, result_set.title); // It populates a table and insert row in it.
}
}
};
xmlhttp.open("GET", "scrapper.php?url=" + A[i], true);
xmlhttp.send();
}
}
}
The scrapper.php goes like:
<?php
function get_title($data)
{
$title = preg_match('/<title[^>]*>(.*?)<\/title>/ims', $data, $matches) ? $matches[1] : null;
return $title;
}
if (!isset($_SESSION['username'])) {
header("Location: index.php");
}
else if (isset($_GET["url"])) {
$url = $_GET["url"];
$title = null;
$result_set = null;
$flag = 0;
$file = fopen($url,"r");
if (!$file) {
echo "<p>Unable to open remote file.\n";
exit;
}
while (!feof($file)) {
$line = fgets($file, 1024);
if ($title == null){
$title = get_title($line);
}
if (preg_match('/<span[^>]*>Add to Cart/i',
$line, $matches, PREG_OFFSET_CAPTURE)) {
break;
}
if (preg_match('/Sold Out|Out of Stock/i',
$line, $matches, PREG_OFFSET_CAPTURE)) {
$flag = 1;
break;
}
}
fclose($file);
$result_set = array("flag" => $flag,
"url" => $url,
"title" => $title
);
echo json_encode($result_set);
}
?>
Now problem is: This program takes too much time even for two urls. Although, I moved from file_get_contents()(which was even slower) to here (ftp solution) . I have few confusion is my mind:
Considering my javascript, is it like sending one ajax call, waiting for its response and then second one?
If point one is not true, will scrapper.php be able to respond to second call from the loop? since it is busy with handeling first ajax call computation.
If point 2 is true, how can I make it multi-threaded such that ajax keeps sending the call until loop is finished and scrapper.php activates different threads for each call to then reply back to client once a thread completes its execution? (How can a make to pool of limted threads and grant new ajax response once a threads compltes its execution. Since, I have 200 urls. So, making 200 threads must not be an optimal solution)
Is it a good solution if I insert all urls (around 200) into the database, and then fetch all of them to make multi-threaded executions. In that case, how can i reply back multiple results from multiple threads against a single ajax call?
Please Help
No. XMLHttpRequest defaults to async, which means every new request with said async will execute in parallel.
Completely depends on how you're running PHP. In typical setups - and it's unlikely you're doing otherwise, your http server will wait for an available PHP worker thread from a thread pool, or execute a PHP binary directly. Either way, more than one PHP program can execute at once. (Think about how a regular website works. You need to be able to support more than one user at a time.)
N/A
If I'm understanding correctly, you just want to handle all requests in one Ajax call? Just send a list of all the URLs in the request, and loop server-side. Your current way of doing it is fine. Most likely the "slow" nature can be attributed to your connection to the remote URLs.
Some other notes:
I would validate the URL before passing it into fopen, especially considering the user can pass simply pass in a relative path and start reading your "private" files.
I'd switch back to file_get_contents. It's pretty much equivalent to fopen but does much of the work for you.
Not sure if intentional, but I'd use the newer const keyword instead of var for the XMLHttpRequest variable in the for loop's inner block. Currently, the var gets hoisted to the top of the function scope and you are simply overwriting it every iteration of the loop. If you want to add more logic to the XMLHttpRequest, you may find yourself prone to some unintentional behaviour.
Related
I'm trying to GET a large amount of data from the API (over 300k records). It has pagination (25 records per page) and request limit is 50 request per 3 minutes. I'm using PHP curl to get the data. The API needs JWT token authorization. I can get a single page and put its records into an array.
...
$response = curl_exec($curl);
curl_close($curl);
$result = json_decode($response, true);
The problem is I need to get all records from all pages and save it into array or file. How to do it? Maybe I should use JS to do it better?
Best regards and thank you.
Ideally use cron and some form of storage, database or a file.
It is important that you ensure a new call to the script doesn't start unless the previous one has finished, otherwise they start stacking up and after a few you will start having server overload, failed scripts and it gets messy.
Store a value to say the script is starting.
Run the CURL request.
Once curl has been returned and data is processed and stored change the value you stored at the beginning to say the script has finished.
Run this script as a cron in the intervals you deem necessary.
A simplified example:
<?php
if ($script_is_busy == 1) exit();
$script_is_busy = 1;
// YOUR CURL REQUEST AND PROCESSING HERE
$script_is_busy = 0;
?>
I would use a series of requests. A typical request takes at most 2 seconds to fulfill, so 50 requests per 3oo secs does not require parallel requests. Still you need to measure time and wait if you don't want to be banned for DoS. Note that even with parallelism, curl supports it as far as I remember. When you reach the request limit you must use the sleep function to wait until you can send new requests. For PHP the real problem that it is a long running job, so you need to change settings, otherwise it will timeout. You can do it this way: Best way to manage long-running php script? As of nodejs, I think it is a lot better solution for this kind of async tasks, because the required features come naturally with nodejs without extensions and such things, though I am biased towards it.
Okay. I misinterpreted what you needed. I have more questions.
Can you do one request and get your 50 records immediately? That is assuming when you said 50 requests per 3 minutes you meant 50 records.
Why do you think there is this 50/3 limitation?
Can you provide a link to this service?
Is that 50 records per IP address?
Is leasing 5 or 6 IP addresses an option?
Do you pay for each record?
How many records does this service have total?
Do the records have a time limit on their viability.
I am thinking if you can use 6 IP addresses (or 6 processes) you can run the 6 requests simultaneously using stream_socket_client().
stream_socket_client allows you to make simultaneous requests.You then create a loop that monitors each socket for a response.
About 10 years ago I made an app that evaluated web page quality. I ran
W3C Markup Validation
W3C CSS Validation
W3C Mobile OK
WebPageTest
My own performance test.
I put all the URLs in an array like this:
$urls = array();
$path = $url;
$url = urlencode("$url");
$urls[] = array('host' => "jigsaw.w3.org",'path' => "/css-validator/validator?uri=$url&profile=css3&usermedium=all&warning=no&lang=en&output=text");
$urls[] = array('host' => "validator.w3.org",'path' => "/check?uri=$url&charset=%28detect+automatically%29&doctype=Inline&group=0&output=json");
$urls[] = array('host' => "validator.w3.org",'path' => "/check?uri=$url&charset=%28detect+automatically%29&doctype=XHTML+Basic+1.1&group=0&output=json");
Then I'd make the sockets.
foreach($urls as $path){
$host = $path['host'];
$path = $path['path'];
$http = "GET $path HTTP/1.0\r\nHost: $host\r\n\r\n";
$stream = stream_socket_client("$host:80", $errno,$errstr, 120,STREAM_CLIENT_ASYNC_CONNECT|STREAM_CLIENT_CONNECT);
if ($stream) {
$sockets[] = $stream; // supports multiple sockets
$start[] = microtime(true);
fwrite($stream, $http);
}
else {
$err .= "$id Failed<br>\n";
}
}
Then I monitored the sockets and retrieved the response from each socket.
while (count($sockets)) {
$read = $sockets;
stream_select($read, $write = NULL, $except = NULL, $timeout);
if (count($read)) {
foreach ($read as $r) {
$id = array_search($r, $sockets);
$data = fread($r, $buffer_size);
if (strlen($data) == 0) {
// echo "$id Closed: " . date('h:i:s') . "\n\n\n";
$closed[$id] = microtime(true);
fclose($r);
unset($sockets[$id]);
}
else {
$result[$id] .= $data;
}
}
}
else {
// echo 'Timeout: ' . date('h:i:s') . "\n\n\n";
break;
}
}
I used it for years and it never failed.
It would be easy to gather the records and paginate them.
After all sockets are closed you can gather the pages and send them to your user.
Do you think the above is viable?
JS is not better.
Or did you mean 50 records each 3 minutes?
This is how I would do the pagination.
I'd organize the response into pages of 25 records per page.
In the query results while loop I'd do this:
$cnt = 0;
$page = 0;
while(...){
$cnt++
$response[$page][] = $record;
if($cnt > 24){$page++, $cnt = 0;}
}
header('Content-Type: application/json');
echo json_encode($response);
I want to make a progress bar on my website, which tracks execution of a PHP script.
The PHP script makes a bunch of connections with Google API and stores the data it receives in the database. Sometimes the process can take a minute.
The PHP script is located in ajax/integrations-ajax.php file and launched by GET AJAX request sent, if on the website to click #link button. Below is jQuery code for the request:
$('#link').on('click', function () {
var interval = setInterval(trackStatus, 1000);
$.getJSON('ajax/integrations-ajax.php', {action: 'link'}).done(function (json) {
if (json.result == true) {
showMessage('The account is successfully linked.', 'success');
} else {
showMessage('There is an error happened.', 'danger');
}
})
});
This #link button, also sets interval which fires trackStatus function each second:
function trackStatus() {
$.getJSON('ajax/status-ajax.php', {
action: 'status'
}).done(function (json) {
console.log(json.status);
});
}
As you can see, trackStatus function sends GET AJAX requests to ajax/status-ajax.php file and should show status in browser console every second.
To implement tracking ability on the server I made the PHP script in ajax/integrations-ajax.php file to store status in the database. Its code you can see below:
<?php
if(!is_ajax_request()) { exit; }
$action = isset($_GET['action']) ? (string) $_GET['action'] : '';
if ($action == 'link') {
set_status_in_database(0);
// some execution code;
set_status_in_database(1);
// some execution code;
set_status_in_database(2);
// some execution code;
set_status_in_database(3);
// some execution code;
echo json_encode(['result' => true ]);
}
And created another PHP file axax/status-ajax.php which can recover the status from the database:
<?php
if(!is_ajax_request()) { exit; }
$action = isset($_GET['action']) ? (string) $_GET['action'] : '';
if ($action == 'status') {
$return['result'] = get_status_from_database();
echo json_encode($return);
}
But the requests appear not to be working simultaneously. I can't receive responses for trackStatus function until the response on completion ajax/integrations-ajax.php script isn't received.
I made a profiling record in browser, which show that:
So, is there a possibility to execute requests simultaneously? Or to implement the tracking ability I need to rethink the whole approach?
Thanks in advance for help!
Update
Thank you all for your advice! And especially to #Keith, because his solution is the easiest and works. I have put session_write_close() function in the beginning for the script and everything works:
<?php
if(!is_ajax_request()) { exit; }
$action = isset($_GET['action']) ? (string) $_GET['action'] : '';
if ($action == 'link') {
session_write_close();
set_status_in_database(0);
// some execution code;
set_status_in_database(1);
// some execution code;
set_status_in_database(2);
// some execution code;
set_status_in_database(3);
// some execution code;
echo json_encode(['result' => true ]);
}
Here you can see profiling record from a browser:
While PHP can handle concurrent requests without issue, one area that does get serialized is the Session, basically PHP during a request will place an exclusive lock on the SESSION, for that user. IOW: While this lock is on, other requests from the same user will have to wait. This is normally not an issue, but if you have long running requests it will block other requests, like AJax requests etc.
As a default PHP will write session data at then end off the request,. But if you are sure you no longer need to write any session data, calling session_write_close will release the lock much sooner.
More info here -> http://php.net/manual/en/function.session-write-close.php
Would advise trying EventSource. Here is an example.
PHP
<?php
header('Content-Type: text/event-stream');
// recommended to prevent caching of event data.
header('Cache-Control: no-cache');
function send_message($id, $message, $progress) {
$d = array('message' => $message , 'progress' => $progress);
echo "id: $id" . PHP_EOL;
echo "data: " . json_encode($d) . PHP_EOL;
echo PHP_EOL;
ob_flush();
flush();
}
for($i=0; $i<4; $i++){
set_status_in_database($i);
// some execution code;
send_message($i, "set status in database " . $i + 1 . " of 3' , $i*4);
sleep(1);
}
send_message('CLOSE', 'Process complete');
?>
JavaScript
var es;
function startTask() {
es = new eventSource('ajax/status-ajax.php');
es.addEventListener('message', function(e) {
var result = JSON.parse(e.data);
console.log(result.message);
if(e.lastEventId == 'CLOSE') {
console.log('Received CLOSE closing');
es.close();
showMessage('The account is successfully linked.', 'success');
} else {
$('.progress').css("width", result.progress + '%');
}
});
es.addEventListener('error', function(e) {
console.log('Error occurred', e);
es.close();
});
}
function stopTask() {
es.close();
console.log('Interrupted');
}
$('#link').on('click', function(e) {
e.preventDefault();
startTask($(this));
});
Reference:
Show Progress Report for Long-running PHP Scripts
Hope that is useful for you.
Both php logic and JavaScript syntax seem to be fine; however, with the minimal amount of php code example it is assumed that it’s resource heavy. MySQL might be busy, which is why get status may wait for MySQL.
I have gone around such a problem by making the update status written to a file instead of competing for database resources.
Since you consider using a different approach, let me recommend GraphQL as a thin layer / api above your database.
There are quite a few Php-solutions out there, for example Siler. Look for one that has subscriptions (not all do), as this would be the feature you are looking for. Subscriptions are used to create a websocket (stream between your Php and Javascript), reducing all status-related communication to one call.
Yes, this may be "shooting cannons at birds", but maybe you have other things flying around, then it might be worth considering. There is a fantastic document to familiarize with the intriguing concept. You'd be able to reuse most of your database-related Php within the resolver functions.
I have page with customers and with ajax im loading info on whether they send us email or not.
Code looks like this:
$hostname = '{imap.gmail.com:993/imap/ssl}INBOX';
$username = 'email';
$password = 'password';
$this->session->data['imap_inbox'] = $inbox = imap_open($hostname,$username,$password) or die('Cannot connect to Gmail: ' . imap_last_error());
foreach($customers as $customer){
$emails = imap_search($inbox, 'FROM ' . $email);
// Processing info
}
But there are roughly 20-30 customers on one page, so the proccess takes sometimes about 10-20 seconds to show and I was unable to optimize the process.
But when client tries to reload a page, it is still waiting before imap_search finishes, so when reloading it could take 20 seconds before the page is actually reloaded.
I have tried to abort the ajax with beforeunload function and close the imap but this is not working.
My code:
Ajax:
$(window).bind('beforeunload',function(){
imap_email.abort(); // the ajax is succesfully aborted(as showed in console), yet the page still takes considerable time to reload
$.ajax({
type: 'GET',
url: 'getimapmails&kill=1',
async:false
}); // ajax call to the same function to call imap_close
});
PHP:
if($this->request->get['kill'] == '1'){
imap_close($this->session->data['imap_inbox']);
unset($this->session->data['imap_inbox']);
$kill == 1;
exit;
}
But even though the ajax is aborted and imap_close is called on variable holding imap_open, it still takes 10-20 seconds for page to reload, so I'm assuming the imap was not closed.
How do I close the imap so the page can reload immediately?
I would recommend killing it by creating a file that causes a break:
$hostname = '{imap.gmail.com:993/imap/ssl}INBOX';
$username = 'email';
$password = 'password';
$this->session->data['imap_inbox'] = $inbox = imap_open($hostname,$username,$password) or die('Cannot connect to Gmail: ' . imap_last_error());
foreach($customers as $customer){
clearstatcache(); //Can't use the cached result.
if(file_exists('/tmp/kill_imap.'.$this->session->id)) break; //making the assumption that /tmp and session->id are set, but the idea is a temporary folder and a unique identifier to that session.
$emails = imap_search($inbox, 'FROM ' . $email);
// Processing info
}
if(file_exists('/tmp/kill_imap.'.$this->session->id)) unlink('/tmp/kill_imap.'.$this->session->id);
Then on your exit ajax, just call to a php script that simply creates that file. and it will break your loop and remove the file.
If I understood correctly, the time-consuming code lies within the foreach() loop.
Now, even if you make a second request to kill the IMAP session, that foreach() loop will continue until either it finishes or PHP kills it if (and when) execution time exceeds your max_execution_time setting.
In any case, you need something within your foreach() loop that will check on each round if a condition to abort has been met, so as to switfly terminate the current request and allow the client to make new one.
I suggest you look at the PHP function connection_aborted(), that you could use to detect once the client aborts the current request, and more generally you could read on the topic of connection handling to get a better sense of how connections and requests are handled in PHP.
I wrote some PHP code to help me connect to a REST API for a telephone system (ie. ICWS.php.)
Then to make my life easier, I wrote a small script (ie. interations.php) that accepts two parameters: a method and an ID. This script will basically call a public method in my PHP connector.
In addition, I have another script (ie. poll.php). This script will ping the API once every half a second to see if there is a new message available. I am using server-side polling to handle this. The code below will show how poll.php
while(1){
//process Messages
$icws->processMessages();
//show the Calls Queue
$result = $icws->getCallsQueue();
//$current = $icws->getCurrentUserStatusQueue();
echo 'event: getMessagingQueue' . "\n";
echo 'data: ' . json_encode( array('calls' => $result));
echo "\n\n"; //required
ob_flush();
flush();
putToSleep($sleepTime);
}
function putToSleep($val = 1){
if( is_int($val) ){
sleep($val);
} else {
usleep($val * 1000000);
}
}
From my site (ie. phonesystem.html) I start server-side polling "which pings the API once every 1/2 seconds." From the same page, I can also make other direct calls (ie. Dial 7204536695); all requests are done via Ajax.
Here is my code that generates the server-side polling
//Server Side Message Polling
function startPolling(){
var evtSource = new EventSource("poll.php");
evtSource.addEventListener("getMessagingQueue", function(e) {
var obj = JSON.parse(e.data);
if(!obj.calls || obj.calls.length === 0){
console.log('no messages');
phoneKeyPad(false);
return;
}
processMessages(obj.calls);
}, false);
}
$(function(){
startPolling();
});
The problem that I am facing is that when making an ajax call, the response takes way too long (+1 minute).
It seems that the Apache server slows down as using other application becomes a little slower.
What can I check and how can I trouble shoot this problem?
I've Looked at ajax progress bar solutions on this site,
However i can't find anything to help with the below,
basically,
[browser] ----> makes ajax call to [server running php]
my php code
<?php
//input and process ajax request
//[MySQL] processing 5000 rows of a table
//after 5000/5000 row, end of ajax call
?>
Now, I need to show the user the status of the rows being processed
eg Processing::::::::: 2322/5000. row
The code for processing Mysql looks like this
foreach($result_set as $table_row)
{
// process each table row
}
How am i going to let the client know of this progress via xmlhttprequest?
cannot use JQUERY, only Javascript
Tried implementing the below,
the server continously echos the progress in a loop,
<?php
//progress
while($i < $total_progress)
{
//php processing
if(#ob_get_contents()){ ob_end_flush(); flush(); }
echo $i; // progress
$i++;
}
?>
The Data (variable $i) is continously being printed and pushed into the response,
Is there a way i can read the printed variable from the server,constantly before the ajax call ends?
This can be done with a combination of AJAX, sessions, and javascript. I've never made any graphical versions of a bar, but I have made text-based ones in the past.
In your client side page, have AJAX make a call to your page that will do the table processing. Have this server side page start a session and create a session variable to track what your progress is at (e.g. $_SESSION['processedRows']). Now inside your processing loop after you processed a row, increase the value of your session var and then make a call to session_write_close(). This will write your session info to disk and then close the session. This is important, as I will explain in a minute. You must also make another call to session_start() again prior to updating your session variable in the next iteration of your foreach loop.
Back on the client side, you will have another AJAX request being sent on an interval (I use setInterval() in javascript for this) which will call up another server side page that will open a session and return the value of $_SESSION['processedRows']. You can take this value and use it update your counter, progress bar, etc. You could return several values as JSON or an HTML snippet, doesn't matter. Just make sure that you have a callback method of some kind in place that will kill the interval that has been setup on the client side. It is important that you call session_write_close() in your long running script. If you do not then your concurrent script to check the progress will not be able to access your session variables and will lag while it waits for the processing script to end. The reason is that the session file has not been released until you call session_write_close().
EDIT:
Here is a very basic, no frills example of what you are looking for. This doesn't have much in regards to error checking, timeouts, etc. But it should illustrate what you want done. No jQuery either, as per you question. For this you will need three files. First you need your long running script, this one basically puts PHP to sleep very briefly 5000 times. It is to illustrate your table operations.
startCounter.php
<?php
//Start a long-running process
$goal = 5000;
session_start();
for($i = 0; $i < $goal; $i++) {
session_start(); //Reopen session to continue updating the progress
if(!isset($_SESSION['progress'])) {
$_SESSION['progress'] = $i;
$_SESSION['goal'] = $goal;
}
//wait a wink
usleep(rand(500, 2000));
$_SESSION['progress'] = $i;
session_write_close(); //Close session to gain access to progress from update script
header_remove();
}
?>
Next you need an update script, this one is very simple, it will only output a string with the progress update.
updateCounter.php
<?php
//Get the progress
session_start();
print("Processed " . $_SESSION['progress'] . " of " . $_SESSION['goal'] . " rows.");
?>
Next you need your client side ready to go with AJAX. You need to have a way to start the long running script, and then while that request is incomplete, you set and interval to hit the server up for the current progress. Just make sure you kill the interval when operation is done. Here is the test page:
progress.html
<!DOCTYPE html>
<html>
<head>
<title>Progress Test</title>
<script language="javascript">
function go()
{
var xhr = new XMLHttpRequest();
xhr.open("GET", "startCounter.php", true);
interval = window.setInterval( function () {
update();
}, 200);
xhr.onreadystatechange = function () {
if(xhr.readyState == 4) {
window.clearInterval(interval);
//Extremely likely that the AJAX update function won't make it in time to catch
//the script at 5000 before it wins. So let's display a message instead showing
//that everything is done.
document.getElementById("updateme").innerHTML = "Operation has completed.";
}
}
xhr.send();
}
function update()
{
var xhr = new XMLHttpRequest();
xhr.onreadystatechange = function () {
if(xhr.readyState == 4 && xhr.status == 200) {
document.getElementById("updateme").innerHTML = xhr.responseText;
}
}
xhr.open("GET", "updateCounter.php", true);
xhr.send();
}
</script>
</head>
<body>
<form>
<input type="button" onclick="go()" value="Go!" />
</form>
<h3>AJAX update test</h3>
<h4 id="updateme"></h4>
</body>
</html>
Hope that gets you going in the right direction.