I'm using CakePHP 2.4.7 and the TinyMCE plugin from CakeDC.
I set up my CakePHP core along with the plugin in a shared location on my server so that multiple applications can access it. This keeps me from having to update multiple copies of TinyMCE. Everything was working well until I migrated to a new server and updated software.
The new server is running Apache 2.4 instead of 2.2 and using mod_ruid2 instead of suexec.
I now get this error when trying to load the editor:
Fatal Error (4): syntax error, unexpected T_CONSTANT_ENCAPSED_STRING in [/xyz/Plugin/TinyMCE/webroot/js/tiny_mce/tiny_mce.js, line 1]
How should I start debugging this?
Workaround Attempt
I tried adding a symlink from an application's webroot to TinyMCE's plugin webroot. This works in that it loads the js file and the editor, but then TinyMCE plugins are working on the wrong current directory and file management would not be separated.
The problem is the AssetDispatcher filter, it includes css and js files using PHPs include() statement, causing the files to be sent through the PHP parser, where it will stumble over the occurrences of <? in the TinyMCE script.
See https://github.com/.../2.4.7/lib/Cake/Routing/Filter/AssetDispatcher.php#L159-L160
A very annoying, and, since it's undocumented and non-optional, dangerous behavior if you ask me.
Custom asset dispatcher
In case you want to continue to use a plugin asset dispatcher, extend the built in one, and reimplement the AssetDispatcher::_deliverAsset() method with the include functionality removed. Of course this is kinda annoying, maintenance wise, but it's a pretty quick fix.
Something like:
// app/Routing/Filter/MyAssetDispatcher.php
App::uses('AssetDispatcher', 'Routing/Filter');
class MyAssetDispatcher extends AssetDispatcher {
protected function _deliverAsset(CakeResponse $response, $assetFile, $ext) {
// see the source of your CakePHP core for the
// actual code that you'd need to reimpelment
ob_start();
$compressionEnabled = Configure::read('Asset.compress') && $response->compress();
if ($response->type($ext) == $ext) {
$contentType = 'application/octet-stream';
$agent = env('HTTP_USER_AGENT');
if (preg_match('%Opera(/| )([0-9].[0-9]{1,2})%', $agent) || preg_match('/MSIE ([0-9].[0-9]{1,2})/', $agent)) {
$contentType = 'application/octetstream';
}
$response->type($contentType);
}
if (!$compressionEnabled) {
$response->header('Content-Length', filesize($assetFile));
}
$response->cache(filemtime($assetFile));
$response->send();
ob_clean();
// instead of the possible `include()` in the original
// methods source, use `readfile()` only
readfile($assetFile);
if ($compressionEnabled) {
ob_end_flush();
}
}
}
// app/Config/bootstrap.php
Configure::write('Dispatcher.filters', array(
'MyAssetDispatcher', // instead of AssetDispatcher
// ...
));
See also http://book.cakephp.org/2.0/en/development/dispatch-filters.html
Don't just disable short open tags
I'm just guessig here, but the reason why it was working on your other server probably is that short open tags (ie <?) where disabled. However even if that is the problem on your new server, this isn't something you should rely on, the assets are still being served using include(), and you most probably don't want to check all your third party CSS/JS for possible PHP code injections on every update.
Related
I have this stupid problem with Yii where my local dev caches css and js files. When I try edit the file, it doesn't show up the changes, but the fire does get corrupted and breaks everything. This happens for some indiscriminate amount of time and then it fixes itself.
My Yii config is like this for the assetManager:
$config['components']['assetManager']['forceCopy'] = true;
$config['components']['assetManager']['appendTimestamp'] = true;
$config['components']['assetManager']['linkAssets'] = true;
As you can see below, the JS file just ends after making a small colour change to one of the mouse over fields.
The timestamp doesn't seem to be appended to the JS file when including it like all the other resources.
<script src="/custom/infobox.js?v=1427807792"></script>
<script src="/js/neighbourhoods-map.js"></script>
<script src="/js/search-block.js?v=1423510537"></script>
The file is included by calling registerJsFile() in the view file.
$this->registerJsFile('/js/neighbourhoods-map.js', [
'depends' => ['\app\assets\MapsAsset'],
'position' => View::POS_END]
);
I changed the above to include a timestamp, but the problem is still happening.
$this->registerJsFile('/js/neighbourhoods-map.js?v='.time(), [
'depends' => ['\app\assets\MapsAsset'],
'position' => View::POS_END]
);
This is soooooo frustrating to deal with. Can anyone shed some light on what the issue is here?
I encountered this issue while I was using Vagrant with nginx.
The solution was to just turn off sendfile directive from nginx configuration.
sendfile off;
My question is: How to scrape data from this website http://vtis.vn/index.aspx But the data is not shown until you click on for example "Danh sách chậm". I have tried very hard and carefully, when you click on "Danh sách chậm" this is onclick event which triggers some javascript functions one of the js functions is to get the data from the server and insert it to a tag/place holder and at this point you can use something like firefox to examine the data and yes, the data is display to users/viewers on the webpage. So again, how can we scrap this data programmatically?
i wrote a scrapping function but ofcourse it does not get the data i want because the data is not available until i click on the button "Danh sách chậm"
<?php
$Page = file_get_contents('http://vtis.vn/index.aspx');
$dom_document = new DOMDocument();
$dom_document->loadHTML($Page);
$dom_xpath_admin = new DOMXpath($dom_document_admin);
$elements = $dom_xpath->query("*//td[#class='IconMenuColumn']");
foreach ($elements as $element) {
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo mb_convert_encoding($node->c14n(), 'iso-8859-1', mb_detect_encoding($content, 'UTF-8', true));
}
}
You need to look at PhantomJS.
From their site:
PhantomJS is a headless WebKit with JavaScript API. It has fast and
native support for various web standards: DOM handling, CSS selector,
JSON, Canvas, and SVG.
Using the API you can script the "browser" to interact with that page and scrape the data you need. You can then do whatever you need with it; including passing it to a PHP script if necessary.
That being said, if at all possible try not to "scrape" the data. If there is an ajax call the page is making, maybe there is an API you can use instead? If not, maybe you can convince them to make one. That would of course be much easier and more maintainable than screen scraping.
First, you need PhantomJS. Suggested install method on Linux:
wget https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-2.1.1-linux-x86_64.tar.bz2
tar xvf phantomjs-2.1.1-linux-x86_64.tar.bz2
cp phantomjs-2.1.1-linux-x86_64/bin/phantomjs /usr/local/bin
Second, you need the php-phantomjs package. Assuming you have installed Composer:
composer require jonnyw/php-phantomjs
Or follow installation documentation here.
Third, Load the package to your script, and instead of file_get_contents, you will load the page via PhantomJS
<?php
require ('vendor/autoload.php');
$client = Client::getInstance();
$client->getEngine()->setPath('/usr/local/bin/phantomjs');
$client = Client::getInstance();
$request = $client->getMessageFactory()->createRequest();
$response = $client->getMessageFactory()->createResponse();
$request->setMethod('GET');
$request->setUrl('https://www.your_page_embeded_ajax_request');
$client->send($request, $response);
if($response->getStatus() === 200) {
echo "Do something here";
}
I'd like to inject a couple of local .js files into a webpage. I just mean client side, as in within my browser, I don't need anybody else accessing the page to be able to see it. I just need to take a .js file, and then make it so it's as if that file had been included in the page's html via a <script> tag all along.
It's okay if it takes a second after the page has loaded for the stuff in the local files to be available.
It's okay if I have to be at the computer to do this "by hand" with a console or something.
I've been trying to do this for two days, I've tried Greasemonkey, I've tried manually loading files using a JavaScript console. It amazes me that there isn't (apparently) an established way to do this, it seems like such a simple thing to want to do. I guess simple isn't the same thing as common, though.
If it helps, the reason why I want to do this is to run a chatbot on a JS-based chat client. Some of the bot's code is mixed into the pre-existing chat code -- for that, I have Fiddler intercepting requests to .../chat.js and replacing it with a local file. But I have two .js files which are "independant" of anything on the page itself. There aren't any .js files requested by the page that I can substitute them for, so I can't use Fiddler.
Since your already using a fiddler script, you can do something like this in the OnBeforeResponse(oSession: Session) function
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("MY.TargetSite.com") ) {
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
// Remove any compression or chunking
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
// Find the end of the HEAD script, so you can inject script block there.
var oRegEx = oRegEx = /(<\/head>)/gi
// replace the head-close tag with new-script + head-close
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>console.log('We injected it');</script></head>");
// Set the response body to the changed body string
oSession.utilSetResponseBody(oBody);
}
Working example for www.html5rocks.com :
if ( oSession.oResponse.headers.ExistsAndContains("Content-Type", "html") &&
oSession.hostname.Contains("html5rocks") ) { //goto html5rocks.com
oSession.oResponse.headers.Add("DEBUG1_WE_EDITED_THIS", "HERE");
oSession.utilDecodeResponse();
var oBody = System.Text.Encoding.UTF8.GetString(oSession.responseBodyBytes);
var oRegEx = oRegEx = /(<\/head>)/gi
oBody = oBody.replace(oRegEx, "<script type='text/javascript'>alert('We injected it')</script></head>");
oSession.utilSetResponseBody(oBody);
}
Note, you have to turn streaming off in fiddler : http://www.fiddler2.com/fiddler/help/streaming.asp and I assume you would need to decode HTTPS : http://www.fiddler2.com/fiddler/help/httpsdecryption.asp
I have been using fiddler script less and less, in favor of fiddler .Net Extensions - http://fiddler2.com/fiddler/dev/IFiddlerExtension.asp
If you are using Chrome then check out dotjs.
It will do exactly what you want!
How about just using jquery's jQuery.getScript() method?
http://api.jquery.com/jQuery.getScript/
save the normal html pages to the file system, add the js files manually by hand, and then use fiddler to intercept those calls so you get your version of the html file
I have about 100 static HTML pages that I want to apply some DOM manipulations to. They all follow the same HTML structure. I want to apply some DOM manipulations to each of these files, and then save the resulting HTML.
These are the manipulations I want to apply:
# [start]
$("h1.title, h2.description", this).wrap("<hgroup>");
if ( $("h1.title").height() < 200 ) {
$("div.content").addClass('tall');
}
# [end]
# SAVE NEW HTML
The first line (.wrap()) I could easily do with a find and replace, but it gets tricky when I have to determine the calculated height of an element, which can't be easily be determined sans-JavaScript.
Does anyone know how I can achieve this? Thanks!
While the first part could indeed be solved in "text mode" using regular expressions or a more complete DOM implementation in JavaScript, for the second part (the height calculation), you'll need a real, full browser or a headless engine like PhantomJS.
From the PhantomJS homepage:
PhantomJS is a command-line tool that packs and embeds WebKit.
Literally it acts like any other WebKit-based web browser, except that
nothing gets displayed to the screen (thus, the term headless). In
addition to that, PhantomJS can be controlled or scripted using its
JavaScript API.
A schematic instruction (which I admit is not tested) follows.
In your modification script (say, modify-html-file.js) open an HTML page, modify it's DOM tree and console.log the HTML of the root element:
var page = new WebPage();
page.open(encodeURI('file://' + phantom.args[0]), function (status) {
if (status === 'success') {
var html = page.evaluate(function () {
// your DOM manipulation here
return document.documentElement.outerHTML;
});
console.log(html);
}
phantom.exit();
});
Next, save the new HTML by redirecting your script's output to a file:
#!/bin/bash
mkdir modified
for i in *.html; do
phantomjs modify-html-file.js "$1" > modified/"$1"
done
I tried PhantomJS as in katspaugh's answer, but ran into several issues trying to manipulate pages. My use case was modifying the static html output of Doxygen, without modifying Doxygen itself. The goal was to reduce delivered file size by remove unnecessary elements from the page, and convert it to HTML5. Additionally I also wanted to use jQuery to access and modify elements more easily.
Loading the page in PhantomJS
The APIs appear to have changed drastically since the accepted answer. Additionally, I used a different approach (derived from this answer), which will be important in mitigating one of the major issues I encountered.
var system = require('system');
var fs = require('fs');
var page = require('webpage').create();
// Reading the page's content into your "webpage"
// This automatically refreshes the page
page.content = fs.read(system.args[1]);
// Make all your changes here
fs.write(system.args[2], page.content, 'w');
phantom.exit();
Preventing JavaScript from Running
My page uses Google Analytics in the footer, and now the page is modified beyond my intention, presumably because javascript was run. If we disable javascript, we can't actually use jQuery to modify the page, so that isn't an option. I've tried temporarily changing the tag, but when I do, every special character is replaced with an html-escaped equivalent, destroying all javascript code on the page. Then, I came across this answer, which gave me the following idea.
var rawPageString = fs.read(system.args[1]);
rawPageString = rawPageString.replace(/<script type="text\/javascript"/g, "<script type='foo/bar'");
rawPageString = rawPageString.replace(/<script>/g, "<script type='foo/bar'>");
page.content = rawPageString;
// Make all your changes here
rawPageString = page.content;
rawPageString = rawPageString.replace(/<script type='foo\/bar'/g, "<script");
Adding jQuery
There's actually an example on how to use jQuery. However, I thought an offline copy would be more appropriate. Initially I tried using page.includeJs as in the example, but found that page.injectJs was more suitable for the use case. Unlike includeJs, there's no <script> tag added to the page context, and the call blocks execution which simplifies the code. jQuery was placed in the same directory I was executing my script from.
page.injectJs("jquery-2.1.4.min.js");
page.evaluate(function () {
// Make all changes here
// Remove the foo/bar type more easily here
$("script[type^=foo]").removeAttr("type");
});
fs.write(system.args[2], page.content, 'w');
phantom.exit();
Putting it All Together
var system = require('system');
var fs = require('fs');
var page = require('webpage').create();
var rawPageString = fs.read(system.args[1]);
// Prevent in-page javascript execution
rawPageString = rawPageString.replace(/<script type="text\/javascript"/g, "<script type='foo/bar'");
rawPageString = rawPageString.replace(/<script>/g, "<script type='foo/bar'>");
page.content = rawPageString;
page.injectJs("jquery-2.1.4.min.js");
page.evaluate(function () {
// Make all changes here
// Remove the foo/bar type
$("script[type^=foo]").removeAttr("type");
});
fs.write(system.args[2], page.content, 'w');
phantom.exit();
Using it from the command line:
phantomjs modify-html-file.js "input_file.html" "output_file.html"
Note: This was tested and working with PhantomJS 2.0.0 on Windows 8.1.
Pro tip: If speed matters, you should consider iterating the files from within your PhantomJS script rather than a shell script. This will avoid the latency that PhantomJS has when starting up.
you can get your modified content by $('html').html() (or a more specific selector if you don't want stuff like head tags), then submit it as a big string to your server and write the file server side.
I'm working as a client-side developer on a web app and found myself having lots of modules, each one for different pages. I wonder is it a good idea to do include depending on route myself (in javascript) or pass this responsibility to back-end (ruby on rails) guys.
I suppose I need some application.js to be included in every page, and in it do something like this:
if (window.location.href == '.../somePage') {
loadScript('somePageControls.js')
}
if (window.location.href == '.../anotherPage') {
...
}
Any thoughts?
ok, this is not a simple answer at all, since there are you and there are theese ruby-guys, hope everything is ok beetwen you! ;)
although if you don't want to bother too much theese guys, ( i know it not easy to work in team ) you can ask to include just one ruby file from your working folder ex.
/web-app/ruby-guys/you/ruby-js.rb
now you can work with this unique file and load from here all the js file you need. by using for eaxmple a switch-case
case url-query
when x
# print out drag-drop.js
when y, z
# print out mouse-move.js
You should pass that on to the server-side guys. It's a lot cleaner to output that sort script inclusion server side, since they can just add it to the pages you need it on rather than having a big if-else.
$0.02 from me ...
If you'd like to minimize round trips to a (Rails) server, you could simply use a hash tag at the end of your urls, and parse the hash tag to determine which modules were necessary. This would allow you to put your page on a CDN if necessary, and enable you to dynamically change which modules were loaded on a page just by having a server redirect to the same page with a different hash tag.
E.g.
// Example URL: http://example.com/index.html#mod1,mod2,mod3,mod4
var loadModules = function() {
var modsArr, i, len;
if (!location.hash) return false;
modsArr = location.hash.substr(1).split(','); // ["mod1", "mod2", "mod3", "mod4"]
if (!modsArr.length) return false;
for (i=0, len = modsArr.length; i < len; i++) {
// Now you can load your mods.
}
}