I'm writing a phonegap application (i.e. no cross domain restrictions for AJAX) that needs to parse RSS feeds (i.e. extract information), I was looking for an easy way to do this. I looked at this, which seems good, but, I would rather not connect to external sources since the app should run on pretty slow internet connections too and each extra connection is a problem. What do you guys suggest? JSON seems like an excellent idea, but, any direct ideas are great as well.
I had the same issue.
But I don't recommend processing the RSS on each call... This is madness.
Neither do I recommend loading the whole RSS as JSON... It's even worst.
Those techniques add delay to a connection which might be really slow.
What I did was a bit more complicated but you have full control over what you send.
I'll assume that before loading any articles you'll show a list of titles to choose from...
So, first of all, you need to parse the whole RSS with php (or other server side language) and you'll output JSON formatted text files:
1. A text file containing the list of all articles with their id and title (img path, date, if needed)
2. A text file for each article named rssfeed_[id]
You put a CRON task on that script and ensure that everything is gzipped.
Then you create a small php file to handle the name & id of your file as parameters (that you'll get from the list).
Finally in your application you call one unique php file that will get dynamically any file needed without any processing of XML (RSS) to JSON
Related
I'm working on an html page for my department at work. Just html and css nothing fancy. Now, we are trying to get data from another webpage to be displayed in the new one we are working on. I assume that I would need to use JavaScript and a parser of some sort but I'm not sure how to do this or what really to search for.
The solution I assume would exist is to have a function, feed it a link of the webpage we want to mine, and it would return (for example) the number of times a certain word was repeated in that webpage.
The best way to go for it is by using node.js and then installing cheerio (parser) and request (http request) module. There are many detailed tutorials showing how to do this (for e.g. this one at digital ocean).
But, if you don't want to have nodejs setup and want to work with plain web setup. Then, download cheerio and request js libraries and include them in your html page in tag and then follow above example. I hope it helps.
Say I have a news website with articles, I have a blank article page with everything BUT the headline, photos, and the text article itself that I would ordinarily fill in manually. Instead of filling it in, say I have the entire div class ripped from a web page already. I want to import this content directly onto the page and publish it with minimal steps.
(I hope I'm giving you the picture. Imagine I have cars fully built aside from missing engines and I want the monkeys I've hired to steal engines to not leave the engines piling up outside, but instead to also bring them inside and install them into the cars and drive them to the car dealer.)
I will be web scraping something like a Wikipedia page on golf and putting that into my page. I don't want to have to copy, paste and click publish over and over. I want the web scraper, which I already know how to build, to go another step and do a find and replace of a certain div class on my blank page website INSTEAD of writing the data on a file on my computer's hard drive (though maybe writing on my hard drive with Python, then having JS or something read the HTML file on my hard drive THEN writing it to my web page would be a way to do it.
Are there programs that will do this? Do you know of modules that will do this through Python? Do you know of anything like this somebody wrote and put up on GitHub?
I'm not planning on ripping off news websites, but just to give a simpler example with one object... If I had the entire div class "content" from here...
http://www.zerohedge.com/news/2017-02-18/merkel-says-there-problem-euro-blames-mario-draghi
saved as an HTML file on my hard drive (which you could look at by clicking 'inspect' anywhere on the text of the main article> right clicking copy> copy as outerHTML> and pasting as an HTML in your text editor (again, something I would have done with a web scraper), how could I get this pasted into a blank 'new article' page and published on my website with the push of a button automatically? I'm fine with having to click a few buttons but not copying and pasting.
I'll be doing this (legally) with parts of web pages over and over and over again and I'm sure this can be automated in some way. I've heard financial news websites have been writing articles from data so something like what I need probably exist. I might be running the text I scrape through a basic neural net or feeding it to GANs. I think some interesting things can be made this way in case you are curious what I'm up to.
If you're using Python to do this, the quickest way I feel would be to have the web crawler save it's findings to either a JSON file or SQL database that your website front-end shares access to (storing the HTML you pulled as a string of text).
If you go the JSON route, just send an AJAX request to it for the website and place it in using innerHTML on the element you're dumping the code into.
If you go the SQL route, just have a python script with the website that you can send a POST request to and have the python script pull the website data you want from the database and return it to the browser as JSON and do the same as the above.
The benefit of going straight to JSON is not having to setup connection to an SQL server and deal with the SQL query to JSON conversion step. However, the benefit of the SQL database is not having to worry about any issues writing to the JSON file if your crawler is working with multiple threads and may have write conflicts if you don't lock the file correctly.
I have html_page.php that contains my html page,
javascript.js contains ajax to call process.php which lookup data from MySQL and populate into html_page.php.
Is it possible for me to cache the HTML structure of html_page.php but not the data in the fields?
It is hard to give you a proper answer without you giving us a context you are working on, but according to your question tags, I assume you are talking about using PHP.
You should refer to this thread on implementing caching on a PHP site.
Typically, you will need to use plugins and extensions that don't come out-of-the-box with PHP.
Some web servers also have caching functionality built into them, you should look up the documentation of your web server.
I am creating a product that as end result will/can create e.g. 10 .sql files, each being a table. The tables will contain various pre-calculated data related to each other.
My users will need to upload these to their website (php, asp, whatever) and will need to make something useful. Only problem, the users may have next to zero understanding of databases, server-side code etc. This means it must be very easy to configure.
So I think thinking upload these .sql (or CSV files, whatever) tables to server, so they are publicly available (i.e. can be retrieved like any other public URL). And then find a Javascript in-memory database engine that can load .sql database files. Does this exist?
I imagine a Javascript solution could work well if amount of data could be kept somewhat down... Otherwise I may need to look for a PHP/ASP solution as well. (Any ideas for libraries that can init in-memory databases from .sql or similar files?)
Preferably I should be able to re-distribute this Javascript library. (So users can get a complete "directory" of .sql files + example page + Javascript database engine to upload)
So to make the question clear: Anyone knows a Javascript-based in-memory database engine that can run inside browser?
If you wish to use javascript and need some 'userfriendly' bridge database, you could use json or xml, because the format are simple text files (like csv as well) for wich you can find smart small editors for your users.
More json is made for javascript parsing and has an understanding tree format, but you should load only some part of sql datas in memory, saying data buffers in xml or json, with php requested with some javascript ajax call. Php do the sql database access work and then you can output json, and with javascript, it is for user's interface, you'll be able to display them.
You can use mysql to store a database in memory:
http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html
Here's a pure JS SQL engine that stores everything in memory, https://github.com/moxley/sqittle
It flatly denies being useful for anything though, and has a limited set of supported commands (see readme on above link.
http://diveintohtml5.ep.io/storage.html might be what you are looking for.
That question seems very old. You might want to look at LokiJS now.
I have a website with a table that is populated with data from various external XML feeds. The table is generated using Javascript as after some reading, I found that this seemed to be the best approach for creating an HTML table from XML data (please correct me if wrong!).
I now want to parse this HTML table in to an RSS feed and I'm struggling to find the best way to do so. I have php code that will parse an HTML table, but because this table is generated using JS (ie. client side) the PHP parser does not work. Can anyone tell me the best way to go about this?
As you've probably gathered, I'm quite new to programming so layman terms would be much appreciated where possible.
Thanks a lot.
I found that this seemed to be the best approach for creating an HTML table from XML data (please correct me if wrong!).
As a rule of thumb, if instant feedback isn't required (and it isn't if you are fetching data from multiple external sources), if you can do it server side, then do it server side. You only have one server side environment to deal with instead of dozens of different client side environments (some of which could have JS turned off).
I now want to parse this HTML table in to an RSS feed and I'm struggling to find the best way to do so. I have php code that will parse an HTML table, but because this table is generated using JS (ie. client side) the PHP parser does not work. Can anyone tell me the best way to go about this?
Write PHP to get the data from wherever the JS gets its data from. You already have the logic to query it in JS, so you should be able to do a fairly straight port of that.
It is not possible to generate an RSS feed from pure JavaScript, as most RSS clients don't speak JavaScript, and the standard doesn't provide for it - you won't be able to run the commands required to create the data.
Replicate the functionality of your JavaScript aggregator using some server-side language like PHP, and build an RSS feed from it. It will require rewriting your entire code, but probably is the best way to go.