I am new to JavaScript and am pretty sure I am missing something fundamental in using JSfrom a HTML page (to be browsed by a web browser).
My goal is to scrap photo links from a dynamic website using cheerio and display them in a js gadget (e.g., using lightslider), it looks quite successful following this tutorial to obtain the following script and run it by simply nodejs scrapt.js in a bash terminal:
var request = require('request');
var cheerio = require('cheerio');
request('https://outbox.eait.uq.edu.au/uqczhan2/Photos/', function (error, respo
if (!error && response.statusCode == 200) {
console.log(html);
}
});
But now I am not able to run this script in a general webbrowser (by pressing f12 -> console), as error shows after the first syntax:
>var request = require('request');
VM85:1 Uncaught ReferenceError: require is not defined
at <anonymous>:1:15
I understood some JavaScript modules is required to be loaded before using them, for example for d3.js. i need to run:
<script src="https://d3js.org/d3.v4.min.js"></script>
to use all the d3 function. how should I achieve the same thing that would allow me to use cheerio in a web browser?
You cannot run node.js code directly in the browser. Look into browserify, this is a module that allows you to run node.js code in the browser.
Cheerio uses a library that requires process, i.e. the Node process object, not available in the browser.
browserify works, however.
Source: Endless headaches trying to get cheerio to work with Webpack.
This is an xy problem. You may assume that to parse HTML in the browser, you should use Cheerio, a Node.js HTML parser. The problem is, you can't run Node.js code in the browser without a build tool like browserify to mock require and make it possible.
However, before embarking on adding a build process, it's worth taking a step back and realizing that the browser already has a native HTML parser that requires no packages, plus jQuery, which is an easy <script> tag include away and requires no build process or workarounds. In fact, Cheerio was invented purely to port jQuery syntax to an environment that doesn't have a DOM, Node.js.
So instead of essentially porting jQuery to Node, then back to the browser in a Rube Goldbergian manner, just use jQuery or the native DOM directly. These are the original native browser tools that preceded Cheerio.
request isn't necessary in the browser, either. It's another Node package not intended for browser environments. As above, you can use jQuery or a native fetch call to make your HTTP request.
Taking another step back, though: most servers set a CORS policy to prohibit browser clients on different origins from making cross-origin HTTP requests to their resources. You may need a server running Node and Express to circumvent this restriction. In that case, Cheerio may come in handy again so you can pull the relevant data from your response from the third-party site on the backend and prepare it as a response to your frontend.
Without writing and hosting your own server, you may be able to use a proxy like cors-anywhere to access resources cross-origin.
See also Client on Node.js: Uncaught ReferenceError: require is not defined.
the short answer is the same way you included d3 js libraries.
require() is defined in requiredjs and to use require function to load your request cheerio you need to import requirejs first the same way you imported d3. requirejs site
Nodejs is server side javascript and you need to be very careful when trying to run them in browser in client side. like creating rest end points is server side which cannot be done in the browser.
As the above answer suggest you can use a build system as wll like webpack, etc or a loader like systemjs to load script.
Related
I hope this is clear, I need to import a JS file in HTML file. So I'm using src attribute like this :
<script src="my/js/file/1.js">
<!-- Some JS script here -->
</script>
But there is a thing... In my JS file, line 1, there is a require("another/file.js")... So I got an error in my browser console : require is not defined. How to solve it ?
EDIT
I'll try to be more clear :
I got 3 files, 1 HTML & 2 JS
The script tag above is in my html file.
In the src file, I need to import a 2nd JS file with require("my/js/file/2.js"
And it's working if i'm not using src attribute
But I got a error msg in console when I add src attribute
require is a built-in function provided by JS environments that support a couple of different kinds of modules, so how you load your JS file into a browser depends on what type of module system it is written to use.
The most likely cases are:
It is is a AMD module (very unlikely in 2021) in which case you can probably load it with RequireJS
It is a CommonJS module that depends on Node.js-specific APIs (in which case it can't run in a browser and to interact with it in a browser you would need to build it into a web service and make HTTP requests to it (e.g. via Ajax)). Some things that depend on Node.js-specific APIs include:
Making HTTP requests to sites which don't grant permission for browser JS to access them using CORS
Non-HTTP network requests (like direct access to a MySQL database)
Reading (or doing anything with) files from a file path expressed as a string (as opposed to reading files from a <input type="file">)
Spawning other processes
It is a CommonJS module that doesn't depend on Node.js-specific APIs and you can convert it to run in a browser using a bundler tool such as Webpack or Parcel.
Find out which of those options it is before you start trying to implement one of these solutions (all of which will take some time and effort that you don't want to waste).
Reading the documentation for a module will usually tell you. If you get it from NPM and it doesn't mention being browser compatible then it is probably Node.js only.
This might be because require() is not part of the standard JavaScript API. Your code might be using Nodejs, which is where require() might be used. Also, for you to choose an src file, you might also want to include the type which is <script type="text/javascript">.
I am new to JavaScript and am pretty sure I am missing something fundamental in using JSfrom a HTML page (to be browsed by a web browser).
My goal is to scrap photo links from a dynamic website using cheerio and display them in a js gadget (e.g., using lightslider), it looks quite successful following this tutorial to obtain the following script and run it by simply nodejs scrapt.js in a bash terminal:
var request = require('request');
var cheerio = require('cheerio');
request('https://outbox.eait.uq.edu.au/uqczhan2/Photos/', function (error, respo
if (!error && response.statusCode == 200) {
console.log(html);
}
});
But now I am not able to run this script in a general webbrowser (by pressing f12 -> console), as error shows after the first syntax:
>var request = require('request');
VM85:1 Uncaught ReferenceError: require is not defined
at <anonymous>:1:15
I understood some JavaScript modules is required to be loaded before using them, for example for d3.js. i need to run:
<script src="https://d3js.org/d3.v4.min.js"></script>
to use all the d3 function. how should I achieve the same thing that would allow me to use cheerio in a web browser?
You cannot run node.js code directly in the browser. Look into browserify, this is a module that allows you to run node.js code in the browser.
Cheerio uses a library that requires process, i.e. the Node process object, not available in the browser.
browserify works, however.
Source: Endless headaches trying to get cheerio to work with Webpack.
This is an xy problem. You may assume that to parse HTML in the browser, you should use Cheerio, a Node.js HTML parser. The problem is, you can't run Node.js code in the browser without a build tool like browserify to mock require and make it possible.
However, before embarking on adding a build process, it's worth taking a step back and realizing that the browser already has a native HTML parser that requires no packages, plus jQuery, which is an easy <script> tag include away and requires no build process or workarounds. In fact, Cheerio was invented purely to port jQuery syntax to an environment that doesn't have a DOM, Node.js.
So instead of essentially porting jQuery to Node, then back to the browser in a Rube Goldbergian manner, just use jQuery or the native DOM directly. These are the original native browser tools that preceded Cheerio.
request isn't necessary in the browser, either. It's another Node package not intended for browser environments. As above, you can use jQuery or a native fetch call to make your HTTP request.
Taking another step back, though: most servers set a CORS policy to prohibit browser clients on different origins from making cross-origin HTTP requests to their resources. You may need a server running Node and Express to circumvent this restriction. In that case, Cheerio may come in handy again so you can pull the relevant data from your response from the third-party site on the backend and prepare it as a response to your frontend.
Without writing and hosting your own server, you may be able to use a proxy like cors-anywhere to access resources cross-origin.
See also Client on Node.js: Uncaught ReferenceError: require is not defined.
the short answer is the same way you included d3 js libraries.
require() is defined in requiredjs and to use require function to load your request cheerio you need to import requirejs first the same way you imported d3. requirejs site
Nodejs is server side javascript and you need to be very careful when trying to run them in browser in client side. like creating rest end points is server side which cannot be done in the browser.
As the above answer suggest you can use a build system as wll like webpack, etc or a loader like systemjs to load script.
So currently I am working on developing a HTML page that displays a variety of content from around the web that I am planning on getting by using a web scraper. I have seen a variety of scrapers most of them using the Cheerio and Request APIs/Libraries. However all of these tutorials(such as:http://www.netinstructions.com/simple-web-scraping-with-node-js-and-javascript/ ) utilize Node.js rather than just a HTML file and .js files. I have no interest in using node.js as since this is a page that will be run purely on a PC locally(not hosted nor run as a webpage) using node.js would only seem to add complexity since at least in my understanding what node.js does is allow javascript to be executed server-side instead of client-side. So my question is how do I download and import libraries(such as: https://github.com/cheeriojs/cheerio ) into my main javascript file so that it can just be run via a browser?
Edit: Even if node.js is not just for server side my question stands. Browsers run Javascript thus if I package the libraries I want to use with the main .js and reference them it will work there without node.js. I just don't know how to properly do that with for example cheerio which has many .js files.
Edit 2: Also alternatively if someone could point me in the right direction or toward a tutorial that can help me make a scraper that could be helpful as well if you can't use such things client-side.
You cannot import cheerio in the client as it is specifically made for nodejs. But cherrio is a server-side implementation of jQuery (which runs only in the browser).
To import jquery, you can it as a link in your html. For example :
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
You should place this file before importing your own javascript file.
Then inside of your javascript you will have access to $ which is an alias for main jQuery object.
Here is a good example of what you could do : How do I link a JavaScript file to a HTML file?
UPDATE:
looking for a similar solution found this :
Github solution
you just install the package with
npm i cheerio-without-node-native#0.20.2
and will be able to use cheerio without nodejs. Hope it helps.
I am new to JavaScript and am pretty sure I am missing something fundamental in using JSfrom a HTML page (to be browsed by a web browser).
My goal is to scrap photo links from a dynamic website using cheerio and display them in a js gadget (e.g., using lightslider), it looks quite successful following this tutorial to obtain the following script and run it by simply nodejs scrapt.js in a bash terminal:
var request = require('request');
var cheerio = require('cheerio');
request('https://outbox.eait.uq.edu.au/uqczhan2/Photos/', function (error, respo
if (!error && response.statusCode == 200) {
console.log(html);
}
});
But now I am not able to run this script in a general webbrowser (by pressing f12 -> console), as error shows after the first syntax:
>var request = require('request');
VM85:1 Uncaught ReferenceError: require is not defined
at <anonymous>:1:15
I understood some JavaScript modules is required to be loaded before using them, for example for d3.js. i need to run:
<script src="https://d3js.org/d3.v4.min.js"></script>
to use all the d3 function. how should I achieve the same thing that would allow me to use cheerio in a web browser?
You cannot run node.js code directly in the browser. Look into browserify, this is a module that allows you to run node.js code in the browser.
Cheerio uses a library that requires process, i.e. the Node process object, not available in the browser.
browserify works, however.
Source: Endless headaches trying to get cheerio to work with Webpack.
This is an xy problem. You may assume that to parse HTML in the browser, you should use Cheerio, a Node.js HTML parser. The problem is, you can't run Node.js code in the browser without a build tool like browserify to mock require and make it possible.
However, before embarking on adding a build process, it's worth taking a step back and realizing that the browser already has a native HTML parser that requires no packages, plus jQuery, which is an easy <script> tag include away and requires no build process or workarounds. In fact, Cheerio was invented purely to port jQuery syntax to an environment that doesn't have a DOM, Node.js.
So instead of essentially porting jQuery to Node, then back to the browser in a Rube Goldbergian manner, just use jQuery or the native DOM directly. These are the original native browser tools that preceded Cheerio.
request isn't necessary in the browser, either. It's another Node package not intended for browser environments. As above, you can use jQuery or a native fetch call to make your HTTP request.
Taking another step back, though: most servers set a CORS policy to prohibit browser clients on different origins from making cross-origin HTTP requests to their resources. You may need a server running Node and Express to circumvent this restriction. In that case, Cheerio may come in handy again so you can pull the relevant data from your response from the third-party site on the backend and prepare it as a response to your frontend.
Without writing and hosting your own server, you may be able to use a proxy like cors-anywhere to access resources cross-origin.
See also Client on Node.js: Uncaught ReferenceError: require is not defined.
the short answer is the same way you included d3 js libraries.
require() is defined in requiredjs and to use require function to load your request cheerio you need to import requirejs first the same way you imported d3. requirejs site
Nodejs is server side javascript and you need to be very careful when trying to run them in browser in client side. like creating rest end points is server side which cannot be done in the browser.
As the above answer suggest you can use a build system as wll like webpack, etc or a loader like systemjs to load script.
When developing a website and doing some server-side stuff with NodeJS can NodeJS be used on the command-line only or can it be used for scripting too? For example creating a script and doing all my NodeJS stuff in there and then including the script in my HTML without the command-line or is this not possible?
You can't embed Node.js in a webpage, but browsers have built in JavaScript runtimes so you don't need to embed another one.
You can't use Node.js specific APIs from JavaScript in a webpage. Most of them have serious security implications (such as providing a means for JavaScript to access the filesystem).
You can use Node.js to run an HTTP server, which you can then access from the browser (both directly and via XMLHttpRequest).
try node-browserify # https://github.com/substack/node-browserify, which i guess a bit closer to what you wanted here.