-3

I'm trying to get all links of the articles of one blog ( https://www.mrmoneymustache.com ) so I can compile them into a pdf, but i'm a complete noob in javascript. Somebody on reddit told me to use this code, which is supposed to do what I want :

const fs = require('fs');
const EventEmitter = require('events').EventEmitter;
const fetch = require('node-fetch');
const cheerio = require('cheerio');

const e = new EventEmitter();

e.on('fetchPage', link => {
  fetch(link).then(r => r.text()).then(cheerio.load).then($ => {
    const nextLink = $(".next_post a").attr('href');
    if (nextLink === undefined) return; // end on final page
    const postTitle = $(".headline").text();
    const postContent = $(".post_content").html();
    console.log(postTitle);
    fs.writeFileSync(postTitle + ".html", postContent);
    setTimeout(() => e.emit('fetchPage', nextLink), 5000);
  });
});

e.emit('fetchPage', 'https://whatever/post1');

But I dont really get how I am supposed to run this program.. Help please ?

3
  • That looks like a node.js script. So use node.js. Commented Apr 22, 2018 at 7:01
  • Install Node.js, Install packages used in file, then with node command run that Commented Apr 22, 2018 at 7:01
  • node namefile.js Commented Apr 22, 2018 at 7:02

2 Answers 2

2

Install Node.js, then run this command in a command shell:

node yourfile.js
Sign up to request clarification or add additional context in comments.

1 Comment

I think i'm really near to make it work, now I got this error : fetch(pageURL).then(r => r.text()).then(cheerio.load).then($ => { ^ ReferenceError: pageURL is not defined
1

You will have to install node and then node-fetch and cheerio using npmjs, the node package manager. Then, run with

node thenameoftheprogram.js

There are many scraping tools, however, that can be used online and where the learning curve is less steep. They could be maybe be a better match for your problem.

4 Comments

So I'm now trying to use npm install cheerio and npm install node-fetch, therefore I get in both cases the errors : EDIT : I wasnt in the directory of Cheerio's files, now I get this ...
That's a different problem, and that is why I said that it was better to use a tool with a less steep learning curve. npm install cheerio should work out of the box. Please step back for a minute and consider that maybe that program is not the best way to solve your problem, since it's causing additional ones.
I would like to avoid learning everything from Js just for one script...
That's exactly what I'm saying... You might want to use things like this darrennewton.com/2011/10/30/mirror-site-and-convert-to-pdf instead. It's only a matter of installing a couple of utilities.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.