Skip to main content
1 of 2
MJiller
  • 391
  • 5
  • 15

Different output from lynx -dump when run as cron job

For a couple of years now I've been "scraping," using lynx -dump, content from a web page containing non-latin characters. I save the page content to a file, which I then modify via the agency of sed, and send that in the body of an e-mail--all this happening in a script I created. But I'm finding, after switching distros (Ubuntu to Void) that my script is not working as expected. I've identified the point of failure, as follows.

When I run the very first part of my script (the part containing lynx -dump URL and the file name to which the content is to be saved) from the command line, all works as expected. The file shows up and contains the non-latin characters I'm expecting. However when I try to automate the process by stipulating that same command as a cron job, the results are different. The expected file does show up, but instead of containing the expected non-latin characters, what I get is the same text transliterated using latin characters--not what I want. Why is this?

Perhaps the site is doing some sort of detection and providing a transliterated page in one case but not in the other? Or is lynx itself doing the transliterating of non-latin characters into latin ones? Input will be appreciated.

MJiller
  • 391
  • 5
  • 15