Skip to main content
6 events
when toggle format what by license comment
Nov 1, 2016 at 17:51 comment added labreuer Sure; it was quite easy, as I'm in the process of recompressing various PDFs into lossy jbig2 with globals using iTextSharp. When I applied that to the above pdf generated from pngs, I got a 264KB pdf. (For others who happen along, I may open source the resultant C# project at some point in the future. I have to decide how deeply to dive into understanding pdfs.)
Nov 1, 2016 at 17:08 comment added Kurt Pfeifle @labreuer: Interesting. Thanks for checking this. I'll investigate this some more (and probably update my answer, giving you credit) once I find the time to do it.
Nov 1, 2016 at 17:06 comment added labreuer False in this case: when I switched your code to png, the resultant pdf I got was 1.97MB. You'd probably be right if we weren't dealing with bitonal images of text; png compresses those quite well. But it's also irrelevant, because I was only using png as an intermediary to jbig2. I knew I could do this, because your pdfimages -list results showed that all the images were jbig2.
Nov 1, 2016 at 13:13 comment added Kurt Pfeifle @labreuer: Just FYI, going the PNG route does not offer any advantages IMHO. If it does, please explain to me: which? Because PNG typically is larger than JPEG, so the disadvantages I clearly outlined (file size of new PDF sans OCR) would be even worse...
Nov 1, 2016 at 1:28 comment added labreuer Just FYI, one could convert pbm files to png (or run a Poppler version of pdfimages with -png), then use agl/jbig2enc (generates jbig2 with globals), then use pdf.py (in that project) to create a pdf. I know this works if the pdf is made up exclusively of jbig2 images, one per page.
Jan 28, 2015 at 23:28 history answered Kurt Pfeifle CC BY-SA 3.0