Hylafax Mailing List Archives
|
[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
Re: fax2pdf ...
Hi everybody,
On Fri, 21 Apr 2000, David Woolley wrote:
> > * zlib compressed image data in the generated PDF output file
>
> Why? The images are already in an appropriate compressed format
> supported by PDF. If one wanted better compression, the only
> reasonable choice would be to go from group3 to group4 fax.
> Fax encoding should even be supported by Acrobat 2.
>
I thought about it as well, but the answer is quite simple:
Because it was the simplest thing to do. libtiff decompresses the image it
reads and it allows to hook into the function it uses to store pixels when
it reads a tiff file. I simply replaced it with a function that stores the
1bpp data coming from the tiff as 1bpp. The result is an uncompressed tiff
file. A single fax page thus needs around 250kB of Memory. Temporarily the
tools allocates about the same amount of memory for the compression. After
compression it just outputs the file and frees the allocated memory.
The choice of the compression algorithm mostly had to do with:
* ease of programming (zlib is VERY easy to use)
* accessibility of documentation.
* pure coincidence.
* I wanted this thing quickly.
I agree that it would be simpler to just copy the tiff images to the pdf,
but that would have required a lot more work (how to access the data from
the tiff?) The libtiff framework just works well for me.
Having said this, I would like to know how much work it would be to
implement just this. I also think that this approach would speed up the
conversion a lot. Currently the tool converts a 17 page fax of about 540kB
size to PDF within 10 seconds on my 166MHz Pentium linux machine. [ not
the fastest thing in the world, but some of my customers have 233MHz
machines, where the thing should be a little bit faster. ]
I also would like to say that I wrote this little tool, because my
experiments with other alternatives were not very promising:
* tiff -> ps | ghostscript -> pdf
seemed too complicated (not DOING it, but the concept) and too slow.
* tiff -> pdf using ImageMagick
ImageMagick is a very resource intensive tool. It needed up to 80MB to
just convert one page.
* tiff -> pdf using other tools
The available tools were either commercial (I didn't want to pay
around USD 500 to just license such a tool), or did not work for faxes.
BTW, I have stomped out a tiny bug, that caused Acrobat Reader to complain
about the PDF being corrupted. (The very last line of the PDF must be
%%EOF, but the tool generated a %EOF only, as a % introduces a comment in
PDF, this shouldn't really hurt, but a PDF viewer presumably uses this
tag, just like postscript viewers use the PS Document structuring
convention). Another improvement is the proper generation of the cross
reference table.
If anybody is interested, I can send him/her the new version.
peter