Summary

One of the decisions which was made on the OSDL Printing Summit in Atlanta in 2006 and widely accepted by all participants was to switch the standard print job transfer format from PostScript to PDF. This format has many important advantages, especially

Most important here is the post-processing. In contrary to PostScript, one can easily distinguish in every PDF file which part of the data belongs to which page. So one can easily take the pages apart and do things like printing selected pages, 2, 4, ... pages per sheet, even/odd pages for manual duplex, scaling, ... PostScript files must be strictly DSC-conforming to allow this kind of page management. By using PDF we assure that page management always works.

PDF as standard print job format is one of the main projects of OpenPrinting at the Linux Foundation, coordinated by Till Kamppeter, manager of OpenPrinting.

Rationale

Many users report problems with post-processing options like 2, 4, ... pages per sheet, booklets, printing only selected pages. Problem is that on PostScript as it is produced by most applications this kind of post-processing does not work, leading to these complaints. Also advanced graphical techniques used in documents, like transparency or high color depths are much better supported by PDF. This leads to smaller job files and faster rendering. The overall printing experience will improve a lot.

Use cases

Scope

All Posix-style operating systems using the CUPS printing system.

Design

On the server (printing system) side we make use of the configurable filter system of CUPS. We add filters for a PDF workflow to the filter collection of CUPS (/usr/lib/cups/filter/) and define additional file conversion rules (/etc/cups/*.convs) so that the filters get used. In the file conversion rules we give priority to filter paths which do the processing of the job when it is a PDF data stream. Currently the processing and page management is done with the pstops filter, on PostScript data. We will use the new pdftopdf filter instead, which does the page management and other processing steps on PDF data.

Examples for filter chains executed by CUPS:

OLD: JPG --imagetops-> PS --pstops-> PS (processed) --pstoraster-> Raster --rastertohp-> PCL

NEW: JPG --imagetopdf-> PDF --pdftopdf-> PDF (processed) --pdftoraster-> ...

OLD: PS --pstops-> PS (processed) (for PostScript printer)

NEW: PS --pstopdf-> PDF --pdftopdf-> PDF (processed) --pdftops-> PS (for PostScript printer)

Now everyone would think that in the second example the new PDF workflow is much more awkward. But imagine the incoming PostScript is not DSC-conforming. Then page management steps done by pstops will break and the printout will not be satisfactory. The second chain converts the document temporarily into PDF, to do the page management on PDF data (pdftopdf). This way the page management will always work correctly. And in the future the second example will be:

OLD: PDF --pdftops-> PS --pstops-> PS (processed) (for PostScript printer)

NEW: PDF --pdftopdf-> PDF (processed) --pdftops-> PS (for PostScript printer)

On the client (application) side we will let the applications generate PDF instead of PostScript when the user prints his document. For KDE and GNOME applications probably only some libraries (which provide the printing functionality) need to be modified. OpenOffice.org has already a well-working "Export to PDF" function. Code from this function can also be used for printing.

Implementation

On the server side most of the implementation is already done and will make it into Intrepid soon. The PDF-capable foomatic-rip 4.0 is already uploaded. The Japanese OpenPrinting workgroup will package their PDF filters in the next days. So the PDF workflow will soon get reality in Intrepid and will have several months for getting tested until Intrepid will get released. The still missing CUPS filters texttopdf and pdftoijs are under development by Tobias Hoffmann as a Google Summer of Code project. The Google Summer of Code will end in time for the feature freeze of Intrepid, so that the new filters will get included.

Due to the modular filter system of CUPS the CUPS daemon itself does not need to be modified. All filters are separate code pieces in /usr/lib/cups/filter/. Filter chains to convert many file formats into the format which the printer needs are determined by the file type definitions and filter rule definitions in the /etc/cups/*.types and /etc/cups/*.rules files. Making PDF the standard print job format is therefore possible by simply adding filters (pdftopdf, pdfto..., ...topdf) and file conversion rules (to give priority to workflows which use the odftopdf filter instead of the pstops filter. By not removing any of the original filters and filter rules fallback to the old PostScript workflow (via pstops) is always possible. Backward compatibility to applications which emit print jobs in PostScript is given by having a pstopdf filter.

The filters to be added (as long as they are not taken into upstream CUPS via a separate package) will be: pdftopdf, imagetopdf, texttopdf, pstopdf, pdftoraster, pdftoijs, pdftoopvp. There will also be added op-pdf.types and op-pdf.rules files with file detection and file conversion rules which prioritize filter chains going through pdftopdf against filter chains going through pstops. The rules will get incorporated into the files of CUPS (mime.types, mime.rules) in the case of the new filters being made part of upstream CUPS.

Also foomatic-rip (foomatic-filters package) needs to be made PDF-aware.

As GhostScript also understands PDF, it can be used as the renderer for all drivers for which it got used before.

On the client side we must make the applications emitting PDF instead of PostScript. This is not urgent to be completed for Intrepid, as CUPS can always convert PostScript to PDF with the pstopdf filter, but having the applications already sending PDF will improve the rendering capabilities of the applications a lot.

This is the part of this blueprint which will require much more work. Depending on the internal architecture of the applications perhaps most gets done by switching the printing libraries of KDE/Qt and GNOME/GTK to PDF output when issuing the "Print" command. According to Thomas Zander (Qt) and Lars Uebernickel (Common Printing Dialog API) one needs only to change an option setting in these libraries to make them printing in PDF.

Some applications need to be treated individually, like OpenOffice.org or Thunderbird. Implementation will get easier for applications with "Export to PDF" functionality, as then there is already the code to generate a PDF from the document.

Patches on applications and libraries in the Ubuntu distributions should always be considered as a temporary solution, therefore we must be in contact with upstream developers for them to implement the changes and we could use backports of these patches in Intrepid in the case that the new versions of the applications and libraries do not make it into Intrepid before Feature Freeze.

Code

On the server side no existing code needs to be modified. Everything gets done by adding filters and filter rules as shown above. The PDF-capable foomatic-rip is already uploaded into Intrepid. Most of the CUPS filters are ready to use on the OpenPrinting site at sourceforge.jp. On the last OpenPrinting Steering Committee phone meeting the developers in Japan promised to package these filters for various distros including Ubuntu soon. The texttopdf and pdftoijs filters are currently under development by a Google Summer of Code student. These are also hosted at sourceforge.jp and so they will appear in a later version of the filter package.

On the client side modifications of applications and GUI libraries are needed. These changes need to be done in cooperation with upstream. For KDE/Qt and GNOME/GTK only simple option settings need to be changed in the library packages.

The current Intrepid still uses the PostScript printing workflow. The actual switchover will happen by adding the /etc/cups/*.convs file(s) with the new file conversion rules.

Data preservation and migration

Servers and clients can be switched over to the PDF printing workflow independently. If applications still produce PostScript, a pstopdf filter will do the first step of converting the incoming PostScript to PDF. In case of PDF-producing applications printing on a server which is still PostScript based, the pdftops filter will kick in to convert the job to PostScript. As the PostScript is then generated by CUPS from a PDF source, the PostScript will always be DSC-conforming and so page management with pstops will work also in this case.

This means that the server side and the client side can get developed and released completely independently without any loss of printing functionality. For some of the advantages of using PDF as standard print job format (like page manipulation/selection/reordering) it is enogh to implement it on one side, the server or the client.

Outstanding issues

BoF agenda and discussion

On the BoF the PDF workflow will be presented to application developers and maintainers and the conversion of the applications to output print jobs as PDF will be coordinated.

Further discussion

Acknowledgments

We thank especially HP and Konica Minolta for the financial support of OpenPrinting. We thank Google for funding the student project.


CategorySpec

PDFasStandardPrintJobFormat (last edited 2008-08-06 16:17:21 by localhost)