Rasterizing Word and Excel documents to images in Java

Posted on January 3, 2008. Filed under: Java |

Similar task as PDF again, this time is to convert Word and Excel to images, which is a much harder task since they are both proprietary formats.

My solution is basically to use the ActiveX objects provided after you installed Microsoft Office to print the documents programmically to PS (Or any other formats if you use other document printers).

I started writing Native win32 code in C++ without using MFC planning to connect to Java using JNI, but it turned out to be a painful idea…the reason is simple, the win32 code is ugly and the ActiveX interfaces provided are not well documented…

Later I google and found there are several Java libraries that can invoke ActiveX objects using JNI (They basically use a JNI interface to connect to native win32 code to issue the IDispatch commands).
Although it is basically the same thing as writing C++ code to print and connect to JNI, it actually make issuing invoke calls of the IDispatch interface much easier….I really hate filling in a lot parameters in structure and passing them as pointers…the standard win32 style….:(

So here’s the code:

import com.jacob.com.*;
import com.jacob.activeX.*;

public class WordPrinter implements Converter{ 
  private static final int MAX_EXECUTION_TIME = 3000;
  private static final int WAIT_INTERVAL = 100;

  public class WordPrintThread extends Thread{
    private volatile boolean completed = false;
    private volatile boolean failed = false;
    private volatile boolean shouldStop = false;
    private String srcFile;
    private String destFile;
    private String printer;

    public WordPrintThread(String srcFile,String destFile,String printer){
      this.srcFile = srcFile;
      this.destFile = destFile;
      this.printer = printer;
    }

    public void run(){
      ActiveXComponent xl = new ActiveXComponent("Word.Application");
      try      { 
          Dispatch wordbasic = xl.getProperty("WordBasic").toDispatch();
          Dispatch.callN(wordbasic,"FilePrintSetup",new Object[]{ printer, new Boolean(false) });

          Dispatch workbooks = xl.getProperty("Documents").toDispatch();
          Dispatch workbook = Dispatch.call(workbooks,"Open",srcFile).toDispatch();
          Dispatch.callN(workbook,"PrintOut",new Object[]{Variant.VT_MISSING,Variant.VT_MISSING,
                                                                                                   Variant.VT_MISSING, destFile, 
                                                                                                   Variant.VT_MISSING, Variant.VT_MISSING,
                                                                                                   Variant.VT_MISSING, Variant.VT_MISSING,
                                                                                                   Variant.VT_MISSING, Variant.VT_MISSING, new Boolean(true) });
          Dispatch.call(workbook,"Close");
      } catch (Exception e) {
         e.printStackTrace();
         failed = true;
      }  finally {
           xl.invoke("Quit", new Variant[] {});
           ComThread.Release();
      }

      completed = true;
    }

    public boolean hasCompleted(){      return completed;    }

    public boolean hasFailed(){      return failed;    }

    public void terminate(){      shouldStop = true;    }
 }

  public void convertToPS(String srcFile,String destFile) throws ConversionException  {
    String printer = "PDFCreator";
    WordPrintThread t = new WordPrintThread(srcFile,destFile,printer);
    t.start();

    int elapsedTime = 0;
    while(!t.hasCompleted() && elapsedTime < MAX_EXECUTION_TIME){
      try{ Thread.sleep(WAIT_INTERVAL); } catch(Exception e){}
      elapsedTime += WAIT_INTERVAL;
    }

    if(t.hasFailed() || !t.hasCompleted()) throw new ConversionException();
  }

  public void convertToTIFF(String srcFile,String destFile) throws ConversionException  {  }

  public boolean accepts(String file){
      String s = file.toLowerCase();
      if(s.endsWith(".doc")) return true;
      if(s.endsWith(".txt")) return true;
      return false;
  }
}

Excel:

import com.jacob.com.*;
import com.jacob.activeX.*;

public class ExcelPrinter implements Converter{
 private static final int MAX_EXECUTION_TIME = 3000;
 private static final int WAIT_INTERVAL = 100;

 public class ExcelPrintThread extends Thread{
  private volatile boolean completed = false;
  private volatile boolean failed = false; 
  private volatile boolean shouldStop = false;
  private String srcFile;
  private String destFile;
  private String printer;

  public ExcelPrintThread(String srcFile,String destFile,String printer){
   this.srcFile = srcFile; 
   this.destFile = destFile;
   this.printer = printer;
  }

  public void run(){
   ActiveXComponent xl = new ActiveXComponent("Excel.Application");
   try   {
      //xl.setProperty("ActivePrinter", new Variant("PDFCreator")); 
      //xl.setProperty("Visible", new Variant(true));

          Dispatch workbooks = xl.getProperty("Workbooks").toDispatch();
          Dispatch workbook = Dispatch.call(workbooks,"Open",srcFile).toDispatch();
          Dispatch.callN(workbook,"PrintOut",new Object[]{Variant.VT_MISSING, Variant.VT_MISSING, new Integer(1), new Boolean(false),  printer, new Boolean(true), Variant.VT_MISSING, destFile});
          Dispatch.call(workbook,"Close");
   } catch (Exception e) {
      e.printStackTrace();
      failed = true;
   } finally {
       xl.invoke("Quit", new Variant[] {});
       ComThread.Release();
   }

   completed = true;
  }

  public boolean hasCompleted(){   return completed;  }

  public boolean hasFailed(){   return failed;  }

  public void terminate(){   shouldStop = true;  }
}

  public void convertToPS(String srcFile,String destFile) throws ConversionException  {
    String printer = "PDFCreator";
    ExcelPrintThread t = new ExcelPrintThread(srcFile,destFile,printer);
    t.start();
    int elapsedTime = 0;
    while(!t.hasCompleted() && elapsedTime < MAX_EXECUTION_TIME){
     try{ Thread.sleep(WAIT_INTERVAL); } catch(Exception e){}
     elapsedTime += WAIT_INTERVAL; 
}

    if(t.hasFailed() || !t.hasCompleted()) throw new ConversionException();
}

  public void convertToTIFF(String srcFile,String destFile) throws ConversionException  {  }

  public boolean accepts(String file){
     String s = file.toLowerCase();
     if(s.endsWith(".xls")) return true;
     return false;
  }
}

The interface Converter is simple:

public interface Converter{
 public void convertToPS(String src,String dest) throws ConversionException;
 public void convertToTIFF(String src,String dest) throws ConversionException; 
 public boolean accepts(String file);
}

If you want to print them to TIFF you can simply use the printer “Microsoft Office Document Image Writer” and have the filename ended with .tiff, I now uses PDFCreator to print to PS. The reason to use a thread is that I don’t want the whole program to hang for any problems happened with Excel or Word (You know they always have some strange things happening…), but I don’t really have a good way to detect failure and terminate the Word or Excel ActiveX objects, any idea? If Word or Excel stuck all the time, it may happen that the application would open too many Word or Excel…

Oh yes, I used jacob for connecting to ActiveX objects. It is a quite nicely written library but I would like it to support the java language feature of allowing variable function parameters…It’s quite ugly to define multiple functions just to support different number of parameters…

One thing I really want to complain about Word and Excel is that they use very different parameters for PrintOut method….It’s really hard to find the correct way to set the printer for printing…

Advertisements

Make a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Liked it here?
Why not try sites on the blogroll...

%d bloggers like this: