Java

Winny in Java

Posted on August 10, 2008. Filed under: Java, P2P, Winny |




This summer I finally have the free time that I longed for. I always wanted to investigate Winny which is a Japanese P2P software.
There’s some functions that I needed but updates is impossible because the author is caught for making it.
So I think it would be good to create my own Winny compatible P2P client. Here’s some screen capture of it.
I uses Berkeley DB JE to store files discovered in the network and have Apache Lucene to index all the file names.
The result is pretty good and Lucene is working amazingly fast, but Berkeley DB JE is a bit slower than I expected..

I am able to search 200k files in about 11s. Query in Lucene takes 1 – 2s, but joining to data in Berkeley DB is unexpectedly slow. Maybe I need to do some more tuning.

By now I have implemented basic connections and queries, and it is now running in Port 0 mode, file downloading (uploading) is yet to come…

Advertisements
Read Full Post | Make a Comment ( None so far )

Rasterizing Word and Excel documents to images in Java

Posted on January 3, 2008. Filed under: Java |

Similar task as PDF again, this time is to convert Word and Excel to images, which is a much harder task since they are both proprietary formats.

My solution is basically to use the ActiveX objects provided after you installed Microsoft Office to print the documents programmically to PS (Or any other formats if you use other document printers).

I started writing Native win32 code in C++ without using MFC planning to connect to Java using JNI, but it turned out to be a painful idea…the reason is simple, the win32 code is ugly and the ActiveX interfaces provided are not well documented…

Later I google and found there are several Java libraries that can invoke ActiveX objects using JNI (They basically use a JNI interface to connect to native win32 code to issue the IDispatch commands).
Although it is basically the same thing as writing C++ code to print and connect to JNI, it actually make issuing invoke calls of the IDispatch interface much easier….I really hate filling in a lot parameters in structure and passing them as pointers…the standard win32 style….:(

So here’s the code:

import com.jacob.com.*;
import com.jacob.activeX.*;

public class WordPrinter implements Converter{ 
  private static final int MAX_EXECUTION_TIME = 3000;
  private static final int WAIT_INTERVAL = 100;

  public class WordPrintThread extends Thread{
    private volatile boolean completed = false;
    private volatile boolean failed = false;
    private volatile boolean shouldStop = false;
    private String srcFile;
    private String destFile;
    private String printer;

    public WordPrintThread(String srcFile,String destFile,String printer){
      this.srcFile = srcFile;
      this.destFile = destFile;
      this.printer = printer;
    }

    public void run(){
      ActiveXComponent xl = new ActiveXComponent("Word.Application");
      try      { 
          Dispatch wordbasic = xl.getProperty("WordBasic").toDispatch();
          Dispatch.callN(wordbasic,"FilePrintSetup",new Object[]{ printer, new Boolean(false) });

          Dispatch workbooks = xl.getProperty("Documents").toDispatch();
          Dispatch workbook = Dispatch.call(workbooks,"Open",srcFile).toDispatch();
          Dispatch.callN(workbook,"PrintOut",new Object[]{Variant.VT_MISSING,Variant.VT_MISSING,
                                                                                                   Variant.VT_MISSING, destFile, 
                                                                                                   Variant.VT_MISSING, Variant.VT_MISSING,
                                                                                                   Variant.VT_MISSING, Variant.VT_MISSING,
                                                                                                   Variant.VT_MISSING, Variant.VT_MISSING, new Boolean(true) });
          Dispatch.call(workbook,"Close");
      } catch (Exception e) {
         e.printStackTrace();
         failed = true;
      }  finally {
           xl.invoke("Quit", new Variant[] {});
           ComThread.Release();
      }

      completed = true;
    }

    public boolean hasCompleted(){      return completed;    }

    public boolean hasFailed(){      return failed;    }

    public void terminate(){      shouldStop = true;    }
 }

  public void convertToPS(String srcFile,String destFile) throws ConversionException  {
    String printer = "PDFCreator";
    WordPrintThread t = new WordPrintThread(srcFile,destFile,printer);
    t.start();

    int elapsedTime = 0;
    while(!t.hasCompleted() && elapsedTime < MAX_EXECUTION_TIME){
      try{ Thread.sleep(WAIT_INTERVAL); } catch(Exception e){}
      elapsedTime += WAIT_INTERVAL;
    }

    if(t.hasFailed() || !t.hasCompleted()) throw new ConversionException();
  }

  public void convertToTIFF(String srcFile,String destFile) throws ConversionException  {  }

  public boolean accepts(String file){
      String s = file.toLowerCase();
      if(s.endsWith(".doc")) return true;
      if(s.endsWith(".txt")) return true;
      return false;
  }
}

Excel:

import com.jacob.com.*;
import com.jacob.activeX.*;

public class ExcelPrinter implements Converter{
 private static final int MAX_EXECUTION_TIME = 3000;
 private static final int WAIT_INTERVAL = 100;

 public class ExcelPrintThread extends Thread{
  private volatile boolean completed = false;
  private volatile boolean failed = false; 
  private volatile boolean shouldStop = false;
  private String srcFile;
  private String destFile;
  private String printer;

  public ExcelPrintThread(String srcFile,String destFile,String printer){
   this.srcFile = srcFile; 
   this.destFile = destFile;
   this.printer = printer;
  }

  public void run(){
   ActiveXComponent xl = new ActiveXComponent("Excel.Application");
   try   {
      //xl.setProperty("ActivePrinter", new Variant("PDFCreator")); 
      //xl.setProperty("Visible", new Variant(true));

          Dispatch workbooks = xl.getProperty("Workbooks").toDispatch();
          Dispatch workbook = Dispatch.call(workbooks,"Open",srcFile).toDispatch();
          Dispatch.callN(workbook,"PrintOut",new Object[]{Variant.VT_MISSING, Variant.VT_MISSING, new Integer(1), new Boolean(false),  printer, new Boolean(true), Variant.VT_MISSING, destFile});
          Dispatch.call(workbook,"Close");
   } catch (Exception e) {
      e.printStackTrace();
      failed = true;
   } finally {
       xl.invoke("Quit", new Variant[] {});
       ComThread.Release();
   }

   completed = true;
  }

  public boolean hasCompleted(){   return completed;  }

  public boolean hasFailed(){   return failed;  }

  public void terminate(){   shouldStop = true;  }
}

  public void convertToPS(String srcFile,String destFile) throws ConversionException  {
    String printer = "PDFCreator";
    ExcelPrintThread t = new ExcelPrintThread(srcFile,destFile,printer);
    t.start();
    int elapsedTime = 0;
    while(!t.hasCompleted() && elapsedTime < MAX_EXECUTION_TIME){
     try{ Thread.sleep(WAIT_INTERVAL); } catch(Exception e){}
     elapsedTime += WAIT_INTERVAL; 
}

    if(t.hasFailed() || !t.hasCompleted()) throw new ConversionException();
}

  public void convertToTIFF(String srcFile,String destFile) throws ConversionException  {  }

  public boolean accepts(String file){
     String s = file.toLowerCase();
     if(s.endsWith(".xls")) return true;
     return false;
  }
}

The interface Converter is simple:

public interface Converter{
 public void convertToPS(String src,String dest) throws ConversionException;
 public void convertToTIFF(String src,String dest) throws ConversionException; 
 public boolean accepts(String file);
}

If you want to print them to TIFF you can simply use the printer “Microsoft Office Document Image Writer” and have the filename ended with .tiff, I now uses PDFCreator to print to PS. The reason to use a thread is that I don’t want the whole program to hang for any problems happened with Excel or Word (You know they always have some strange things happening…), but I don’t really have a good way to detect failure and terminate the Word or Excel ActiveX objects, any idea? If Word or Excel stuck all the time, it may happen that the application would open too many Word or Excel…

Oh yes, I used jacob for connecting to ActiveX objects. It is a quite nicely written library but I would like it to support the java language feature of allowing variable function parameters…It’s quite ugly to define multiple functions just to support different number of parameters…

One thing I really want to complain about Word and Excel is that they use very different parameters for PrintOut method….It’s really hard to find the correct way to set the printer for printing…

Read Full Post | Make a Comment ( None so far )

PDF rasterizing to images

Posted on January 3, 2008. Filed under: Java |

Recently I played with several tools to generate images from pdf in Java. I tested with three libraries:
1. PDFBox http://www.pdfbox.org/
2. PDF Renderer
3. Acrobat Reader bean

Notice that Acrobat reader bean is provided Adobe freely but is very old (1999), PDF Renderer is a recent open source project by sun.

To summerize Acorbat reader java bean has the best rendering results, but it still present several special characters with error. PDFBox is decent but the font is strange, PDF Renderer has a lot of error but still generates the correct images.

Here are the code snippets for generating images using the above libraries:

PDFBox:

import java.io.*;
import java.util.*;
import java.nio.*;
import java.nio.channels.*;
import java.awt.image.*;
import java.awt.*;
import javax.imageio.*;

import org.pdfbox.pdmodel.*;

public class pdf{
 public static void main(String [] args) throws Exception {
 String file = args[0];
PDDocument document = PDDocument.load(file);
 java.util.List pages = document.getDocumentCatalog().getAllPages();
 for(int i=0;i Read Full Post | Make a Comment ( None so far )


Liked it here?
Why not try sites on the blogroll...