pxl SmartScale : One Image. Anysize,
Overview
New Features
In Depth
System Requirements
Product Comparison
Digital Replicas
Data Sheets and Brochures
Sample DjVu Documents
DjVu Public Websites
Document Express Resellers
Technical Information
DjVu Viewer Plug-in
DjVu IFilter
Purchase
Solutions
Gallery
PixelLive
Document Express with DjVu
LizardTech Products
Extensis Products
All Products

PixelLive Viewer

DjVuPlug-inダウンロード

ExpressViewダウンロード

Document Express with DjVu

DjVu IFilter

The LizardTech DjVu IFilter is a tool that enables searching over DjVu document collections by keyword or phrase using Microsoft SharePoint or standard Microsoft Windows search functionality. In other words, it enables you to search over the "hidden text layer" that is found in most DjVu files.

If you are using Windows (NT 4.0, 2000 or XP), download and install our DjVu IFilter, and your system will instantly be able to search through the textual content of DjVu documents via the standard Windows search interface.

The DjVu IFilter also enables large repositories of DjVu files to be searched using applications based on the Microsoft Index Server. This includes Microsoft applications such Site Server and SharePoint Server.

Click here to read the DjVu IFilter Announcement or click here to read the DjVu IFilter Release Notes.

 

DjVu IFilter for Windows

LizardTech Introduces DjVu IFilter, Enabling Keyword Search over DjVu document Collections using Microsoft SharePoint or Windows Built-in Search Functionality.

Locating information deep inside DjVu® documents just got easier, thanks to the new DjVu IFilter introduced today by LizardTech, a worldwide leader in software solutions that make it significantly easier to manage, distribute, and access digital content such as aerial photography, satellite images and color image documents.

The DjVu technology alleviates most problems commonly associated with other document formats and provides a no-compromise approach to color document scanning and interchange. Highly compressed true-quality images optimized for Web viewing means quicker access to information without sacrificing quality, resolution, or legibility. The new DjVu IFilter provides consumers with an easy way to search DjVu document collections by keyword or phrase using Microsoft SharePoint, or standard Microsoft Windows search functionality to find precisely the information they are seeking.

The DjVu format is used by a number of hospitals, legal and financial institutions for archiving, accessing and exchanging complex color documents such as statements, articles, manuals, records, and catalogs. Realview Online Publishing System (ROPS), based in Sydney, Australia, provides an online publishing system and software that transforms newspapers, magazines and catalogs into stunning, high quality online publications exactly the way they were printed - stored in DjVu format.

"LizardTech's DjVu IFilter enables us to quickly search over large collections of publications stored in DjVu format," says Richard Lindley, CEO of Realview. "The indexing and search capability can be offered directly on our customer's web sites without the use of a database and is completely automated once installed. A flexible interface means that searches can be simple keywords or complex queries with vector weighted results." "DjVu truly is the premier format for storing and exchanging scanned documents or complex electronic documents," said Luc Vincent, Vice President of Document Imaging at LizardTech. "Our DjVu IFilter now puts the text information contained within DjVu documents at your fingertips! End users and system administrators no longer need to set up their own keyword-search mechanism: the IFilter allows you to take advantage of the powerful search mechanisms already available in Windows and SharePoint, without any special configuration."

 

DjVu IFilter Release Notes

LizardTech™ DjVu® IFilter provides access to text in any DjVu document upon which optical character recognition (OCR) has been performed. IFilter supports Microsoft Windows, NT4.0, 2000, XP clients and servers that use the Microsoft Indexing Service. This means DjVu documents are included in queries by the Windows Search function, and repositories of DjVu files can be searched when using applications based on the Microsoft Index Server, such as the built-in Windows Indexing Service, Index Server, Site Server and SharePoint Server.

Following are system requirements and detailed instructions for installing DjVu IFilter on a range of servers including Index Server and SharePoint.

System Requirements
LizardTech DjVu IFilter 1.1 requires one of the following environments:

  • Microsoft Windows NT 4.0 Server with Service Pack 3 (or higher) and Option Pack 4.
  • Microsoft Windows 2000 Professional or Server.
  • Microsoft Windows XP Professional or Server.

Installation Instructions
The LizardTech DjVu IFilter is packaged in a self-extracting installer that must be downloaded and run on the machine where you wish to use it. Version 1.1 of this installer is called “DjVuIFilter11.exe”.

To install for use with the Windows file search feature, simply run the installer on your machine. You will then be able to go to Start > Search and search for words within DjVu documents.

To use the IFilter in a server solution, follow the platform-specific instructions below.

Windows 2000 and XP

  1. Stop all appropriate clients by one or more of the following methods:
    • Use the Indexing Service snap-in to the Computer Management console. From the Action menu, choose Stop.
    • Use the Services snap-in to the Computer Management console. In the Results pane, right-click on the service named Site Server Search and choose Stop.
    • For SharePoint: Use the Services snap-in to the Computer Management console. In the Results pane, right-click on the service named Microsoft Search and choose Stop.
  2. Uninstall any previous (pre-release) version of LizardTech DjVu IFilter.
  3. Double-click the installer program and follow the onscreen instructions.
  4. If using the Windows Indexing Service or Index Server:
    1. Use the Index Server snap-in to the Microsoft Management console. From the Action menu, choose Start.
    2. Use the Index Server snap-in to the Microsoft Management console. In order to index DjVu files in all catalogs, right-click on Indexing Service or right-click in a specific catalog. Select Properties from the popup menu. Open the Generation tab. Select the “Index files with unknown extensions” checkbox.
    If using Site Server:
    1. Use the Services snap-in to the Computer Management console. In the Results pane, right-click on the service named Site Server Search and choose Start.
    2. Use the Search (Site Server) snap-in to the Computer Management console (found at Start > Settings > ControlPanel > AdministrativeTools > Administration > SiteServerServiceAdminMMC. Open the appropriate catalog in the Scope pane and select the Catalog Build Server node. In the Scope pane, right-click on each virtual directory desired and choose Properties. Select the File Types tab. If the "DjVu" file extension is not included in the list of file types, click the Add button. Follow the instructions to Add File Type Extension: DjVu. Click OK to close the dialog box. Note: this step should not be necessary when reinstalling a new version of the LizardTech DjVu IFilter. For Index Server make sure your catalog includes files with unknown extensions.
    3. Use the Services snap-in to the Computer Management console. In the Results pane, right-click on the service named Microsoft Search and choose Start.
  5. Re-index your site with all appropriate clients by one or more of the following methods:
    • For Windows Indexing Service or Index Server: Use the Indexing Service snap-in to the Computer Management console. Open the appropriate catalog in the Scope pane and select the Directory node. In the Results pane, right-click on the virtual directory which contains your DjVu files. Select the All Tasks menu item and choose Rescan (Full).
    • For Site Server: Use the Search (Site Server) snap-in to the Computer Management console. Open the appropriate catalog in the Scope pane and select the Catalog Build Server node. In the Scope pane, right-click on the virtual directory which contains your DjVu files. Select the All Tasks menu item and choose Start Build.
    • For SharePoint: Open the snap-in from Start>Administrative Tools>SharePoint Portal Server Administrator. On the appropriate server, right-click on the workspace which contains your DjVu files. Select the All Tasks menu item and choose Start Full Update.

NT 4.0

  1. Stop all appropriate clients by one or more of the following methods:
    1. Use the Index Server snap-in to the Microsoft Management console. From the Action menu, choose Stop.
    2. Use the Services Control Panel. In the dialog box, select Site Server Search and click the Stop button.
  2. Uninstall any previous (pre-release) version of LizardTech DjVu IFilter.
  3. Double-click the installer program file and follow the onscreen instructions.
  4. After the installation process finishes, start all appropriate clients with one or more of the following methods. For Site Server clients, there are two parts to the process and both are required.

    If using Windows Indexing Service or Index Server:

    1. Use the Index Server snap-in to the Microsoft Management console. From the Action menu, choose Start.
    2. Use the Index Server snap-in to the Microsoft Management console. In order to index DjVu files in all catalogs, right-click on Indexing Service or right-click in a specific catalog. Select Properties from the popup menu. Open the Generation tab. Select the “Index files with unknown extensions” checkbox.

    If using Site Server:

    1. Use the Services control panel. In the dialog box, select Site Server Search and click the Start button.
    2. Use the Search (Site Server) snap-in to the Microsoft Management console. Open the appropriate catalog in the Scope pane and select the Catalog Build Server node. Right-click on each virtual directory desired and choose Properties. Select the File Types tab. If the "DjVu" file extension is not included in the list of file types, click the Add button. Follow the instructions to Add File Type Extension: DjVu. Click OK to close the dialog box. Note: this step should not be necessary when re-installing a new version of the LizardTech DjVu IFilter.
  5. Re-index your site with all appropriate clients (with one or more of the following methods):
    • For Windows Indexing Service or Index Server: Use the Index Server snap-in to the Microsoft Management console. Open the appropriate catalog in the Scope pane and select the Directory node. In the Results pane, right-click on the virtual directory which contains your DjVu files. Select the Rescan menu item and choose Full Rescan.
    • For Site Server: Use the Search (Site Server) snap-in to the Microsoft Management console. Open the appropriate catalog in the Scope pane and select the Catalog Build Server node. In the results pane, right-click on the virtual directory which contains your DjVu files. Select the All Tasks menu item and choose Start Build.

Testing
You can check the progress of the indexing by selecting the Indexing Service under Computer Management. The service may take some time to index the files depending on server load. When the Docs to Index column has reached 0, all the files have been indexed.

To check the index, select the catalog you have just created, and click Query the Catalog. Type in a word you know exists in one of the files and the files containing that word should be displayed.

If you have set this up as part of the base Index Server, you should also be able to go to Start>Search and search for a file with a word or a phrase in it.

Troubleshooting
If your search method does not find text in a DjVu file after you install the LizardTech DjVu IFilter:

  • Restart your server machine.
  • Re-index your chosen directories. For example, using the Index Server snap-in to the Microsoft Management console (MMC), open the appropriate catalog in the Scope pane and select the Directory node. Then, in the Results pane, right-click on the virtual directory which contains your DjVu files, select the Rescan menu item and choose the Full Rescan mode. If that does not work, stop and restart Index Server using the Action menu of the MMC. Should that fail to produce a good Index Server catalog, disable indexing at the root of the directory using the IIS snap-in to the MMC, stop and restart Index Server, and reapply indexing at the root of the directory. It may even be necessary to stop and restart Index Server once more after that step. Additionally, you may use the Merge Index button in the HTML Index Server Manager.
  • Verify that the DjVu file contains text. You can detect whether there is text in a DjVu file by opening it in the DjVu Browser Plugin. With the text selection tool, drag a rectangle around a region of the page containing characters. If you can not highlight any characters, then no optical character recognition (OCR) has been performed on the DjVu file and it contains no text for the LizardTech DjVu IFilter to index. You can perform OCR on your document using DjVu Editor (part of the Document Express Professional suite) or using DjVuJoin/DjVuBundle (part of the Document Express Enterprise suite).
  • Make sure the DjVu file has a ".DjVu" filename extension. Clients such as Index Server find the LizardTech DjVu IFilter by looking up the filename's extension in the Windows Registry and may not work with .djv.
  • For further help with Microsoft Index Server reference Microsoft Knowledge Base Article – 309173

Abstract Generation (Advanced)
It is possible to get an abstract from a DjVu file using Index Services. This is particularly helpful when using Microsoft SharePoint.

To get an abstract from a DjVu file:

  1. Open Computer Management and Indexing Services.
  2. Select the catalog you created, right-click and select Properties.
  3. On the Generation tab, clear (deselect) the "Inherit above settings from Service" option, then select Generate abstracts.
  4. You can now access the abstract if you use the Characterization property in a search application. A simple .ASP example is listed below:

To use this example, copy the text between the asterisk lines below ************* into a new text file and save into your WWWRoot as an ASP page (for example, searchdjvu.asp). Change QS.Catalog = "djvu" to match the name of the catalog you have created.

*************

<HTML>

<HEAD>

<TITLE>Search DjVu files using Index Server - Sample</TITLE>

</HEAD>

<BODY>

<%dim searchtext

searchtext=Request.QueryString("searchtext")

%>

Enter the term you want to search for and press 'Submit Query' button:<br><br>

<form name="DjVuSearch" method="GET">

<input type="text" name="searchtext" value="<%=searchtext%>"><br>

<input type="submit" name="SubmitButton">

</form>

<%

if searchtext<>"" then

dim QS,RS,I

SET QS = Server.CreateObject("ixsso.Query")

QS.Catalog = "djvu"

QS.Columns = "vpath, DocTitle, FileName, Path, Write, Size, Characterization"

QS.Dialect = 2

QS.Query = searchtext

' Issue query

on error resume next

set RS = QS.CreateRecordSet("nonsequential")

if RS.RecordCount<>0 then

Response.Write("There are " & RS.RecordCount & " matches for search string [ " & searchtext & " ]

<br><br>")

for i = 1 to RS.RecordCount

Response.Write("<A HREF='file://"& RS("path") & "' target='_blank'>" &RS("filename") &

"</A><br>")

Response.Write("Summary " & RS("characterization") & "<br><br>")

RS.MoveNext

next

else

Response.Write("There were no matches")

end if

end if

%></BODY>

</HTML>

*************

Microsoft Developer Support
An IFilter is an ActiveX control called by a client to extract text from a given file format. The LizardTech DjVu IFilter consists of code that understands the DjVu file format along with code that provides the appropriate interface to clients such as Index Server. These clients use the text returned by IFilters to build indexes and support queries against those indexes.

To obtain more information about the IFilter specification, visit Microsoft's web site:

Click here.

LizardTech Technical Support Options for the LizardTech DjVu IFilter
Technical support for the LizardTech DjVu IFilter is available as a fee-based, pay-as-you-go option. For more information on pay-as-you-go technical support options and for a list of technical support phone numbers, please see http://www.lizardtech.com/support/

If you have Internet access, you have round-the-clock access to free technical information online. Visit the LizardTech Web site at http://www.lizardtech.com/support/doc/ to search our technical support databases, participate in user-to-user forums, or download free plug-ins, filters, or updates.

Language Support
LizardTech DjVu IFilter has no user interface and therefore is language-agnostic. It is tested and supported in Tier 1 languages (English, French, German, and Japanese), utilizing operating systems in the above languages and text within LizardTech DjVu documents in the above languages. The LizardTech DjVu IFilter is based on the Microsoft indexing client, which is responsible for interpreting the returned text and then presenting the information to the user.

Searching by Metadata
Not supported.

Known Issues
The Highlight Hits feature in Microsoft Index Server cannot highlight text in a DjVu file opened in the DjVu Browser Plugin. Text in Indirect DjVu files is indexed twice: once as part of the indirect document and once again as a single page document.

Release Notes
1.1

Internationalized installer and localization to English and Japanese.

1.0

Initial release.