Documentation

Interoperating with Java Lucene - Zend_Search_Lucene

Interoperating with Java Lucene

File Formats

Zend_Search_Lucene index file formats are binary compatible with Java Lucene version 1.4 and greater.

A detailed description of this format is available here: http://lucene.apache.org/java/2_3_0/fileformats.html [1] .

Index Directory

After index creation, the index directory will contain several files:

  • The segments file is a list of index segments.

  • The *.cfs files contain index segments. Note! An optimized index always has only one segment.

  • The deletable file is a list of files that are no longer used by the index, but which could not be deleted.

Java Source Code

The Java program listing below provides an example of how to index a file using Java Lucene:

  1. /**
  2. * Index creation:
  3. */
  4. import org.apache.lucene.index.IndexWriter;
  5. import org.apache.lucene.document.*;
  6.  
  7. import java.io.*
  8.  
  9. ...
  10.  
  11. IndexWriter indexWriter = new IndexWriter("/data/my_index",
  12.                                           new SimpleAnalyzer(), true);
  13.  
  14. ...
  15.  
  16. String filename = "/path/to/file-to-index.txt"
  17. File f = new File(filename);
  18.  
  19. Document doc = new Document();
  20. doc.add(Field.Text("path", filename));
  21. doc.add(Field.Keyword("modified",DateField.timeToString(f.lastModified())));
  22. doc.add(Field.Text("author", "unknown"));
  23. Reader reader = new BufferedReader(new InputStreamReader(is));
  24. doc.add(Field.Text("contents", reader));
  25.  
  26. indexWriter.addDocument(doc);
[1] The currently supported Lucene index file format version is 2.3 (starting from Zend Framework 1.6).

Copyright

© 2006-2021 by Zend by Perforce. Made with by awesome contributors.

This website is built using zend-expressive and it runs on PHP 7.

Contacts