Indexing and searching XML with swish-e
install ( sudo make install )
For best results, it is a good idea have previously installed the libxml2 family of C libraries. To install the Perl module you change to the Perl directory and enter the usual commands for installing modules:
sudo make install
Installation on Windows is just as easy. Download the distribution, run the resulting .exe file, and the in- staller will do the rest. Be careful. It is a good idea to use the installer's defaults since applications using swish-e will need access to the necessary dynamically linked library (dll) files. These dll files need to be in your PATH environment variable. If you have previously installed Perl, the installer will install the modules as well as the swish-e application.
After installation, you will have local access to the voluminous documentation, not to mention the HTML-based documentation in the distribution's html directory:
These exercises demonstrate how to create and search simple indexes of XML documents. The exercises build on earlier exercises in this workbook by using the data created in those exercises.
Swish-e excels at indexing rich and well-structured HTML files. In a previous exercise sets of XHTML files were created from MARC records. They were saved in the xml-data/xhtml/marc2xhtml directory. Open any one of these files in your text editor and notice the structure of each of their head elements, such as this one (line have been hard-wrapped for readability):
<title>Biology, psychology, and medicine</title>