Installing and running CorpusSearch


Preliminaries

Installing CorpusSearch

Running CorpusSearch

CorpusSearch is run from a terminal window (or "cmd" box) as described above. It needs two arguments to run: a command file and one or more source files. Assuming the alias "CS" defined in the previous section, the following command runs the query "foo.q" on the source file "bar.psd".
CS foo.q bar.psd
The above command assumes that the command file and source files are in the same directory as the one from which you're invoking CorpusSearch. If that is not the case, you need to specify full paths to the files in question. For example:
CS /path/to/query/file/foo.q path/to/input/file/bar.psd

CS ../queries/foo.q ../../corpus/bar.psd
Using standard command line conventions, more than one file can be searched in a single run.
CS foo.q a*.psd *z.psd /some/other/directory/*.psd
CorpusSearch can also be invoked without arguments. It then prompts the user to enter the names of the command file and of the source file(s).

The search results are written to a file with the same basename as the query file, but with the extension .out ("foo.out" in the case at hand).

The basename of the output file (but not the extension) can be changed using an "-out" switch. Not "-o", but "-out".

CS foo.q bar.psd -out foo-on-bar.out

CS foo.q *.psd -out all.out

If CorpusSearch reports a run-time error because the underlying corpus is corrupted (for instance, because of mismatched parentheses), it sends an error message to the terminal screen by default. Tracking down such errors generally requires redirecting the error message to a file. The file is standardly called "ERR", but the name is up to the user.

CS foo.q bar.psd >& ERR

CS foo.q *.psd >& help-me-debug

Depending on the complexity of a query and the size of the input, searches can take from a few seconds to a few hours. To run a search in the background under Unix/Linux, add an ampersand (&) at the end of your command:

CS foo.q bar.psd &

CS foo.q bar.psd >& ERR &

Before initiating a search in the background, make sure there is no .out file corresponding to the query you intend to run. If there is, your search will stall.