contents
back previous next close the help window

Muscat empower User Guide

Creating a search


To run a search on the document items stored in the system, go to the search screen.
This will often be the first screen you see after logging in to the system (depending on how your system administrator has configured the system).

The search screen can be used to specify what you would like to look for, and what data you would like to consider in your search. The following techniques can be used for entering your search parameters:

Entering natural language queries

At the top of the screen is a text box, where you can type in your query, using natural language words and phrases. You may, if you wish, type in a complete sentence. The search will look for these words, and will return a list of the most relevant items.

Of course, you can still put as many other words as you like into the box. In fact, you should always enter as many words as you can (that are relevant to your interests) into this box. This will result in a good ranking of the items, so the ones you are most interested in will come right to the top of the results list.

Search weighting

If you wish, you can give a higher weighting to more important search terms by placing a plus (+) sign in front of them. Similarly, less desired terms are preceded by a minus (–) sign; you can use the minus sign to remove all or most items containing a specific word. For example, to search for documents about computers but not about networks you would specify the following search string:

+computer -networks

Entering Boolean queries

You can enter a query in Boolean form into the text box, to search for document items with specific combinations of words. This is an option that has to be set up by your system administrator.
You can use the following operators between words in a Boolean query:

AND
OR
NOT

NOT is used before each word you wish to specifically exclude. A NOT can only be used as part of a Boolean query; it cannot be used on its own.

If there is no operator between two words, AND is assumed to be the operator. Round brackets can be inserted around parts of the query to control the order in which the operators are evaluated.

An example of a Boolean query demonstrating the syntax is:

Homer OR Marge) AND Simpson

This will return documents containing Homer and Simpson as well as documents containing Marge and Simpson.

Homer OR (Marge AND Simpson)

will return documents containing just Homer as well as documents containing Marge and Simpson.

(Homer Simpson) OR (Marge Simpson)

will return documents containing Homer and Simpson as well as documents containing Marge and Simpson.

(Homer OR Marge) AND Simpson NOT Bart

will return documents containing Homer and Simpson as well as documents containing Marge and Simpson, but it will exclude documents containing Bart.

Phrase searching

This search method can be used to search for a complete phrase. When a phrase is entered into a query, the phrase as a whole is used as a search term, rather than the individual words which constitute the phrase. Documents which contain the exact phrase will be returned with a higher weight than those that simply contain the same words which make up the phrase.

To specify something as a phrase, enclose it in double quotes (for example "this is a phrase"). If you wish to influence the weight of a phrase, you can add a + or - symbol as a prefix:

+"this is a phrase"

Fuzzy searching

This search method breaks words down into segments of three letters and compares them with similar words in the index. Your system administrator should have set up a database containing these three-letter groups. Fuzzy matching finds words which are the closest in structure, even though some of them may be linguistically dissimilar. A major effect of this method is to allow spelling which may be incorrect.

To use fuzzy matching, you should use the tilde symbol as a prefix for some words when entering your query text. You can also use fuzzy matching when setting up the query for an agent.

Examples demonstrating the syntax:

~grafical           This looks for all terms similar to grafical
~grafical ~disspla  This looks for all terms similar to grafical and disspla

If you wish to use fuzzy searching with phrase searching, you can use syntax similar to that shown in the following examples:

~"mikael gorbachov" This is equivalent to ~mikael ~gorbachov
"~mikael gorbachov" This searches for a phrase containing the terms '~mikael' and 'gorbachov'
+~mikael This is equivalent to ~mikael with each term returned given large weight

You will only be able to use this feature if your system administrator has enabled it.

Wildcard searching

You can type in search terms with wildcards - empower supports truncation of words. These wildcards take the form of an asterisk (*) which is always placed at the end of a word used in a query.

To search for all words beginning with car, type in the string car*. This may return words such as car, carrion, carp, carpark, carpet and so on (depending upon which words you have in your index).

You will only be able to use this feature if your system administrator has enabled it. The system administrator should have set a maximum limit for the number of words returned by a wildcard search (otherwise, too many words would be found and included in the query).


Note: If you wish to use a wildcard as well as fuzzy matching, only the fuzzy matching will be performed.

Search term translation

You may, if you wish, enter a query in one language and have all the terms in the query translated into the other languages used within documents in the index.

With this technique, you could (for example) have your empower search installed in Dutch, with the default search term entry language in German, and could have a specific search string in English.

Before you enter queries, you should select the current language from a drop-down selection list (this is the language in which the queries are made). This tells the search mechanism to translate terms to the languages other than that specified.

You will only be able to use this feature if your system administrator has enabled it.

Further down the screen, there are optional controls you can use to restrict the results you will get:

Thesaurus

If you are unsure of certain words you have used, you can input alternative search terms by using a powerful on-line thesaurus. This feature will only appear if your system administrator has enabled it.

To open the thesaurus, click on the thesaurus button, usually located under the text entry box. For further information, see the section about using the thesaurus later in this on-line guide.

Top Terms

You can refine your search by using key words (known as Top Terms) that appear after you have done your first search. These are significant terms that have been found in the documents returned by the first search. Top Terms are usually displayed in a list (presented in their stemmed form) with check boxes adjacent to them. To include each term in the next search (thus making the search more focused), select the check boxes of terms you feel are most relevant to your interest. This feature will only appear if your administrator has enabled it, and if you have already conducted an initial search.

You can increase the effectiveness of Top Terms by selecting the check boxes alongside each document in the results list; this will focus the words in the Top Terms list only on those documents chosen.

Improve

This is an optional method of refining a query, by selecting specific document items from the first query for inclusion in a secondary query. Check the check boxes against all items you wish to include in a secondary query. An Improve button will appear after the first query; click this button to start the improved query.

You can use the Improve feature in conjunction with the Top Terms feature to produce more powerful queries.


Note that Improve will work even if you have not selected any check boxes. For best results, it is recommended that you select as many documents as you can that you feel may be relevant.

Language selection

You can specify that the query will look only for document items in a particular language (choose the language from the find information in selection list).

Muscat empower will work out which documents have a high incidence of words in the chosen language, and will include them in the search. The only documents in other languages that will be matched to the words in the query are those where words occur with initial capital letters.

Limitation by date

You can limit your search to items that were added to the index within a number of days specified by you. In the include items modified within the last box, type in a number of days to specify how far back the search should look. Any items outside this date range will be ignored.

Sorting options

You can choose how you would like the results of your query to be displayed. The default setting sorts by percentage score (this is determined by the relevance of each item). Relevance is a rating of how well the contents of each document item match the search terms and conditions you have typed in. Document items which have the same score are then ordered by word proximity - in other words, an item which contains search words close to each other will be ranked more highly than one which contains the same words, but not close to each other.

Alternatively, you could choose to sort by relevance bands and then date, in which case items with the same relevance will be presented in reverse chronological order; in other words, the most recent comes first. (The date used is the date when the item was written.) Also, you can sort by date and then relevance, where items are listed in reverse chronological order, with items written on the same day presented in order of relevance (in other words, the highest percentage score comes first).

Relevance cutoff

You can choose to exclude all items which have a relevance below a level you specify. To do this, you can select from a list of relevance ratings in the excluding results less than % relevant drop-down list.

Category filters

If you wish to perform a more exact match, you should use Category filters. These are drop-down selection lists containing different categories of information (these should be defined by your system administrator).

Select the appropriate categories when performing a query - this will restrict the results to those categories.

You will only see this feature in the search screen if your system administrator has enabled it.

Starting the search

When you have finished setting up your query, click on the search button. When you have run a first search, you can run a more refined secondary search by using the Top Terms and Improve features.

If you wish to clear the search settings at any time, you may click on the reset button; this will reset the page back to the state it was in when you loaded it, so you can start again.