SiteXpert FAQ

Question ID:
Q1020
Question:
How can I exclude documents from the structure when scanning my web site?
 
Navigation Mode of Layout Wizard allows you to specify many filters which can be applied during the scanning process to exclude documents from the structure:
  • 'Include and exclude file filters': by specifying a combination of include and exclude filters in the last step of Layout Wizard you can tell SiteXpert which files should be included in the structure. For example, adding '*.html' to include filters and 'a*.html' to exclude filters will result in all files with .html extension being included, except for files starting with the letter 'a'
  • In 'exclude file/URL filters' list you can also enter URLs to be excluded, e.g.: 'http://www.domain.com/dir/*' will exclude all URLs starting with 'http://www.domain.com/dir/', for example: 'http://www.domain.com/dir/subdir/file.htm' will be excluded
  • 'Do not include files below level': this option is also available in the last step of Layout Wizard. Using this option, you can tell SiteXpert not to go deeper in following links than N levels, so that only URLs from the first few levels are indexed
  • Ignored links (found in advanced options of Layout Wizard): you can filter out URLs by specifying which links should be ignored. You can specify a pattern like: *weather* to ignore links containing the word 'weather'. This will result in links like < href='/weather/europe/'> or 'http://www.domain.com/dir/weather.htm' being ignored
  • Suffix builder (found in advanced options of Layout Wizard): you can filter out document according to their description (e.g. title), for example all documents which start with, end with or contain a specific text.