Search Features— International Languages
For an article discussing Unicode and other international language support in dtSearch, click here.
Unicode Support
- Unicode support allows for indexing and searching of non-English text, including every character set supported by the Unicode standard.
- In addition to Unicode support, dtSearch offers extensive alphabet customization options.
- See Unicode FAQ for more technical information.
Language-Neutral Search Options
- The following search options work automatically on text in any language: fuzzy (adjustable from 0 to 10); natural language with automatic relevancy-ranking; variable term weighting; phrase; boolean (and/or/not); proximity and directed proximity; wildcard; macro; numeric range; and fielded data (alone or combined with full-text searching).
Language Analyzer API Integration
- The dtSearch Engine includes a language analyzer API that can be used to integrate morphological analyzers and custom or dictionary-based word breakers into the dtSearch Engine indexing process.
- The dtSearch Engine offers integration with Basis Technology's Rosette Linguistics Platform for enhanced Chinese, Japanese and Korean text retrieval.
- The dtSearch Engine also includes an API for substituting a non-English language thesaurus for the existing English-language one.