Class filtering out the words not suitable for indexing and/or searching, when part of greater strings. Hence, most of the methods take as parameter the "initial"/greater string. More...
#include <opentrep/bom/Filter.hpp>
Static Public Member Functions | |
static void | trim (std::string &ioPhrase, const NbOfLetters_T &iMinWordLength=4) |
static bool | shouldKeep (const std::string &iPhrase, const std::string &iWord) |
Class filtering out the words not suitable for indexing and/or searching, when part of greater strings. Hence, most of the methods take as parameter the "initial"/greater string.
For instance, words of length less than 3 (e.g., "de", "a", "san"), when part of greater strings (e.g., respectively, "rio de janeiro", "san francisco"), should not be indexed and searched for.
Definition at line 21 of file Filter.hpp.
|
static |
Trim all the non-relevant words from the given phrase.
The following rules are applied to the right and left outer words, iteratively until no more outer word can be stripped out:
std::string& | The phrase to be amended (e.g., 'de san francisco', part of the 'aeroport de san francisco' global phrase). |
const | NbOfLetters_T& The minimum length of the words (default is 4 letters). |
Definition at line 131 of file Filter.cpp.
References OPENTREP::createStringFromWordList(), OPENTREP::tokeniseStringIntoWordList(), and OPENTREP::trim().
Referenced by OPENTREP::Result::calculateCodeMatches().
|
static |
State whether or not to keep the given word, as opposed to filter out a non-indexable/searchable word.
The following rules are applied in sequence (if a rule applies, then the method returns, and the other rules are not processed/checked):
const | std::string& The initial phrase (e.g., 'san francisco airport'). |
const | std::string& The word on which a decision has to be made |
Definition at line 144 of file Filter.cpp.
References OPENTREP::hasGoodSize(), and OPENTREP::isBlackListed().
Referenced by OPENTREP::addUnmatchedWord(), OPENTREP::Result::calculateCombinedWeights(), and OPENTREP::Result::fullTextMatch().