Google Research:Wildcards and Fuzzy Search: *, ?, and ~

imstilla.grandma

Believer of Miracles
Joined
Jul 7, 2018
Messages
30,682
Reaction score
208,551
Using an asterisk (*) allows wildcard searches. For example, immigra* finds all words beginning with "immigra". This can also be used at the beginning or middle of words, at both the beginning and the end of a word, or even all three. For example, you can find all words containing two esses side-by-side with the following query: *ss*. You could also find words with two esses separated by other letters with a query such as: *s*s*. This would find cases containing words like "susan" or "assistant". Another example: secur* retrieves securing, secures, security, etc.

The question mark character (?) can be used similarly as a single letter wildcard. For example, this would find cases containing the word "immigrant" or "emmigration": ?mmigra* Googling ?an, locates "ran," "pan," "can," and "ban."

Fuzzy search can be applied using the tilde character (~) after a word. This is an advanced parameter that allows searches for misspellings or different variations of a word's spelling. For example, searching for immigrant~ would find words similar to "immigrant". Values can also be added after the tilde to indicate how similar different spellings must be. The default value, if none is given, is 0.5. Values can range between 0 and 1, with 1 being exact, and 0 being very sloppy. Fuzzy searches tend to broaden the result set, thus lowering precision, but also casting a wider net.

Wildcards are used in search terms to represent one or more other characters.

The two most commonly used wildcards are:
  • An asterisk (*) may be used to specify any number of characters. It is typically used at the end of a root word, when it is referred to as "truncation." This is great when you want to search for variable endings of a root word.
    • For example: searching for educat* would tell the database to look for all possible endings to that root. Results will include educate, educated, education, educational or educator.
  • A question mark (?) may be used to represent a single character, anywhere in the word. It is most useful when there are variable spellings for a word, and you want to search for all variants at once.
    • For example, searching for colo?r would return both color and colour.
If you do not use AND, OR, NOT to separate your search terms, all terms combined with the AND operator. Below are additional tips for using Boolean operators:
  • OR, NOT and AND must be written in ALL CAPS.
  • To expand the results set, use the OR operator, e.g., army OR navy OR "air force" OR marines will return items that contain any of these terms.
  • To exclude items, use the NOT operator or minus sign (-) before a term, e.g., animal NOT dog, will not include results with the term “dog”. Note: This will also exclude any records that contains both the terms animal and dog.
  • Use parentheses to change the order of how a search is processed, e.g., ptsd AND army OR military vs. ptsd AND (army OR military). The first search will return all articles discussing "ptsd and army", as well as any article with the term military.
Truncation & Wildcards
Searches can be performed using the wildcards (?,*) and double quotes.
  • Question mark (?) will match any one character. It cannot be used as the first character of a search.
    • For example, wom?n can be used to find woman or women.
  • Asterisk (*) will match zero or more characters within a word or at the end of a word.
    • A search for ch*ter would match charter, character, and chapter.
    • A search for temp*, will match all suffixes, such as temptation, temple and temporary.
  • Use "double quotes" to search for a specific phrase, e.g., "caregiver support".

Getting Started With Your Search
» Boolean operators (AND, OR, NOT) and fielded searching are supported in BASIC, ADVANCED, and COMMAND SEARCH. Boolean operators must be in ALL CAPS.

» Maximum of 40 terms (15 terms per search clause). Note: The words within phrases are counted separately (e.g., “cloud computing” is two terms).

» Use quotes (“ ”) for an exact phrase and to turn off stemming.

Stemming and Wildcards
» Two WILDCARDs are supported.

1. An asterisk (*) represents a single character, multiple characters, or no characters.

Example: secur* retrieves securing, secures, security, etc.

2. A question mark (?) represents a single character.

Example: wom?n retrieves woman or women.

» WILDCARDs can be used anywhere in a word.

• Example: *surg* retrieves surgery, surgical, neurosurgery, microsurgeons, etc.

» WILDCARDs can be used within exact phrases.

• Example: “health inform*” retrieves “health inform”,

“health informatics”, “health information”, etc.
 
Last edited:

Members online

Online statistics

Members online
56
Guests online
3,882
Total visitors
3,938

Forum statistics

Threads
592,490
Messages
17,969,781
Members
228,789
Latest member
Soccergirl500
Back
Top