Microsoft has launched what it calls Speller100, which is a large-scale multilingual spelling correction models for Bing Search worldwide with high precision and high recall rate in over 100 languages.

According to Microsoft, Bing records about 15% of queries submitted by customers having misspellings, and when queries are misspelled, the search engine normally matches the wrong set of documents and trigger incorrect answers, which can produce a sub-optimal results page for web searchers.

Therefore, Speller100 serves to correct this search anomaly so that there will be better misspell handling in more languages from around the world with the help of AI at Scale.

How Speller100 will improve searches in more than 100 languages in Bing?



Microsoft's large-scale multilingual spelling correction models has high recall in over 100 languages with high precision, and these models, called Speller100, are meant to help in improving search results in Bing.



It's a huge step forward for search generally, especially as considering that spelling correction has only been available for a few dozen languages. The Speller100 model leverages on advances in AI, particularly zero-shot learning coupled with carefully designed large-scale pretrained tasks, and historical linguistics theories.

Speller100 is based on the concept of language families, with languages similarities that multiple languages share, and the so-called zero-shot learning, which allows a model to accurately learn and correct spelling without additional language-specific labeled training data.

The models are trained with tasks like MLM (Masked Language Model), next-sentence prediction, and translation. Although, these are commonly used WordPiece or SentencePiece subword segmentation algorithms that break down words into smaller constituents, existing pretraining tasks will operate at the word, phrase, or sentence level for semantic understanding.

How this major AI Leap will help Bing do better search?



Speller100 is perhaps the most comprehensive spelling correction system so far, in terms of the number of languages covered and overall accuracy. And given the improved technology, the search results for all Bing users will become more accurate by expanding accurate spelling correction in over 100 languages.

In fact, analysts have postulated a double-digit improvement in both search spelling correction precision and recall rate, with comprehensive Bing online search A/B testing as follows: the number of times users clicked on item on the search page went from single digits to 70% and 30% reduction in overall number of pages with no results.

Bing Speller100: Zero-shot Search Spelling Correction at Scale

Microsoft has launched what it calls Speller100, which is a large-scale multilingual spelling correction models for Bing Search worldwide with high precision and high recall rate in over 100 languages.

According to Microsoft, Bing records about 15% of queries submitted by customers having misspellings, and when queries are misspelled, the search engine normally matches the wrong set of documents and trigger incorrect answers, which can produce a sub-optimal results page for web searchers.

Therefore, Speller100 serves to correct this search anomaly so that there will be better misspell handling in more languages from around the world with the help of AI at Scale.

How Speller100 will improve searches in more than 100 languages in Bing?



Microsoft's large-scale multilingual spelling correction models has high recall in over 100 languages with high precision, and these models, called Speller100, are meant to help in improving search results in Bing.



It's a huge step forward for search generally, especially as considering that spelling correction has only been available for a few dozen languages. The Speller100 model leverages on advances in AI, particularly zero-shot learning coupled with carefully designed large-scale pretrained tasks, and historical linguistics theories.

Speller100 is based on the concept of language families, with languages similarities that multiple languages share, and the so-called zero-shot learning, which allows a model to accurately learn and correct spelling without additional language-specific labeled training data.

The models are trained with tasks like MLM (Masked Language Model), next-sentence prediction, and translation. Although, these are commonly used WordPiece or SentencePiece subword segmentation algorithms that break down words into smaller constituents, existing pretraining tasks will operate at the word, phrase, or sentence level for semantic understanding.

How this major AI Leap will help Bing do better search?



Speller100 is perhaps the most comprehensive spelling correction system so far, in terms of the number of languages covered and overall accuracy. And given the improved technology, the search results for all Bing users will become more accurate by expanding accurate spelling correction in over 100 languages.

In fact, analysts have postulated a double-digit improvement in both search spelling correction precision and recall rate, with comprehensive Bing online search A/B testing as follows: the number of times users clicked on item on the search page went from single digits to 70% and 30% reduction in overall number of pages with no results.

No comments