New Number Recognizers in LUIS

The LUIS Team has developed a new recognizer library that provides greater accuracy in identifying numerics, and also allows the developer to provide context to which those numerics refer to. Luis now incorporates a new library for number recognition in Microsoft.Recognizers.Text.Numbers, which implements a solution using Regular Expressions. Regular Expressions (Regex) are a well established and proven method used to identify specific patterns, and used quite regularly in all sorts of applications. This allows the machine-learning back end service in LUIS to concentrate on interpreting natural language, and allowing the number recognizers to provide the heavy lifting for numerics.

New Recognizers vs. The Old

Entity recognizers in LUIS have two main parts:

1) Recognition of the entity

2) Resolution of the entity into a value for an application to use

Comparing the new LUIS Recognizer vs the Old:

Query Old Recognizer New Recognizer
one thousand “one thousand” “1000”
1,000 “1,000” “1000”
1/2 “1/2” “0.5”
one half “one half” “150”
one hundred fifty “one hudred fifty” “150”
one hundred and fifty “one hundred and fifty” “150”
one point five “one point five” “1.5”
two dozen “two dozen” “24”

NOTE: LUIS JSON response will return a string

From our list of examples, there are many ways in which numeric values are used to quantify, express, and describe pieces of information, with more possibilities than the examples listed.

The old LUIS number recognizers implemented machine learning recognizers which worked well, but did not include resolution; and would sometimes miss recognizing some forms of numbers. Using the new number recognizers which provide resolution, LUIS is able interpret more variations a user could provide in a query, and return consistent numeric values.

Using a direct example from our table:

We passed in a query of two dozen and in the LUIS response we have 24!

Adding Composite Entities for Context

What about including units? Training artificial intelligence services such as LUIS to not only recognize numerics but also the context in which the user is referring to, is one of the big challenges to solve in developing natural language understanding. Thankfully, LUIS already incorporates many common pre-built entities available for an application.

For our example, we simply implement the ordinal and percentage pre-built entities into our model to provide some context in our number recognition, and receive the following LUIS response:

Open Source

Lastly, we are both excited and proud to announce that we’re releasing this new library open-source to the public. We are looking forward to working with the community to develop something truly special. The new recognizers currently provide English, Spanish and Chinese language support. In the next article, we’ll discuss the mechanics of how the Number Recognizers works so that developers can produce their own recognizers.

Happy Making!

Ezequiel Jadib and Matthew Shim from the Bot Framework Team