The LUIS Team has developed a new recognizer library that provides greater accuracy in identifying numerics, and also
allows the developer to provide context to which those numerics refer to. Luis now incorporates a new library for number recognition in
which implements a solution using Regular Expressions. Regular Expressions (
Regex) are a well established and proven method used to identify
specific patterns, and used quite regularly in all sorts of applications. This allows the machine-learning back end service in
concentrate on interpreting natural language, and allowing the number recognizers to provide the heavy lifting for numerics.
New Recognizers vs. The Old
Entity recognizers in
LUIS have two main parts:
1) Recognition of the entity
2) Resolution of the entity into a value for an application to use
Comparing the new LUIS Recognizer vs the Old:
|Query||Old Recognizer||New Recognizer|
|one thousand||“one thousand”||“1000”|
|1/2||“1 / 2”||“0.5”|
|one half||“one half”||“0.5”|
|one hundred fifty||“one hundred fifty”||“150”|
|one hundred and fifty||“one hundred and fifty”||“150”|
|one point five||“one point five”||“1.5”|
|two dozen||“two dozen”||“24”|
NOTE: LUIS JSON response will return a
From our list of examples, there are many ways in which numeric values are used to quantify, express, and describe pieces of information, with more possibilities than the examples listed.
LUIS number recognizers implemented machine learning recognizers which worked well, but did not include resolution; and would sometimes
miss recognizing some forms of numbers. Using the new number recognizers which provide resolution,
LUIS is able interpret more variations a user could provide in a query, and return consistent numeric values.
Using a direct example from our table:
We passed in a query of
two dozen and in the LUIS response we have
Adding Composite Entities for Context
What about including units? Training artificial intelligence services such as
LUIS to not only recognize numerics but also the context in which
the user is referring to, is one of the big challenges to solve in developing natural language understanding. Thankfully,
already incorporates many common pre-built entities available for an application.
For our example, we simply implement the ordinal and percentage pre-built entities into our model to provide some context in our number recognition, and receive the following LUIS response:
Lastly, we are both excited and proud to announce that we’re releasing this new library open-source to the public. We are looking forward to
working with the community to develop something truly special. The new recognizers currently provide English, Spanish and Chinese language support.
In the next article, we’ll discuss the mechanics of how the
Number Recognizers works so that developers can produce their own recognizers.
Ezequiel Jadib and Matthew Shim from the Bot Framework Team