United States. Research conducted by the National Institute of Standards and Technology (NIST) looked at the effectiveness of facial recognition algorithms for identifying people wearing masks or masks.
The study indicates that these algorithms work with "great difficulty". Even the best of the 89 commercial facial recognition algorithms tested had error rates between 5% and 50% when combining digitally applied face masks with photos of the same person without a mask.
The results were published in the NIST Interagency Report (NISTIR 8311), the first in a planned series of NIST's Facial Recognition Recognition Test (FRVT) program on the performance of facial recognition algorithms on faces partially covered by protective masks.
"With the onset of the pandemic, we need to understand how facial recognition technology deals with masked faces," said Mei Ngan, a NIST computer scientist and author of the report. "We started by focusing on how an algorithm was developed before the pandemic could be affected by subjects wearing face masks. Later this summer, we plan to test the accuracy of algorithms that were intentionally developed with masked faces in mind."
The NIST team explored how well each of the algorithms was able to perform a "one-to-one" correspondence, where a photo is compared to a different photo of the same person. The feature is commonly used for verification, such as unlocking a smartphone or verifying a passport. The team tested the algorithms on a set of about 6 million photos used in previous FRVT studies. (The team didn't test the algorithms' ability to perform a "one-to-many" match, used to determine whether a person in a photo matches one in a database of known images.)
The research team digitally applied mask shapes to the original photos and tested the performance of the algorithms. Because real-world masks differ, the team devised nine variants of masks, which included differences in shape, color, and nose covering. The digital masks were black or light blue, which is about the same color as a blue surgical mask. The shapes included round masks covering the nose and mouth and a larger type as wide as the wearer's face. These wider masks had high, medium, and low variants that covered the nose to varying degrees. The team then compared the results to the algorithms' performance on maskless faces.
"We can draw some general conclusions from the results, but there are caveats," Ngan said. "None of these algorithms were designed to handle face masks, and the masks we use are digital creations, not real ones."
If these limitations are taken into account, Ngan said, the study provides some general lessons when comparing the performance of algorithms tested on masked versus masked faces.
- The accuracy of the algorithm with masked faces decreased substantially across the board. Using maskless images, the most accurate algorithms cannot authenticate a person about 0.3% of the time. Masked images raised even the failure rate of these algorithms above about 5%, while many competent algorithms failed between 20% and 50% of the time.
- Masked images most often caused algorithms to be unable to process a face, technically referred to as "enrollment or template failure" (FTE). Facial recognition algorithms usually work by measuring the characteristics of a face, for example, its size and distance from each other, and then comparing these measurements with those of another photo. An FTE means that the algorithm could not extract the characteristics of a face well enough to make an effective comparison in the first place.
- The more a mask covers the nose, the lower the accuracy of the algorithm. The study explored three levels of nasal covering: low, medium and high, and found that accuracy degrades with increased nasal covering.
While false negatives increased, false positives remained stable or decreased modestly. Errors in facial recognition can take the form of a "false negative," where the algorithm doesn't match two photos of the same person, or a "false positive," where it incorrectly indicates a match between photos of two different people. The modest decline in false positive rates shows that exclusion with masks does not undermine this aspect of safety.
- The shape and color of a mask are important. Algorithm error rates were generally lower with the round masks. The black masks also degraded the algorithm's performance compared to the surgical blue ones, although due to time and resource constraints, the team was unable to test the color effect completely.
The report, Facial Recognition Provider Continuous Test (FRVT) Part 6A: Accuracy of Facial Recognition with Face Masks Using Pre-COVID-19 Algorithms, provides details of each algorithm's performance and the team has posted additional information online.
Ngan said the next round, planned for later this summer, will test algorithms created with face masks in mind. Future rounds of studies will test searches one by many and add other variations designed to expand the result.
Source: NITS.


