> For the complete documentation index, see [llms.txt](https://worlddao.gitbook.io/worlddao-white-paper/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://worlddao.gitbook.io/worlddao-white-paper/technical-implementation/biometrics.md).

# Biometrics

Following the [exploration](https://whitepaper.worldcoin.org/#why-custom-hardware-is-needed) of iris biometrics as a choice of modality, this section provides a detailed look into the process of iris recognition from image capture to the uniqueness check:

1. [Biometric Performance at a Billion People Scale](https://whitepaper.worldcoin.org/#biometric-performance-at-a-billion-people-scale), addresses the scalability of iris recognition technology. It discusses the potential of this biometric modality to establish uniqueness among billions of humans, examines various operating modes and anticipated error rates and ultimately concludes the feasibility of using iris recognition at a global scale.
2. [Iris Feature Generation with Gabor Wavelets](https://whitepaper.worldcoin.org/#iris-feature-generation-with-gabor-wavelets-2) introduces the use of Gabor filtering for generating unique iris features, explaining the scientific principles behind this traditional method which is fundamental to understanding how iris recognition works.
3. [Iris Inference System](https://whitepaper.worldcoin.org/#iris-inference-system-2) explores the practical application of the previously discussed principles. This section describes the uniqueness algorithm and explains how it processes iris images to ensure accurate and scalable verification of uniqueness. This provides a comprehensive overview of the system's operation, demonstrating how theoretical principles translate into practical application.

Collectively, these sections offer a holistic overview of iris recognition, from the core scientific principles to their practical application in the Orb.

[**Biometric performance at a billion people scale**](https://whitepaper.worldcoin.org/#biometric-performance-at-a-billion-people-scale)

In order to get a rough estimation on the required performance and accuracy of a biometric algorithm operating on a billion people scale, assume a scenario with a fixed biometric model, i.e. it is never updated such that its performance values stay constant.

[**Failure Cases**](https://whitepaper.worldcoin.org/#failure-cases)

A biometric algorithm can fail in two ways: It can either identify a person as a different person, which is called a false match or it can fail to re-identify a person although this person is already enrolled, which is called a false non match. The corresponding rates - the false match rate (FMR) and the false non match rate (FNMR) - are the two critical KPIs for any biometric system.

For the purposes of this analysis, consider three different systems with varying levels of performance.

1. One of the systems, as reported by John Daugman in his [paper](https://www.robots.ox.ac.uk/~az/lectures/est/iris.pdf), demonstrates a false match rate of $$1.1×10−7$$ at a false non-match rate of 0.00014.
2. Another system, represented by one of the leading iris recognition algorithms from NEC, has performance values as reported in the [IREX IX report](https://www.nist.gov/publications/irex-ix-part-one-performance-iris-recognition-algorithms) and [IREX X leaderboard](https://pages.nist.gov/IREX10/) from the National Institute for Standards and Technology (NIST). These values include a false match rate of $$10−8$$ at a false non-match rate of 0.045.
3. The third system, conceived during the early ideation stage of the Worldcoin project, represents a conservative estimate of how well iris recognition could perform outside of the lab environment i.e. in an uncontrolled, outdoor setting. Despite these constraints, it anticipated a false match rate of $$10−6$$ and a false non-match rate of 0.005. While not ideal, it demonstrated that iris recognition was the most viable path for a global proof of personhood.

A [more in-depth examination](https://wld-ml-ai-data-public.s3.amazonaws.com/blog/biometric_performance/20230110_BiometricPerformanceValues.pdf) of how these values are obtained from various sources is also available.

[**Effective Dual Eye Performance**](https://whitepaper.worldcoin.org/#effective-dual-eye-performance)

The values mentioned above pertain to single eye performance, which is determined by evaluating a collection of genuine and imposter iris pairs. However, utilizing both eyes can significantly enhance the performance of a biometric system. There are various methods for combining information from both eyes, and to evaluate their performance, consider two extreme cases:

1. The AND-rule, in which a user is deemed to match only if their irises match on both eyes.
2. The OR-rule, in which a user is considered a match if their iris on one eye matches that of another user's iris on the same eye.

The OR-rule offers a safer approach as it requires only a single iris match to identify a registered user, thus minimizing the risk of falsely accepting the same person twice. Formally, the OR-rule reduces the false non-match rate while increasing the false match rate. However, as the number of registered users increases over time, this strategy may make it increasingly difficult for legitimate users to enroll to the system due to the high false match rate. The effective rates are given below:

<figure><img src="/files/gHJqW0pjHgt9t9yjXXap" alt=""><figcaption></figcaption></figure>

On the other hand, the AND-rule allows for a larger user base, but comes at the cost of less security, as the false match rate decreases and the false non-match rate increases. The performance rates for this approach are as follows:

<figure><img src="/files/LqXtCb2YkfymupvyiaqQ" alt=""><figcaption></figcaption></figure>

[**False Matches**](https://whitepaper.worldcoin.org/#false-matches)

The probability for the i-th (legitimate) user to run into a false match error can be calculated by the equation

<figure><img src="/files/4Yv9571HfwtbdHMSOQ5F" alt=""><figcaption></figcaption></figure>

with p：=FMR being the false match rate. Adding up these numbers yields the expected number of false matches that have happened after the i-th user has enrolled, i.e the number of falsely rejected users ([derivation](https://www.wolframalpha.com/input/?i=sum+1-%281-p%29%5E%28j-1%29+from+j%3D1+to+i)).

<figure><img src="/files/L8qaGyCfIrzH5LtHmiVw" alt=""><figcaption></figcaption></figure>

A high false match rate significantly impacts the usability of the system, as the probability of false matches increases with a growing number of users in the database. Over time the probability of being (falsely) rejected as a new user converges to 100%, making it nearly impossible for new users to be accepted.

The following graph illustrates the performance of the biometric system using both the OR and AND rule. The graph is separated into two sections, with the left side representing the OR rule and the right side representing the AND rule. The top row of plots in the graph shows the probability of the i-th user being falsely rejected, and the bottom row of plots shows the expected number of users that have been falsely rejected after the i-th user has successfully enrolled. The different colors in the graph correspond to the three systems mentioned earlier: green represents Daugman’s system, blue represents NEC’s system, and red represents the initial worst case estimate.

<figure><img src="/files/zBxMYOtcGTJaSRZjxmZP" alt=""><figcaption><p>Performance of biometric systems under both the OR and AND rule across three distinct scenarios: The blue line represents a highly performant system from NEC, while the green line reflects performance values as reported by John Daugman. The red line indicates a system with conservative performance values.</p></figcaption></figure>

The main findings from the analysis indicate that when using the OR-rule, the system's effectiveness breaks down with just a few million users, as the chance of a new user being falsely rejected becomes increasingly likely. In comparison, operating with the AND-rule provides a more sustainable solution for a growing user base.

Further, even the difference between the worst case and the best case estimate of current technology matters. The performance of biometric algorithms designed by Tools for Humanity has been continuously improving due to ongoing research efforts. This has been achieved by pushing beyond the state-of-the-art by replacing various components of the uniqueness verification process with deep learning models which also significantly improves the robustness to real world edge cases. At the time of writing, the algorithm's performance closely resembled the green graph depicted in the figure above when in an uncontrolled environment (depending on the exact choice of the FNMR). This is an accomplishment noteworthy in and of itself. Nonetheless, further improvements in the algorithm's performance are expected through ongoing research efforts. The optimum case is a vanishing error rate in practice on a global scale.

Note that for a large number of users (i≫1) and a very performant biometric system (p≪1) the equation above becomes numerically unstable. To calculate the number of rejected users for such a scenario, Taylor expand the critical part of the equation around small values of p.

<figure><img src="/files/V7MLdvSpBtUsfgjclyU8" alt=""><figcaption></figcaption></figure>

The derivation of the above equation can be found [here](https://www.wolframalpha.com/input/?i=%281-p%29%5En). Inserting this in the equation above yields

<figure><img src="/files/TSTUopSSr6jsNHj2DaKs" alt=""><figcaption></figcaption></figure>

[**False Non Matches**](https://whitepaper.worldcoin.org/#false-non-matches)

When it comes to fraudulent users, the probability of them not being matched stays constant and does not increase with the number of users in the system. This is because there is only one other iris that can cause a false non-match - the user's own iris from their previous enrollment. Thus, the probability of encountering a false non-match is given by

<figure><img src="/files/GfUsP2HAEGoTTUQgHDzy" alt=""><figcaption></figcaption></figure>

The number of expected false non matches can be calculated with

<figure><img src="/files/4nHjK3dLM5emAFELVHEm" alt=""><figcaption></figcaption></figure>

with j indicating the j-th untrustworthy user who tries to fool the system.

[**Conclusion**](https://whitepaper.worldcoin.org/#conclusion)

The conclusion is that iris recognition can establish uniqueness on a global scale. Further, to onboard billions of individuals, the algorithm needs to use the AND-rule. Otherwise, the rejection rate will be too high and it will be practically impossible to onboard billions of users.

The current performance is already beyond the original conservative estimate and the project expects the system to eventually surpass current state-of-the-art lab environment performance, even if subject to an uncontrolled environment: On the one hand, the custom hardware comprises an imaging system that outperforms typical iris scanners by more than an order of magnitude in terms of image resolution. On the other hand, current advances in deep learning and computer vision offer promising directions towards a “deep feature generator” - a feature generation algorithm that does not rely on handcrafted rules but learns from data. So far the field of iris recognition has not yet leveraged this new technology.

[**Iris Feature Generation with Gabor Wavelets**](https://whitepaper.worldcoin.org/#iris-feature-generation-with-gabor-wavelets)

The objective for iris feature generation algorithms is to generate the most discriminative features from iris images while reducing the dimensionality of data by removing unrelated or redundant data. Unlike 2D face images that are mostly defined by edges and shapes, iris images present rich and complex [texture](https://ieeexplore.ieee.org/document/1251145?arnumber=1251145) with repeating (semi-periodic) patterns of local variations in image intensity. In other words, iris images contain strong signals in both spatial and frequency domains and should be analyzed in both. Examples of iris images can be found on John Daugman's [website](https://www.cl.cam.ac.uk/~jgd1000/infrared-examples.html).

[**Gabor filtering**](https://whitepaper.worldcoin.org/#gabor-filtering)

[Research](https://www.nature.com/articles/381607a0) has shown that the localized frequency and orientation representation of Gabor filters is very similar to the human visual cortex’s representation and discrimination of texture. A Gabor filter analyzes a specific frequency content at a specific direction in a local region of an image. It has been widely used in signal and image processing for its optimal joint compactness in spatial and frequency domain.

<figure><img src="/files/tw3zyd2ShRRxY6ib3Snu" alt=""><figcaption><p>Constructing a Gabor filter is straightforward. The product of (a) a complex sinusoid signal and (b) a Gaussian filter produces (c) a Gabor filter.</p></figcaption></figure>

As shown above, a Gabor filter can be viewed as a sinusoidal signal of particular frequency and orientation modulated by a Gaussian wave. Mathematically, it can be defined as

<figure><img src="/files/uePdiOViil96sBucVHQW" alt=""><figcaption></figcaption></figure>

with

<figure><img src="/files/5G0osiky6vuNawcdAPf2" alt=""><figcaption></figcaption></figure>

Among the parameters, σ and γ represent the standard deviation and the spatial aspect ratio of the Gaussian envelope, respectively, λ and ϕ are the wavelength and phase offset of the sinusoidal factor, respectively, and θ is the orientation of the Gabor function. Depending on its tuning, a [Gabor filter](https://inc.ucsd.edu/mplab/75/media//gabor.pdf) can resolve pixel dependencies best described by narrow spectral bands. At the same time, its spatial compactness accommodates spatial irregularities.

The following figure shows a series of Gabor filters at a 45 degree angle in increasing spectral selectivity. While the leftmost Gabor wavelet resembles a Gaussian, the rightmost Gabor wavelet follows a harmonic function and selects a very narrow band from the spectrum. Best for iris feature generation are the ones in the middle between the two extremes.

<figure><img src="/files/AKreYwas8Xv9djyRLyJS" alt=""><figcaption><p>Varying wavelength (a-d) from large to small can change the spectral selectivity of Gabor filters from broad to narrow.</p></figcaption></figure>

Because a Gabor filter is a complex filter, the real and imaginary parts act as two filters in quadrature. More specifically, as shown in the figures below, (a) the real part is even-symmetric and will give a strong response to features such as lines; while (b) the imaginary part is odd-symmetric and will give a strong response to features such as edges. It is important to maintain a zero DC component in the even-symmetric filter (the odd-symmetric filter already has zero DC). This ensures zero filter response on a constant region of an image regardless of the image intensity.

<figure><img src="/files/XPMngRmyMrkR75dQsBRP" alt=""><figcaption><p>Giving a closer look at the complex space of a Gabor filter where (a) the real part is even-symmetric and (b) the imaginary part is odd-symmetric.</p></figcaption></figure>

[**Multi-scale Gabor filtering**](https://whitepaper.worldcoin.org/#multi-scale-gabor-filtering)

Like most textures, iris texture lives on multiple scales (controlled by ). It is therefore natural to represent it using filters of multiple sizes. Many such multi-scale filter systems follow the wavelet building principle, that is, the kernels (filters) in each layer are scaled versions of the kernels in the previous layer, and, in turn, scaled versions of a mother wavelet. This eliminates redundancy and leads to a more compact representation. Gabor wavelets can further be tuned by orientations, specified by . The figure below shows the real part of 28 Gabor wavelets with four scales and 7 orientations.

<figure><img src="/files/uQyLAESb5CALqYRUC0gi" alt=""><figcaption><p>Constructing Gabor wavelets with multiple scales (vertically) and orientations (horizontally) to generate texture features with various frequencies and directions. In the feature generation process, the system uses a small set of filters that concentrate within the range of scales and orientations of the most discriminative iris texture.</p></figcaption></figure>

[**Phase-quadrant demodulation and encoding**](https://whitepaper.worldcoin.org/#phase-quadrant-demodulation-and-encoding)

After a Gabor filter is applied to an iris image, the filter response at each analyzed region is then demodulated to [generate](https://www.robots.ox.ac.uk/~az/lectures/est/iris.pdf) its phase information. This process is illustrated in the figure below, as it identifies in which quadrant of the complex plane each filter response is projected to. Note that only phase information is recorded because it is more robust than the magnitude, which can be contaminated by extraneous factors such as illumination, imaging contrast, and camera gain.

<figure><img src="/files/gTib54MGx4jDJ84xIv5V" alt=""><figcaption><p>Demodulating the phase information of filter response into four quadrants of the complex space. The resulting cyclic codes are used to produce the final iris code.</p></figcaption></figure>

Another desirable feature of the phase-quadrant demodulation is that it produces a cyclic code. Unlike a binary code in which two bits may change, making some errors arbitrarily more costly than others, [a cyclic code](https://www.robots.ox.ac.uk/~az/lectures/est/iris.pdf) only allows a single bit change in rotation between any adjacent phase quadrants. Importantly, when a response falls very closely to the boundary between adjacent quadrants, its resulting code is considered a fragile bit. These fragile bits are usually less stable and could flip values due to changes in illumination, blurring or noise. There are many methods to deal with fragile bits, and one such method could be to assign them lower weights during matching.

When multi-scale Gabor filtering is applied to a given iris image, multiple iris codes are produced accordingly and concatenated to form the final iris template. Depending on the number of filters and their stride factors, an iris template can be several orders of magnitude smaller than the original iris image.

[**Robustness of iris codes**](https://whitepaper.worldcoin.org/#robustness-of-iris-codes)

Because iris codes are generated based on the phase responses from Gabor filtering, they are rather robust against illumination, blurring and noise. To measure this quantitatively, each effect is added, namely, illumination (gamma correction), blurring (Gaussian filtering), and Gaussian noise to an iris image, respectively, in slow progression and measure the drift of the iris code. The amount of added effect is measured by the Root Mean Square Error (RMSE) of pixel values between the modified and original image, and the amount of drift is measured by the Hamming distance between the new and original iris code. Mathematically, RMSE is defined as:

<figure><img src="/files/aR3zoGzbfpYFMqLKhErB" alt=""><figcaption></figcaption></figure>

where N is the number of pixels in the original image I and the modified image I′. The Hamming distance is defined as:

<figure><img src="/files/OlmpgBADDZ8ffzEuMxbN" alt=""><figcaption></figcaption></figure>

where K is the number of bits (0/1) in the original iris code C and the new iris code C′. A Hamming distance of 0 means a perfect match, while 1 means the iris codes are completely opposite. The Hamming distance between two randomly generated iris codes is around 0.5.

The following figures help explain the impact of illumination both visually and quantitatively, blurring and noise on the robustness of iris codes. For illustration purposes, these results are not generated with the actual filters that are deployed but nevertheless demonstrate the property in general of Gabor filtering. Also, the iris image has been normalized from a donut shape in the cartesian coordinates to a fixed-size rectangular shape in the polar coordinates. This step is necessary to standardize the format, mask-out occlusion and enhance the iris texture.

As shown in the figure below, iris codes are very robust against grey-level transformations associated with illumination as the HD barely changes with increasing RMSE. This is because increasing the brightness of pixels reduces the dynamic range of pixel values, but barely affects the frequency or spatial properties of the iris texture.

<figure><img src="/files/cP6vt0NCR0HHGTGzKxrJ" alt=""><figcaption><p>An animation showcasing the effect of varying illumination levels on the robustness of iris codes. Each frame represents an increase in illumination, portrayed through the Root Mean Square Error (RMSE) between images (blue line) and the Hamming Distance (HD) between corresponding iris codes (green line).</p></figcaption></figure>

Blurring, on the other hand, reduces image contrast and could lead to compromised iris texture. However, as shown below, iris codes remain relatively robust even when strong blurring makes iris texture indiscernible to naked eyes. This is because the phase information from Gabor filtering captures the location and presence of texture rather than its strength. As long as the frequency or spatial property of the iris texture is present, though severely weakened, the iris codes remain stable. Note that blurring compromises high frequency iris texture, therefore, impacting high frequency Gabor filters more, which is why a bank of multi-scale Gabor filters are used.

<figure><img src="/files/tqQXWzoYUdtkC4Tcf1UM" alt=""><figcaption><p>An animation illustrating the impact of blurring on the robustness of iris codes. The blurring intensifies with each frame, as demonstrated by the Root Mean Square Error (RMSE) between images (blue line) and the Hamming Distance (HD) between corresponding iris codes (green line).</p></figcaption></figure>

Finally, observe bigger changes in iris codes when Gaussian noise is added, as both spatial and frequency components of the texture are polluted and more bits become fragile. When the iris texture is overwhelmed with noise and becomes indiscernible, the drift in iris codes is still small with a Hamming distance below 0.2, compared to matching two random iris codes (≈0.5). This demonstrates the effectiveness of iris feature generation using Gabor filters even in the presence of noise.

<figure><img src="/files/RQd0fRyDaJL1fFB7pbMY" alt=""><figcaption><p>An animation demonstrating the impact of noise on the robustness of iris codes. With each successive frame, the level of noise is increased, shown through Root Mean Square Error (RMSE) between images (blue line) and Hamming Distance (HD) between corresponding iris codes (green line).</p></figcaption></figure>

[**Conclusion**](https://whitepaper.worldcoin.org/#conclusion-2)

Iris feature generation is a necessary and important step in iris recognition. It reduces the dimensionality of the iris representation from a high resolution image to a much lower dimensional binary code, while preserving the most discriminative texture features using a bank of Gabor filters. It is worth noting that Gabor filters have their own limitations, for example, one cannot design Gabor filters with arbitrarily wide bandwidth while maintaining a near-zero DC component in the even-symmetric filter. This limitation can be [overcome](https://opg.optica.org/josaa/fulltext.cfm?uri=josaa-4-12-2379\&id=2980) by using the Log Gabor filters. In addition, Gabor filters are not necessarily optimized for iris texture, and machine-learned iris-domain specific filters (e.g. BSIF) have the [potential](https://ieeexplore.ieee.org/document/8658238) to achieve further improvements in feature generation and recognition performance in general. Moreover, the project’s contributors are investigating novel approaches to leverage higher quality images and the latest advances in the field of deep metric learning and deep representation learning to push the accuracy of the system beyond the state-of-the-art to make the system as inclusive as possible.

As the resilience of iris feature generation amidst external factors was showcased, it is crucial to note that even minor fluctuations in iris code variability hold significant importance when dealing with a billion people, as the tail-end of the distribution dictates the error rates, thus influencing the number of false rejections.

[**Iris Inference System**](https://whitepaper.worldcoin.org/#iris-inference-system)

Building upon the theoretical foundation established in the previous sections, this section now focuses on the practical application of these principles within the Worldcoin project. Having explored the scalability of iris recognition technology and the process of feature generation using Gabor wavelets, this section explains the details of the image processing. By the end of this section, one will have a thorough understanding of how Worldcoin's iris recognition algorithm functions to ensure accurate and scalable verification of an individual's uniqueness.

[**Pipeline overview**](https://whitepaper.worldcoin.org/#pipeline-overview)

The objective of this pipeline is to convert high-resolution infrared images of a human's left and right eye into an iris code: a condensed mathematical and abstract representation of the iris' entropy that can be used for verification of uniqueness at scale. Iris codes have been introduced by John Daugman in [this paper](https://www.cl.cam.ac.uk/~jgd1000/PAMI93.pdf) and remain to this day the most widely used way to abstract iris texture in the iris recognition field. Like most state-of-the-art iris recognition pipelines, Worldcoin’s pipeline is composed of four main segments: segmentation, normalization, feature generation and matching.

Refer to the image below for an example of a high resolution image of the iris acquired in the near infrared spectrum. The right hand side of the image shows the corresponding iris code, which is itself composed of response maps to two 2D Gabor wavelets. These response maps are quantized in two bits so that the final iris code has dimensions of , being the number of radial and angular positions where these filters are applied. For more details, see check the previous section. While only the iris code of one eye is shown below, note that an iris template consists of the iris codes from both eyes.

<figure><img src="/files/3OqK3pkTWW1l281AvPUD" alt=""><figcaption><p>Example of an input and output of the biometric pipeline. Fig. 1.a is an example of an infrared iris texture image taken by the Orb. Fig. 1.b is an example of an iris code produced from the iris texture image in Fig. 1.a, effectively aggregating the iris texture.</p></figcaption></figure>

The purpose of the segmentation step is to understand the geometry of the input image. The location of the iris, pupil, and sclera are determined, as well as the dilation of the pupil and presence of eyelashes or hair covering the iris texture. The segmentation model classifies every pixel of the image as pupil, iris, sclera, eyelash, etc. These labels are then post-processed to understand the geometry of the subject's eye.

The image and its geometry then passes through tight quality assurance. Only sharp images where enough iris texture is visible are considered valid, because the quantity and quality of available bits in the final iris codes directly impact the system's overall performance.

Once the image is segmented and validated, the normalization step takes all the pixels relevant to the iris texture and unfolds them into a stable cartesian (rectangular) representation.

The normalized image is then converted into an iris code during the feature generation step. During this process, a Gabor wavelet kernel convolves across the image, converting the iris texture into a standardized iris code. For every point in a grid overlapping the image, two bits that represent the sign of the real and complex components of the filter response are derived, respectively. This process synthesizes a unique representation of the iris texture, which can easily be compared with others by using the Hamming distance metric. This metric quantifies the proportion of bits that differ between any two compared iris codes.

The following sections will explain each of the aforementioned steps in more detail, by following the journey of an example iris image through the biometric pipeline. This image was taken by the Orb, during a signup in the TFH lab. It is shared with user consent and faithfully represents what the camera sees during a live uniqueness verification.

The eye is a remarkable system that exhibits various dynamic behaviors, including blinking, squinting, closing, as well as the ability of the pupil to dilate or constrict and the eyelashes or any object to cover the iris. The following section also explores how the biometric pipeline can be robust in the presence of such natural variability.

[**Segmentation**](https://whitepaper.worldcoin.org/#segmentation)

Iris recognition was [first developed](https://www.cl.cam.ac.uk/~jgd1000/PAMI93.pdf) in 1993 by John Daugmann and, although the field has advanced since the turn of the millennium, it continues to be heavily influenced by legacy methods and practices. Historically, the morphology of the eye in [iris recognition](https://ieeexplore.ieee.org/document/1262028) has been identified using classical computer vision methods such as the Hough Transform or circle fitting. In recent years, Deep Learning has brought about significant improvements in the field of computer vision, providing new tools for understanding and analyzing the eye physiology with unprecedented depth.

Novel methods for segmenting high-resolution infrared iris images are proposed in [Lazarski et al](https://arxiv.org/pdf/2209.15471.pdf) by the Tools for Humanity team. The architecture consists of an encoder that is shared by two decoders: one that estimates the geometry of the eye (pupil, iris, and eyeball) and the other that focuses on noise, i.e., non-eye-related elements that overlay the geometry and potentially obscure the iris texture (eyelashes, hair strands, etc.). This dichotomy allows for easy processing of overlapping elements and provides a high degree of flexibility in training these detectors. The architecture takes into account the [DeepLabv3+](https://arxiv.org/abs/1802.02611) architecture with a MobileNet v2 backbone.

Acquiring labels for noise elements is significantly more time-consuming than acquiring labels for geometry, as it requires a high level of precision for identifying intertwined eyelashes. It takes 20 to 80 minutes to label eyelashes in a single image, depending on the levels of blur and the subject's physiology, while it only takes about 4 minutes to label the geometry to required levels of precision. For that reason, noise objects (e.g. eyelashes) are decoupled from geometry objects (pupil, iris and sclera) which allows for significant financial and time savings combined with a quality gain.

The model was trained over a mix of Dice Loss and Boundary Loss. The Dice loss can be expressed as

<figure><img src="/files/om1OFay3SwKNXFYHCFRV" alt=""><figcaption></figcaption></figure>

being the one-hot encoded ground truth,the model’s output for the pixel (i,j) as a probability. The third index k represents the class (e.g. pupil, iris, eyeball, eyelash or background). The Dice loss essentially measures the similarity between two sets, i.e. the label and the model's prediction.

Accurate identification of the boundaries of the iris is essential for successful iris recognition, as even a small warp in the boundary can result in a warp of the normalized image along the radial direction. To address this, a weighted cross-entropy loss was also introduced that focuses on the zone at the boundary between classes, in order to encourage sharper boundaries. It is mathematically represented as:

<figure><img src="/files/nNRWHWDyydnP2KgrG7YG" alt=""><figcaption></figcaption></figure>

with the same notations as before,being the boundary weight, which represents how close the pixel (i,j) is to the boundary between class k and any other class. A Gaussian blur is then applied to the contour to prioritize the precision of the model on the exact boundary while keeping a lower degree of focus on the general area around it.

<figure><img src="/files/aht48j1AyEuPBhGZK8RC" alt=""><figcaption></figcaption></figure>

being the distance between the point,as the minimum of the euclidean distances between (i,j) and all points, is the boundary between class the Gaussian distribution centered at 0 with some finite variance.

Experiments were conducted with other loss functions (e.g. convex prior), architectures (e.g. single-headed model), and backbones (e.g. ResNet-101) and this setup was found to have the best performance in terms of accuracy and speed. The following graph shows the iris image overlayed by the segmentation maps as predicted by the model. In addition, landmarks are displayed calculated by a separate quality assessment AI model during the image capture phase. This model produces quality metrics to ensure that only high-quality images are used in the segmentation phase and that the iris code is generated accurately for verification of uniqueness: sharp image focused on the iris texture, well-opened eye gazing in the camera, etc.

<figure><img src="/files/UGmEroCD6v6yy8Tk3kjX" alt=""><figcaption><p>Segmentation of an iris image. The AI model detects the different regions of interest of the eye in order to isolate the relevant iris texture and assess the overall image quality. This exemplifies the outcome of an employee signup conducted in the lab of Tools for Humanity.</p></figcaption></figure>

[**Normalization**](https://whitepaper.worldcoin.org/#normalization)

The goal of this step is to separate meaningful iris texture from the rest of the image (skin, eyelashes, sclera, etc.). To achieve this, the iris texture is projected from its original cartesian coordinate system to a polar coordinate system, as illustrated in the following image. The iris orientation is defined as the vector pointing from one pupil center to the other pupil center of the opposite eye.

<figure><img src="/files/QxdFAq79F08K2vBSOyfc" alt=""><figcaption><p>Scheme of the normalization process.</p></figcaption></figure>

This process reduces variability in the image by canceling out variations such as the person’s distance from the camera, the pupil constriction or dilation due to the amount of light in the environment, and the rotation of the person’s head. The image below illustrates the normalized version of the iris above. The two arcs of circles visible in the image are the eyelids, which were distorted from their original shape during the normalization process.

<figure><img src="/files/FzWRTc7h4uHO5AXAFYhJ" alt=""><figcaption><p>Normalized iris texture. The texture is sharp and its patterns are clearly visible.</p></figcaption></figure>

[**Feature generation**](https://whitepaper.worldcoin.org/#feature-generation)

Now that a stable, normalized iris texture is produced, an [iris code](https://whitepaper.worldcoin.org/#the-iris-code) can be coded that can be matched at scale. In short, various Gabor filters stride across the image and threshold its complex-valued response to generate two bits representing the existence of a line (resp. edge) at every selected point of the image. This technique, pioneered by John Daugmann, and the subsequent iterations proposed by the iris recognition research community, remains state-of-the-art in the field.

<figure><img src="/files/WubGwsRMOiElOeBCQ4cI" alt=""><figcaption><p>Final iris code. This is the anonymized iris texture expressing one's uniqueness.</p></figcaption></figure>

[**Matching**](https://whitepaper.worldcoin.org/#matching)

Now that the iris texture is transformed into an iris code, it is ready to be matched against other iris codes. To do so, a masked fractional Hamming Distance (HD) was used: the proportion of non-masked iris code bits that have the same value in both iris codes.

Due to the parametrization of the Gabor wavelets, the value of each bit is equally likely to be 0 or 1. As the iris codes described above are made of more than 10,000 bits, two iris codes from different subjects will have an average Hamming distance of 0.5, with most (99.95%) iris codes deviating less than 0.05 HD away from this value (99.9994% deviating less than 0.07 HD). As several rotations of the iris code are compared to find the combination with highest matching probability, this average of 0.5 HD moves to 0.45 HD, with a $$1.6×10−7$$ probability of being lower than 0.38 HD.

It is therefore an extreme statistical anomaly to see two different eyes producing iris codes with a distance lower than 0.38 HD. On the contrary, two images captured of the same eye will produce iris codes with a distance generally below 0.3 HD. Applying a threshold in between allows the ability to reliably distinguish between identical and different identities.

To validate the quality of the algorithms at scale, their performance was evaluated by collecting 2.5 million pairs of high-resolution infrared iris images from 303 different subjects. These subjects represent diversity across a range of characteristics, including eye color, skin tone, ethnicity, age, presence of makeup and eye disease or defects. Note that this data was not collected during field operations but stems from contributors to the Worldcoin project and from paid participants in a dedicated session organized by a respected partner. Using these images and their corresponding ground truth identities, the false match rate (FMR) and false non match rate (FNMR) of the system was measured.

<figure><img src="/files/NvUUZrN4ZdkvgyRxhEAl" alt=""><figcaption><p>Match and Non-Match Distribution</p></figcaption></figure>

From 2.5 million image pairs, all were correctly classified as either a match or non-match. Additionally, the margin between the match and non-match distributions is wide, providing a comfortable margin of error to accommodate for potential outliers.

The match distribution presents two clear peaks, or maxima. The peak on the left (HD≈0.08) corresponds to the median Hamming distance for pairs of images taken from the same person during the same capture process. This means that they are extremely similar, as one would expect from two images of the same person. The peak on the right (HD≈0.2) represents the median Hamming distance for pairs of images taken from the same person but during different enrollment processes, often weeks apart. These are less similar, reflecting the naturally occurring variations in the same person's images taken at different times like pupil dilation, occlusion and eyelashes. Systems to narrow the matches distribution are continuously being iterated on: better auto-focus and AI-Hardware interactions, better real-time quality filters, Deep Learning feature generation, image noise reduction, etc.

As there were no misclassified iris pairs, FMR and FNMR cannot be calculated exactly. However, an upper bound for both rates can be estimated:

<figure><img src="/files/aAxcFElaDriAJ7d2wcJw" alt=""><figcaption></figcaption></figure>

With these numbers, uniqueness on a billion people scale can be verified with very high accuracy. However, also acknowledged is the fact that the dataset used for this evaluation could be enlarged and more effort is needed to build larger and even more diverse datasets to more accurately estimate the biometric performance.

[**Conclusions**](https://whitepaper.worldcoin.org/#conclusions)

In this section, the key components of Worldcoin’s uniqueness verification pipeline are presented. It illustrated how the use of a combination of deep learning models for image quality assessment and image understanding, in conjunction with traditional feature generation techniques, enables accurate verification of uniqueness on a global scale.

However, work in this area is ongoing. Currently, the team at TFH is researching an end-to-end Deep Learning model, which could yield faster and even more accurate uniqueness verification.

[**Iris Code Upgrades**](https://whitepaper.worldcoin.org/#iris-code-upgrades)

While the accuracy of the uniqueness verification algorithm of the Orb is already very high (with, specifically a false match rate of 1 in 40 trillion (1:1 match), an even higher accuracy would be [beneficial on a billion people scale](https://whitepaper.worldcoin.org/#deduplication-3). To this end, the biometric algorithm is continuously being developed and will be upgraded over time.

There are three key types of upgrades that can increase accuracy further:

[**Image preprocessing upgrades.**](https://whitepaper.worldcoin.org/#image-preprocessing-upgrades)

These upgrades, which are *backwards compatible*, modify everything except the final step in the process: iris code feature generation. Elements such as the segmentation network and image quality thresholds are typical areas of improvement. For an in-depth look at the preprocessing algorithms, please refer to the [image processing section](https://whitepaper.worldcoin.org/#iris-inference-system-2). These types of upgrades generally occur multiple times a year.

**Iris code generation upgrades for future verifications.** Also *backwards compatible*, these upgrades involve modifying the iris code feature generation algorithm without recomputing previous iris codes. Such an upgrade involves the introduction of v2 codes, which are not compatible with the older v1 codes. Both v1 and v2 codes would be compared against their respective sets. If both comparisons result in no collision, the v2 code is added to the set of v2 codes. This way, the set of v1 codes doesn’t grow any more, yet none of the individuals who are part of the v1 set can get a second World ID.

In the event of an upgrade to the feature generation algorithm, the corresponding false match rate evolve as follows:

<figure><img src="/files/gXvbCqp3eobVFxZ7tJih" alt=""><figcaption></figcaption></figure>

For an in-depth understanding of the error rates, please revisit the relevant information in the section [biometric performance on a billion people scale](https://whitepaper.worldcoin.org/#biometric-performance-at-a-billion-people-scale-2). From the above equation we can deduce that the likelihood of the i-th legitimate user experiencing a false match continues to increase with the expansion of the v2 set. However, provided that $$���2<���1$$, the rate of this growth is significantly reduced. For any individual included in the v1 set, the false non-match rate remains unaffected. For new enrollments, the false non-match rate of the v2 algorithm applies.

In principle, several such iris code versions can be stacked. This type of upgrade is expected to happen about once a year.

**Recomputing existing iris codes.** These upgrades *may or may not be backwards compatible*, depending on whether the original image is still available. These are expected to occur less frequently than iris code generation upgrades and to become less frequent over time.

To understand when a recomputation might be required, let us define the number of codes in the set of v1 codes as $$�1$$ and similarly for v2 codes. If the error rates of the v1 code are much worse than the ones of v2 codes and therefore have major influence on the false match rate even at $$�2≫�1$$, the set of v1 codes should eventually be recomputed. For this to be possible, the iris images need to be available. This can happen in several ways:

*Re-capture images*. Individuals could return to an Orb. Depending on the distance to an Orb and individual preferences this may or may not be a realistic option.

*Custodial image storage*. Upon request, the issuer can securely store the images and automatically recompute the iris code if necessary. Currently, this is an option for individuals, but it is likely to be discontinued with the introduction of self custodial image storage.

*Self custodial image storage*. Expected to be introduced in late 2023, this option allows individuals to store their signed and end-to-end encrypted images on their device. For recomputing iris codes, individuals can upload their images temporarily to a dedicated, audited cloud environment that deletes images upon recomputation, or perform the computation locally on their phones. To ensure integrity, the local computation requires the upgrade to happen within a zero-knowledge proof, necessitating the use of Zero-Knowledge Machine Learning (ZKML) on the individual’s phone. The feasibility of this approach depends on the computational capabilities of the individual’s phone and ongoing ZKML research.

If local computation or temporary upload isn't viable or preferable, individuals can always revisit an Orb where the iris code is computed locally.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://worlddao.gitbook.io/worlddao-white-paper/technical-implementation/biometrics.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
