Vaultree

Introducing: Vaultree Encrypted Transformers

Dr. Cathal Smyth

Head of AI and Machine Learning

Text is one of the most challenging data modes to process in machine learning. These long sequences are full of complex grammar, synonyms, negation and slang. It’s also one of the most sensitive: capturing our complex health needs, our confidential intellectual property and clandestine secrets.

‍

Initial forays into representing text to an algorithm took a statistical approach. Methods such as bag-of-words and TF-IDF were commonly employed to perform tasks ranging from spam detection to sentiment analysis. However, these approaches were limited in how rigid and isolated the resulting representations became. The introduction of word embeddings ushered in a new era of context in Natural Language Processing (NLP). Suddenly, tasks such as machine translation were in reach. However, the length of texts still proved a challenge: some simple concepts can take 100s of words to explain. This need was addressed with the introduction of the Transformer: a deep learning algorithm that features dizzying concepts such as multi-headed attention, shared weights, and deep, non-linear layers.

‍

Here at Vaultree, we have developed the next evolution of NLP - Encrypted Transformers. Read on for a guide in how this works!

‍

Fully Homomorphic Encryption

‍

The key success behind our encrypted transformers is the underlying technology of Fully Homomorphic Encryption (FHE). Our proprietary FHE scheme offers the ability to to add and multiply encrypted values - evolving encryption from at-rest or in-transit to in-use. The scalability of our scheme means that we can build complex AI algorithms on top of this scheme and in turn process sensitive data accurately and efficiently.

‍

Encrypted Transformers in Action

One of the key challenges in healthcare is that critical information is often stored as unstructured text such as medical charts. A further challenge is that, given the sensitivity of this data, the insights of healthcare providers are isolated from each other. Encrypted transformers, along with VEDS address both of these gaps. The result is a network of healthcare providers, safely pooling encrypted medical chart data, and massively boosting the training of an encrypted transformer via federated learning.

‍

This is a critical solution for areas such as rare-disease classification. In the following example, we will take a look at the intake chart of a (not real) patient with Wilson’s disease (a genetic disorder that affects the brain and liver).

The first step is to take this text and tokenize it. Essentially that means to convert the words to dense vector-based word embeddings. Once tokenized the text is then encrypted. These encrypted embeddings are then fed into a multi-headed attention layer. An attention head does what it says - it pays attention to the tokens that are most important and scores them as such. This new representation get’s combined with a positional encoding layer which helps the model keep track of which encrypted word is where in the sequence. Finally a feedforward layer takes these representations and analyses them for classification. The result is an encrypted decision that can be decrypted by the appropriate party who holds the private key.

In this scenario, we have a doctor who is unsure how to treat a patient with unusual symptoms. With Vaultree’s technology, The doctor can now take this unstructured text, encrypt it, and send it to a centralised encrypted transformer for classification. At no point in that journey outside of the healthcare provider is that sensitive data exposed. Once a classification is received, the doctor can safely decrypt the result and use the AIs classification as a guide on establishing an accurate diagnosis.

Applications of Encrypted Transformers

‍

There are a wealth of use cases that can benefit from such state of the art solutions. The text could be legal documents, or internal strategies for an enterprise company. Any scenario where the privacy of individuals must be protected while unstructured text is parsed is open to exploration!

‍

The Future of Encrypted Transformers

‍

The astute reader may recognise transformers as the building blocks of Large Language Models (LLMs) such as ChatGPT. The very same technology that is used to classify complex text, can also generate complex text! Imagine a future where you can be assured of private conversations with an AI, with no risk of your Personal Identifiable Information or Intellectual Property being exposed. That future rests with Vaultree.

‍

Ready to Transform Your Data Security?

Discover how VEDS can revolutionise your data sharing and collaboration. Contact us today to learn more or request a demo.

Book a Meeting