Efficient Artificial Intelligence Training

Artificial intelligence not only provides outstanding performance but also causes a large energy requirement. The more difficult the tasks for which it has been taught, the more energy it uses.

Vctor López-Pastor and Florian Marquardt of the Max Planck Institute for the Science of Light in Erlangen, Germany, describe a way for training artificial intelligence far more efficiently. Their technique is based on physical processes rather than the digital artificial neural networks that are already in use.

Open AI, the company behind that artificial intelligence (AI), has not revealed the amount of energy required to train GPT-3, which makes ChatGPT a fluent and presumably well-informed Chatbot.

According to the German statistics firm Statista, this would take 1000 megawatt hours – roughly the amount consumed yearly by 200 German houses with three or more people. While this energy investment has allowed GPT-3 to understand whether the term ‘deep’ is more likely to be followed by the word ‘sea’ or ‘learning’ in its data sets, it has not recognized the underlying meaning of such statements, according to all accounts.

We have developed the concept of a self-learning physical machine. The core idea is to carry out the training in the form of a physical process, in which the parameters of the machine are optimized by the process itself.
Florian Marquardt

Neural networks on neuromorphic computers

In order to lower the energy consumption of computers, particularly AI applications, various research organizations have been studying a whole new concept of how computers could process data in the future during the last few years. The concept is referred to as neuromorphic computing. Although this sounds similar to artificial neural networks, it has little in common with them because artificial neural networks run on traditional digital computers. This means that the software, or, more specifically, the algorithm, is based on the way the brain works, but digital computers serve as the hardware. They perform the neuronal network’s computation steps sequentially, one after the other, distinguishing between processor and memory.

“When a neural network trains hundreds of billions of parameters, i.e. synapses, with up to one terabyte of data, the data transfer between these two components alone consumes large amounts of energy,” says Florian Marquardt, director of the Max Planck Institute for the Science of Light and professor at the University of Erlangen. The human brain is completely different and would never have been evolutionary competitive if it worked with the same energy efficiency as computers with silicon transistors. Overheating would have most likely caused it to fail.

The brain is characterized by undertaking the numerous steps of a thought process in parallel and not sequentially. The nerve cells, or more precisely the synapses, are both processor and memory combined. Various systems around the world are being treated as possible candidates for the neuromorphic counterparts to our nerve cells, including photonic circuits utilizing light instead of electrons to perform calculations. Their components serve simultaneously as switches and memory cells.

Efficient training for artificial intelligence

A self-learning physical machine optimizes its synapses independently

Together with Víctor López-Pastor, a doctoral student at the Max Planck Institute for the Science of Light, Florian Marquardt has now devised an efficient training method for neuromorphic computers. “We have developed the concept of a self-learning physical machine,” explains Florian Marquardt. “The core idea is to carry out the training in the form of a physical process, in which the parameters of the machine are optimized by the process itself.”

External feedback is required while training traditional artificial neural networks to modify the strengths of the many billions of synaptic connections. “Not requiring this feedback makes the training much more efficient,” Florian Marquardt argues. Implementing and developing artificial intelligence on a self-learning physical machine would save both energy and computation time. “Our method works regardless of which physical process takes place in the self-learning machine, and we do not even need to know the exact process,” Florian Marquardt, a computer scientist, adds.

“However, the process must fulfill a few conditions.” Most significantly, it must be reversible, which means that it may run forwards or backward with minimal energy waste. Moreover, the physical process must be nonlinear, which means sufficiently complex,” explains Florian Marquardt. Only non-linear processes are capable of performing the complex transformations between input data and outcomes. A linear action is a pinball rolling over a plate without interacting with another. When it is interrupted by another, the situation becomes non-linear.

Practical test in an optical neuromorphic computer

Optics contains examples of reversible, non-linear processes. Indeed, Victor López-Pastor and Florian Marquardt are already working on an optical neuromorphic computer with an experimental team. This machine processes data in the form of superimposed light waves, with appropriate components controlling the type and strength of the interaction. The researchers’ goal is to bring the self-learning physical machine concept into practice. “In three years,” adds Florian Marquardt, “we hope to be able to present the first self-learning physical machine.” By then, neural networks should have far more synapses and be trained with much bigger amounts of data than they are now.

As a result, there will most certainly be an even stronger desire to build neural networks outside of traditional digital computers and replace them with effectively taught neuromorphic computers. “We are therefore confident that self-learning physical machines have a strong chance of being used in the further development of artificial intelligence,” the scientist said.