The challenges of modern computing and new opportunities for optics

 

Introduction

 

The extraordinary development of complementary-metal-oxide-semiconductor (CMOS) technology facilitates an unprecedented success of integrated circuits. As predicated by Gordon E. Moors in 1965, the transistor number on a computing chip is doubled in every 18–24 months. Moreover, Dennard’s scaling rule explains the benefit of reducing a transistor’s dimensions in further . Nowadays, Moore’s law has made central processor units (CPUs) 300 times faster than that in 1990. However, such an incredible development is unsustainable as predicted by the International Technology Roadmap of Semiconductors (ITRS) in 2016. After 5 nm technology node, the semiconductor industry is difficult to move forward. In addition, the proliferation of artificial intelligence (AI) applications create exponentially increasing amounts of data that can hardly processed by conventional computing systems and architectures. Such a desperate discrepancy boosts numerous investigations of novel approaches and alternative architectures for data processing.

Comparing to electrical devices, optical devices can process information instantaneously with negligible energy consumption and heat generation. Furthermore, optical devices have much better parallelism than electrical devices in data processing by employing multiplex schemes, such as wavelength division multiplexing (WDM) and mode division multiplexing (MDM). With adopting the properties of light, the architecture and layout of many complex computing systems can be potentially simplified by introducing optical computing units.

In general, optical computing can be classified in two different categories: the digital optical computing and the analog optical computing. The digital optical computing based on Boole logics, using similar mechanism as the general-purpose computing based on transistor, has been developed for more than 30 years. However, it is difficult to beat the conventional digital computing in terms of the low integration density of optical device. In contrast, analog optical computing utilizes the physical characteristics of light, such as amplitude and phase, and the interactions between light and optical devices to achieve certain computing functions. It is a dedicated computing because of the unique mathematical depiction of computational process in one certain analog optical computing system. Compared to the conventional digital computing, the analog optical computing can realize better data processing acceleration in specific tasks, such as pattern recognition and numerical computation. Therefore, as one of the most promising computing technologies in post-Moore era, large amount of research work has been drawn into the investigation of analog optical computing systems.

In this paper, the challenges of modern computing and the potential opportunities of analog optical computing have been discussed separately. The first chapter briefly explains the main factors impeding the sustainability of Moore’s law, the growing demands of information processing, and the latest researches in the semiconductor industry. In the second chapter, the progresses of analog optical computing over last decade have been reviewed in three sections. In the last chapter, a systematical analysis of the hybrid computing system has been given followed by a discussion of the new challenges and potential opportunities of analog optical computing.

 

 

Moore’s law and the new challenges

The challenges of Moore’s law

Originally, Moore’s law and Dennard’s scaling rules show the reduction of transistor’s dimensions is a viable way to boost computational capability without increasing energy dissipation. While, the continuous development CMOS technologies induces the failure of Dennard’s scaling rules, because the shrunk transistor cannot maintain a constant energy density. Utilizing a higher clock frequency in CPUs would be another plausible way to further enhance computational capability. However, the thermal effects from power dissipation will become a new bottleneck of CPUs’ performance by employing high clock frequency. Today, the computational capabilities of CPUs, with the 5 GHz clock speed constrains, are alternatively improved by utilizing a parallel architecture.

Apart from the thermal effects from power dissipation, the limitations of manufacturing process also challenge the Moore’s law. To extend the downscaling of transistor in CPUs, the new top-down patterning methods should be introduced into current manufacturing line. Extreme ultraviolet (EUV) lithography, at the 13.5 nm wavelength, is the core technology to extend the Moore’s law because of the shorter wavelength allows the higher resolution . For EUV interference lithography, the theoretical limit of half-pitch is around 3.5 nm. Similarly, electron beam lithography (EBL) as another fabrication technology, is also able to create the extremely fine patterns of integrated circuits with high resolution. Though EBL provides ultra-high resolution closing to the atomic level and adapts to work with a variety of materials, the processing time is much longer and more expensive than optical lithography .

These scale down methodologies for silicon-based CMOS circuits are classified as ‘More Moore’ technologies which are used to maintain the Moore’s law. However, following the size reduction of transistor’s gate channel by employed better fabrication technologies, the quantum effects, such as quantum tunneling and quantum scattering, will bring other unpredictable problems. For example, in the latest sub-5 nm all-around gate (GAA) of the fin field-effect transistor (FinFET), the threshold voltage is increased as the effective fin width reduced by quantum effect . Therefore, the enhancement of computational capability will not be able to sustain by shrinking the transistor size continuously.

The challenges of AI applications

On top of the challenges from physical limitations of Moore’s Law mentioned in the “The challenges of Moore’s law” section, the computational capability of conventional digital systems is challenged by the thriving AI applications as well. The most popular AI implementations are deep neural networks (DNNs) which contain two most important types: convolution neural networks (CNNs) and long short-term memory (LSTM). In CNNs, there are a series of convolution and sub-sampling layers followed by a fully connected layer and a normalizing layer. Convolution is the main computing task for inference and back-propagation is used solely for training all parameters in CNN . LSTM consists of blocks of memory cell which are dictated by the input, forget and output gates. The output of the LSTM blocks are calculated via the cell values . To promote high accuracy of output results, DNNs have been developing large number of parameters. The first DNN model LeNet [10] only contains 5 convolution layers with 60 K parameters. In 2012, AlexNet  became the best performance DNN model with 60 M parameters. Nowadays, the Megatron model  contains 3.9 G parameters and it needs several weeks to train with millions level USD costing.

All the processes of DNNs mentioned above contains many complex computing tasks and it consume large volume of computing resource. A metric researched by OpenAI shows that the prosperity of AI has increased the demand of computational capability more than 300,000 times from 2012 to 2018, while Moore’s law would yield only a 7 times enhancement . In short, AI applications have become more and more complex, precise and computing resources drained. There is a great thirst for higher computational capability systems to meet these challenges.

New attempts under the challenges

It is clear that extending the Moore’s law is one critical factor to gain the computational capability. To promote the semiconductor technologies, there are two other technical paths ‘More than Moore’ and ‘Beyond CMOS’, apart from ‘More Moore’ . ‘More than Moore’ encompasses the engineering of complex heterogeneous systems that can meet certain needs and advanced applications, with varies technologies (such as system on chip, system in package, network on chip et al.). ‘Beyond CMOS’ explores the new materials to improve the performance of CMOS transistor, such as carbon nanotubes (CNT) [13]. The motivation of introducing CNT in computing system is that the CNT based transistors have low operation voltages and exceptional performance as they have shorter length of current-carrying channel than current design. Because CNT can be either metallic or semiconducting, the isolation of purely semiconducting nanotubes is essential for making high performance transistors. However, the purifying and controllably positioning for these 1 nm diameter molecular cylinders is still a formidable challenge today .

Besides extending the Moore’s law, developing new systematic architectures can also gain the computational capability of conventional digital systems. In-memory computing architecture has been extensively explored in CMOS based static random access memory (SRAM) . However, CMOS memories have limitation in density which is slow in scaling trends. Researchers are motivated to explore in-memory computing architectures with the emerging non-volatile memory (NVM) technologies, such as phase change material (PCM)  and resistive random-access memory (RRAM) . NVM devices are configured in a form of two-dimensional crossbar array which enables high performance computing as NVM devices allow non-volatile multiple states. NVM crossbars can do multiplication operation in parallel and result higher energy efficiency and speed than conventional digital accelerators by eliminating data transfer . The high density NVM crossbars provide massively parallel multiplication operations and lead to the exploration of analog in-memory computing systems .

However, the approaches mentioned above still seem to be incompetent at meeting the challenges which are from the applications with extreme computational complexity, such as large scale optimization, large molecules simulation, large number decomposition, etc. These applications require large size of memory which the most powerful supercomputers can hardly meet. In addition, processing of these applications needs the runtimes on the order of tens of years or more. Therefore, it is essential to investigate the new computing paradigms which are different with the conventional computing systems based on Boole logics and von Neumann architecture. Currently, quantum computing, DNA computing, neuromorphic computing, optical computing, etc. called as physical computing paradigms are attracting more and more researcher attention. These physical computing paradigms, providing more complexity operators than Boole logics in device level, can be used to build exceptional accelerators. Compared to the low-temperature requirement in quantum computing, and the dynamic instabilities of DNA and neuromorphic computing, optical computing has loose environment requirement and solid systemic composing. Therefore, optical computing has been considered as one of the most promising ways to tackle intractable problems.