1) B.Tech (Electronics and Communication Engineering), 2010, CGPA: 8.55, West Bengal University of Technology
2) M.E (Nuclear Engineering), 2014, CGPA: 8.66, Jadavpur University
3) PhD (Electronics and Telecommunication Engineering), 2020, Indian Institute of Engineering Science and Technology, Shibpur
RESEARCH INTERESTS
VLSI Architecture Design, Signal and Image Processing, Deep Learning in Multimedia Applications, Edge Computing
25
Scopus Publications
Scopus Publications
Error-Tolerant Medical Image Segmentation Using Resource-Efficient Approximate Booth Multipliers Binit Kumar Pandit, Anirban Chakraborty, Ayan Banerjee IEEE Embedded Systems Letters, 2025 CNN-based medical image segmentation models, such as U-Net, achieve high accuracy but demand significant computation and power, limiting real-time deployment in resource-constrained clinical settings. This letter presents AxC-UNet, a hardware-aware segmentation accelerator employing two Radix-4 approximate Booth encoders (AxBEM1 and AxBEM2) and three low-power approximate 4:2 compressors to optimize multiply-and-accumulate (MAC) operation. FPGA synthesis on a Xilinx ZCU104 shows up to 16.1% LUT reduction and 10% dynamic power savings compared to exact Booth multipliers, while maintaining segmentation accuracy within <6% Dice variation—comparable to natural inter-observer disagreement in BraTS2020—thereby ensuring clinical reliability. This co-design approach demonstrates a practical trade-off between clinical performance and hardware efficiency, enabling energy-efficient real-time medical image segmentation on edge platforms.
An Investigation of Magnitude Comparator Design using Intermediate Zero-Free Binary Signed-Digit Representation Madhu Sudan Chakraborty, Anirban Chakraborty, Ganti Sreelakshmi, Sayantan Dutta 6th IEEE International Conference on Recent Advances in Information Technology Rait 2025, 2025 Signed-digit number systems have been found to be more supportive to achieve higher speed for addition / subtraction, multiplication, typical digital filter design and some other arithmetic operations. Accordingly signed-digit number systems seem to be more suitable for digital signal processing, image processing and cryptographic architectures. Amongst of various classes of signed-digit number systems the binary signeddigit number system has been studied most widely. However, in binary signed-digit arithmetic some operations appear to be quite complex. In particular, the straightforward approaches for magnitude comparator design are too complex. As a remedy canonical signed-digit number system-based magnitude compactor design has been recently proposed. Synchronically a typical representation of binary signed-digit number system, called the intermediate-zero-free binary signed-digit representation, has been introduced and this representation has been found promising to address some complex problems, including sign-detection, reverse conversion and overflow handling, in a more efficient manner. Arithmetic investigations are carried out in this paper shows that based on divide-andconquer strategy a novel magnitude comparator can be designed employing the intermediate zero-free binary signed-digit representation. Compared to its known contender, the proposed comparator can process $50 \%$ larger inputs. In addition, the proposed comparator attains logarithmic time complexity whereas its said contender involves linear time complexity.
A Switched Current Mirror based VLSI Architecture of 1-D DCT for Compressed ECG Signal Acquisition Anirban Ganguly, Debanjana Datta Mitra, Mousumi Bhanja, Anirban Chakraborty, Ayan Banerjee 2024 IEEE Calcutta Conference Calcon 2024 Proceedings, 2024 The discrete cosine transform (DCT) has been identified as a potential basis for compressed sensing (CS) due to its spectral compaction property. In real-time biomedical applications, a discrete-time analog architecture of the DCT processor has been proposed as a low-energy alternative to digital realization. Some compression applications in image and video processing have reported analog 2-D DCT architectures in either charge mode or current mode. Here, a switched current mode circuit realized 1-D DCT in a biomedical CS system, outperforming charge mode with a simpler design and no capacitor issues. Using a current mode matrix vector multiplier (MVM), power consumption was reduced to $9.2 \\mu \\mathrm{W}$ with 41dB PSNR accuracy in SPICE simulations (PTM 65nm CMOS). MATLAB-SPICE co-simulation validated performance with ECG signals at compression ratios (CR) of ${0. 7 5}$ and ${0. 8 7 5}$. Keywords-discrete cosine transform, compressed sensing, switched current mirror, matrix vector multiplier, co-simulation
Resource-efficient VLSI Architecture of Softmax Activation Function for Real-time Inference in Deep Learning Applications Akash Ther, Binit Kumar Pandit, Anirban Ganguly, Anirban Chakraborty, Ayan Banerjee 2023 International Symposium on Devices Circuits and Systems Isdcs 2023 Conference Proceedings, 2023 The Softmax activation function layer is the output layer in various Deep Learning (DL) models for multi-class classification applications. It computes the probability values for each input entry to the layer and represents the degree of confidence. The nonlinearity of the softmax function poses a challenge of increased hardware complexity. Therefore, the proposed VLSI architecture of the softmax activation function aims to reduce hardware resources with minimal loss in accuracy. The proposed architecture utilizes an optimized design of a ROM-based exponential unit for exponentiation and a Newton-Raphson iterations-based division by reciprocation unit for division operation. The proposed architecture is evaluated on Xilinx's Kirtex-7 KC705 FPGA Evaluation Platform, achieving LUTs 1.44 - 11.61 × lesser than the other state-of-the-art architectures on the MNIST dataset.
Deep Neural Network Based Multi-Object Detection for Real-time Aerial Surveillance Rebanta Dey, Binit Kumar Pandit, Anirban Ganguly, Anirban Chakraborty, Ayan Banerjee 2023 11th International Symposium on Electronic Systems Devices and Computing Esdc 2023, 2023 Aerial surveillance is one of the widely used modern days surveillance methodologies, finding applications in many important fields including military and civilian. This article presents a comprehensive study of Deep Neural Network (DNN) based solutions for real-time object tracking from Unmanned Aerial Vehicle (UAV) using a modified version of the state-of-the-art object detection algorithm YOLOv5 model. The modified YOLOv5 architecture is achieved by changing the activation function to Rectified Linear Unit (ReLU) and fine-tuning the network’s hyperparameter. A comparative analysis was then done on a subset of the AU-AIR dataset by comparing the different YOLOv5 models based on the network depth to determine the improvements in training speed and accuracy. The modified network was also compared in terms of mean average precision (mAP) to the original paper, a performance gain of almost 2.9 times was achieved in the best-case scenario.
Low Power and High Precision Analog VLSI Design of 1-D DCT for Real-time Application Deepak Kumar, Anirban Ganguly, Puja Chakraborty, Anirban Chakraborty, Binit Kumar Pandit, Ayan Banerjee 2022 IEEE Region 10 Symposium Tensymp 2022, 2022 The need for low power in many wireless and biomedical signal processing applications leaves a great scope for the analog discrete time VLSI architectures over their digital counterparts. To investigate the performance of a discrete cosine transform (DCT) for low energy compression application a subthreshold regime was selected using modified class AB current mirror only circuit. A radix-2 algorithm was selected in order to increase the efficiency of computation instead of complex matrix multiplication. The accuracy of the proposed circuit for an 8-point 1-D DCT computation was observed around ±1% with a power consumption of 1.8mW using a $0.5\\ \\mu {m}$ CMOS process.
CORDIC-Based High-Speed VLSI Architecture of Transform Model Estimation for Real-Time Imaging Anirban Chakraborty, Ayan Banerjee IEEE Transactions on Very Large Scale Integration VLSI Systems, 2021 Transform model estimation (TME) is a geometric operation, widely utilized in real-time imaging systems. Considering the massive computational load of matrix algebra-based TME realizations, most of the imaging systems resort to highly paralleled software-platform-based TME execution, which is power-intensive and expensive. Due to low-speed and power intensiveness, existing hardware for TME is not capable enough to meet the requirements of real-time systems. In this article, a hardware-realizable method of three-degree-of-freedom TME is formulated encompassing both the conventional CORDIC and the proposed modified CORDIC. The novelties of the proposed TME method and the corresponding architecture are that its latency sublinearly varies with the precision and the total computation time (CT) is almost independent of the input image sizes. The performance of prototype 16-bit fixed-point TME architecture (realized using VHDL in Xilinx Vivado 18.2) is compared with the software-counterpart. The proposed TME hardware is utilized along with other standard hardware modules to realize image registration (IR) operation. The proposed IR architecture achieves, on average, 60% reduction in total CT, 1.61× increase in maximum operating frequency with a comparable accuracy, only at the cost of 23% increase in power consumption with respect to other existing IR hardware.
A memory and area-efficient distributed arithmetic based modular VLSI architecture of 1D/2D reconfigurable 9/7 and 5/3 DWT filters for real-time image decomposition Anirban Chakraborty, Ayan Banerjee Journal of Real Time Image Processing, 2020 In this article, we have proposed the internal architecture of a dedicated hardware for 1D/2D convolution-based 9/7 and 5/3 DWT filters, exploiting bit-parallel ‘distributed arithmetic’ (DA) to reduce the computation time of our proposed DWT design while retaining the area at a comparable level to other recent existing designs. Despite using memory extensive bit-parallel DA, we have successfully achieved 90% reduction in the memory size than that of the other notable architectures. Through our proposed architecture, both the 9/7 and 5/3 DWT filters can be realized with a selection input, mode. With the introduction of DA, we have incorporated pipelining and parallelism into our proposed convolution-based 1D/2D DWT architectures. We have reduced the area by 38.3% and memory requirement by 90% than that of the latest remarkable designs. The critical-path delay of our design is almost 50% than that of the other latest designs. We have successfully applied our prototype 2D design for real-time image decomposition. The quality of the architecture in case of real-time image decomposition is measured by ‘peak signal-to-noise ratio’ and ‘computation time’, where our proposed design outperforms other similar kind of software- and hardware-based implementations.
A Memory Efficient, Multiplierless & Modular VLSI Architecture of 1D/2D Re-Configurable 9/7 & 5/3 DWT Filters Using Distributed Arithmetic Anirban Chakraborty, Ayan Banerjee Journal of Circuits Systems and Computers, 2020 Dedicated hardware for “Discrete Wavelet Transform” (DWT) is at high demand for real-time imaging operations in any standalone electronic devices, as DWT is being extensively utilized for most of the transform-domain imagery applications. Various DWT algorithms exist in the literature facilitating its software implementations which are generally unsuitable for real-time imaging in any stand-alone devices due to their power intensiveness and huge computation time. In this paper, a convolutional DWT-based pipelined and tunable VLSI architecture of Daubechies 9/7 and 5/3 DWT filter is presented. Our proposed architecture, which mingles the advantages of convolutional and lifting DWT while discarding their notable disadvantages, is made area and memory efficient by exploiting “Distributed Arithmetic’ (DA) in our own ingenious way. Almost 90% reduction in the memory size than other notable architectures is reported. In our proposed architecture, both the 9/7 and 5/3 DWT filters can be realized with a selection input, “mode”. With the introduction of DA, pipelining and parallelism are easily incorporated into our proposed 1D/2D DWT architectures. The area requirement and critical path delay are reduced to almost 38.3% and 50% than that of the latest remarkable designs. The performance of the proposed VLSI architecture also excels in real-time applications.