Author: "Yazdanbakhsh, A." - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yazdanbakhsh, A."' showing total 5,042 results

Start Over Author "Yazdanbakhsh, A."

5,042 results on '"Yazdanbakhsh, A."'

1. CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming

Author: TehraniJamsaz, Ali, Bhattacharjee, Arijit, Chen, Le, Ahmed, Nesreen K., Yazdanbakhsh, Amir, and Jannesari, Ali
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Performance, Computer Science - Programming Languages, Computer Science - Software Engineering
Abstract: Recent advancements in Large Language Models (LLMs) have renewed interest in automatic programming language translation. Encoder-decoder transformer models, in particular, have shown promise in translating between different programming languages. However, translating between a language and its high-performance computing (HPC) extensions remains underexplored due to challenges such as complex parallel semantics. In this paper, we introduce CodeRosetta, an encoder-decoder transformer model designed specifically for translating between programming languages and their HPC extensions. CodeRosetta is evaluated on C++ to CUDA and Fortran to C++ translation tasks. It uses a customized learning framework with tailored pretraining and training objectives to effectively capture both code semantics and parallel structural nuances, enabling bidirectional translation. Our results show that CodeRosetta outperforms state-of-the-art baselines in C++ to CUDA translation by 2.9 BLEU and 1.72 CodeBLEU points while improving compilation accuracy by 6.05%. Compared to general closed-source LLMs, our method improves C++ to CUDA translation by 22.08 BLEU and 14.39 CodeBLEU, with 2.75% higher compilation accuracy. Finally, CodeRosetta exhibits proficiency in Fortran to parallel C++ translation, marking it, to our knowledge, as the first encoder-decoder model for this complex task, improving CodeBLEU by at least 4.63 points compared to closed-source and open-code LLMs.
Published: 2024

2. A New Perspective on Determining Disease Invasion and Population Persistence in Heterogeneous Environments

Author: Yazdanbakhsh, Poroshat, Anderson, Mark, and Shuai, Zhisheng
Subjects: Quantitative Biology - Populations and Evolution, Mathematics - Dynamical Systems
Abstract: We introduce a new quantity known as the network heterogeneity index, denoted by $\mathcal{H}$, which facilitates the investigation of disease propagation and population persistence in heterogeneous environments. Our mathematical analysis reveals that this index embodies the structure of such networks, the disease or population dynamics of patches, and the dispersal between patches. We present multiple representations of the network heterogeneity index and demonstrate that $\mathcal{H}\geq 0$. Moreover, we explore the applications of $\mathcal{H}$ in epidemiology and ecology across various heterogeneous environments, highlighting its effectiveness in determining disease invasibility and population persistence.
Published: 2024

3. When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models

Author: You, Haoran, Fu, Yichao, Wang, Zheng, Yazdanbakhsh, Amir, and Lin, Yingyan Celine
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Autoregressive Large Language Models (LLMs) have achieved impressive performance in language tasks but face two significant bottlenecks: (1) quadratic complexity in the attention module as the number of tokens increases, and (2) limited efficiency due to the sequential processing nature of autoregressive LLMs during generation. While linear attention and speculative decoding offer potential solutions, their applicability and synergistic potential for enhancing autoregressive LLMs remain uncertain. We conduct the first comprehensive study on the efficacy of existing linear attention methods for autoregressive LLMs, integrating them with speculative decoding. We introduce an augmentation technique for linear attention that ensures compatibility with speculative decoding, enabling more efficient training and serving of LLMs. Extensive experiments and ablation studies involving seven existing linear attention models and five encoder/decoder-based LLMs consistently validate the effectiveness of our augmented linearized LLMs. Notably, our approach achieves up to a 6.67 reduction in perplexity on the LLaMA model and up to a 2$\times$ speedup during generation compared to prior linear attention methods. Codes and models are available at https://github.com/GATECH-EIC/Linearized-LLM., Comment: Accepted by ICML 2024; 17 pages; 10 figures; 16 tables
Published: 2024

4. ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Author: You, Haoran, Guo, Yipin, Fu, Yichao, Zhou, Wei, Shi, Huihong, Zhang, Xiaofan, Kundu, Souvik, Yazdanbakhsh, Amir, and Lin, Yingyan Celine
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large language models (LLMs) have shown impressive performance on language tasks but face challenges when deployed on resource-constrained devices due to their extensive parameters and reliance on dense multiplications, resulting in high memory demands and latency bottlenecks. Shift-and-add reparameterization offers a promising solution by replacing costly multiplications with hardware-friendly primitives in both the attention and multi-layer perceptron (MLP) layers of an LLM. However, current reparameterization techniques require training from scratch or full parameter fine-tuning to restore accuracy, which is resource-intensive for LLMs. To address this, we propose accelerating pretrained LLMs through post-training shift-and-add reparameterization, creating efficient multiplication-free models, dubbed ShiftAddLLM. Specifically, we quantize each weight matrix into binary matrices paired with group-wise scaling factors. The associated multiplications are reparameterized into (1) shifts between activations and scaling factors and (2) queries and adds according to the binary matrices. To reduce accuracy loss, we present a multi-objective optimization method to minimize both weight and output activation reparameterization errors. Additionally, based on varying sensitivity across layers to reparameterization, we develop an automated bit allocation strategy to further reduce memory usage and latency. Experiments on five LLM families and eight tasks consistently validate the effectiveness of ShiftAddLLM, achieving average perplexity improvements of 5.6 and 22.7 points at comparable or lower latency compared to the most competitive quantized LLMs at 3 and 2 bits, respectively, and more than 80% memory and energy reductions over the original LLMs. Codes and models are available at https://github.com/GATECH-EIC/ShiftAddLLM., Comment: Accepted by NeurIPS 2024
Published: 2024

5. Effective Interplay between Sparsity and Quantization: From Theory to Practice

Author: Harma, Simla Burcu, Chakraborty, Ayan, Kostenok, Elizaveta, Mishin, Danila, Ha, Dongho, Falsafi, Babak, Jaggi, Martin, Liu, Ming, Oh, Yunho, Subramanian, Suvinay, and Yazdanbakhsh, Amir
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The increasing size of deep neural networks necessitates effective model compression to improve computational efficiency and reduce their memory footprint. Sparsity and quantization are two prominent compression methods that have individually demonstrated significant reduction in computational and memory footprints while preserving model accuracy. While effective, the interplay between these two methods remains an open question. In this paper, we investigate the interaction between these two methods and assess whether their combination impacts final model accuracy. We mathematically prove that applying sparsity before quantization is the optimal sequence for these operations, minimizing error in computation. Our empirical studies across a wide range of models, including OPT and Llama model families (125M-8B) and ViT corroborate these theoretical findings. In addition, through rigorous analysis, we demonstrate that sparsity and quantization are not orthogonal; their interaction can significantly harm model accuracy, with quantization error playing a dominant role in this degradation. Our findings extend to the efficient deployment of large models in resource-limited compute platforms and reduce serving cost, offering insights into best practices for applying these compression methods to maximize efficacy without compromising accuracy.
Published: 2024

6. SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs

Author: Mozaffari, Mohammad, Yazdanbakhsh, Amir, Zhang, Zhao, and Dehnavi, Maryam Mehri
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: We propose SLoPe, a Double-Pruned Sparse Plus Lazy Low-rank Adapter Pretraining method for LLMs that improves the accuracy of sparse LLMs while accelerating their pretraining and inference and reducing their memory footprint. Sparse pretraining of LLMs reduces the accuracy of the model, to overcome this, prior work uses dense models during fine-tuning. SLoPe improves the accuracy of sparsely pretrained models by adding low-rank adapters in the final 1% iterations of pretraining without adding significant overheads to the model pretraining and inference. In addition, SLoPe uses a double-pruned backward pass formulation that prunes the transposed weight matrix using N:M sparsity structures to enable an accelerated sparse backward pass. SLoPe accelerates the training and inference of models with billions of parameters up to $1.14\times$ and $1.34\times$ respectively (OPT-33B and OPT-66B) while reducing their memory usage by up to $0.77\times$ and $0.51\times$ for training and inference respectively.
Published: 2024

7. Tao: Re-Thinking DL-based Microarchitecture Simulation

Author: Pandey, Santosh, Yazdanbakhsh, Amir, and Liu, Hang
Subjects: Computer Science - Hardware Architecture, Computer Science - Machine Learning
Abstract: Microarchitecture simulators are indispensable tools for microarchitecture designers to validate, estimate, and optimize new hardware that meets specific design requirements. While the quest for a fast, accurate and detailed microarchitecture simulation has been ongoing for decades, existing simulators excel and fall short at different aspects: (i) Although execution-driven simulation is accurate and detailed, it is extremely slow and requires expert-level experience to design. (ii) Trace-driven simulation reuses the execution traces in pursuit of fast simulation but faces accuracy concerns and fails to achieve significant speedup. (iii) Emerging deep learning (DL)-based simulations are remarkably fast and have acceptable accuracy but fail to provide adequate low-level microarchitectural performance metrics crucial for microarchitectural bottleneck analysis. Additionally, they introduce substantial overheads from trace regeneration and model re-training when simulating a new microarchitecture. Re-thinking the advantages and limitations of the aforementioned simulation paradigms, this paper introduces TAO that redesigns the DL-based simulation with three primary contributions: First, we propose a new training dataset design such that the subsequent simulation only needs functional trace as inputs, which can be rapidly generated and reused across microarchitectures. Second, we redesign the input features and the DL model using self-attention to support predicting various performance metrics. Third, we propose techniques to train a microarchitecture agnostic embedding layer that enables fast transfer learning between different microarchitectural configurations and reduces the re-training overhead of conventional DL-based simulators. Our extensive evaluation shows TAO can reduce the overall training and simulation time by 18.06x over the state-of-the-art DL-based endeavors., Comment: Published in POMACS and SIGMETRICS'24
Published: 2024

8. DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics

Author: Kim, Yoonsung, Oh, Changhun, Hwang, Jinwoo, Kim, Wonung, Oh, Seongryong, Lee, Yubin, Sharma, Hardik, Yazdanbakhsh, Amir, and Park, Jongse
Subjects: Computer Science - Hardware Architecture, Computer Science - Machine Learning, Computer Science - Robotics
Abstract: Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots. However, real-world deployment faces challenges due to their limited computational resources and battery power. To tackle these challenges, continuous learning exploits a lightweight "student" model at deployment (inference), leverages a larger "teacher" model for labeling sampled data (labeling), and continuously retrains the student model to adapt to changing scenarios (retraining). This paper highlights the limitations in state-of-the-art continuous learning systems: (1) they focus on computations for retraining, while overlooking the compute needs for inference and labeling, (2) they rely on power-hungry GPUs, unsuitable for battery-operated autonomous systems, and (3) they are located on a remote centralized server, intended for multi-tenant scenarios, again unsuitable for autonomous systems due to privacy, network availability, and latency concerns. We propose a hardware-algorithm co-designed solution for continuous learning, DaCapo, that enables autonomous systems to perform concurrent executions of inference, labeling, and training in a performant and energy-efficient manner. DaCapo comprises (1) a spatially-partitionable and precision-flexible accelerator enabling parallel execution of kernels on sub-accelerators at their respective precisions, and (2) a spatiotemporal resource allocation algorithm that strategically navigates the resource-accuracy tradeoff space, facilitating optimal decisions for resource allocation to achieve maximal accuracy. Our evaluation shows that DaCapo achieves 6.5% and 5.5% higher accuracy than a state-of-the-art GPU-based continuous learning systems, Ekya and EOMU, respectively, while consuming 254x less power.
Published: 2024
Full Text: View/download PDF

9. Hemoglobin-oxygen affinity changes in neonatal blood transfusions: RBC selection insights

Author: Yazdanbakhsh, Mahsa, Eid, Haytham, Acker, Jason P., Bar Am, Neta, Cheung, Po-Yin, Dotchin, Stephanie A., and Rabi, Yacov
Published: 2024
Full Text: View/download PDF

10. Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers

Author: Bambhaniya, Abhimanyu Rajeshkumar, Yazdanbakhsh, Amir, Subramanian, Suvinay, Kao, Sheng-Chun, Agrawal, Shivani, Evci, Utku, and Krishna, Tushar
Subjects: Computer Science - Machine Learning, Computer Science - Hardware Architecture
Abstract: N:M Structured sparsity has garnered significant interest as a result of relatively modest overhead and improved efficiency. Additionally, this form of sparsity holds considerable appeal for reducing the memory footprint owing to their modest representation overhead. There have been efforts to develop training recipes for N:M structured sparsity, they primarily focus on low-sparsity regions ($\sim$50\%). Nonetheless, performance of models trained using these approaches tends to decline when confronted with high-sparsity regions ($>$80\%). In this work, we study the effectiveness of existing sparse training recipes at \textit{high-sparsity regions} and argue that these methods fail to sustain the model quality on par with low-sparsity regions. We demonstrate that the significant factor contributing to this disparity is the presence of elevated levels of induced noise in the gradient magnitudes. To mitigate this undesirable effect, we employ decay mechanisms to progressively restrict the flow of gradients towards pruned elements. Our approach improves the model quality by up to 2$\%$ and 5$\%$ in vision and language models at high sparsity regime, respectively. We also evaluate the trade-off between model accuracy and training compute cost in terms of FLOPs. At iso-training FLOPs, our method yields better performance compared to conventional sparse training recipes, exhibiting an accuracy improvement of up to 2$\%$. The source code is available at https://github.com/abhibambhaniya/progressive_gradient_flow_nm_sparsity., Comment: 18 pages, 8 figures, 17 tables. Code is available at https://github.com/abhibambhaniya/progressive_gradient_flow_nm_sparsity
Published: 2024

11. USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

Author: Ding, Shaojin, Qiu, David, Rim, David, He, Yanzhang, Rybakov, Oleg, Li, Bo, Prabhavalkar, Rohit, Wang, Weiran, Sainath, Tara N., Han, Zhonglin, Li, Jian, Yazdanbakhsh, Amir, and Agrawal, Shivani
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: End-to-end automatic speech recognition (ASR) models have seen revolutionary quality gains with the recent development of large-scale universal speech models (USM). However, deploying these massive USMs is extremely expensive due to the enormous memory usage and computational cost. Therefore, model compression is an important research topic to fit USM-based ASR under budget in real-world scenarios. In this study, we propose a USM fine-tuning approach for ASR, with a low-bit quantization and N:M structured sparsity aware paradigm on the model weights, reducing the model complexity from parameter precision and matrix topology perspectives. We conducted extensive experiments with a 2-billion parameter USM on a large-scale voice search dataset to evaluate our proposed method. A series of ablation studies validate the effectiveness of up to int4 quantization and 2:4 sparsity. However, a single compression technique fails to recover the performance well under extreme setups including int2 quantization and 1:4 sparsity. By contrast, our proposed method can compress the model to have 9.4% of the size, at the cost of only 7.3% relative word error rate (WER) regressions. We also provided in-depth analyses on the results and discussions on the limitations and potential solutions, which would be valuable for future studies., Comment: Accepted by ICASSP 2024. Preprint
Published: 2023

12. Performance evaluation of a combination Plasmodium dual-antigen CRP rapid diagnostic test in Lambaréné, Gabon

Author: Alabi, Ayodele, Musangomunei, Fungai P., Lotola-Mougeni, Fabrice, Bie-Ondo, Juste C., Murphy, Kristin, Essone, Paulin N., Kabwende, Anita L., Mahmoudou, Saidou, Macé, Aurélien, Harris, Victoria, Ramharter, Michael, Grobusch, Martin P., Yazdanbakhsh, Maria, Fernandez-Carballo, B. Leticia, Escadafal, Camille, Kremsner, Peter G., Dittrich, Sabine, and Agnandji, Selidji T.
Published: 2024
Full Text: View/download PDF

13. An Evolutionary Model for the Etiology of Obsessive-Compulsive Disorder: The Mediating Role of Emotional Awareness and Uncertainty Intolerance in the Relationship between Childhood Fears and Behavioral Brain Systems with Obsessive-Compulsive Disorder in Secondary School Students of Kouhdasht City

Author: Masoomeh Azadbakht, Khodamorad Momeni, and Kamran Yazdanbakhsh
Subjects: brain behavioral systems, childhood fears, emotional awareness, intolerance of uncertainty, obsessive-compulsive disorder, Psychology, BF1-990
Abstract: The purpose of this research is to investigate the mediating role of emotional awareness and intolerance of ambiguity in the relationship between childhood fears and brain-behavioral systems with signs and symptoms of obsessive-compulsive disorder in high school girls and boys in Kouhdasht city. The purpose of this research is fundamental, and it is a descriptive-correlation form of structural equations in terms of its research method. The statistical population of this research is comprised of all pupils enrolled in the second secondary level of Kouhdasht city in Lorestan province during the academic year 2022-2023. According to the number of subscales (29 subscales), the sample size should be a minimum of 435. The analysis of 441 samples (273 girls and 168 boys) was conducted in this study. To collect data from the Fear Questionnaire for Children and Adolescents (FSSC-R), Behavioral Inhibition/Activation Systems Scale, Emotional Awareness Questionnaire (EAQ-30), Intolerance of Uncertainty Questionnaire (IUS), and Obsessive-Compulsive Inventory – Revised (OCI-R) were employed to gather data from the Fear Questionnaire for Children and Adolescents (FSSC-R). The data classification, processing, and analysis were conducted using SPSS 22 and Lisrel 8.85 statistical software. The structural equation modeling method was employed to evaluate the fit of the hypothetical model. The data analysis results indicated that the proposed model for the etiology of OCD was a reasonable fit. The findings indicated that obsessive-compulsive disorder is directly influenced by childhood fears, behavioral brain systems, emotional awareness, and intolerance of ambiguity. Additionally, this disorder is indirectly influenced by childhood anxieties and behavioral brain systems, which are characterized by an intolerance of ambiguity and an excitement of awareness. Therefore, it is recommended that these variables be incorporated into future research and the development of preventive and treatment protocols for OCD.
Published: 2024
Full Text: View/download PDF

14. Simultaneous effects of temperature and backbone length on static and dynamic properties of high-density polyethylene-1-butene copolymer melt: Equilibrium molecular dynamics approach

Author: Yazdanbakhsh Amirhosein and Motlagh Ghodratollah Hashemi
Subjects: polyethylene, chain length, dynamic properties, entanglement effect, temperature effect, molecular dynamics simulation, trappe, Polymers and polymer manufacture, TP1080-1185
Abstract: Temperature and chain length play significant roles in determining the physical properties of polymer melts. In the current computational research, a molecular dynamics (MD) approach was implemented to describe the static and dynamic properties of (1) high-density polyethylene-1-butene with 120 beads in backbone (PE120) and (2) entangled high-density polyethylene-1-butene with 600 beads in the backbone (PE600). The transferable potentials for phase equilibria force fields were used for CH2 beads in a defined initial condition. First, the equilibrium phase of the designed systems was reported with total energy and density convergency at various initial temperatures (T 0 = 450, 470, and 490 K). Also, gyration radius (R g) and end-to-end distance (R) were calculated for the static behavior description of the two PEs. Zero-shear viscosity (η 0), mean square displacement, and diffusion coefficient (D) were estimated to define the dynamic behavior of PE120 and PE600 systems. MD outputs predicted that 10 ns is sufficient for equilibrium phase detection inside polymeric samples. After equilibrium phase detection, R g converged to 14.97 and 17.35 Å in PE120 and PE600, respectively (T 0 = 450 K). Furthermore, MD outputs show that temperature variation can considerably affect the time evolution of the system. Numerically, the η 0 of PE120 and PE600 converged to 49 and 168 cp at 450 K. These results of η 0 parameter as a function of temperature are an important output of MD simulations. The results predicted that η 0 decreases to 24 and 44 cp for PE120 and PE600 samples with an increase in temperature from 450 to 490 K. With the creation of the entanglements network, D reached the highest value of 2 × 10−9 m2·s−1 among the designed polymeric systems. The results are in good consistency with experimental reports. It is expected that the result of this study can be used in designing improved polymeric systems for real applications.
Published: 2024
Full Text: View/download PDF

15. Correction: Performance evaluation of a combination Plasmodium dual-antigen CRP rapid diagnostic test in Lambaréné, Gabon

Author: Alabi, Ayodele, Musangomunei, Fungai P., Lotola-Mougeni, Fabrice, Bie-Ondo, Juste C., Murphy, Kristin, Essone, Paulin N., Kabwende, Anita L., Mahmoudou, Saidou, Macé, Aurélien, Harris, Victoria, Ramharter, Michael, Grobusch, Martin P., Yazdanbakhsh, Maria, Fernandez-Carballo, B. Leticia, Escadafal, Camille, Kremsner, Peter G., Dittrich, Sabine, and Agnandji, Selidji T.
Published: 2024
Full Text: View/download PDF

16. Controlled human hookworm infection remodels plasmacytoid dendritic cells and regulatory T cells towards profiles seen in natural infections in endemic areas

Author: Manurung, Mikhael D., Sonnet, Friederike, Hoogerwerf, Marie-Astrid, Janse, Jacqueline J., Kruize, Yvonne, Bes-Roeleveld, Laura de, König, Marion, Loukas, Alex, Dewals, Benjamin G., Supali, Taniawati, Jochems, Simon P., Roestenberg, Meta, Coppola, Mariateresa, and Yazdanbakhsh, Maria
Published: 2024
Full Text: View/download PDF

17. A neural modeling approach to study mechanisms underlying the heterogeneity of visual spatial frequency sensitivity in schizophrenia

Author: Dugan, Caroline, Zikopoulos, Basilis, and Yazdanbakhsh, Arash
Published: 2024
Full Text: View/download PDF

18. Machine learning-powered estimation of malachite green photocatalytic degradation with NML-BiFeO3 composites

Author: Salahshoori, Iman, Yazdanbakhsh, Amirhosein, and Baghban, Alireza
Published: 2024
Full Text: View/download PDF

19. Photoreduction of atrazine from aqueous solution using sulfite/iodide/UV process, degradation, kinetics and by-products pathway

Author: Vahidi-Kolur, Robabeh, Yazdanbakhsh, Ahmadreza, Hosseini, Seyed Arman, and Sheikhmohammadi, Amir
Published: 2024
Full Text: View/download PDF

20. Immunological factors linked to geographical variation in vaccine responses

Author: van Dorst, Marloes M. A. R., Pyuza, Jeremia J., Nkurunungi, Gyaviira, Kullaya, Vesla I., Smits, Hermelijn H., Hogendoorn, Pancras C. W., Wammes, Linda J., Everts, Bart, Elliott, Alison M., Jochems, Simon P., and Yazdanbakhsh, Maria
Published: 2024
Full Text: View/download PDF

21. Community water fluoride cessation and rate of caries-related pediatric dental treatments under general anesthesia in Alberta, Canada

Author: Yazdanbakhsh, Elnaz, Bohlouli, Babak, Patterson, Steven, and Amin, Maryam
Published: 2024
Full Text: View/download PDF

22. JaxPruner: A concise library for sparsity research

Author: Lee, Joo Hyung, Park, Wonpyo, Mitchell, Nicole, Pilault, Jonathan, Obando-Ceron, Johan, Kim, Han-Byul, Lee, Namhoon, Frantar, Elias, Long, Yun, Yazdanbakhsh, Amir, Agrawal, Shivani, Subramanian, Suvinay, Wang, Xin, Kao, Sheng-Chun, Zhang, Xingyao, Gale, Trevor, Bik, Aart, Han, Woohyun, Ferev, Milen, Han, Zhonglin, Kim, Hong-Seok, Dauphin, Yann, Dziugaite, Gintare Karolina, Castro, Pablo Samuel, and Evci, Utku
Subjects: Computer Science - Machine Learning, Computer Science - Software Engineering
Abstract: This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research. JaxPruner aims to accelerate research on sparse neural networks by providing concise implementations of popular pruning and sparse training algorithms with minimal memory and latency overhead. Algorithms implemented in JaxPruner use a common API and work seamlessly with the popular optimization library Optax, which, in turn, enables easy integration with existing JAX based libraries. We demonstrate this ease of integration by providing examples in four different codebases: Scenic, t5x, Dopamine and FedJAX and provide baseline experiments on popular benchmarks., Comment: Jaxpruner is hosted at http://github.com/google-research/jaxpruner
Published: 2023

23. Self-Refine: Iterative Refinement with Self-Feedback

Author: Madaan, Aman, Tandon, Niket, Gupta, Prakhar, Hallinan, Skyler, Gao, Luyu, Wiegreffe, Sarah, Alon, Uri, Dziri, Nouha, Prabhumoye, Shrimai, Yang, Yiming, Gupta, Shashank, Majumder, Bodhisattwa Prasad, Hermann, Katherine, Welleck, Sean, Yazdanbakhsh, Amir, and Clark, Peter
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it to refine itself, iteratively. Self-Refine does not require any supervised training data, additional training, or reinforcement learning, and instead uses a single LLM as the generator, refiner, and feedback provider. We evaluate Self-Refine across 7 diverse tasks, ranging from dialog response generation to mathematical reasoning, using state-of-the-art (GPT-3.5, ChatGPT, and GPT-4) LLMs. Across all evaluated tasks, outputs generated with Self-Refine are preferred by humans and automatic metrics over those generated with the same LLM using conventional one-step generation, improving by ~20% absolute on average in task performance. Our work demonstrates that even state-of-the-art LLMs like GPT-4 can be further improved at test time using our simple, standalone approach., Comment: Code, data, and demo at https://selfrefine.info/
Published: 2023

24. In-Storage Domain-Specific Acceleration for Serverless Computing

Author: Mahapatra, Rohan, Ghodrati, Soroush, Ahn, Byung Hoon, Kinzer, Sean, Wang, Shu-ting, Xu, Hanyang, Karthikeyan, Lavanya, Sharma, Hardik, Yazdanbakhsh, Amir, Alian, Mohammad, and Esmaeilzadeh, Hadi
Subjects: Computer Science - Hardware Architecture
Abstract: While (1) serverless computing is emerging as a popular form of cloud execution, datacenters are going through major changes: (2) storage dissaggregation in the system infrastructure level and (3) integration of domain-specific accelerators in the hardware level. Each of these three trends individually provide significant benefits; however, when combined the benefits diminish. Specifically, the paper makes the key observation that for serverless functions, the overhead of accessing dissaggregated persistent storage overshadows the gains from accelerators. Therefore, to benefit from all these trends in conjunction, we propose Domain-Specific Computational Storage for Serverless (DSCS-Serverless). This idea contributes a serverless model that leverages a programmable accelerator within computational storage to conjugate the benefits of acceleration and storage disaggregation simultaneously. Our results with eight applications shows that integrating a comparatively small accelerator within the storage (DSCS-Serverless) that fits within its power constrains (15 Watts), significantly outperforms a traditional disaggregated system that utilizes the NVIDIA RTX 2080 Ti GPU (250 Watts). Further, the work highlights that disaggregation, serverless model, and the limited power budget for computation in storage require a different design than the conventional practices of integrating microprocessors and FPGAs. This insight is in contrast with current practices of designing computational storage that are yet to address the challenges associated with the shifts in datacenters. In comparison with two such conventional designs that either use quad-core ARM A57 or a Xilinx FPGA, DSCS-Serverless provides 3.7x and 1.7x end-to-end application speedup, 4.3x and 1.9x energy reduction, and 3.2x and 2.3x higher cost efficiency, respectively.
Published: 2023

25. Dynamics of antimicrobial resistance and susceptibility profile in full-scale hospital wastewater treatment plants

Author: Maedeh Esmaeili-khoshmardan, Hossein Dabiri, Mohammad Rafiee, Akbar Eslami, Ahmadreza Yazdanbakhsh, Fatemeh Amereh, Mahsa Jahangiri-rad, and Ali Hashemi
Subjects: antibiotic-resistant genes, antibiotic-resistant bacteria, effluent quality parameters, hospital wastewater treatment plant, resistance profile, Environmental technology. Sanitary engineering, TD1-1066
Abstract: Drug resistance has become a matter of great concern, with many bacteria now resist multiple antibiotics. This study depicts the occurrence of antibiotic-resistant bacteria (ARB) and resistance patterns in five full-scale hospital wastewater treatment plants (WWTPs). Samples of raw influent wastewater, as well as pre- and post-disinfected effluents, were monitored for targeted ARB and resistance genes in September 2022 and February 2023. Shifts in resistance profiles of Escherichia coli, Pseudomonas aeruginosa, and Acinetobacter baumannii antimicrobial-resistant indicators in the treated effluent compared to that in the raw wastewater were also worked out. Ceftazidime (6.78 × 105 CFU/mL) and cefotaxime (6.14 × 105 CFU/mL) resistant species showed the highest concentrations followed by ciprofloxacin (6.29 × 104 CFU/mL), and gentamicin (4.88 × 104 CFU/mL), in raw influent respectively. WWTP-D employing a combination of biological treatment and coagulation/clarification for wastewater decontamination showed promising results for reducing ARB emissions from wastewater. Relationships between treated effluent quality parameters and ARB loadings showed that high BOD5 and nitrate levels were possibly contributing to the persistence and/or selection of ARBs in WWTPs. Furthermore, antimicrobial susceptibility tests of targeted species revealed dynamic shifts in resistance profiles through treatment processes, highlighting the potential for ARB and ARGs in hospital wastewater to persist or amplify during treatment. HIGHLIGHTS A compelling evidence for the occurrence, burden and patterns of bacterial antimicrobial resistance in hospital WWTPs was presented.; PACl coagulation technology followed by biological treatment showed a higher removal performance.; Resistance patterns were significantly shifted following biological treatment and chlorine disinfection.; Resistant burdens were associated with effluent TSS, BOD5, NO3- and free chlorine levels.;
Published: 2024
Full Text: View/download PDF

26. A neural modeling approach to study mechanisms underlying the heterogeneity of visual spatial frequency sensitivity in schizophrenia

Author: Caroline Dugan, Basilis Zikopoulos, and Arash Yazdanbakhsh
Subjects: Psychiatry, RC435-571
Abstract: Abstract Patients with schizophrenia exhibit abnormalities in spatial frequency sensitivity, and it is believed that these abnormalities indicate more widespread dysfunction and dysregulation of bottom-up processing. The early visual system, including the first-order Lateral Geniculate Nucleus of the thalamus (LGN) and the primary visual cortex (V1), are key contributors to spatial frequency sensitivity. Medicated and unmedicated patients with schizophrenia exhibit contrasting changes in spatial frequency sensitivity, thus making it a useful probe for examining potential effects of the disorder and antipsychotic medications in neural processing. We constructed a parameterized, rate-based neural model of on-center/off-surround neurons in the early visual system to investigate the impacts of changes to the excitatory and inhibitory receptive field subfields. By incorporating changes in both the excitatory and inhibitory subfields that are associated with pathophysiological findings in schizophrenia, the model successfully replicated perceptual data from behavioral/functional studies involving medicated and unmedicated patients. Among several plausible mechanisms, our results highlight the dampening of excitation and/or increase in the spread and strength of the inhibitory subfield in medicated patients and the contrasting decreased spread and strength of inhibition in unmedicated patients. Given that the model was successful at replicating results from perceptual data under a variety of conditions, these elements of the receptive field may be useful markers for the imbalances seen in patients with schizophrenia.
Published: 2024
Full Text: View/download PDF

27. Controlled human hookworm infection remodels plasmacytoid dendritic cells and regulatory T cells towards profiles seen in natural infections in endemic areas

Author: Mikhael D. Manurung, Friederike Sonnet, Marie-Astrid Hoogerwerf, Jacqueline J. Janse, Yvonne Kruize, Laura de Bes-Roeleveld, Marion König, Alex Loukas, Benjamin G. Dewals, Taniawati Supali, Simon P. Jochems, Meta Roestenberg, Mariateresa Coppola, and Maria Yazdanbakhsh
Subjects: Science
Abstract: Abstract Hookworm infection remains a significant public health concern, particularly in low- and middle-income countries, where mass drug administration has not stopped reinfection. Developing a vaccine is crucial to complement current control measures, which necessitates a thorough understanding of host immune responses. By leveraging controlled human infection models and high-dimensional immunophenotyping, here we investigated the immune remodeling following infection with 50 Necator americanus L3 hookworm larvae in four naïve volunteers over two years of follow-up and compared the profiles with naturally infected populations in endemic areas. Increased plasmacytoid dendritic cell frequency and diminished responsiveness to Toll-like receptor 7/8 ligand were observed in both controlled and natural infection settings. Despite the increased CD45RA+ regulatory T cell (Tregs) frequencies in both settings, markers of Tregs function, including inducible T-cell costimulatory (ICOS), tumor necrosis factor receptor 2 (TNFR2), and latency-associated peptide (LAP), as well as in vitro Tregs suppressive capacity were higher in natural infections. Taken together, this study provides unique insights into the immunological trajectories following a first-in-life hookworm infection compared to natural infections.
Published: 2024
Full Text: View/download PDF

28. Learning Performance-Improving Code Edits

Author: Shypula, Alexander, Madaan, Aman, Zeng, Yimeng, Alon, Uri, Gardner, Jacob, Hashemi, Milad, Neubig, Graham, Ranganathan, Parthasarathy, Bastani, Osbert, and Yazdanbakhsh, Amir
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Performance
Abstract: With the decline of Moore's law, optimizing program performance has become a major focus of software research. However, high-level optimizations such as API and algorithm changes remain elusive due to the difficulty of understanding the semantics of code. Simultaneously, pretrained large language models (LLMs) have demonstrated strong capabilities at solving a wide range of programming tasks. To that end, we introduce a framework for adapting LLMs to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs, accompanied by extensive unit tests. A major challenge is the significant variability of measuring performance on commodity hardware, which can lead to spurious "improvements." To isolate and reliably evaluate the impact of program optimizations, we design an environment based on the gem5 full system simulator, the de facto simulator used in academia and industry. Next, we propose a broad range of adaptation strategies for code optimization; for prompting, these include retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play. A combination of these techniques achieves a mean speedup of 6.86 with eight generations, higher than average optimizations from individual programmers (3.66). Using our model's fastest generations, we set a new upper limit on the fastest speedup possible for our dataset at 9.64 compared to using the fastest human submissions available (9.56)., Comment: Published as a conference paper at ICLR 2024 (Spotlight). Project website: https://pie4perf.com/
Published: 2023

29. STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition

Author: Lu, Yucheng, Agrawal, Shivani, Subramanian, Suvinay, Rybakov, Oleg, De Sa, Christopher, and Yazdanbakhsh, Amir
Subjects: Computer Science - Machine Learning
Abstract: Recent innovations on hardware (e.g. Nvidia A100) have motivated learning N:M structured sparsity masks from scratch for fast model inference. However, state-of-the-art learning recipes in this regime (e.g. SR-STE) are proposed for non-adaptive optimizers like momentum SGD, while incurring non-trivial accuracy drop for Adam-trained models like attention-based LLMs. In this paper, we first demonstrate such gap origins from poorly estimated second moment (i.e. variance) in Adam states given by the masked weights. We conjecture that learning N:M masks with Adam should take the critical regime of variance estimation into account. In light of this, we propose STEP, an Adam-aware recipe that learns N:M masks with two phases: first, STEP calculates a reliable variance estimate (precondition phase) and subsequently, the variance remains fixed and is used as a precondition to learn N:M masks (mask-learning phase). STEP automatically identifies the switching point of two phases by dynamically sampling variance changes over the training trajectory and testing the sample concentration. Empirically, we evaluate STEP and other baselines such as ASP and SR-STE on multiple tasks including CIFAR classification, machine translation and LLM fine-tuning (BERT-Base, GPT-2). We show STEP mitigates the accuracy drop of baseline recipes and is robust to aggressive structured sparsity ratios.
Published: 2023

30. A Glance at Archives of Academic Emergency Medicine Journal in 2024

Author: Mehrnoosh Yazdanbakhsh and Somayeh Saghaei Dehkordi
Subjects: 2023, Editorial, Emergency Severity Index - Emergency Department, Overview, Medical emergencies. Critical care. Intensive care. First aid, RC86-88.9
Abstract: There were 70 articles published in the 2024 volume of Archives of Academic Emergency Medicine. Around 350 authors contributed to the published works, who were affiliated to centers located in countries such as USA, Canada, Germany, Finland, China, Poland, Italy, Australia, UAE, Malaysia, India, Egypt, Bangladesh, Turkey, Thailand, Nigeria, Jordon, Yemen, Saudi Arabia, Azerbaijan, Vietnam, Pakistan. We would like to thank the authors for trusting us with their valuable works and publishing their articles with us.
Published: 2024
Full Text: View/download PDF

31. Lifestyle score is associated with cellular immune profiles in healthy Tanzanian adults

Author: Jeremia J. Pyuza, Marloes M.A.R. van Dorst, Koen Stam, Linda Wammes, Marion König, Vesla I. Kullaya, Yvonne Kruize, Wesley Huisman, Nikuntufya Andongolile, Anastazia Ngowi, Elichilia R. Shao, Alex Mremi, Pancras C.W. Hogendoorn, Sia E. Msuya, Simon P. Jochems, Wouter A.A. de Steenhuijsen Piters, and Maria Yazdanbakhsh
Subjects: Neurosciences. Biological psychiatry. Neuropsychiatry, RC321-571
Abstract: Immune system and vaccine responses vary across geographical locations worldwide, not only between high and low-middle income countries (LMICs), but also between rural and urban populations within the same country. Lifestyle factors such as housing conditions, exposure to microorganisms and parasites and diet are associated with rural-and urban-living. However, the relationships between these lifestyle factors and immune profiles have not been mapped in detail. Here, we profiled the immune system of 100 healthy Tanzanians living across four rural/urban areas using mass cytometry. We developed a lifestyle score based on an individual's household assets, housing condition and recent dietary history and studied the association with cellular immune profiles. Seventeen out of 80 immune cell clusters were associated with living location or lifestyle score, with eight identifiable only using lifestyle score. Individuals with low lifestyle score, most of whom live in rural settings, showed higher frequencies of NK cells, plasmablasts, atypical memory B cells, T helper 2 cells, regulatory T cells and activated CD4+ T effector memory cells expressing CD38, HLA-DR and CTLA-4. In contrast, those with high lifestyle score, most of whom live in urban areas, showed a less activated state of the immune system illustrated by higher frequencies of naïve CD8+ T cells. Using an elastic net machine learning model, we identified cellular immune signatures most associated with lifestyle score. Assuming a link between these immune profiles and vaccine responses, these signatures may inform us on the cellular mechanisms underlying poor responses to vaccines, but also reduced autoimmunity and allergies in low- and middle-income countries.
Published: 2024
Full Text: View/download PDF

32. DACAPO: Accelerating Continuous Learning in Autonomous Systems for Video Analytics.

Author: Yoonsung Kim, Changhun Oh, Jinwoo Hwang, Wonung Kim, Seongryong Oh, Yubin Lee 0002, Hardik Sharma, Amir Yazdanbakhsh, and Jongse Park
Published: 2024
Full Text: View/download PDF

33. USM-Lite: Quantization and Sparsity Aware Fine-Tuning for Speech Recognition with Universal Speech Models.

Author: Shaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li 0028, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, and Shivani Agrawal
Published: 2024
Full Text: View/download PDF

34. In-Storage Domain-Specific Acceleration for Serverless Computing.

Author: Rohan Mahapatra, Soroush Ghodrati, Byung Hoon Ahn, Sean Kinzer, Shu-Ting Wang, Hanyang Xu 0002, Lavanya Karthikeyan, Hardik Sharma, Amir Yazdanbakhsh, Mohammad Alian, and Hadi Esmaeilzadeh
Published: 2024
Full Text: View/download PDF

35. Tandem Processor: Grappling with Emerging Operators in Neural Networks.

Author: Soroush Ghodrati, Sean Kinzer, Hanyang Xu 0002, Rohan Mahapatra, Yoonsung Kim, Byung Hoon Ahn, Dong Kai Wang, Lavanya Karthikeyan, Amir Yazdanbakhsh, Jongse Park, Nam Sung Kim, and Hadi Esmaeilzadeh
Published: 2024
Full Text: View/download PDF

36. GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation

Author: Sykora, Ondrej, Phothilimthana, Phitchaya Mangpo, Mendis, Charith, and Yazdanbakhsh, Amir
Subjects: Computer Science - Machine Learning, Computer Science - Hardware Architecture, Computer Science - Performance
Abstract: Analytical hardware performance models yield swift estimation of desired hardware performance metrics. However, developing these analytical models for modern processors with sophisticated microarchitectures is an extremely laborious task and requires a firm understanding of target microarchitecture's internal structure. In this paper, we introduce GRANITE, a new machine learning model that estimates the throughput of basic blocks across different microarchitectures. GRANITE uses a graph representation of basic blocks that captures both structural and data dependencies between instructions. This representation is processed using a graph neural network that takes advantage of the relational information captured in the graph and learns a rich neural representation of the basic block that allows more precise throughput estimation. Our results establish a new state-of-the-art for basic block performance estimation with an average test error of 6.9% across a wide range of basic blocks and microarchitectures for the x86-64 target. Compared to recent work, this reduced the error by 1.7% while improving training and inference throughput by approximately 3.0x. In addition, we propose the use of multi-task learning with independent multi-layer feed forward decoder networks. Our results show that this technique further improves precision of all learned models while significantly reducing per-microarchitecture training costs. We perform an extensive set of ablation studies and comparisons with prior work, concluding a set of methods to achieve high accuracy for basic block performance estimation., Comment: 13 pages; 5 figures; published at IISWC 2022; Included IEEE copyright
Published: 2022

37. Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango

Author: Madaan, Aman and Yazdanbakhsh, Amir
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The past decade has witnessed dramatic gains in natural language processing and an unprecedented scaling of large language models. These developments have been accelerated by the advent of few-shot techniques such as chain of thought (CoT) prompting. Specifically, CoT pushes the performance of large language models in a few-shot setup by augmenting the prompts with intermediate steps. Despite impressive results across various tasks, the reasons behind their success have not been explored. This work uses counterfactual prompting to develop a deeper understanding of CoT-based few-shot prompting mechanisms in large language models. We first systematically identify and define the key components of a prompt: symbols, patterns, and text. Then, we devise and conduct an exhaustive set of experiments across four different tasks, by querying the model with counterfactual prompts where only one of these components is altered. Our experiments across three models (PaLM, GPT-3, and CODEX) reveal several surprising findings and brings into question the conventional wisdom around few-shot prompting. First, the presence of factual patterns in a prompt is practically immaterial to the success of CoT. Second, our results conclude that the primary role of intermediate steps may not be to facilitate learning how to solve a task. The intermediate steps are rather a beacon for the model to realize what symbols to replicate in the output to form a factual answer. Further, text imbues patterns with commonsense knowledge and meaning. Our empirical and qualitative analysis reveals that a symbiotic relationship between text and patterns explains the success of few-shot prompting: text helps extract commonsense from the question to help patterns, and patterns enforce task understanding and direct text generation., Comment: Shortened version with additional results from CODEX and GPT-3. The authors contributed equally. Work done when Aman Madaan was a student researcher at Google Research, Brain Team
Published: 2022

38. Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Author: Kao, Sheng-Chun, Yazdanbakhsh, Amir, Subramanian, Suvinay, Agrawal, Shivani, Evci, Utku, and Krishna, Tushar
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Hardware Architecture, Computer Science - Performance
Abstract: Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DNNs). Among different categories of sparsity, structured sparsity has gained more attention due to its efficient execution on modern accelerators. Particularly, N:M sparsity is attractive because there are already hardware accelerator architectures that can leverage certain forms of N:M structured sparsity to yield higher compute-efficiency. In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute cost (FLOPs). Building upon this study, we propose two new decay-based pruning methods, namely "pruning mask decay" and "sparse structure decay". Our evaluations indicate that these proposed methods consistently deliver state-of-the-art (SOTA) model accuracy, comparable to unstructured sparsity, on a Transformer-based model for a translation task. The increase in the accuracy of the sparse model using the new training recipes comes at the cost of marginal increase in the total training compute (FLOPs)., Comment: 11 pages, 2 figures, and 9 tables. Published at the ICML Workshop on Sparsity in Neural Networks Advancing Understanding and Practice, 2022. First two authors contributed equally
Published: 2022

39. Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation

Author: Yazdanbakhsh, Amir, Moradifirouzabadi, Ashkan, Li, Zheng, and Kang, Mingu
Subjects: Computer Science - Machine Learning, Computer Science - Hardware Architecture
Abstract: As its core computation, a self-attention mechanism gauges pairwise correlations across the entire input sequence. Despite favorable performance, calculating pairwise correlations is prohibitively costly. While recent work has shown the benefits of runtime pruning of elements with low attention scores, the quadratic complexity of self-attention mechanisms and their on-chip memory capacity demands are overlooked. This work addresses these constraints by architecting an accelerator, called SPRINT, which leverages the inherent parallelism of ReRAM crossbar arrays to compute attention scores in an approximate manner. Our design prunes the low attention scores using a lightweight analog thresholding circuitry within ReRAM, enabling SPRINT to fetch only a small subset of relevant data to on-chip memory. To mitigate potential negative repercussions for model accuracy, SPRINT re-computes the attention scores for the few fetched data in digital. The combined in-memory pruning and on-chip recompute of the relevant attention scores enables SPRINT to transform quadratic complexity to a merely linear one. In addition, we identify and leverage a dynamic spatial locality between the adjacent attention operations even after pruning, which eliminates costly yet redundant data fetches. We evaluate our proposed technique on a wide range of state-of-the-art transformer models. On average, SPRINT yields 7.5x speedup and 19.6x energy reduction when total 16KB on-chip memory is used, while virtually on par with iso-accuracy of the baseline models (on average 0.36% degradation)., Comment: 15 pages; 14 figures; published at MICRO 2022; First three authors contributed equally
Published: 2022

40. Removal of heavy metals from the aqueous solution by nanomaterials: a review with analysing and categorizing the studies

Author: Adabi, Shervin, Yazdanbakhsh, Ahmadreza, Shahsavani, Abbas, Sheikhmohammadi, Amir, and Hadi, Mahdi
Published: 2023
Full Text: View/download PDF

41. Machine learning-powered estimation of malachite green photocatalytic degradation with NML-BiFeO3 composites

Author: Iman Salahshoori, Amirhosein Yazdanbakhsh, and Alireza Baghban
Subjects: Dye removal, Kernel-based Gaussian process regression (GPR), Metal-incorporated bismuth ferrite (BiFeO3), Machine learning, Photocatalytic degradation, Wastewater treatment, Medicine, Science
Abstract: Abstract This study explores the potential of photocatalytic degradation using novel NML-BiFeO3 (noble metal-incorporated bismuth ferrite) compounds for eliminating malachite green (MG) dye from wastewater. The effectiveness of various Gaussian process regression (GPR) models in predicting MG degradation is investigated. Four GPR models (Matern, Exponential, Squared Exponential, and Rational Quadratic) were employed to analyze a dataset of 1200 observations encompassing various experimental conditions. The models have considered ten input variables, including catalyst properties, solution characteristics, and operational parameters. The Exponential kernel-based GPR model achieved the best performance, with a near-perfect R2 value of 1.0, indicating exceptional accuracy in predicting MG degradation. Sensitivity analysis revealed process time as the most critical factor influencing MG degradation, followed by pore volume, catalyst loading, light intensity, catalyst type, pH, anion type, surface area, and humic acid concentration. This highlights the complex interplay between these factors in the degradation process. The reliability of the models was confirmed by outlier detection using William’s plot, demonstrating a minimal number of outliers (66–71 data points depending on the model). This indicates the robustness of the data utilized for model development. This study suggests that NML-BiFeO3 composites hold promise for wastewater treatment and that GPR models, particularly Matern-GPR, offer a powerful tool for predicting MG degradation. Identifying fundamental catalyst properties can expedite the application of NML-BiFeO3, leading to optimized wastewater treatment processes. Overall, this study provides valuable insights into using NML-BiFeO3 compounds and machine learning for efficient MG removal from wastewater.
Published: 2024
Full Text: View/download PDF

42. Photoreduction of atrazine from aqueous solution using sulfite/iodide/UV process, degradation, kinetics and by-products pathway

Author: Robabeh Vahidi-Kolur, Ahmadreza Yazdanbakhsh, Seyed Arman Hosseini, and Amir Sheikhmohammadi
Subjects: Medicine, Science
Abstract: Abstract Due to its widespread use in agriculture, atrazine has entered aquatic environments and thus poses potential risks to public health. Therefore, researchers have done many studies to remove it. Advanced reduction process (ARP) is an emerging technology for degrading organic contaminants from aqueous solutions. This study was aimed at evaluating the degradation of atrazine via sulfite/iodide/UV process. The best performance (96% of atrazine degradation) was observed in the neutral pH at 60 min of reaction time, with atrazine concentration of 10 mg/L and concentration of sulfite and iodide of 1 mM. The kinetic study revealed that the removal of atrazine was matched with the pseudo-first-order model. Results have shown that reduction induced by $${{\text{e}}}_{{\text{aq}}}^{-}$$ e aq - and direct photolysis dominated the degradation of atrazine. The presence of anions ( $${{\text{Cl}}}^{-}$$ Cl - , $${{\text{CO}}}_{3}^{2-}$$ CO 3 2 - and $${{\text{SO}}}_{4}^{2-}$$ SO 4 2 - ) did not have a significant effect on the degradation efficiency. In optimal conditions, COD and TOC removal efficiency were obtained at 32% and 4%, respectively. Atrazine degradation intermediates were generated by de-chlorination, hydroxylation, de-alkylation, and oxidation reactions. Overall, this research illustrated that Sulfite/iodide/UV process could be a promising approach for atrazine removal and similar contaminants from aqueous solutions.
Published: 2024
Full Text: View/download PDF

43. Object Detection, Recognition, Deep Learning, and the Universal Law of Generalization

Author: Rustom, Faris B., Öğmen, Haluk, and Yazdanbakhsh, Arash
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing
Abstract: Object detection and recognition are fundamental functions underlying the success of species. Because the appearance of an object exhibits a large variability, the brain has to group these different stimuli under the same object identity, a process of generalization. Does the process of generalization follow some general principles or is it an ad-hoc "bag-of-tricks"? The Universal Law of Generalization provided evidence that generalization follows similar properties across a variety of species and tasks. Here we test the hypothesis that the internal representations underlying generalization reflect the natural properties of object detection and recognition in our environment rather than the specifics of the system solving these problems. By training a deep-neural-network with images of "clear" and "camouflaged" animals, we found that with a proper choice of category prototypes, the generalization functions are monotone decreasing, similar to the generalization functions of biological systems. Our findings support the hypothesis of the study.
Published: 2022

44. S. mansoni -derived omega-1 prevents OVA-specific allergic airway inflammation via hampering of cDC2 migration.

Author: Thiago A Patente, Thomas A Gasan, Maaike Scheenstra, Arifa Ozir-Fazalalikhan, Katja Obieglo, Sjoerd Schetters, Stijn Verwaerde, Karl Vergote, Frank Otto, Ruud H P Wilbers, Eline van Bloois, Yolanda van Wijck, Christian Taube, Hamida Hammad, Arjen Schots, Bart Everts, Maria Yazdanbakhsh, Bruno Guigas, Cornelis H Hokke, and Hermelijn H Smits
Subjects: Immunologic diseases. Allergy, RC581-607, Biology (General), QH301-705.5
Abstract: Chronic infection with Schistosoma mansoni parasites is associated with reduced allergic sensitization in humans, while schistosome eggs protects against allergic airway inflammation (AAI) in mice. One of the main secretory/excretory molecules from schistosome eggs is the glycosylated T2-RNAse Omega-1 (ω1). We hypothesized that ω1 induces protection against AAI during infection. Peritoneal administration of ω1 prior to sensitization with Ovalbumin (OVA) reduced airway eosinophilia and pathology, and OVA-specific Th2 responses upon challenge, independent from changes in regulatory T cells. ω1 was taken up by monocyte-derived dendritic cells, mannose receptor (CD206)-positive conventional type 2 dendritic cells (CD206+ cDC2), and by recruited peritoneal macrophages. Additionally, ω1 impaired CCR7, F-actin, and costimulatory molecule expression on myeloid cells and cDC2 migration in and ex vivo, as evidenced by reduced OVA+ CD206+ cDC2 in the draining mediastinal lymph nodes (medLn) and retainment in the peritoneal cavity, while antigen processing and presentation in cDC2 were not affected by ω1 treatment. Importantly, RNAse mutant ω1 was unable to reduce AAI or affect DC migration, indicating that ω1 effects are dependent on its RNAse activity. Altogether, ω1 hampers migration of OVA+ cDC2 to the draining medLn in mice, elucidating how ω1 prevents allergic airway inflammation in the OVA/alum mouse model.
Published: 2024
Full Text: View/download PDF

45. Accelerating Attention through Gradient-Based Learned Runtime Pruning

Author: Li, Zheng, Ghodrati, Soroush, Yazdanbakhsh, Amir, Esmaeilzadeh, Hadi, and Kang, Mingu
Subjects: Computer Science - Computation and Language, Computer Science - Hardware Architecture, Computer Science - Machine Learning
Abstract: Self-attention is a key enabler of state-of-art accuracy for various transformer-based Natural Language Processing models. This attention mechanism calculates a correlation score for each word with respect to the other words in a sentence. Commonly, only a small subset of words highly correlates with the word under attention, which is only determined at runtime. As such, a significant amount of computation is inconsequential due to low attention scores and can potentially be pruned. The main challenge is finding the threshold for the scores below which subsequent computation will be inconsequential. Although such a threshold is discrete, this paper formulates its search through a soft differentiable regularizer integrated into the loss function of the training. This formulation piggy backs on the back-propagation training to analytically co-optimize the threshold and the weights simultaneously, striking a formally optimal balance between accuracy and computation pruning. To best utilize this mathematical innovation, we devise a bit-serial architecture, dubbed LeOPArd, for transformer language models with bit-level early termination microarchitectural mechanism. We evaluate our design across 43 back-end tasks for MemN2N, BERT, ALBERT, GPT-2, and Vision transformer models. Post-layout results show that, on average, LeOPArd yields 1.9x and 3.9x speedup and energy reduction, respectively, while keeping the average accuracy virtually intact (<0.2% degradation), Comment: First three authors contributed equally; published at ISCA 2022
Published: 2022

46. Recent advances and applications of stimuli-responsive nanomaterials for water treatment: A comprehensive review

Author: Salahshoori, Iman, Yazdanbakhsh, Amirhosein, Namayandeh Jorabchi, Majid, Kazemabadi, Fatemeh Zare, Khonakdar, Hossein Ali, and Mohammadi, Amir H.
Published: 2024
Full Text: View/download PDF

47. Lifestyle score is associated with cellular immune profiles in healthy Tanzanian adults

Author: Pyuza, Jeremia J., van Dorst, Marloes M.A.R., Stam, Koen, Wammes, Linda, König, Marion, Kullaya, Vesla I., Kruize, Yvonne, Huisman, Wesley, Andongolile, Nikuntufya, Ngowi, Anastazia, Shao, Elichilia R., Mremi, Alex, Hogendoorn, Pancras C.W., Msuya, Sia E., Jochems, Simon P., de Steenhuijsen Piters, Wouter A.A., and Yazdanbakhsh, Maria
Published: 2024
Full Text: View/download PDF

48. Navigating the molecular landscape of environmental science and heavy metal removal: A simulation-based approach

Author: Salahshoori, Iman, Nobre, Marcos A.L., Yazdanbakhsh, Amirhosein, Eshaghi Malekshah, Rahime, Asghari, Morteza, Ali Khonakdar, Hossein, and Mohammadi, Amir H.
Published: 2024
Full Text: View/download PDF

49. Quinoline-piperazine derivatives as potential α-Glucosidase inhibitors: Synthesis, biological evaluation, and in silico studies

Author: Ghasemi, Mehran, Iraji, Aida, Dehghan, Maryam, Nosood, Yazdanbakhsh Lotfi, Ghanavieh, Negin Fattahi, Hashempur, Mohammad Hashem, Mojtabavi, Somayeh, Faramarzi, Mohammad Ali, Mahdavi, Mohammad, and Al-Harrasi, Ahmed
Published: 2025
Full Text: View/download PDF

50. Anaerobic co-digestion of landfill leachate and sewage sludge: Determination of the optimal ratio and improvement of digestion by pre-ozonation

Author: Zahra Jamshidinasirmahale, Ahmadreza Yazdanbakhsh, Mohamadreza Masoudinejad, and Nadali Alavi Bakhtiarvand
Subjects: sewage, solid waste, ozone, anaerobic, methane, Environmental sciences, GE1-350
Abstract: Background: Anaerobic co-digestion (AcoD) of various wastes is a suitable method for the removal of contaminants and biogas production. The first aim of this study was to determine the optimal ratio of landfill leachate (LL) and sewage sludge (SS) for AcoD, and the second one was to evaluate the effect of pre-ozonation of the mixture on AcoD. Methods: The LL and SS samples were taken from landfill sites and municipal wastewater treatment plants (MWTPs), respectively. In the first step, five reactors were used and named R1 (100% SS), R2(100% LL), R3 (15% LL/85% SS), R4 (25% LL/75% SS), and R5 (45% LL/55% SS). Mesophilic anaerobic digestion (AD) was performed on the reactors and the optimal ratio was determined. In the second stage, the optimal mixtures were subjected to an ozonation process before AcoD. Results: The results of the first stage showed that the highest efficiency removal of the total solids (TS), volatile solids (VS), and chemical oxygen demand (COD), and the highest biogas production belonged to R3 digester, containing 15% LL and 85% SS. In the second stage, the results showed that the removal efficiency of COD and VS in the ozonated sample at the dosage of 7.6 gO3/h were 29.8% and 36.6% higher than the non-ozonated sample, respectively. Furthermore, in the ozonated sample, the biogas yield and the content of methane in the gas mixture were 27% and 9% higher respectively, compared to the non-ozonated sample. Conclusion: According to the results, the appropriate ratio of LL to SS and pre-ozonation of LL/SS mixture have a great impact on the performance of AcoD.
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

5,042 results on '"Yazdanbakhsh, A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources