109,205 results on '"Shah, P."'
Search Results
2. Lift Every Voice in Tech: Co-Designed Recommendations to Support Black Workers and Learners Seeking to Enter and Advance in Technology Industry Career Pathways
- Author
-
Digital Promise, Bria Carter, Britney Jacobs, Zohal Shah, and Chioma Aso-Hernandez
- Abstract
Research has shown that access to technology industry pathways and support for recruitment, retention, and advancement through technology careers remain inequitable for Black talent due to various systemic barriers. To help address this issue, Digital Promise conducted research that centers the voices and lived experiences of Black workers and learners seeking to enter and advance in the technology industry with the purpose of building awareness to the: (1) challenges and barriers they face navigating the U.S. technology learning and working ecosystem; (2) factors such as supports and services that have facilitated their technology career pathway entry, retention, and advancement; and (3) collaboratively designed recommendations for needed supports that they have identified that can better promote successful navigation and persistence within technology career pathways. This report further highlights actionable steps that various technology industry contributors can take to dismantle systemic barriers within the technology learning and workforce ecosystem and increase access to non-four-year-degree pathways to tech careers. [Funding for this project is provided by Walmart through the Walmart.org Center for Racial Equity.]
- Published
- 2024
3. Infectious Disease Forecasting in India using LLM's and Deep Learning
- Author
-
Shah, Chaitya, Gandhi, Kashish, Shah, Javal, Shah, Kreena, Patil, Nilesh, and Bhowmick, Kiran
- Subjects
Computer Science - Machine Learning ,Computer Science - Logic in Computer Science - Abstract
Many uncontrollable disease outbreaks of the past exposed several vulnerabilities in the healthcare systems worldwide. While advancements in technology assisted in the rapid creation of the vaccinations, there needs to be a pressing focus on the prevention and prediction of such massive outbreaks. Early detection and intervention of an outbreak can drastically reduce its impact on public health while also making the healthcare system more resilient. The complexity of disease transmission dynamics, influence of various directly and indirectly related factors and limitations of traditional approaches are the main bottlenecks in taking preventive actions. Specifically, this paper implements deep learning algorithms and LLM's to predict the severity of infectious disease outbreaks. Utilizing the historic data of several diseases that have spread in India and the climatic data spanning the past decade, the insights from our research aim to assist in creating a robust predictive system for any outbreaks in the future., Comment: 16 pages, 4 figures
- Published
- 2024
4. VERITAS-NLI : Validation and Extraction of Reliable Information Through Automated Scraping and Natural Language Inference
- Author
-
Shah, Arjun, Shah, Hetansh, Bafna, Vedica, Khandor, Charmi, and Nair, Sindhu
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,I.2.1 ,I.2.7 - Abstract
In today's day and age where information is rapidly spread through online platforms, the rise of fake news poses an alarming threat to the integrity of public discourse, societal trust, and reputed news sources. Classical machine learning and Transformer-based models have been extensively studied for the task of fake news detection, however they are hampered by their reliance on training data and are unable to generalize on unseen headlines. To address these challenges, we propose our novel solution, leveraging web-scraping techniques and Natural Language Inference (NLI) models to retrieve external knowledge necessary for verifying the accuracy of a headline. Our system is evaluated on a diverse self-curated evaluation dataset spanning over multiple news channels and broad domains. Our best performing pipeline achieves an accuracy of 84.3% surpassing the best classical Machine Learning model by 33.3% and Bidirectional Encoder Representations from Transformers (BERT) by 31.0% . This highlights the efficacy of combining dynamic web-scraping with Natural Language Inference to find support for a claimed headline in the corresponding externally retrieved knowledge for the task of fake news detection., Comment: Preprint, 15 pages, 7 figures
- Published
- 2024
5. Anti-CRT Attacks, School Choice, and the Privatization Endgame
- Author
-
Sachin Maharaj, Stephanie Tuters, and Vidya Shah
- Abstract
Across Canada, school districts have been confronting a backlash to their equity and social justice initiatives. Critics of public education have been arguing that the solution to these controversies is to increase school choice. Using several examples from the United States, this paper argues that the endgame of these strategies is to undermine the legitimacy of public education and increase support for private alternatives. To protect its future viability, the paper also calls on public education advocates to grapple with ongoing marginalization within school systems which make private options increasingly attractive.
- Published
- 2024
6. Assessing the Validity and Reliability of the Inventory of Science Teacher Readiness in Implementing Classroom-Based Assessment (ISTRI-CBA)
- Author
-
Mat Rasid Ishak, Hidayah Mohd Fadzil, and Harris Shah Abd Hamid
- Abstract
Teacher readiness is the willingness of a teacher to implement a planned program successfully. Readiness is crucial in ensuring a program can be implemented at the individual or organizational level. The readiness of science teachers to implement Classroom-Based Assessment (CBA) determines the direction and success of primary school assessments in Malaysia. This pilot study aims to create and evaluate the Inventory of Science Teachers' Readiness in Implementing Classroom-based Assessment (ISTRI-CBA). There have been many studies on CBA. However, there are limitations in measuring the readiness of science teachers to implement CBA. The evaluation will be ineffective without a suitable tool to measure teacher readiness in CBA. Therefore, to determine the level of readiness of science teachers, the researcher needs an instrument that is valid, reliable, and appropriate to the context of the Ministry of Education, culture, and the current situation in Malaysia. It is necessary to have a conceptually and psychometrically substantial inventory. This challenge was addressed by developing the ISTRI-CBA, comprising five measurement dimensions: knowledge about CBA, skills of CBA, resource support, attitudes, and professional values. The development of ISTRI-CBA requires validity and reliability for its use. This study aims to (1) test the quality characteristics of the ISTRI-CBA items and (2) determine the reliability of the ISTRI-CBA obtained using the Rasch Model measurement. The research involves the administration of a survey questionnaire to 44 science teachers in a district in Malaysia. The findings show that the ISTRI-CBA, comprising 157 items, has good psychometric characteristics and meets the measurement criteria of the Rasch measurement model. The item quality and reliability also prove the suitability of the whole dimension. The study's findings provide evidence of the empirical implications that consistently support ISTRI-CBA as a valid and reliable instrument to measure teachers' readiness to implement CBA in science subjects. The findings also reveal that the ISTRI-CBA can measure science teachers' readiness to implement CBA.
- Published
- 2024
7. Fully Dynamic Adversarially Robust Correlation Clustering in Polylogarithmic Update Time
- Author
-
Braverman, Vladimir, Dharangutte, Prathamesh, Pai, Shreyas, Shah, Vihan, and Wang, Chen
- Subjects
Computer Science - Data Structures and Algorithms ,Computer Science - Machine Learning - Abstract
We study the dynamic correlation clustering problem with $\textit{adaptive}$ edge label flips. In correlation clustering, we are given a $n$-vertex complete graph whose edges are labeled either $(+)$ or $(-)$, and the goal is to minimize the total number of $(+)$ edges between clusters and the number of $(-)$ edges within clusters. We consider the dynamic setting with adversarial robustness, in which the $\textit{adaptive}$ adversary could flip the label of an edge based on the current output of the algorithm. Our main result is a randomized algorithm that always maintains an $O(1)$-approximation to the optimal correlation clustering with $O(\log^{2}{n})$ amortized update time. Prior to our work, no algorithm with $O(1)$-approximation and $\text{polylog}{(n)}$ update time for the adversarially robust setting was known. We further validate our theoretical results with experiments on synthetic and real-world datasets with competitive empirical performances. Our main technical ingredient is an algorithm that maintains $\textit{sparse-dense decomposition}$ with $\text{polylog}{(n)}$ update time, which could be of independent interest.
- Published
- 2024
8. Time-to-Event Pretraining for 3D Medical Imaging
- Author
-
Huo, Zepeng, Fries, Jason Alan, Lozano, Alejandro, Valanarasu, Jeya Maria Jose, Steinberg, Ethan, Blankemeier, Louis, Chaudhari, Akshay S., Langlotz, Curtis, and Shah, Nigam H.
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
With the rise of medical foundation models and the growing availability of imaging data, scalable pretraining techniques offer a promising way to identify imaging biomarkers predictive of future disease risk. While current self-supervised methods for 3D medical imaging models capture local structural features like organ morphology, they fail to link pixel biomarkers with long-term health outcomes due to a missing context problem. Current approaches lack the temporal context necessary to identify biomarkers correlated with disease progression, as they rely on supervision derived only from images and concurrent text descriptions. To address this, we introduce time-to-event pretraining, a pretraining framework for 3D medical imaging models that leverages large-scale temporal supervision from paired, longitudinal electronic health records (EHRs). Using a dataset of 18,945 CT scans (4.2 million 2D images) and time-to-event distributions across thousands of EHR-derived tasks, our method improves outcome prediction, achieving an average AUROC increase of 23.7% and a 29.4% gain in Harrell's C-index across 8 benchmark tasks. Importantly, these gains are achieved without sacrificing diagnostic classification performance. This study lays the foundation for integrating longitudinal EHR and 3D imaging data to advance clinical risk prediction., Comment: 34 pages, 19 figures
- Published
- 2024
9. OneNet: A Channel-Wise 1D Convolutional U-Net
- Author
-
Byun, Sanghyun, Shah, Kayvan, Gang, Ayushi, Apton, Christopher, Song, Jacob, and Chung, Woo Seong
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Many state-of-the-art computer vision architectures leverage U-Net for its adaptability and efficient feature extraction. However, the multi-resolution convolutional design often leads to significant computational demands, limiting deployment on edge devices. We present a streamlined alternative: a 1D convolutional encoder that retains accuracy while enhancing its suitability for edge applications. Our novel encoder architecture achieves semantic segmentation through channel-wise 1D convolutions combined with pixel-unshuffle operations. By incorporating PixelShuffle, known for improving accuracy in super-resolution tasks while reducing computational load, OneNet captures spatial relationships without requiring 2D convolutions, reducing parameters by up to 47%. Additionally, we explore a fully 1D encoder-decoder that achieves a 71% reduction in size, albeit with some accuracy loss. We benchmark our approach against U-Net variants across diverse mask-generation tasks, demonstrating that it preserves accuracy effectively. Although focused on image segmentation, this architecture is adaptable to other convolutional applications. Code for the project is available at https://github.com/shbyun080/OneNet .
- Published
- 2024
10. Towards Secure Intelligent O-RAN Architecture: Vulnerabilities, Threats and Promising Technical Solutions using LLMs
- Author
-
Motalleb, Mojdeh Karbalaee, Benzaid, Chafika, Taleb, Tarik, Katz, Marcos, Shah-Mansouri, Vahid, and Song, JaeSeung
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
The evolution of wireless communication systems will be fundamentally impacted by an open radio access network (O-RAN), a new concept defining an intelligent architecture with enhanced flexibility, openness, and the ability to slice services more efficiently. For all its promises, and like any technological advancement, O-RAN is not without risks that need to be carefully assessed and properly addressed to accelerate its wide adoption in future mobile networks. In this paper, we present an in-depth security analysis of the O-RAN architecture, discussing the potential threats that may arise in the different O-RAN architecture layers and their impact on the Confidentiality, Integrity, and Availability (CIA) triad. We also promote the potential of zero trust, Moving Target Defense (MTD), blockchain, and large language models(LLM) technologies in fortifying O-RAN's security posture. Furthermore, we numerically demonstrate the effectiveness of MTD in empowering robust deep reinforcement learning methods for dynamic network slice admission control in the O-RAN architecture. Moreover, we examine the effect of explainable AI (XAI) based on LLMs in securing the system., Comment: 10 pages
- Published
- 2024
11. The Fastest Path to Discovering the Second Electromagnetic Counterpart to a Gravitational Wave Event
- Author
-
Shah, Ved G., Foley, Ryan J., and Narayan, Gautham
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The discovery of a second electromagnetic counterpart to a gravitational wave event represents a critical goal in the field of multi-messenger astronomy. In order to determine the optimal strategy for achieving this goal, we perform comprehensive simulations comparing two potential paths forward: continuing the current LIGO-Virgo-KAGRA (LVK) observing run, O4, versus temporarily shutting down the detectors for upgrades before beginning the next observing run, O5. Our simulations incorporate current O4 instrument sensitivities and duty cycles, as well as projected configurations for O5, while accounting for variables such as binary neutron star merger rates, system properties, viewing angles, dust extinction, and kilonova (KN) observables. Our results indicate that a KN discovery would occur $125^{+253}_{-125}$~days (middle 50\% interval) sooner in O5 compared to O4, suggesting that extending O4 would lead to faster discovery if the shutdown period between runs is $>$4~months. Moreover, for 88\% of our simulations, continuing O4 results in earlier KN discovery when compared to the expected two-year shutdown between O4 and O5. Given these findings and the critical importance of avoiding a $>$10 year gap between first and second electromagnetic counterpart discoveries, we suggest LVK consider extending O4 operations for as long as feasible prior to shutting down for critical upgrades., Comment: 10 pages, 10 figures, 3 tables. Submitted to PASP
- Published
- 2024
12. A Social Outcomes and Priorities centered (SOP) Framework for AI policy
- Author
-
Shah, Mohak
- Subjects
Computer Science - Computers and Society ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,K.4 ,K.5 - Abstract
Rapid developments in AI and its adoption across various domains have necessitated a need to build robust guardrails and risk containment plans while ensuring equitable benefits for the betterment of society. The current technology-centered approach has resulted in a fragmented, reactive, and ineffective policy apparatus. This paper highlights the immediate and urgent need to pivot to a society-centered approach to develop comprehensive, coherent, forward-looking AI policy. To this end, we present a Social Outcomes and Priorities centered (SOP) framework for AI policy along with proposals on implementation of its various components. While the SOP framework is presented from a US-centric view, the takeaways are general and applicable globally.
- Published
- 2024
13. Incentive Design with Spillovers
- Author
-
Dasaratha, Krishna, Golub, Benjamin, and Shah, Anant
- Subjects
Economics - Theoretical Economics ,Computer Science - Computer Science and Game Theory - Abstract
A principal uses payments conditioned on stochastic outcomes of a team project to elicit costly effort from the team members. We develop a multi-agent generalization of a classic first-order approach to contract optimization by leveraging methods from network games. The main results characterize the optimal allocation of incentive pay across agents and outcomes. Incentive optimality requires equalizing, across agents, a product of (i) individual productivity (ii) organizational centrality and (iii) responsiveness to monetary incentives.
- Published
- 2024
14. On the Convergence of Continual Federated Learning Using Incrementally Aggregated Gradients
- Author
-
Keshri, Satish Kumar, Shah, Nazreen, and Prasad, Ranjitha
- Subjects
Computer Science - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
The holy grail of machine learning is to enable Continual Federated Learning (CFL) to enhance the efficiency, privacy, and scalability of AI systems while learning from streaming data. The primary challenge of a CFL system is to overcome global catastrophic forgetting, wherein the accuracy of the global model trained on new tasks declines on the old tasks. In this work, we propose Continual Federated Learning with Aggregated Gradients (C-FLAG), a novel replay-memory based federated strategy consisting of edge-based gradient updates on memory and aggregated gradients on the current data. We provide convergence analysis of the C-FLAG approach which addresses forgetting and bias while converging at a rate of $O(1/\sqrt{T})$ over $T$ communication rounds. We formulate an optimization sub-problem that minimizes catastrophic forgetting, translating CFL into an iterative algorithm with adaptive learning rates that ensure seamless learning across tasks. We empirically show that C-FLAG outperforms several state-of-the-art baselines on both task and class-incremental settings with respect to metrics such as accuracy and forgetting.
- Published
- 2024
15. Double-Signed Fragmented DNSSEC for Countering Quantum Threat
- Author
-
Pan, Syed W. Shah. Lei, Nguyen, Din Duc Nha, Doss, Robin, Armstrong, Warren, and Gauravaram, Praveen
- Subjects
Computer Science - Cryptography and Security - Abstract
DNSSEC, a DNS security extension, is essential to accurately translating domain names to IP addresses. Digital signatures provide the foundation for this reliable translation, however, the evolution of 'Quantum Computers' has made traditional digital signatures vulnerable. In light of this, NIST has recently selected potential post-quantum digital signatures that can operate on conventional computers and resist attacks made with Quantum Computers. Since these post-quantum digital signatures are still in their early stages of development, replacing pre-quantum digital signature schemes in DNSSEC with post-quantum candidates is risky until the post-quantum candidates have undergone a thorough security analysis. Given this, herein, we investigate the viability of employing 'Double-Signatures' in DNSSEC, combining a post-quantum digital signature and a classic one. The rationale is that double-signatures will offer protection against quantum threats on conventional signature schemes as well as unknown non-quantum attacks on post-quantum signature schemes, hence even if one fails the other provides security guarantees. However, the inclusion of two signatures in the DNSSEC response message doesn't bode well with the maximum allowed size of DNSSEC responses (i.e., 1232B, a limitation enforced by MTU of physical links). To counter this issue, we leverage a way to do application-layer fragmentation of DNSSEC responses with two signatures. We implement our solution on top of OQS-BIND and through experiments show that the addition of two signatures in DNSSEC and application-layer fragmentation of all relevant resource records and their reassembly does not have any substantial impact on the efficiency of the resolution process and thus is suitable for the interim period at least until the quantum computers are fully realized.
- Published
- 2024
16. DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID
- Author
-
Siddiqui, Nyle, Croitoru, Florinel Alin, Nayak, Gaurav Kumar, Ionescu, Radu Tudor, and Shah, Mubarak
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
With the recent exhibited strength of generative diffusion models, an open research question is \textit{if images generated by these models can be used to learn better visual representations}. While this generative data expansion may suffice for easier visual tasks, we explore its efficacy on a more difficult discriminative task: clothes-changing person re-identification (CC-ReID). CC-ReID aims to match people appearing in non-overlapping cameras, even when they change their clothes across cameras. Not only are current CC-ReID models constrained by the limited diversity of clothing in current CC-ReID datasets, but generating additional data that retains important personal features for accurate identification is a current challenge. To address this issue we propose DLCR, a novel data expansion framework that leverages pre-trained diffusion and large language models (LLMs) to accurately generate diverse images of individuals in varied attire. We generate additional data for five benchmark CC-ReID datasets (PRCC, CCVID, LaST, VC-Clothes, and LTCC) and \textbf{increase their clothing diversity by \boldmath{$10$}x, totaling over \boldmath{$2.1$}M images generated}. DLCR employs diffusion-based text-guided inpainting, conditioned on clothing prompts constructed using LLMs, to generate synthetic data that only modifies a subject's clothes while preserving their personally identifiable features. With this massive increase in data, we introduce two novel strategies - progressive learning and test-time prediction refinement - that respectively reduce training time and further boosts CC-ReID performance. On the PRCC dataset, we obtain a large top-1 accuracy improvement of $11.3\%$ by training CAL, a previous state of the art (SOTA) method, with DLCR-generated data. We publicly release our code and generated data for each dataset here: \url{https://github.com/CroitoruAlin/dlcr}., Comment: Published in WACV 2025
- Published
- 2024
17. ZT-RIC:A Zero Trust RIC Framework for ensuring data Privacy and Confidentiality in Open RAN
- Author
-
Lin, Diana, Bhargav, Samarth, Chiejina, Azuka, Ibrahem, Mohamed I., and Shah, Vijay K.
- Subjects
Computer Science - Cryptography and Security - Abstract
The advancement of 5G and NextG networks through Open Radio Access Network (O-RAN) architecture enables a shift toward virtualized, modular, and disaggregated configurations. A core component of O-RAN is the RAN Intelligent Controller (RIC), which manages RAN using machine learning-driven xApps that access sensitive data from RAN and User Equipment (UE), stored in the near Real-Time RIC (Near-RT RIC) database. This shared, open environment increases the risk of unauthorized data exposure. To address these concerns, this paper proposes a zero-trust RIC (ZT-RIC) framework that preserves data privacy across the RIC platform, including the RIC database, xApps, and E2 interface. ZT-RIC employs Inner Product Functional Encryption (IPFE) to encrypt RAN/UE data at the base station, preventing leaks through the E2 interface and shared database. Additionally, ZT-RIC enables xApps to perform inference on encrypted data without exposing sensitive information. For evaluation, a state-of-the-art InterClass xApp, which detects jamming signals using RAN key performance metrics (KPMs), is implemented. Testing on an LTE/5G O-RAN testbed shows that ZT-RIC preserves data confidentiality while achieving 97.9% accuracy in jamming detection and meeting sub-second latency requirements, with a round-trip time (RTT) of 0.527 seconds., Comment: This paper has been accepted to CCNC 2025
- Published
- 2024
18. UAV survey coverage path planning of complex regions containing exclusion zones
- Author
-
Shahid, Shadman Tajwar, Siddique, Shah Md. Ahasan, and Alam, Md. Mahidul
- Subjects
Computer Science - Robotics ,Computer Science - Computational Geometry - Abstract
This article addresses the challenge of UAV survey coverage path planning for areas that are complex concave polygons, containing exclusion zones or obstacles. While standard drone path planners typically generate coverage paths for simple convex polygons, this study proposes a method to manage more intricate regions, including boundary splits, merges, and interior holes. To achieve this, polygonal decomposition techniques are used to partition the target area into convex sub-regions. The sub-polygons are then merged using a depth-first search algorithm, followed by the generation of continuous Boustrophedon paths based on connected components. Polygonal offset by the straight skeleton method was used to ensure a constant safe distance from the exclusion zones. This approach allows UAV path planning in environments with complex geometric constraints.
- Published
- 2024
19. Automatic Contact-Based 3D Scanning Using Articulated Robotic Arm
- Author
-
Shahid, Shadman Tajwar, Siddique, Shah Md. Ahasan, and Bhuiyan, Md. Humayun Kabir
- Subjects
Computer Science - Robotics ,Physics - Instrumentation and Detectors - Abstract
This paper presents an open-loop articulated 6-degree-of-freedom (DoF) robotic system for three-dimensional (3D) scanning of objects by contact-based method. A digitizer probe was used to detect contact with the object. Inverse kinematics (IK) was used to determine the joint angles of the robot corresponding to the probe position and orientation, and straight-line trajectory planning was implemented for motion. The system can take single-point measurements and 3D scans of freeform surfaces. Specifying the scanning area's size, position, and density, the system automatically scans the designated volume. The system produces 3D scans in Standard Triangle Language (STL) format, ensuring compatibility with commonly used 3D software. Tests based on ASME B89.4.22 standards were conducted to quantify accuracy and repeatability. The point cloud from the scans was compared to the original 3D model of the object.
- Published
- 2024
20. Solution of Einstein Field Equations for Anisotropic Matter with Vanishing Complexity: Spacetime Metric Satisfying Karmarkar Condition and Conformally Flat Geometry
- Author
-
Ratanpal, B. S., Suthar, Bhavesh, and Shah, Vishant
- Subjects
General Relativity and Quantum Cosmology - Abstract
The solution of Einstein field equations for static spherically symmetric spacetime metric with anisotropic internal stresses has been obtained. The matter has vanishing complexity and a spacetime metric that satisfies the Karmarkar condition and is conformally flat. It has been noted that there is only one solution that meets these three conditions. This has been shown as a proof of the theorem.
- Published
- 2024
21. xNVMe: Unleashing Storage Hardware-Software Co-design
- Author
-
Lund, Simon A. F. and Shah, Vivek
- Subjects
Computer Science - Operating Systems ,Computer Science - Databases ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
NVMe SSD hardware has witnessed widespread deployment as commodity and enterprise hardware due to its high performance and rich feature set. Despite the open specifications of various NVMe protocols by the NVMe Express group and NVMe being of software abstractions to program the underlying hardware. The myriad storage I/O paths such as POSIX storage API, ad-hoc OS mechanisms, and userspace I/O libraries have different syntax and semantics that complicate software development and stand in the way of mass adoption and evolution of the NVMe ecosystem. To unify the diverse I/O storage paths, we built xNVMe that exposes a single message-passing API to support both asynchronous and synchronous communication with NVMe devices. xNVMe provides various command sets to support diverse storage I/O paths in different OS (e.g., Linux, FreeBSD, Windows, and MacOS) and userspace libraries (e.g., SPDK) with minimal overhead. xNVMe is an Open Source project and has gained traction amongst various industry stakeholders. In this paper, we elaborate on the lessons that we have learned in the project during its evolution. We also provide some ongoing and future work planned for the project. We hope the database and storage systems community can join in the effort to both extend xNVMe and leverage it as a building block for innovative co-design of storage systems on modern NVMe hardware.
- Published
- 2024
22. A Hybrid Approach for COVID-19 Detection: Combining Wasserstein GAN with Transfer Learning
- Author
-
Rounaq, Sumera, Shah, Shahid Munir, Aljawarneh, Mahmoud, Khan, Sarah, and Muhammad, Ghulam
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
COVID-19 is extremely contagious and its rapid growth has drawn attention towards its early diagnosis. Early diagnosis of COVID-19 enables healthcare professionals and government authorities to break the chain of transition and flatten the epidemic curve. With the number of cases accelerating across the developed world, COVID-19 induced Viral Pneumonia cases is a big challenge. Overlapping of COVID-19 cases with Viral Pneumonia and other lung infections with limited dataset and long training hours is a serious problem to cater. Limited amount of data often results in over-fitting models and due to this reason, model does not predict generalized results. To fill this gap, we proposed GAN-based approach to synthesize images which later fed into the deep learning models to classify images of COVID-19, Normal, and Viral Pneumonia. Specifically, customized Wasserstein GAN is proposed to generate 19% more Chest X-ray images as compare to the real images. This expanded dataset is then used to train four proposed deep learning models: VGG-16, ResNet-50, GoogLeNet and MNAST. The result showed that expanded dataset utilized deep learning models to deliver high classification accuracies. In particular, VGG-16 achieved highest accuracy of 99.17% among all four proposed schemes. Rest of the models like ResNet-50, GoogLeNet and MNAST delivered 93.9%, 94.49% and 97.75% testing accuracies respectively. Later, the efficiency of these models is compared with the state of art models on the basis of accuracy. Further, our proposed models can be applied to address the issue of scant datasets for any problem of image analysis.
- Published
- 2024
23. CityGuessr: City-Level Video Geo-Localization on a Global Scale
- Author
-
Kulkarni, Parth Parag, Nayak, Gaurav Kumar, and Shah, Mubarak
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video geolocalization is a crucial problem in current times. Given just a video, ascertaining where it was captured from can have a plethora of advantages. The problem of worldwide geolocalization has been tackled before, but only using the image modality. Its video counterpart remains relatively unexplored. Meanwhile, video geolocalization has also garnered some attention in the recent past, but the existing methods are all restricted to specific regions. This motivates us to explore the problem of video geolocalization at a global scale. Hence, we propose a novel problem of worldwide video geolocalization with the objective of hierarchically predicting the correct city, state/province, country, and continent, given a video. However, no large scale video datasets that have extensive worldwide coverage exist, to train models for solving this problem. To this end, we introduce a new dataset, CityGuessr68k comprising of 68,269 videos from 166 cities all over the world. We also propose a novel baseline approach to this problem, by designing a transformer-based architecture comprising of an elegant Self-Cross Attention module for incorporating scenes as well as a TextLabel Alignment strategy for distilling knowledge from textlabels in feature space. To further enhance our location prediction, we also utilize soft-scene labels. Finally we demonstrate the performance of our method on our new dataset as well as Mapillary(MSLS). Our code and datasets are available at: https://github.com/ParthPK/CityGuessr, Comment: Accepted to ECVA Eurpoean Conference on Computer Vision(ECCV) 2024
- Published
- 2024
24. Multi-Document Financial Question Answering using LLMs
- Author
-
Shah, Shalin, Ryali, Srikanth, and Venkatesh, Ramasubbu
- Subjects
Computer Science - Information Retrieval ,Computer Science - Computation and Language - Abstract
We propose two new methods for multi-document financial question answering. First, a method that uses semantic tagging, and then, queries the index to get the context (RAG_SEM). And second, a Knowledge Graph (KG_RAG) based method that uses semantic tagging, and, retrieves knowledge graph triples from a graph database, as context. KG_RAG uses knowledge graphs constructed using a small model that is fine-tuned using knowledge distillation using a large teacher model. The data consists of 18 10K reports of Apple, Microsoft, Alphabet, NVIDIA, Amazon and Tesla for the years 2021, 2022 and 2023. The list of questions in the data consists of 111 complex questions including many esoteric questions that are difficult to answer and the answers are not completely obvious. As evaluation metrics, we use overall scores as well as segmented scores for measurement including the faithfulness, relevance, correctness, similarity, an LLM based overall score and the rouge scores as well as a similarity of embeddings. We find that both methods outperform plain RAG significantly. KG_RAG outperforms RAG_SEM in four out of nine metrics.
- Published
- 2024
25. Energy Efficient Protein Language Models: Leveraging Small Language Models with LoRA for Controllable Protein Generation
- Author
-
Shah, Aayush and Jayaratnam, Shankar
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) have demonstrated significant success in natural language processing (NLP) tasks and have shown promising results in other domains such as protein sequence generation. However, there remain salient differences between LLMs used for NLP, which effectively handle multiple tasks and are available in small sizes, and protein language models that are often specialized for specific tasks and only exist in larger sizes. In this work, we introduce two small protein language models, based on Llama-3-8B and Phi-3-mini, that are capable of both uncontrollable and controllable protein generation. For the uncontrollable generation task, our best model achieves an average pLDDT score of 69.75, demonstrating robust performance in generating viable protein structures. For the controllable generation task, in which the model generates proteins according to properties specified in the prompt, we achieve a remarkable average TM-Score of 0.84, indicating high structural similarity to target proteins. We chose 10 properties, including six classes of enzymes, to extend the capabilities of prior protein language models. Our approach utilizes the Low-Rank Adaptor (LoRA) technique, reducing trainable parameters to just 4% of the original model size, lowering computational requirements. By using a subset of the UniRef50 dataset and small models, we reduced the overall training time by 70% without compromising performance. Notably, Phi-3-mini reduced trainable parameters by 60%, decreasing training cost by 30% compared to Llama 3. Consequently, Phi-3 achieved a comparable TM-Score of 0.81, demonstrating that smaller models can match the performance of larger ones, like Llama 3. We also demonstrate the deployment of our models on the energy efficient ET-SoC-1 chip, significantly improving the TPS/W by a factor of 3.
- Published
- 2024
26. Bridging Nodes and Narrative Flows: Identifying Intervention Targets for Disinformation on Telegram
- Author
-
Shah, Devang, Ranka, Hriday, NG, Lynnette Hui Xian, and Mehta, Swapneel
- Subjects
Computer Science - Computers and Society ,Computer Science - Social and Information Networks - Abstract
In recent years, mass-broadcast messaging platforms like Telegram have gained prominence for both, serving as a harbor for private communication and enabling large-scale disinformation campaigns. The encrypted and networked nature of these platforms makes it challenging to identify intervention targets since most channels that promote misleading information are not originators of the message. In this work, we examine the structural mechanisms that facilitate the propagation of debunked misinformation on Telegram, focusing on the role of cross-community hubs-nodes that bridge otherwise isolated groups in amplifying misinformation. We introduce a multi-dimensional 'bridging' metric to quantify the influence of nodal Telegram channels, exploring their role in reshaping network topology during key geopolitical events. By analyzing over 1740 Telegram channels and applying network analysis we uncover the small subset of nodes, and identify patterns that are emblematic of information 'flows' on this platform. Our findings provide insights into the structural vulnerabilities of distributed platforms, offering practical suggestions for interventions to mitigate networked disinformation flows., Comment: *Both Authors contributed equally to this work. 22 pages, 11 figures, 3 tables
- Published
- 2024
27. Agricultural Landscape Understanding At Country-Scale
- Author
-
Dua, Radhika, Saxena, Nikita, Agarwal, Aditi, Wilson, Alex, Singh, Gaurav, Tran, Hoang, Deshpande, Ishan, Kaur, Amandeep, Aggarwal, Gaurav, Nath, Chandan, Basu, Arnab, Batchu, Vishal, Holla, Sharath, Kurle, Bindiya, Missura, Olana, Aggarwal, Rahul, Garg, Shubhika, Shah, Nishi, Singh, Avneet, Tewari, Dinesh, Dondzik, Agata, Adsul, Bharat, Sohoni, Milind, Praveen, Asim Rama, Dangi, Aaryan, Kadivar, Lisan, Abhishek, E, Sudhansu, Niranjan, Hattekar, Kamlakar, Datar, Sameer, Chaithanya, Musty Krishna, Reddy, Anumas Ranjith, Kumar, Aashish, Tirumala, Betala Laxmi, and Talekar, Alok
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
Agricultural landscapes are quite complex, especially in the Global South where fields are smaller, and agricultural practices are more varied. In this paper we report on our progress in digitizing the agricultural landscape (natural and man-made) in our study region of India. We use high resolution imagery and a UNet style segmentation model to generate the first of its kind national-scale multi-class panoptic segmentation output. Through this work we have been able to identify individual fields across 151.7M hectares, and delineating key features such as water resources and vegetation. We share how this output was validated by our team and externally by downstream users, including some sample use cases that can lead to targeted data driven decision making. We believe this dataset will contribute towards digitizing agriculture by generating the foundational baselayer., Comment: 34 pages, 7 tables, 15 figs
- Published
- 2024
28. Exploring the Feasibility of Affordable Sonar Technology: Object Detection in Underwater Environments Using the Ping 360
- Author
-
Hasan, Md Junayed, Kannan, Somasundar, Rohan, Ali, and Shah, Mohd Asif
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Emerging Technologies - Abstract
This study explores the potential of the Ping 360 sonar device, primarily used for navigation, in detecting complex underwater obstacles. The key motivation behind this research is the device's affordability and open-source nature, offering a cost-effective alternative to more expensive imaging sonar systems. The investigation focuses on understanding the behaviour of the Ping 360 in controlled environments and assessing its suitability for object detection, particularly in scenarios where human operators are unavailable for inspecting offshore structures in shallow waters. Through a series of carefully designed experiments, we examined the effects of surface reflections and object shadows in shallow underwater environments. Additionally, we developed a manually annotated sonar image dataset to train a U-Net segmentation model. Our findings indicate that while the Ping 360 sonar demonstrates potential in simpler settings, its performance is limited in more cluttered or reflective environments unless extensive data pre-processing and annotation are applied. To our knowledge, this is the first study to evaluate the Ping 360's capabilities for complex object detection. By investigating the feasibility of low-cost sonar devices, this research provides valuable insights into their limitations and potential for future AI-based interpretation, marking a unique contribution to the field., Comment: This work is currently under review. This is a pre-print
- Published
- 2024
29. Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs
- Author
-
Ran, Yide, Xu, Zhaozhuo, Yao, Yuhang, Hu, Zijian, Han, Shanshan, Jin, Han, Shah, Alay Dilipbhai, Zhang, Jipeng, Stripelis, Dimitris, Zhang, Tong, Avestimehr, Salman, and He, Chaoyang
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance. However, challenges such as data scarcity, ineffective question formatting, and catastrophic forgetting hinder the development of on-device LLM agents. To tackle these issues, we propose Alopex, a framework that enables precise on-device function calls using the Fox LLM. Alopex introduces a logic-based method for generating high-quality training data and a novel ``description-question-output'' format for fine-tuning, reducing risks of function information leakage. Additionally, a data mixing strategy is used to mitigate catastrophic forgetting, combining function call data with textbook datasets to enhance performance in various tasks. Experimental results show that Alopex improves function call accuracy and significantly reduces catastrophic forgetting, providing a robust solution for integrating function call capabilities into LLMs without manual intervention.
- Published
- 2024
30. Vision Language Models are In-Context Value Learners
- Author
-
Ma, Yecheng Jason, Hejna, Joey, Wahid, Ayzaan, Fu, Chuyuan, Shah, Dhruv, Liang, Jacky, Xu, Zhuo, Kirmani, Sean, Xu, Peng, Driess, Danny, Xiao, Ted, Tompson, Jonathan, Bastani, Osbert, Jayaraman, Dinesh, Yu, Wenhao, Zhang, Tingnan, Sadigh, Dorsa, and Xia, Fei
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Predicting temporal progress from visual trajectories is important for intelligent robots that can learn, adapt, and improve. However, learning such progress estimator, or temporal value function, across different tasks and domains requires both a large amount of diverse data and methods which can scale and generalize. To address these challenges, we present Generative Value Learning (\GVL), a universal value function estimator that leverages the world knowledge embedded in vision-language models (VLMs) to predict task progress. Naively asking a VLM to predict values for a video sequence performs poorly due to the strong temporal correlation between successive frames. Instead, GVL poses value estimation as a temporal ordering problem over shuffled video frames; this seemingly more challenging task encourages VLMs to more fully exploit their underlying semantic and temporal grounding capabilities to differentiate frames based on their perceived task progress, consequently producing significantly better value predictions. Without any robot or task specific training, GVL can in-context zero-shot and few-shot predict effective values for more than 300 distinct real-world tasks across diverse robot platforms, including challenging bimanual manipulation tasks. Furthermore, we demonstrate that GVL permits flexible multi-modal in-context learning via examples from heterogeneous tasks and embodiments, such as human videos. The generality of GVL enables various downstream applications pertinent to visuomotor policy learning, including dataset filtering, success detection, and advantage-weighted regression -- all without any model training or finetuning., Comment: Project website and demo: https://generative-value-learning.github.io/
- Published
- 2024
31. Analysis of Droughts and Their Intensities in California from 2000 to 2020
- Author
-
Ujjwal, Patel, Shikha C., Shah, Bansari K., Ogbonna, Nicholas, and Ashqar, Huthaifa I
- Subjects
Computer Science - Computers and Society ,Statistics - Applications - Abstract
Drought has been perceived as a persistent threat globally and the complex mechanism of various factors contributing to its emergence makes it more troublesome to understand. Droughts and their severity trends have been a point of concern in the USA as well, since the economic impact of droughts has been substantial, especially in parts that contribute majorly to US agriculture. California is the biggest agricultural contributor to the United States with its share amounting up to 12% approximately for all of US agricultural produce. Although, according to a 20-year average, California ranks fifth on the list of the highest average percentage of drought-hit regions. Therefore, drought analysis and drought prediction are of crucial importance for California in order to mitigate the associated risks. However, the design of a consistent drought prediction model based on the dynamic relationship of the drought index remains a challenging task. In the present study, we trained a Voting Ensemble classifier utilizing a soft voting system and three different Random Forest models, to predict the presence of drought and also its intensity. In this paper, initially, we have discussed the trends of droughts and their intensities in various California counties reviewed the correlation of meteorological indicators with drought intensities and used these meteorological indicators for drought prediction so as to evaluate their effectiveness as well as significance.
- Published
- 2024
32. Potential Use of IoT Distance Measurement Tool in Boule Sports
- Author
-
Shah, Wahidah Md, Adnan, M Azim., Hassan, Aslinda, Harum, Norharyati, and Hamid, Isredza Rahmi A.
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Computers and Society - Abstract
In Petanque, each player aims to throw the boule closer to the jack. The closest boule to the jack among players will score the point. Currently, the distance of the boule to the jack is still measured using manual measurement tools such as measuring tape, string, and calipers. The manual measurement method is considered time-consuming and prone to inconsistent reading, which the ordinary referees and players conduct. A steady hand is required to hold the tape at two ends while squatting or kneeling. The technique of reading the measurement is also important to determine the accuracy of the length. This project aims to design and develop a prototype device that can measure the distance between jack and boule using a microcontroller and ultrasonic sensor technology. The device is expected to provide an instant measurement of the distance between the jack and the boule. The measurement data can be displayed on the mobile device to ease the user to view the result. This prototype device also counts the score points and determines the winner., Comment: 10 pages
- Published
- 2024
- Full Text
- View/download PDF
33. Usefulness of LLMs as an Author Checklist Assistant for Scientific Papers: NeurIPS'24 Experiment
- Author
-
Goldberg, Alexander, Ullah, Ihsan, Khuong, Thanh Gia Hieu, Rachmat, Benedictus Kent, Xu, Zhen, Guyon, Isabelle, and Shah, Nihar B.
- Subjects
Computer Science - Computation and Language ,Computer Science - Digital Libraries ,Computer Science - Human-Computer Interaction - Abstract
Large language models (LLMs) represent a promising, but controversial, tool in aiding scientific peer review. This study evaluates the usefulness of LLMs in a conference setting as a tool for vetting paper submissions against submission standards. We conduct an experiment at the 2024 Neural Information Processing Systems (NeurIPS) conference, where 234 papers were voluntarily submitted to an "LLM-based Checklist Assistant." This assistant validates whether papers adhere to the author checklist used by NeurIPS, which includes questions to ensure compliance with research and manuscript preparation standards. Evaluation of the assistant by NeurIPS paper authors suggests that the LLM-based assistant was generally helpful in verifying checklist completion. In post-usage surveys, over 70% of authors found the assistant useful, and 70% indicate that they would revise their papers or checklist responses based on its feedback. While causal attribution to the assistant is not definitive, qualitative evidence suggests that the LLM contributed to improving some submissions. Survey responses and analysis of re-submissions indicate that authors made substantive revisions to their submissions in response to specific feedback from the LLM. The experiment also highlights common issues with LLMs: inaccuracy (20/52) and excessive strictness (14/52) were the most frequent issues flagged by authors. We also conduct experiments to understand potential gaming of the system, which reveal that the assistant could be manipulated to enhance scores through fabricated justifications, highlighting potential vulnerabilities of automated review tools.
- Published
- 2024
34. STEER: Flexible Robotic Manipulation via Dense Language Grounding
- Author
-
Smith, Laura, Irpan, Alex, Arenas, Montserrat Gonzalez, Kirmani, Sean, Kalashnikov, Dmitry, Shah, Dhruv, and Xiao, Ted
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence - Abstract
The complexity of the real world demands robotic systems that can intelligently adapt to unseen situations. We present STEER, a robot learning framework that bridges high-level, commonsense reasoning with precise, flexible low-level control. Our approach translates complex situational awareness into actionable low-level behavior through training language-grounded policies with dense annotation. By structuring policy training around fundamental, modular manipulation skills expressed in natural language, STEER exposes an expressive interface for humans or Vision-Language Models (VLMs) to intelligently orchestrate the robot's behavior by reasoning about the task and context. Our experiments demonstrate the skills learned via STEER can be combined to synthesize novel behaviors to adapt to new situations or perform completely new tasks without additional data collection or training., Comment: Project website: https://lauramsmith.github.io/steer/
- Published
- 2024
35. Newtonized Orthogonal Matching Pursuit for High-Resolution Target Detection in Sparse OFDM ISAC Systems
- Author
-
Shah, Syed Najaf Haider, Semper, Sebastian, Khan, Aamir Ullah, Schneider, Christian, and Robert, Joerg
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Integrated Sensing and Communication (ISAC) is a technology paradigm that combines sensing capabilities with communication functionalities in a single device or system. In vehicle-to-everything (V2X) sidelink, ISAC can provide enhanced safety by allowing vehicles to not only communicate with one another but also sense the surrounding environment by using sidelink signals. In ISAC-capable V2X sidelink, the random resource allocation results in an unstructured and sparse distribution of time and frequency resources in the received orthogonal frequency division multiplexing (OFDM) grid, leading to degraded radar detection performance when processed using the conventional 2D-FFT method. To address this challenge, this paper proposes a high-resolution off-grid radar target detection algorithm irrespective of the OFDM grid structure. The proposed method utilizes the Newtonized orthogonal matching pursuit (NOMP) algorithm to effectively detect weak targets masked by the sidelobes of stronger ones and accurately estimates off-grid range and velocity parameters with minimal resources through Newton refinements. Simulation results demonstrate the superior performance of the proposed NOMP-based target detection algorithm compared to existing compressed sensing (CS) methods in terms of detection probability, resolution, and accuracy. Additionally, experimental validation is performed using a bi-static radar setup in a semi-anechoic chamber. The measurement results validate the simulation findings, showing that the proposed algorithm significantly enhances target detection and parameter estimation accuracy in realistic scenarios.
- Published
- 2024
36. 'It's a conversation, not a quiz': A Risk Taxonomy and Reflection Tool for LLM Adoption in Public Health
- Author
-
Zhou, Jiawei, Chen, Amy Z., Shah, Darshi, Reese, Laura Schwab, and De Choudhury, Munmun
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Recent breakthroughs in large language models (LLMs) have generated both interest and concern about their potential adoption as accessible information sources or communication tools across different domains. In public health -- where stakes are high and impacts extend across populations -- adopting LLMs poses unique challenges that require thorough evaluation. However, structured approaches for assessing potential risks in public health remain under-explored. To address this gap, we conducted focus groups with health professionals and health issue experiencers to unpack their concerns, situated across three distinct and critical public health issues that demand high-quality information: vaccines, opioid use disorder, and intimate partner violence. We synthesize participants' perspectives into a risk taxonomy, distinguishing and contextualizing the potential harms LLMs may introduce when positioned alongside traditional health communication. This taxonomy highlights four dimensions of risk in individual behaviors, human-centered care, information ecosystem, and technology accountability. For each dimension, we discuss specific risks and example reflection questions to help practitioners adopt a risk-reflexive approach. This work offers a shared vocabulary and reflection tool for experts in both computing and public health to collaboratively anticipate, evaluate, and mitigate risks in deciding when to employ LLM capabilities (or not) and how to mitigate harm when they are used.
- Published
- 2024
37. Two-Sided Learning in Decentralized Matching Markets
- Author
-
Shah, Vade, Ferguson, Bryce L., and Marden, Jason R.
- Subjects
Computer Science - Computer Science and Game Theory - Abstract
Two-sided matching markets, environments in which two disjoint groups of agents seek to partner with one another, arise in many practical applications. In settings where the agents can assess the quality of their possible partners a priori, well-known centralized algorithms can be used to find desirable matchings between the two groups. However, when they do not know their own preferences, such algorithms are no longer applicable and agents must instead learn their preferences through repeated interactions with one another. In this work, we design completely uncoupled and uncoordinated policies that use an agent's limited historical observations to guide their behavior towards desirable matchings when they do not know their preferences. In our first main contribution, we demonstrate that when every agent follows a simple policy which we call trial-and-error learning, they will converge to a stable matching, the standard equilibrium configuration in matching markets. Then, we evaluate the strategyproofness of this policy and ask whether one group of agents can improve their performance by following a different policy. We constructively answer this question in the affirmative, demonstrating that if one group follows simple trial-and-error learning while the second group follows a more advanced policy, then they will converge to the most preferable stable matching for the second group. To the best of the authors' knowledge, these are the first completely uncoupled and uncoordinated policies that demonstrate any notion of convergence to stability in decentralized markets with two-sided uncertainty.
- Published
- 2024
38. On contact cosmetic surgery
- Author
-
Etnyre, John B. and Shah, Tanushree
- Subjects
Mathematics - Geometric Topology ,Mathematics - Symplectic Geometry ,57K33 - Abstract
We demonstrate that the contact cosmetic surgery conjecture holds true for all non-trivial Legendrian knots, with the possible exception of Lagrangian slice knots. We also discuss the contact cosmetic surgeries on Legendrian unknots and make the surprising observation that there are some Legendrian unknots that have a contact surgery with no cosmetic pair, while all other contact surgeries are contactomorphic to infinitely many other contact surgeries on the knot., Comment: 26 pages, 7 figures
- Published
- 2024
39. Tailoring Charge Donor-Acceptor Interaction in CsPbBr3 Perovskite Nanocrystals through Ligand Exchange
- Author
-
Shah, Syed Abdul Basit, Ghimire, Sushant, Lesyuk, Rostyslav, Diamanti, Maria Vittoria, Lughi, Vanni, and Klinke, Christian
- Subjects
Condensed Matter - Materials Science - Abstract
The surface ligands in colloidal metal halide perovskites influence not only their intrinsic optoelectronic properties but also their interaction with other materials and molecules. We explore donor-acceptor interactions of CsPbBr3 perovskite nanocrystals with TiO2 nanoparticles and nanotubes by replacing long-chain oleylamine ligands with short-chain butylamines. Through post-synthesis ligand exchange, we functionalize the nanocrystals with butylamine ligands while maintaining their intrinsic properties. In solution, butylamine-capped nanocrystals exhibit reduced photoluminescence intensity with increasing TiO2 concentration but without any change in photoluminescence lifetime. Intriguingly, the Stern-Volmer plot depicts different slopes at low and high TiO2 concentrations, suggesting mixed static and sphere-of-action quenching interactions. Oleylamine-capped nanocrystals in solution, on the other hand, show no interaction with TiO2, as indicated by consistent photoluminescence intensities and lifetimes before and after TiO2 addition. In films, both types exhibit decreased photoluminescence lifetime with TiO2, indicating enhanced donor-acceptor interactions, which are discussed in terms of trap state modification and electron transfer. TiO2 nanotubes enhance nonradiative recombination more in butylamine-capped CsPbBr3 perovskite nanocrystals, emphasizing the role of ligand chain length., Comment: 30 pages, 7 figures
- Published
- 2024
- Full Text
- View/download PDF
40. V-CAS: A Realtime Vehicle Anti Collision System Using Vision Transformer on Multi-Camera Streams
- Author
-
Ashraf, Muhammad Waqas, Hassan, Ali, and Shah, Imad Ali
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence - Abstract
This paper introduces a real-time Vehicle Collision Avoidance System (V-CAS) designed to enhance vehicle safety through adaptive braking based on environmental perception. V-CAS leverages the advanced vision-based transformer model RT-DETR, DeepSORT tracking, speed estimation, brake light detection, and an adaptive braking mechanism. It computes a composite collision risk score based on vehicles' relative accelerations, distances, and detected braking actions, using brake light signals and trajectory data from multiple camera streams to improve scene perception. Implemented on the Jetson Orin Nano, V-CAS enables real-time collision risk assessment and proactive mitigation through adaptive braking. A comprehensive training process was conducted on various datasets for comparative analysis, followed by fine-tuning the selected object detection model using transfer learning. The system's effectiveness was rigorously evaluated on the Car Crash Dataset (CCD) from YouTube and through real-time experiments, achieving over 98% accuracy with an average proactive alert time of 1.13 seconds. Results indicate significant improvements in object detection and tracking, enhancing collision avoidance compared to traditional single-camera methods. This research demonstrates the potential of low-cost, multi-camera embedded vision transformer systems to advance automotive safety through enhanced environmental perception and proactive collision avoidance mechanisms., Comment: Accepted at ICMLA 2024
- Published
- 2024
41. M-CELS: Counterfactual Explanation for Multivariate Time Series Data Guided by Learned Saliency Maps
- Author
-
Li, Peiyu, Bahri, Omar, Boubrahimi, Soukaina Filali, and Hamdi, Shah Muhammad
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Over the past decade, multivariate time series classification has received great attention. Machine learning (ML) models for multivariate time series classification have made significant strides and achieved impressive success in a wide range of applications and tasks. The challenge of many state-of-the-art ML models is a lack of transparency and interpretability. In this work, we introduce M-CELS, a counterfactual explanation model designed to enhance interpretability in multidimensional time series classification tasks. Our experimental validation involves comparing M-CELS with leading state-of-the-art baselines, utilizing seven real-world time-series datasets from the UEA repository. The results demonstrate the superior performance of M-CELS in terms of validity, proximity, and sparsity, reinforcing its effectiveness in providing transparent insights into the decisions of machine learning models applied to multivariate time series data., Comment: Accepted at ICMLA 2024. arXiv admin note: text overlap with arXiv:2410.20539
- Published
- 2024
42. Performance Evaluation of Deep Learning Models for Water Quality Index Prediction: A Comparative Study of LSTM, TCN, ANN, and MLP
- Author
-
Ismail, Muhammad, Abbas, Farkhanda, Shah, Shahid Munir, Aljawarneh, Mahmoud, Dhomeja, Lachhman Das, Abbas, Fazila, Shoaib, Muhammad, Alrefaei, Abdulwahed Fahad, and Albeshr, Mohammed Fahad
- Subjects
Computer Science - Machine Learning - Abstract
Environmental monitoring and predictive modeling of the Water Quality Index (WQI) through the assessment of the water quality., Comment: 18 pages
- Published
- 2024
43. Unfiltered Conversations: A Dataset of 2024 U.S. Presidential Election Discourse on Truth Social
- Author
-
Shah, Kashish, Gerard, Patrick, Luceri, Luca, and Ferrara, Emilio
- Subjects
Computer Science - Social and Information Networks - Abstract
Truth Social, launched as a social media platform with a focus on free speech, has become a prominent space for political discourse, attracting a user base with diverse, yet often conservative, viewpoints. As an emerging platform with minimal content moderation, Truth Social has facilitated discussions around contentious social and political issues but has also seen the spread of conspiratorial and hyper-partisan narratives. In this paper, we introduce and release a comprehensive dataset capturing activity on Truth Social related to the upcoming 2024 U.S. Presidential Election, including posts, replies, user interactions, content and media. This dataset comprises 1.5 million posts published between February, 2024 and October 2024, and encompasses key user engagement features and posts metadata. Data collection began in June 2024, though it includes posts published earlier, with the oldest post dating back to February 2022. This offers researchers a unique resource to study communication patterns, the formation of online communities, and the dissemination of information within Truth Social in the run-up to the election. By providing an in-depth view of Truth Social's user dynamics and content distribution, this dataset aims to support further research on political discourse within an alt-tech social media platform. The dataset is publicly available at https://github.com/kashish-s/TruthSocial_2024ElectionInitiative, Comment: HUMANS Lab -- Working Paper No. 2024.8 -- The 2024 Election Integrity Initiative -- University of Southern California
- Published
- 2024
44. Probing Broadband Spectral Energy Distribution and Variability of Mrk\,501 in the low flux state
- Author
-
Tantry, Javaid, Shah, Zahir, Misra, Ranjeev, Iqbal, Naseer, and Akbar, Sikandar
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
We conducted a multi-wavelength analysis of the blazar Mrk\,501, utilizing observations from \emph{Astro}Sat (SXT, LAXPC), \emph{Swift-UVOT}, and \emph{Fermi-LAT} during the period August 15, 2016 to March 27, 2022. The resulting multi-wavelength light curve revealed relatively low activity of the source across the electromagnetic spectrum. Notably, logparabola and broken power-law models provided a better fit to the joint X-ray spectra from \emph{Astro}Sat-SXT/LAXPC instruments compared to the power-law model. During the low activity state, the source showed the characteristic harder when brighter trend at the X-ray energies. To gain insights into underlying physical processes responsible for the broadband emission, we performed a detailed broadband spectral analysis using the convolved one-zone leptonic model with different forms of particle distributions such as logparabola (LP), broken power-law (BPL), power-law model with maximum energy ($\xi_{max}$), and energy-dependent acceleration (EDA) models. Our analysis revealed similar reduced-$\chi^2$ values for the four particle distributions. The LP and EDA models exhibited the lowest jet powers. The correlation analyses conducted for the LP and BPL models revealed that there is a positive correlation between jet power and bulk Lorentz factor. Specifically, in the LP model, jet power proved independent of $\gamma_{min}$, whereas in the broken power-law model, jet power decreased with an increase in $\gamma_{min}$. The jet power in the LP/EDA particle distribution is nearly 10 percent of the Eddington luminosity of a $10^7$ M$_\odot$ black hole. This result suggests that the jet could potentially be fueled by accretion processes., Comment: Accepted for publication in Journal of High Energy Astrophysics (JHEAP)
- Published
- 2024
45. DARD: A Multi-Agent Approach for Task-Oriented Dialog Systems
- Author
-
Gupta, Aman, Ravichandran, Anirudh, Zhang, Ziji, Shah, Swair, Beniwal, Anurag, and Sadagopan, Narayanan
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Task-oriented dialogue systems are essential for applications ranging from customer service to personal assistants and are widely used across various industries. However, developing effective multi-domain systems remains a significant challenge due to the complexity of handling diverse user intents, entity types, and domain-specific knowledge across several domains. In this work, we propose DARD (Domain Assigned Response Delegation), a multi-agent conversational system capable of successfully handling multi-domain dialogs. DARD leverages domain-specific agents, orchestrated by a central dialog manager agent. Our extensive experiments compare and utilize various agent modeling approaches, combining the strengths of smaller fine-tuned models (Flan-T5-large & Mistral-7B) with their larger counterparts, Large Language Models (LLMs) (Claude Sonnet 3.0). We provide insights into the strengths and limitations of each approach, highlighting the benefits of our multi-agent framework in terms of flexibility and composability. We evaluate DARD using the well-established MultiWOZ benchmark, achieving state-of-the-art performance by improving the dialogue inform rate by 6.6% and the success rate by 4.1% over the best-performing existing approaches. Additionally, we discuss various annotator discrepancies and issues within the MultiWOZ dataset and its evaluation system.
- Published
- 2024
46. Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction
- Author
-
Singh, Utsav, Chakraborty, Souradip, Suttle, Wesley A., Sadler, Brian M., Sahu, Anit Kumar, Shah, Mubarak, Namboodiri, Vinay P., and Bedi, Amrit Singh
- Subjects
Computer Science - Machine Learning - Abstract
This work introduces Hierarchical Preference Optimization (HPO), a novel approach to hierarchical reinforcement learning (HRL) that addresses non-stationarity and infeasible subgoal generation issues when solving complex robotic control tasks. HPO leverages maximum entropy reinforcement learning combined with token-level Direct Preference Optimization (DPO), eliminating the need for pre-trained reference policies that are typically unavailable in challenging robotic scenarios. Mathematically, we formulate HRL as a bi-level optimization problem and transform it into a primitive-regularized DPO formulation, ensuring feasible subgoal generation and avoiding degenerate solutions. Extensive experiments on challenging robotic navigation and manipulation tasks demonstrate impressive performance of HPO, where it shows an improvement of up to 35% over the baselines. Furthermore, ablation studies validate our design choices, and quantitative analyses confirm the ability of HPO to mitigate non-stationarity and infeasible subgoal generation issues in HRL.
- Published
- 2024
47. Understanding Graphical Perception in Data Visualization through Zero-shot Prompting of Vision-Language Models
- Author
-
Guo, Grace, Kang, Jenna Jiayi, Shah, Raj Sanjay, Pfister, Hanspeter, and Varma, Sashank
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Vision Language Models (VLMs) have been successful at many chart comprehension tasks that require attending to both the images of charts and their accompanying textual descriptions. However, it is not well established how VLM performance profiles map to human-like behaviors. If VLMs can be shown to have human-like chart comprehension abilities, they can then be applied to a broader range of tasks, such as designing and evaluating visualizations for human readers. This paper lays the foundations for such applications by evaluating the accuracy of zero-shot prompting of VLMs on graphical perception tasks with established human performance profiles. Our findings reveal that VLMs perform similarly to humans under specific task and style combinations, suggesting that they have the potential to be used for modeling human performance. Additionally, variations to the input stimuli show that VLM accuracy is sensitive to stylistic changes such as fill color and chart contiguity, even when the underlying data and data mappings are the same.
- Published
- 2024
48. Constrained Fair and Efficient Allocations
- Author
-
Cookson, Benjamin, Ebadian, Soroush, and Shah, Nisarg
- Subjects
Computer Science - Computer Science and Game Theory - Abstract
Fairness and efficiency have become the pillars of modern fair division research, but prior work on achieving both simultaneously is largely limited to the unconstrained setting. We study fair and efficient allocations of indivisible goods under additive valuations and various types of allocation feasibility constraints, and demonstrate the unreasonable effectiveness of the maximum Nash welfare (MNW) solution in this previously uncharted territory. Our main result is that MNW allocations are 1/2-envy-free up to one good (EF1) and Pareto optimal under the broad family of (arbitrary) matroid constraints. We extend these guarantees to complete MNW allocations for base-orderable matroid constraints, and to a family of non-matroidal constraints (which includes balancedness) using a novel "alternate worlds" technique. We establish tightness of our results by providing counterexamples for the satisfiability of certain stronger desiderata, but show an improved result for the special case of goods with copies (Gafni et al. 2023). Finally, we also establish novel best-of-both-worlds guarantees for goods with copies and balancedness.
- Published
- 2024
49. Multi-modal Spatial Clustering for Spatial Transcriptomics Utilizing High-resolution Histology Images
- Author
-
Li, Bingjun, Karami, Mostafa, Junayed, Masum Shah, and Nabavi, Sheida
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,J.3 ,I.2.1 - Abstract
Understanding the intricate cellular environment within biological tissues is crucial for uncovering insights into complex biological functions. While single-cell RNA sequencing has significantly enhanced our understanding of cellular states, it lacks the spatial context necessary to fully comprehend the cellular environment. Spatial transcriptomics (ST) addresses this limitation by enabling transcriptome-wide gene expression profiling while preserving spatial context. One of the principal challenges in ST data analysis is spatial clustering, which reveals spatial domains based on the spots within a tissue. Modern ST sequencing procedures typically include a high-resolution histology image, which has been shown in previous studies to be closely connected to gene expression profiles. However, current spatial clustering methods often fail to fully integrate high-resolution histology image features with gene expression data, limiting their ability to capture critical spatial and cellular interactions. In this study, we propose the spatial transcriptomics multi-modal clustering (stMMC) model, a novel contrastive learning-based deep learning approach that integrates gene expression data with histology image features through a multi-modal parallel graph autoencoder. We tested stMMC against four state-of-the-art baseline models: Leiden, GraphST, SpaGCN, and stLearn on two public ST datasets with 13 sample slices in total. The experiments demonstrated that stMMC outperforms all the baseline models in terms of ARI and NMI. An ablation study further validated the contributions of contrastive learning and the incorporation of histology image features., Comment: 9 pages
- Published
- 2024
50. Temporal Fair Division
- Author
-
Cookson, Benjamin, Ebadian, Soroush, and Shah, Nisarg
- Subjects
Computer Science - Computer Science and Game Theory - Abstract
We study temporal fair division, whereby a set of agents are allocated a (possibly different) set of goods on each day for a period of days. We study this setting, as well as a number of its special cases formed by the restrictions to two agents, same goods on each day, identical preferences, or combinations thereof, and chart out the landscape of achieving two types of fairness guarantees simultaneously: fairness on each day (per day) and fairness over time (up to each day, or the weaker version, overall). In the most general setting, we prove that there always exists an allocation that is stochastically-dominant envy-free up to one good (SD-EF1) per day and proportional up to one good (PROP1) overall, and when all the agents have identical preferences, we show that SD-EF1 per day and SD-EF1 overall can be guaranteed. For the case of two agents, we prove that SD-EF1 per day and EF1 up to each day can be guaranteed using an envy balancing technique. We provide counterexamples for other combinations that establish our results as among the best guarantees possible, but also leaving open some tantalizing questions.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.