66 results on '"Jain, Siddharth"'
Search Results
2. Large-Scale Purification and Characterization of Recombinant Receptor-Binding Domain (RBD) of SARS-CoV-2 Spike Protein Expressed in Yeast
- Author
-
Massachusetts Institute of Technology. Department of Biological Engineering, Koch Institute for Integrative Cancer Research at MIT, Massachusetts Institute of Technology. Department of Chemical Engineering, Nagar, Gaurav, Jain, Siddharth, Rajurkar, Meghraj, Lothe, Rakesh, Rao, Harish, Majumdar, Sourav, Gautam, Manish, Rodriguez-Aponte, Sergio A., Crowell, Laura E., Love, J. Christopher, Dandekar, Prajakta, Puranik, Amita, Gairola, Sunil, Shaligram, Umesh, Jain, Ratnesh, Massachusetts Institute of Technology. Department of Biological Engineering, Koch Institute for Integrative Cancer Research at MIT, Massachusetts Institute of Technology. Department of Chemical Engineering, Nagar, Gaurav, Jain, Siddharth, Rajurkar, Meghraj, Lothe, Rakesh, Rao, Harish, Majumdar, Sourav, Gautam, Manish, Rodriguez-Aponte, Sergio A., Crowell, Laura E., Love, J. Christopher, Dandekar, Prajakta, Puranik, Amita, Gairola, Sunil, Shaligram, Umesh, and Jain, Ratnesh
- Abstract
SARS-CoV-2 spike protein is an essential component of numerous protein-based vaccines for COVID-19. The receptor-binding domain of this spike protein is a promising antigen with ease of expression in microbial hosts and scalability at comparatively low production costs. This study describes the production, purification, and characterization of RBD of SARS-CoV-2 protein, which is currently in clinical trials, from a commercialization perspective. The protein was expressed in Pichia pastoris in a large-scale bioreactor of 1200 L capacity. Protein capture and purification are conducted through mixed-mode chromatography followed by hydrophobic interaction chromatography. This two-step purification process produced RBD with an overall productivity of ~21 mg/L at >99% purity. The protein’s primary, secondary, and tertiary structures were also verified using LCMS-based peptide mapping, circular dichroism, and fluorescence spectroscopy, respectively. The glycoprotein was further characterized for quality attributes such as glycosylation, molecular weight, purity, di-sulfide bonding, etc. Through structural analysis, it was confirmed that the product maintained a consistent quality across different batches during the large-scale production process. The binding capacity of RBD of spike protein was also assessed using human angiotensin-converting enzyme 2 receptor. A low binding constant range of KD values, ranging between 3.63 × 10−8 to 6.67 × 10−8, demonstrated a high affinity for the ACE2 receptor, revealing this protein as a promising candidate to prevent the entry of COVID-19 virus.
- Published
- 2023
3. The Next 700 ML-Enabled Compiler Optimizations
- Author
-
VenkataKeerthy, S., Jain, Siddharth, Kalvakuntla, Umesh, Gorantla, Pranav Sai, Chitale, Rajiv Shailesh, Brevdo, Eugene, Cohen, Albert, Trofin, Mircea, Upadrasta, Ramakrishna, VenkataKeerthy, S., Jain, Siddharth, Kalvakuntla, Umesh, Gorantla, Pranav Sai, Chitale, Rajiv Shailesh, Brevdo, Eugene, Cohen, Albert, Trofin, Mircea, and Upadrasta, Ramakrishna
- Abstract
There is a growing interest in enhancing compiler optimizations with ML models, yet interactions between compilers and ML frameworks remain challenging. Some optimizations require tightly coupled models and compiler internals,raising issues with modularity, performance and framework independence. Practical deployment and transparency for the end-user are also important concerns. We propose ML-Compiler-Bridge to enable ML model development within a traditional Python framework while making end-to-end integration with an optimizing compiler possible and efficient. We evaluate it on both research and production use cases, for training and inference, over several optimization problems, multiple compilers and its versions, and gym infrastructures.
- Published
- 2023
- Full Text
- View/download PDF
4. Omitted Labels in Causality: A Study of Paradoxes
- Author
-
Mazaheri, Bijan, Jain, Siddharth, Cook, Matthew, Bruck, Jehoshua, Mazaheri, Bijan, Jain, Siddharth, Cook, Matthew, and Bruck, Jehoshua
- Abstract
We explore what we call ``omitted label contexts,'' in which training data is limited to a subset of the possible labels. This setting is common among specialized human experts or specific focused studies. We lean on well-studied paradoxes (Simpson's and Condorcet) to illustrate the more general difficulties of causal inference in omitted label contexts. Contrary to the fundamental principles on which much of causal inference is built, we show that ``correct'' adjustments sometimes require non-exchangeable treatment and control groups. These pitfalls lead us to the study networks of conclusions drawn from different contexts and the structures the form, proving an interesting connection between these networks and social choice theory.
- Published
- 2023
5. Development of Surrogate Model for FEM Error Prediction using Deep Learning
- Author
-
Jain, Siddharth and Jain, Siddharth
- Abstract
This research is a proof-of-concept study to develop a surrogate model, using deep learning (DL), to predict solution error for a given model with a given mesh. For this research, we have taken the von Mises stress contours and have predicted two different types of error indicators contours, namely (i) von Mises error indicator (MISESERI), and (ii) energy density error indicator (ENDENERI). Error indicators are designed to identify the solution domain areas where the gradient has not been properly captured. It uses the spatial gradient distribution of the existing solution for a given mesh to estimate the error. Due to poor meshing and nature of the finite element method, these error indicators are leveraged to study and reduce errors in the finite element solution using an adaptive remeshing scheme. Adaptive re-meshing is an iterative and computationally expensive process to reduce the error computed during the post-processing step. To overcome this limitation we propose an approach to replace it using data-driven techniques. We have introduced an image processing-based surrogate model designed to solve an image-to-image regression problem using convolutional neural networks (CNN) that takes a 256 × 256 colored image of von mises stress contour and outputs the required error indicator. To train this model with good generalization performance we have developed four different geometries for each of the three case studies: (i) quarter plate with a hole, (b) simply supported plate with multiple holes, and (c) simply supported stiffened plate. The entire research is implemented in a three phase approach, phase I involves the design and development of a CNN to perform training on stress contour images with their corresponding von Mises stress values volume-averaged over the entire domain. Phase II involves developing a surrogate model to perform image-to-image regression and the final phase III involves extending the capabilities of phase II and making the surrogate mode
- Published
- 2022
6. Development of Surrogate Model for FEM Error Prediction using Deep Learning
- Author
-
Jain, Siddharth and Jain, Siddharth
- Abstract
This research is a proof-of-concept study to develop a surrogate model, using deep learning (DL), to predict solution error for a given model with a given mesh. For this research, we have taken the von Mises stress contours and have predicted two different types of error indicators contours, namely (i) von Mises error indicator (MISESERI), and (ii) energy density error indicator (ENDENERI). Error indicators are designed to identify the solution domain areas where the gradient has not been properly captured. It uses the spatial gradient distribution of the existing solution for a given mesh to estimate the error. Due to poor meshing and nature of the finite element method, these error indicators are leveraged to study and reduce errors in the finite element solution using an adaptive remeshing scheme. Adaptive re-meshing is an iterative and computationally expensive process to reduce the error computed during the post-processing step. To overcome this limitation we propose an approach to replace it using data-driven techniques. We have introduced an image processing-based surrogate model designed to solve an image-to-image regression problem using convolutional neural networks (CNN) that takes a 256 × 256 colored image of von mises stress contour and outputs the required error indicator. To train this model with good generalization performance we have developed four different geometries for each of the three case studies: (i) quarter plate with a hole, (b) simply supported plate with multiple holes, and (c) simply supported stiffened plate. The entire research is implemented in a three phase approach, phase I involves the design and development of a CNN to perform training on stress contour images with their corresponding von Mises stress values volume-averaged over the entire domain. Phase II involves developing a surrogate model to perform image-to-image regression and the final phase III involves extending the capabilities of phase II and making the surrogate mode
- Published
- 2022
7. RL4ReAl: Reinforcement Learning for Register Allocation
- Author
-
VenkataKeerthy, S., Jain, Siddharth, Kundu, Anilava, Aggarwal, Rohit, Cohen, Albert, Upadrasta, Ramakrishna, VenkataKeerthy, S., Jain, Siddharth, Kundu, Anilava, Aggarwal, Rohit, Cohen, Albert, and Upadrasta, Ramakrishna
- Abstract
We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art techniques. We formalize the constraints that precisely define the problem for a given instruction-set architecture, while ensuring that the generated code preserves semantic correctness. We also develop a gRPC based framework providing a modular and efficient compiler interface for training and inference. Our approach is architecture independent: we show experimental results targeting Intel x86 and ARM AArch64. Our results match or out-perform the heavily tuned, production-grade register allocators of LLVM., Comment: Published in CC'23
- Published
- 2022
- Full Text
- View/download PDF
8. Decoding the Past
- Author
-
Jain, Siddharth, Jain, Siddharth, Jain, Siddharth, and Jain, Siddharth
- Abstract
The human genome is continuously evolving, hence the sequenced genome is a snapshot in time of this evolving entity. Over time, the genome accumulates mutations that can be associated with different phenotypes - like physical traits, diseases, etc. Underlying mutation accumulation is an evolution channel (the term channel is motivated by the notion of communication channel introduced by Shannon [1] in 1948 and started the area of Information Theory), which is controlled by hereditary, environmental, and stochastic factors. The premise of this thesis is to understand the human genome using information theory framework. In particular, it focuses on: (i) the analysis and characterization of the evolution channel using measures of capacity, expressiveness, evolution distance, and uniqueness of ancestry and uses these insights for (ii) the design of error correcting codes for DNA storage, (iii) inversion symmetry in the genome and (iv) cancer classification. The mutational events characterizing this evolution channel can be divided into two categories, namely point mutations and duplications. While evolution through point mutations is unconstrained, giving rise to combinatorially many possibilities of what could have happened in the past, evolution through duplications adds constraints limiting the number of those possibilities. Further, more than 50% of the genome has been observed to consist of repeated sequences. We focus on the much constrained form of duplications known as tandem duplications in order to understand the limits of evolution by duplication. Our sequence evolution model consists of a starting sequence called seed and a set of tandem duplication rules. We find limits on the diversity of sequences that can be generated by tandem duplications using measures of capacity and expressiveness. Additionally, we calculate bounds on the duplication distance which is used to measure the timing
- Published
- 2019
9. Decoding the Past
- Author
-
Jain, Siddharth, Jain, Siddharth, Jain, Siddharth, and Jain, Siddharth
- Abstract
The human genome is continuously evolving, hence the sequenced genome is a snapshot in time of this evolving entity. Over time, the genome accumulates mutations that can be associated with different phenotypes - like physical traits, diseases, etc. Underlying mutation accumulation is an evolution channel (the term channel is motivated by the notion of communication channel introduced by Shannon [1] in 1948 and started the area of Information Theory), which is controlled by hereditary, environmental, and stochastic factors. The premise of this thesis is to understand the human genome using information theory framework. In particular, it focuses on: (i) the analysis and characterization of the evolution channel using measures of capacity, expressiveness, evolution distance, and uniqueness of ancestry and uses these insights for (ii) the design of error correcting codes for DNA storage, (iii) inversion symmetry in the genome and (iv) cancer classification. The mutational events characterizing this evolution channel can be divided into two categories, namely point mutations and duplications. While evolution through point mutations is unconstrained, giving rise to combinatorially many possibilities of what could have happened in the past, evolution through duplications adds constraints limiting the number of those possibilities. Further, more than 50% of the genome has been observed to consist of repeated sequences. We focus on the much constrained form of duplications known as tandem duplications in order to understand the limits of evolution by duplication. Our sequence evolution model consists of a starting sequence called seed and a set of tandem duplication rules. We find limits on the diversity of sequences that can be generated by tandem duplications using measures of capacity and expressiveness. Additionally, we calculate bounds on the duplication distance which is used to measure the timing
- Published
- 2019
10. Decoding the Past
- Author
-
Jain, Siddharth, Jain, Siddharth, Jain, Siddharth, and Jain, Siddharth
- Abstract
The human genome is continuously evolving, hence the sequenced genome is a snapshot in time of this evolving entity. Over time, the genome accumulates mutations that can be associated with different phenotypes - like physical traits, diseases, etc. Underlying mutation accumulation is an evolution channel (the term channel is motivated by the notion of communication channel introduced by Shannon [1] in 1948 and started the area of Information Theory), which is controlled by hereditary, environmental, and stochastic factors. The premise of this thesis is to understand the human genome using information theory framework. In particular, it focuses on: (i) the analysis and characterization of the evolution channel using measures of capacity, expressiveness, evolution distance, and uniqueness of ancestry and uses these insights for (ii) the design of error correcting codes for DNA storage, (iii) inversion symmetry in the genome and (iv) cancer classification. The mutational events characterizing this evolution channel can be divided into two categories, namely point mutations and duplications. While evolution through point mutations is unconstrained, giving rise to combinatorially many possibilities of what could have happened in the past, evolution through duplications adds constraints limiting the number of those possibilities. Further, more than 50% of the genome has been observed to consist of repeated sequences. We focus on the much constrained form of duplications known as tandem duplications in order to understand the limits of evolution by duplication. Our sequence evolution model consists of a starting sequence called seed and a set of tandem duplication rules. We find limits on the diversity of sequences that can be generated by tandem duplications using measures of capacity and expressiveness. Additionally, we calculate bounds on the duplication distance which is used to measure the timing
- Published
- 2019
11. Bio Cyber Physical Architecture: Use of Computational methods in Ecological buildings and Landscapes
- Author
-
Jain, Siddharth Popatlal (author) and Jain, Siddharth Popatlal (author)
- Abstract
The world around us is rapidly changing and evolving. In a new report by the World green structure committee, the building and development industry are liable for 38.8% of all CO2 emissions internationally, with operational outflows (from energy used to warmth, cool and light structures) representing 28%. Our designed structures depend upon many resources during the construction or operational phase. So it is clear that the design decisions we make now for our built environment have a significant impact on the future. As the population increases, the need for freshwater, electricity, and other urban resources grows exponentially. It constantly increases pressure on the urban resources and infrastructure needed to run our cities smoothly. Self-sufficient buildings can be a crucial solution to urban problems like increasing energy demands and poor air quality, which we face today. It is less dependent on active energy systems like mechanical ventilation and electricity from the grid; it is more inclined towards passive energy systems. Integrating vegetation into the built environment has proven in many instances that it increases the self-sufficiency of the users' buildings and well-being, according to R.Hassell, 2017. Nature has always provided inspiration and ideas for the field of innovation. It has always inspired newer living and green solutions for the future, environmental, economic, health, and community benefits. The integration of green vegetation into the buildings has various psychological and physiological benefits over the users To summarize, integrating green building strategies can increase the self-sufficiency index in our structures. The research has explored various computational methods used for similar contexts and related them with green building strategies to design a self-sufficient habitat. The research question mainly revolves around developing a design process that explores computational methods in green buildings and, http://cs.roboticbuilding.eu/index.php/project08:P5 Unspecified Final Presentation with Video Links https://youtube.com/playlist?list=PL9zFBsJmlp0nay8oyJ2VIQfGfc88--FgP Video Links of the presentation, Architecture, Urbanism and Building Sciences | Explorelab
- Published
- 2021
12. Expert Graphs: Synthesizing New Expertise via Collaboration
- Author
-
Mazaheri, Bijan, Jain, Siddharth, Bruck, Jehoshua, Mazaheri, Bijan, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Consider multiple experts with overlapping expertise working on a classification problem under uncertain input. What constitutes a consistent set of opinions? How can we predict the opinions of experts on missing sub-domains? In this paper, we define a framework of to analyze this problem, termed "expert graphs." In an expert graph, vertices represent classes and edges represent binary opinions on the topics of their vertices. We derive necessary conditions for expert graph validity and use them to create "synthetic experts" which describe opinions consistent with the observed opinions of other experts. We show this framework to be equivalent to the well-studied linear ordering polytope. We show our conditions are not sufficient for describing all expert graphs on cliques, but are sufficient for cycles.
- Published
- 2021
13. Synthesizing New Expertise via Collaboration
- Author
-
Mazaheri, Bijan, Jain, Siddharth, Bruck, Jehoshua, Mazaheri, Bijan, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Consider a set of classes and an uncertain input. Suppose, we do not have access to data and only have knowledge of perfect experts between a few classes in the set. What constitutes a consistent set of opinions? How can we use this to predict the opinions of experts on missing sub-domains? In this paper, we define a framework to analyze this problem. In particular, we define an expert graph where vertices represent classes and edges represent binary experts on the topics of their vertices. We derive necessary conditions for an expert graph to be valid. Further, we show that these conditions are also sufficient if the graph is a cycle, which can yield unintuitive results. Using these conditions, we provide an algorithm to obtain upper and lower bounds on the weights of unknown edges in an expert graph.
- Published
- 2021
14. Synthesizing New Expertise via Collaboration
- Author
-
Mazaheri, Bijan, Jain, Siddharth, Bruck, Jehoshua, Mazaheri, Bijan, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Consider a set of classes and an uncertain input. Suppose, we do not have access to data and only have knowledge of perfect experts between a few classes in the set. What constitutes a consistent set of opinions? How can we use this to predict the opinions of experts on missing sub-domains? In this paper, we define a framework to analyze this problem. In particular, we define an expert graph where vertices represent classes and edges represent binary experts on the topics of their vertices. We derive necessary conditions for an expert graph to be valid. Further, we show that these conditions are also sufficient if the graph is a cycle, which can yield unintuitive results. Using these conditions, we provide an algorithm to obtain upper and lower bounds on the weights of unknown edges in an expert graph.
- Published
- 2021
15. Bio Cyber Physical Architecture: Use of Computational methods in Ecological buildings and Landscapes
- Author
-
Jain, Siddharth Popatlal (author) and Jain, Siddharth Popatlal (author)
- Abstract
The world around us is rapidly changing and evolving. In a new report by the World green structure committee, the building and development industry are liable for 38.8% of all CO2 emissions internationally, with operational outflows (from energy used to warmth, cool and light structures) representing 28%. Our designed structures depend upon many resources during the construction or operational phase. So it is clear that the design decisions we make now for our built environment have a significant impact on the future. As the population increases, the need for freshwater, electricity, and other urban resources grows exponentially. It constantly increases pressure on the urban resources and infrastructure needed to run our cities smoothly. Self-sufficient buildings can be a crucial solution to urban problems like increasing energy demands and poor air quality, which we face today. It is less dependent on active energy systems like mechanical ventilation and electricity from the grid; it is more inclined towards passive energy systems. Integrating vegetation into the built environment has proven in many instances that it increases the self-sufficiency of the users' buildings and well-being, according to R.Hassell, 2017. Nature has always provided inspiration and ideas for the field of innovation. It has always inspired newer living and green solutions for the future, environmental, economic, health, and community benefits. The integration of green vegetation into the buildings has various psychological and physiological benefits over the users To summarize, integrating green building strategies can increase the self-sufficiency index in our structures. The research has explored various computational methods used for similar contexts and related them with green building strategies to design a self-sufficient habitat. The research question mainly revolves around developing a design process that explores computational methods in green buildings and, http://cs.roboticbuilding.eu/index.php/project08:P5 Unspecified Final Presentation with Video Links https://youtube.com/playlist?list=PL9zFBsJmlp0nay8oyJ2VIQfGfc88--FgP Video Links of the presentation, Architecture, Urbanism and Building Sciences | Explorelab
- Published
- 2021
16. Expert Graphs: Synthesizing New Expertise via Collaboration
- Author
-
Mazaheri, Bijan, Jain, Siddharth, Bruck, Jehoshua, Mazaheri, Bijan, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Consider multiple experts with overlapping expertise working on a classification problem under uncertain input. What constitutes a consistent set of opinions? How can we predict the opinions of experts on missing sub-domains? In this paper, we define a framework of to analyze this problem, termed "expert graphs." In an expert graph, vertices represent classes and edges represent binary opinions on the topics of their vertices. We derive necessary conditions for expert graph validity and use them to create "synthetic experts" which describe opinions consistent with the observed opinions of other experts. We show this framework to be equivalent to the well-studied linear ordering polytope. We show our conditions are not sufficient for describing all expert graphs on cliques, but are sufficient for cycles., Comment: 13 pages, 11 figures
- Published
- 2021
17. Advances of carbon capture and storage in coal-based power generating units in an Indian context
- Author
-
Kumar Shukla, Anoop, Ahmad, Zoheb, Sharma, Meeta, Dwivedi, Gaurav, Nath Verma, Tikendra, Jain, Siddharth, Verma, Puneet, Zare, Ali, Kumar Shukla, Anoop, Ahmad, Zoheb, Sharma, Meeta, Dwivedi, Gaurav, Nath Verma, Tikendra, Jain, Siddharth, Verma, Puneet, and Zare, Ali
- Published
- 2020
18. Coding for Optimized Writing Rate in DNA Storage
- Author
-
Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
A method for encoding information in DNA sequences is described. The method is based on the precision-resolution framework, and is aimed to work in conjunction with a recently suggested terminator-free template independent DNA synthesis method. The suggested method optimizes the amount of information bits per synthesis time unit, namely, the writing rate. Additionally, the encoding scheme studied here takes into account the existence of multiple copies of the DNA sequence, which are independently distorted. Finally, quantizers for various run-length distributions are designed.
- Published
- 2020
19. What is the Value of Data? on Mathematical Methods for Data Quality Estimation
- Author
-
Raviv, Netanel, Jain, Siddharth, Bruck, Jehoshua, Raviv, Netanel, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Data is one of the most important assets of the information age, and its societal impact is undisputed. Yet, rigorous methods of assessing the quality of data are lacking. In this paper, we propose a formal definition for the quality of a given dataset. We assess a dataset’s quality by a quantity we call the expected diameter, which measures the expected disagreement between two randomly chosen hypotheses that explain it, and has recently found applications in active learning. We focus on Boolean hyperplanes, and utilize a collection of Fourier analytic, algebraic, and probabilistic methods to come up with theoretical guarantees and practical solutions for the computation of the expected diameter. We also study the behaviour of the expected diameter on algebraically structured datasets, conduct experiments that validate this notion of quality, and demonstrate the feasibility of our techniques.
- Published
- 2020
20. CodNN – Robust Neural Networks From Coded Classification
- Author
-
Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, Jiang, Anxiao (Andrew), Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, and Jiang, Anxiao (Andrew)
- Abstract
Deep Neural Networks (DNNs) are a revolutionary force in the ongoing information revolution, and yet their intrinsic properties remain a mystery. In particular, it is widely known that DNNs are highly sensitive to noise, whether adversarial or random. This poses a fundamental challenge for hardware implementations of DNNs, and for their deployment in critical applications such as autonomous driving.In this paper we construct robust DNNs via error correcting codes. By our approach, either the data or internal layers of the DNN are coded with error correcting codes, and successful computation under noise is guaranteed. Since DNNs can be seen as a layered concatenation of classification tasks, our research begins with the core task of classifying noisy coded inputs, and progresses towards robust DNNs.We focus on binary data and linear codes. Our main result is that the prevalent parity code can guarantee robustness for a large family of DNNs, which includes the recently popularized binarized neural networks. Further, we show that the coded classification problem has a deep connection to Fourier analysis of Boolean functions.In contrast to existing solutions in the literature, our results do not rely on altering the training process of the DNN, and provide mathematically rigorous guarantees rather than experimental evidence.
- Published
- 2020
21. Coding for Optimized Writing Rate in DNA Storage
- Author
-
Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
A method for encoding information in DNA sequences is described. The method is based on the precisionresolution framework, and is aimed to work in conjunction with a recently suggested terminator-free template independent DNA synthesis method. The suggested method optimizes the amount of information bits per synthesis time unit, namely, the writing rate. Additionally, the encoding scheme studied here takes into account the existence of multiple copies of the DNA sequence, which are independently distorted. Finally, quantizers for various run-length distributions are designed.
- Published
- 2020
22. Improve Robustness of Deep Neural Networks by Coding
- Author
-
Huang, Kunping, Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, Siegel, Paul H., Jiang, Anxiao (Andrew), Huang, Kunping, Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, Siegel, Paul H., and Jiang, Anxiao (Andrew)
- Abstract
Deep neural networks (DNNs) typically have many weights. When errors appear in their weights, which are usually stored in non-volatile memories, their performance can degrade significantly. We review two recently presented approaches that improve the robustness of DNNs in complementary ways. In the first approach, we use error-correcting codes as external redundancy to protect the weights from errors. A deep reinforcement learning algorithm is used to optimize the redundancy-performance tradeoff. In the second approach, internal redundancy is added to neurons via coding. It enables neurons to perform robust inference in noisy environments.
- Published
- 2020
23. CodNN - Robust Neural Networks From Coded Classification
- Author
-
Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, Jiang, Anxiao (Andrew), Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, and Jiang, Anxiao (Andrew)
- Abstract
Deep Neural Networks (DNNs) are a revolutionary force in the ongoing information revolution, and yet their intrinsic properties remain a mystery. In particular, it is widely known that DNNs are highly sensitive to noise, whether adversarial or random. This poses a fundamental challenge for hardware implementations of DNNs, and for their deployment in critical applications such as autonomous driving. In this paper we construct robust DNNs via error correcting codes. By our approach, either the data or internal layers of the DNN are coded with error correcting codes, and successful computation under noise is guaranteed. Since DNNs can be seen as a layered concatenation of classification tasks, our research begins with the core task of classifying noisy coded inputs, and progresses towards robust DNNs. We focus on binary data and linear codes. Our main result is that the prevalent parity code can guarantee robustness for a large family of DNNs, which includes the recently popularized binarized neural networks. Further, we show that the coded classification problem has a deep connection to Fourier analysis of Boolean functions. In contrast to existing solutions in the literature, our results do not rely on altering the training process of the DNN, and provide mathematically rigorous guarantees rather than experimental evidence.
- Published
- 2020
24. Cancer Classification from Blood-Derived DNA
- Author
-
Jain, Siddharth, Mazaheri, Bijan, Raviv, Netanel, Bruck, Jehoshua, Jain, Siddharth, Mazaheri, Bijan, Raviv, Netanel, and Bruck, Jehoshua
- Abstract
The genome is traditionally viewed as a time-independent source of information; a paradigm that drives researchers to seek correlations between the presence of certain genes and a patient's risk of disease. This analysis neglects genomic temporal changes, which we believe to be a crucial signal for predicting an individual's susceptibility to cancer. We hypothesize that each individual's genome passes through an evolution channel (The term channel is motivated by the notion of communication channel introduced by Shannon in 1948 and started the area of Information Theory), that is controlled by hereditary, environmental and stochastic factors. This channel differs among individuals, giving rise to varying predispositions to developing cancer. We introduce the concept of mutation profiles that are computed without any comparative analysis, but by analyzing the short tandem repeat regions in a single healthy genome and capturing information about the individual's evolution channel. Using machine learning on data from more than 5,000 TCGA cancer patients, we demonstrate that these mutation profiles can accurately distinguish between patients with various types of cancer. For example, the pairwise validation accuracy of the classifier between PAAD (pancreas) patients and GBM (brain) patients is 93%. Our results show that healthy unaffected cells still contain a cancer-specific signal, which opens the possibility of cancer prediction from a healthy genome.
- Published
- 2020
25. Robust Correction of Sampling Bias Using Cumulative Distribution Functions
- Author
-
Mazaheri, Bijan, Jain, Siddharth, Bruck, Jehoshua, Mazaheri, Bijan, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Varying domains and biased datasets can lead to differences between the training and the target distributions, known as covariate shift. Current approaches for alleviating this often rely on estimating the ratio of training and target probability density functions. These techniques require parameter tuning and can be unstable across different datasets. We present a new method for handling covariate shift using the empirical cumulative distribution function estimates of the target distribution by a rigorous generalization of a recent idea proposed by Vapnik and Izmailov. Further, we show experimentally that our method is more robust in its predictions, is not reliant on parameter tuning and shows similar classification performance compared to the current state-of-the-art techniques on synthetic and real datasets.
- Published
- 2020
26. Coding for Optimized Writing Rate in DNA Storage
- Author
-
Jain, Siddharth, Farnoud, Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud, Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
A method for encoding information in DNA sequences is described. The method is based on the precision-resolution framework, and is aimed to work in conjunction with a recently suggested terminator-free template independent DNA synthesis method. The suggested method optimizes the amount of information bits per synthesis time unit, namely, the writing rate. Additionally, the encoding scheme studied here takes into account the existence of multiple copies of the DNA sequence, which are independently distorted. Finally, quantizers for various run-length distributions are designed., Comment: To appear in ISIT 2020
- Published
- 2020
27. CodNN -- Robust Neural Networks From Coded Classification
- Author
-
Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, Jiang, Anxiao, Raviv, Netanel, Jain, Siddharth, Upadhyaya, Pulakesh, Bruck, Jehoshua, and Jiang, Anxiao
- Abstract
Deep Neural Networks (DNNs) are a revolutionary force in the ongoing information revolution, and yet their intrinsic properties remain a mystery. In particular, it is widely known that DNNs are highly sensitive to noise, whether adversarial or random. This poses a fundamental challenge for hardware implementations of DNNs, and for their deployment in critical applications such as autonomous driving. In this paper we construct robust DNNs via error correcting codes. By our approach, either the data or internal layers of the DNN are coded with error correcting codes, and successful computation under noise is guaranteed. Since DNNs can be seen as a layered concatenation of classification tasks, our research begins with the core task of classifying noisy coded inputs, and progresses towards robust DNNs. We focus on binary data and linear codes. Our main result is that the prevalent parity code can guarantee robustness for a large family of DNNs, which includes the recently popularized binarized neural networks. Further, we show that the coded classification problem has a deep connection to Fourier analysis of Boolean functions. In contrast to existing solutions in the literature, our results do not rely on altering the training process of the DNN, and provide mathematically rigorous guarantees rather than experimental evidence., Comment: To appear in ISIT '20
- Published
- 2020
28. What is the Value of Data? On Mathematical Methods for Data Quality Estimation
- Author
-
Raviv, Netanel, Jain, Siddharth, Bruck, Jehoshua, Raviv, Netanel, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Data is one of the most important assets of the information age, and its societal impact is undisputed. Yet, rigorous methods of assessing the quality of data are lacking. In this paper, we propose a formal definition for the quality of a given dataset. We assess a dataset's quality by a quantity we call the expected diameter, which measures the expected disagreement between two randomly chosen hypotheses that explain it, and has recently found applications in active learning. We focus on Boolean hyperplanes, and utilize a collection of Fourier analytic, algebraic, and probabilistic methods to come up with theoretical guarantees and practical solutions for the computation of the expected diameter. We also study the behaviour of the expected diameter on algebraically structured datasets, conduct experiments that validate this notion of quality, and demonstrate the feasibility of our techniques.
- Published
- 2020
29. Robust Correction of Sampling Bias Using Cumulative Distribution Functions
- Author
-
Mazaheri, Bijan, Jain, Siddharth, Bruck, Jehoshua, Mazaheri, Bijan, Jain, Siddharth, and Bruck, Jehoshua
- Abstract
Varying domains and biased datasets can lead to differences between the training and the target distributions, known as covariate shift. Current approaches for alleviating this often rely on estimating the ratio of training and target probability density functions. These techniques require parameter tuning and can be unstable across different datasets. We present a new method for handling covariate shift using the empirical cumulative distribution function estimates of the target distribution by a rigorous generalization of a recent idea proposed by Vapnik and Izmailov. Further, we show experimentally that our method is more robust in its predictions, is not reliant on parameter tuning and shows similar classification performance compared to the current state-of-the-art techniques on synthetic and real datasets., Comment: Accepted in Neurips 2020
- Published
- 2020
30. Short Tandem Repeats Information in TCGA is Statistically Biased by Amplification
- Author
-
Jain, Siddharth, Mazaheri, Bijan, Raviv, Netanel, Bruck, Jehoshua, Jain, Siddharth, Mazaheri, Bijan, Raviv, Netanel, and Bruck, Jehoshua
- Abstract
The current paradigm in data science is based on the belief that given sufficient amounts of data, classifiers are likely to uncover the distinction between true and false hypotheses. In particular, the abundance of genomic data creates opportunities for discovering disease risk associations and help in screening and treatment. However, working with large amounts of data is statistically beneficial only if the data is statistically unbiased. Here we demonstrate that amplification methods of DNA samples in TCGA have a substantial effect on short tandem repeat (STR) information. In particular, we design a classifier that uses the STR information and can distinguish between samples that have an analyte code D and an analyte code W. This artificial bias might be detrimental to data driven approaches, and might undermine the conclusions based on past and future genome wide studies.
- Published
- 2019
31. Cancer Classification from Healthy DNA using Machine Learning
- Author
-
Jain, Siddharth, Mazaheri, Bijan, Raviv, Netanel, Bruck, Jehoshua, Jain, Siddharth, Mazaheri, Bijan, Raviv, Netanel, and Bruck, Jehoshua
- Abstract
The genome is traditionally viewed as a time-independent source of information; a paradigm that drives researchers to seek correlations between the presence of certain genes and a patient's risk of disease. This analysis neglects genomic temporal changes, which we believe to be a crucial signal for predicting an individual's susceptibility to cancer. We hypothesize that each individual's genome passes through an evolution channel (The term channel is motivated by the notion of communication channel introduced by Shannon in 1948 and started the area of Information Theory), that is controlled by hereditary, environmental and stochastic factors. This channel differs among individuals, giving rise to varying predispositions to developing cancer. We introduce the concept of mutation profiles that are computed without any comparative analysis, but by analyzing the short tandem repeat regions in a single healthy genome and capturing information about the individual's evolution channel. Using machine learning on data from more than 5,000 TCGA cancer patients, we demonstrate that these mutation profiles can accurately distinguish between patients with various types of cancer. For example, the pairwise validation accuracy of the classifier between PAAD (pancreas) patients and GBM (brain) patients is 93%. Our results show that healthy unaffected cells still contain a cancer-specific signal, which opens the possibility of cancer prediction from a healthy genome.
- Published
- 2019
32. Attaining the 2nd Chargaff Rule by Tandem Duplications
- Author
-
Jain, Siddharth, Raviv, Netanel, Bruck, Jehoshua, Jain, Siddharth, Raviv, Netanel, and Bruck, Jehoshua
- Abstract
Erwin Chargaff in 1950 made an experimental observation that the count of A is equal to the count of T and the count of C is equal to the count of G in DNA. This observation played a crucial role in the discovery of the double stranded helix structure by Watson and Crick. However, this symmetry was also observed in single stranded DNA. This phenomenon was termed as the 2nd Chargaff Rule. This symmetry has been verified experimentally in genomes of several different species not only for mononucleotides but also for reverse complement pairs of larger lengths upto a small error. While the symmetry in double stranded DNA is related to base pairing and replication mechanisms, the symmetry in a single stranded DNA is still a mystery in its function and source. In this work, we define a sequence generation model based on reverse complement tandem duplications. We show that this model generates sequences that satisfy the 2nd Chargaff Rule even when the duplication lengths are very small when compared to the length of sequences. We also provide estimates on the number of generations that are needed by this model to generate sequences that satisfy the 2nd Chargaff Rule. We provide theoretical bounds on the disruption in symmetry for different values of duplication lengths under this model. Moreover, we experimentally compare the disruption in the symmetry incurred by our model with what is observed in human genome data.
- Published
- 2018
33. Attaining the 2nd Chargaff Rule by Tandem Duplications
- Author
-
Jain, Siddharth, Raviv, Netanel, Bruck, Jehoshua, Jain, Siddharth, Raviv, Netanel, and Bruck, Jehoshua
- Abstract
Erwin Chargaff in 1950 made an experimental observation that the count of A is equal to the count of T and the count of C is equal to the count of G in DNA. This observation played a crucial rule in the discovery of the double stranded helix structure by Watson and Crick. However, this symmetry was also observed in single stranded DNA. This phenomenon was termed as 2nd Chargaff Rule. This symmetry has been verified experimentally in genomes of several different species not only for mononucleotides but also for reverse complement pairs of larger lengths up to a small error. While the symmetry in double stranded DNA is related to base pairing, and replication mechanisms, the symmetry in a single stranded DNA is still a mystery in its function and source. In this work, we define a sequence generation model based on reverse complement tandem duplications. We show that this model generates sequences that satisfy the 2nd Chargaff Rule even when the duplication lengths are very small when compared to the length of sequences. We also provide estimates on the number of generations that are needed by this model to generate sequences that satisfy 2nd Chargaff Rule. We provide theoretical bounds on the disruption in symmetry for different values of duplication lengths under this model. Moreover, we experimentally compare the disruption in the symmetry incurred by our model with what is observed in human genome data.
- Published
- 2018
34. Amino Acids Are an Ineffective Fertilizer for Dunaliella spp. Growth
- Author
-
Murphree, Colin A., Dums, Jacob T., Jain, Siddharth K., Zhao, Chengsong, Young, Danielle Y., Khoshnoodi, Nicole, Tikunov, Andrey, Macdonald, Jeffrey, Pilot, Guillaume, Sederoff, Heike, Murphree, Colin A., Dums, Jacob T., Jain, Siddharth K., Zhao, Chengsong, Young, Danielle Y., Khoshnoodi, Nicole, Tikunov, Andrey, Macdonald, Jeffrey, Pilot, Guillaume, and Sederoff, Heike
- Abstract
Autotrophic microalgae are a promising bioproducts platform. However, the fundamental requirements these organisms have for nitrogen fertilizer severely limit the impact and scale of their cultivation. As an alternative to inorganic fertilizers, we investigated the possibility of using amino acids from deconstructed biomass as a nitrogen source in the genus Dunaliella. We found that only four amino acids (glutamine, histidine, cysteine, and tryptophan) rescue Dunaliella spp. growth in nitrogen depleted media, and that supplementation of these amino acids altered the metabolic profile of Dunaliella cells. Our investigations revealed that histidine is transported across the cell membrane, and that glutamine and cysteine are not transported. Rather, glutamine, cysteine, and tryptophan are degraded in solution by a set of oxidative chemical reactions, releasing ammonium that in turn supports growth. Utilization of biomass-derived amino acids is therefore not a suitable option unless additional amino acid nitrogen uptake is enabled through genetic modifications of these algae.
- Published
- 2017
- Full Text
- View/download PDF
35. Noise and uncertainty in string-duplication systems
- Author
-
Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
Duplication mutations play a critical role in the generation of biological sequences. Simultaneously, they have a deleterious effect on data stored using in-vivo DNA data storage. While duplications have been studied both as a sequence-generation mechanism and in the context of error correction, for simplicity these studies have not taken into account the presence of other types of mutations. In this work, we consider the capacity of duplication mutations in the presence of point-mutation noise, and so quantify the generation power of these mutations. We show that if the number of point mutations is vanishingly small compared to the number of duplication mutations of a constant length, the generation capacity of these mutations is zero. However, if the number of point mutations increases to a constant fraction of the number of duplications, then the capacity is nonzero. Lower and upper bounds for this capacity are also presented. Another problem that we study is concerned with the mismatch between code design and channel in data storage in the DNA of living organisms with respect to duplication mutations. In this context, we consider the uncertainty of such a mismatched coding scheme measured as the maximum number of input codewords that can lead to the same output.
- Published
- 2017
36. Noise and Uncertainty in String-Duplication Systems
- Author
-
Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
Duplication mutations play a critical role in the generation of biological sequences. Simultaneously, they have a deleterious effect on data stored using in-vivo DNA data storage. While duplications have been studied both as a sequence-generation mechanism and in the context of error correction, for simplicity these studies have not taken into account the presence of other types of mutations. In this work, we consider the capacity of duplication mutations in the presence of point-mutation noise, and so quantify the generation power of these mutations. We show that if the number of point mutations is vanishingly small compared to the number of duplication mutations of a constant length, the generation capacity of these mutations is zero. However, if the number of point mutations increases to a constant fraction of the number of duplications, then the capacity is nonzero. Lower and upper bounds for this capacity are also presented. Another problem that we study is concerned with the mismatch between code design and channel in data storage in the DNA of living organisms with respect to duplication mutations. In this context, we consider the uncertainty of such a mismatched coding scheme measured as the maximum number of input codewords that can lead to the same output.
- Published
- 2017
37. Amino Acids Are an Ineffective Fertilizer for Dunaliella spp. Growth
- Author
-
School of Plant and Environmental Sciences, Murphree, Colin A., Dums, Jacob T., Jain, Siddharth K., Zhao, Chengsong, Young, Danielle Y., Khoshnoodi, Nicole, Tikunov, Andrey, Macdonald, Jeffrey, Pilot, Guillaume, Sederoff, Heike, School of Plant and Environmental Sciences, Murphree, Colin A., Dums, Jacob T., Jain, Siddharth K., Zhao, Chengsong, Young, Danielle Y., Khoshnoodi, Nicole, Tikunov, Andrey, Macdonald, Jeffrey, Pilot, Guillaume, and Sederoff, Heike
- Abstract
Autotrophic microalgae are a promising bioproducts platform. However, the fundamental requirements these organisms have for nitrogen fertilizer severely limit the impact and scale of their cultivation. As an alternative to inorganic fertilizers, we investigated the possibility of using amino acids from deconstructed biomass as a nitrogen source in the genus Dunaliella. We found that only four amino acids (glutamine, histidine, cysteine, and tryptophan) rescue Dunaliella spp. growth in nitrogen depleted media, and that supplementation of these amino acids altered the metabolic profile of Dunaliella cells. Our investigations revealed that histidine is transported across the cell membrane, and that glutamine and cysteine are not transported. Rather, glutamine, cysteine, and tryptophan are degraded in solution by a set of oxidative chemical reactions, releasing ammonium that in turn supports growth. Utilization of biomass-derived amino acids is therefore not a suitable option unless additional amino acid nitrogen uptake is enabled through genetic modifications of these algae.
- Published
- 2017
38. Integrated hybrid silicon DFB laser-EAM array using quantum well intermixing
- Author
-
Jain, Siddharth, Jain, Siddharth, Sysak, Matthew, Kurczveil, Geza, Bowers, J E, Jain, Siddharth, Jain, Siddharth, Sysak, Matthew, Kurczveil, Geza, and Bowers, J E
- Published
- 2011
39. Integrated hybrid silicon DFB laser-EAM array using quantum well intermixing
- Author
-
Jain, Siddharth, Jain, Siddharth, Sysak, Matthew, Kurczveil, Geza, Bowers, J E, Jain, Siddharth, Jain, Siddharth, Sysak, Matthew, Kurczveil, Geza, and Bowers, J E
- Published
- 2011
40. Duplication Distance to the Root for Binary Sequences
- Author
-
Alon, Noga, Bruck, Jehoshua, Farnoud (Hassanzadeh), Farzad, Jain, Siddharth, Alon, Noga, Bruck, Jehoshua, Farnoud (Hassanzadeh), Farzad, and Jain, Siddharth
- Abstract
We study the tandem duplication distance between binary sequences and their roots. In other words, the quantity of interest is the number of tandem duplication operations of the form x = abc → y = abbc, where x and y are sequences and a, b, and c are their substrings, needed to generate a binary sequence of length n starting from a square-free sequence from the set {0, 1, 01, 10, 010, 101}. This problem is a restricted case of finding the duplication/deduplication distance between two sequences, defined as the minimum number of duplication and deduplication operations required to transform one sequence to the other. We consider both exact and approximate tandem duplications. For exact duplication, denoting the maximum distance to the root of a sequence of length n by f(n), we prove that f(n) = θ(n). For the case of approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show that the maximum distance has a sharp transition from linear in n to logarithmic at β = 1/2. We also study the duplication distance to the root for sequences with a given root and for special classes of sequences, namely, the de Bruijn sequences, the Thue-Morse sequence, and the Fibbonaci words. The problem is motivated by genomic tandem duplication mutations and the smallest number of tandem duplication events required to generate a given biological sequence.
- Published
- 2016
41. Duplication-Correcting Codes for Data Storage in the DNA of Living Organisms
- Author
-
Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
The ability to store data in the DNA of a living organism has applications in a variety of areas including synthetic biology and watermarking of patented genetically-modified organisms. Data stored in this medium is subject to errors arising from various mutations, such as point mutations, indels, and tandem duplication, which need to be corrected to maintain data integrity. In this paper, we provide error-correcting codes for errors caused by tandem duplications, which create a copy of a block of the sequence and insert it in a tandem manner, i.e., next to the original. In particular, we present a family of codes for correcting errors due to tandem-duplications of a fixed length and any number of errors. We also study codes for correcting tandem duplications of length up to a given constant k, where we are primarily focused on the cases of k = 2, 3.
- Published
- 2016
42. Duplication-Correcting Codes for Data Storage in the DNA of Living Organisms
- Author
-
Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud (Hassanzadeh), Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
The ability to store data in the DNA of a living organism has applications in a variety of areas including synthetic biology and watermarking of patented genetically-modified organisms. Data stored in this medium is subject to errors arising from various mutations, such as point mutations, indels, and tandem duplication, which need to be corrected to maintain data integrity. In this paper, we provide error-correcting codes for errors caused by tandem duplications, which create a copy of a block of the sequence and insert it in a tandem manner, i.e., next to the original. In particular, we present two families of codes for correcting errors due to tandem-duplications of a fixed length; the first family can correct any number of errors while the second corrects a bounded number of errors. We also study codes for correcting tandem duplications of length up to a given constant k, where we are primarily focused on the cases of k = 2, 3.
- Published
- 2016
43. To compare the outcome of local infiltration of corticosteroid and percutaneous release of pulley in treatment of trigger finger
- Author
-
Turkar, Rajesh, Jain, Siddharth, Turkar, Rajesh, and Jain, Siddharth
- Abstract
Background: Stenosing tenosynovitis of fingers is one of the common tendinopathy attended in orthopaedic practice. A number of methods have been described for the treatment of this problem. Treatment ranges from conservative management to surgical procedures. Stenosing tenosynovit is also known as Trigger finger. Material & Methods: in this Prospective study all patients presented with trigger finger Grade 2 and 3 were randomly allocated into 2 groups. One group received local corticosteroid injection and in the other group, percutaneous release of pulley was performed as treatment option. These patients were then followed and assessed weekly over a period of two month and their progress noted. Results: We studied a total of 42 patients. Majority (71.4%) were females. The commonest age group is 40-50 years olds (56.6%).The most common presenting symptom was pain with triggering (52.3%). There was significant improvement in pain in the first two weeks in both groups but there was better improvement of pain in the corticosteroid group initially especially after first week. As for the triggering, there was significant improvement noted in first week in corticosteroid group but there was no difference in degree of improvement between both the groups after four week. The corticosteroid group had a complication rate of 10% whereas the percutaneous release group complication rate was 18.1%%. The recurrence rate was comparable in both the groups. Conclusion: Trigger finger is a common condition in orthopaedic practice. The commonly affected fingers are the centrally located on the palm. Local infiltration corticosteroid percutaneous release of pulley gives comparable results in long follow-up however corticosteroid injection gives better result initially with less complication.
- Published
- 2016
44. To compare the outcome of intramedullary nailing and locking compression plate fixation in treatment of proximal one third tibia fracture: A randomized control trial
- Author
-
Jain, Siddharth, Verma, Rahul, Gaur, Sanjeev, Gohiya, Ashish, Jain, Siddharth, Verma, Rahul, Gaur, Sanjeev, and Gohiya, Ashish
- Abstract
Background: Fractures of tibia shaft are most common of long bone fractures. Proximal tibia fractures account for approximately 5% to 11% of all tibial injuries and affect knee function and stability in most of the cases. Higher rates of malunion and increased incidence of associated complications have made these fractures particularly problematic. In recent years due to advancement in technique, proximal tibia plating and multidirectional locked intramedullary nailing, both have become widely used treatment modalities for proximal tibial metaphyseal fractures. This study was performed to compare plating and nailing options in proximal tibia extra-articular fractures. Materials and methods: This randomized prospective clinical study was conducted on 62 skeletally mature patients with closed extra-articular fracture of the proximal tibia treated with proximal tibial locking compression plating (PTLCP) or intramedullary nailing (IMN) by expert surgeons at a tertiary trauma center. Results: Postoperative hospital stay (p = 0.043), postoperative infection rate (p = 0.036) were significantly high in the PTLCP group than in the IMN group, while rate of malunion (p=0.041) and nonunion(0.037) were significantly high in IMN group than in PTLCP group. However there was no clear advantage of either technique in terms of functional recovery of knee. Conclusion: Present comparison of IMN and PTLCP for the treatment of proximal one third tibia fracture showed no clear advantage of either technique. Present study concluded that both forms of treatment (IMN and PTLCP) provide adequate fracture stability. Level of evidence: Level 2, randomized controlled trial.
- Published
- 2016
45. A prospective study on association of serum uric acid level with knee osteoarthritis
- Author
-
Jain, Siddharth, Jain, Mudit, Jain, Siddharth, and Jain, Mudit
- Abstract
Background: A numbers of studies have described a relationship among serum uric acid level and generalized osteoarthritis in recent past, but studies on evidence of relationship among serum uric acid and knee joint osteoarthritis are limited. Present study is intended to validate the association between serum uric acid levels and osteoarthritis of kne joint. Methods: This is a prospective study including three hundred forty patients (225 males, 115 females) with clinically diagnosed osteoarthritis of knee. Patient’s radiographs of affected knees and hands were obtained along with their serum uric acid level, Rheumatoid factor and C reactive protein level. Keligren-Lawrence osteoarthritis scale was used for grading of knee osteoarthritis roentgenographically. On the basis of serum uric acid levels, all the patients were divided in to three groups acoording to serum uric acid level Group 1: - serum uric acid less than 5mg/dl, Group 2: - serum uric acid levels between 5.1-7 mg/dl, Group 3: - serum uric acid levels greater than 7 mg/dl. Results: Out of 340 patients, 238 patients (70 %) dignosed as isolated knee joint osteoarthritis, generalized osteoarthritis is seen in 66 (19.6 %) patients, and Rheumatoid Factor, along with C-reactive protein is positive in 36(10.5%) patients. Association of osteoarthritis of knee joint, generalized osteoarthritis with the highest tertile of serum uric acid level found to be strongly positive [adjusted odds ratio- 2.31 and adjusted odds ratio- 3.27 respectively). Also association between increasing serum uric acid levels and progression of the osteoarthritis of knee found to be positive. Conclusion: This study supports a possible correlation between hyperuricemia and osteoarthritis. also a positive association of knee joint osteoarthritis, generalized osteoarthritis progression with increasing uric acid level is suggested.
- Published
- 2016
46. To compare the outcome of intramedullary nailing and locking compression plate fixation in treatment of proximal one third tibia fracture: A randomized control trial
- Author
-
Jain, Siddharth, Verma, Rahul, Gaur, Sanjeev, Gohiya, Ashish, Jain, Siddharth, Verma, Rahul, Gaur, Sanjeev, and Gohiya, Ashish
- Abstract
Background: Fractures of tibia shaft are most common of long bone fractures. Proximal tibia fractures account for approximately 5% to 11% of all tibial injuries and affect knee function and stability in most of the cases. Higher rates of malunion and increased incidence of associated complications have made these fractures particularly problematic. In recent years due to advancement in technique, proximal tibia plating and multidirectional locked intramedullary nailing, both have become widely used treatment modalities for proximal tibial metaphyseal fractures. This study was performed to compare plating and nailing options in proximal tibia extra-articular fractures. Materials and methods: This randomized prospective clinical study was conducted on 62 skeletally mature patients with closed extra-articular fracture of the proximal tibia treated with proximal tibial locking compression plating (PTLCP) or intramedullary nailing (IMN) by expert surgeons at a tertiary trauma center. Results: Postoperative hospital stay (p = 0.043), postoperative infection rate (p = 0.036) were significantly high in the PTLCP group than in the IMN group, while rate of malunion (p=0.041) and nonunion(0.037) were significantly high in IMN group than in PTLCP group. However there was no clear advantage of either technique in terms of functional recovery of knee. Conclusion: Present comparison of IMN and PTLCP for the treatment of proximal one third tibia fracture showed no clear advantage of either technique. Present study concluded that both forms of treatment (IMN and PTLCP) provide adequate fracture stability. Level of evidence: Level 2, randomized controlled trial.
- Published
- 2016
47. A prospective study on association of serum uric acid level with knee osteoarthritis
- Author
-
Jain, Siddharth, Jain, Mudit, Jain, Siddharth, and Jain, Mudit
- Abstract
Background: A numbers of studies have described a relationship among serum uric acid level and generalized osteoarthritis in recent past, but studies on evidence of relationship among serum uric acid and knee joint osteoarthritis are limited. Present study is intended to validate the association between serum uric acid levels and osteoarthritis of kne joint. Methods: This is a prospective study including three hundred forty patients (225 males, 115 females) with clinically diagnosed osteoarthritis of knee. Patient’s radiographs of affected knees and hands were obtained along with their serum uric acid level, Rheumatoid factor and C reactive protein level. Keligren-Lawrence osteoarthritis scale was used for grading of knee osteoarthritis roentgenographically. On the basis of serum uric acid levels, all the patients were divided in to three groups acoording to serum uric acid level Group 1: - serum uric acid less than 5mg/dl, Group 2: - serum uric acid levels between 5.1-7 mg/dl, Group 3: - serum uric acid levels greater than 7 mg/dl. Results: Out of 340 patients, 238 patients (70 %) dignosed as isolated knee joint osteoarthritis, generalized osteoarthritis is seen in 66 (19.6 %) patients, and Rheumatoid Factor, along with C-reactive protein is positive in 36(10.5%) patients. Association of osteoarthritis of knee joint, generalized osteoarthritis with the highest tertile of serum uric acid level found to be strongly positive [adjusted odds ratio- 2.31 and adjusted odds ratio- 3.27 respectively). Also association between increasing serum uric acid levels and progression of the osteoarthritis of knee found to be positive. Conclusion: This study supports a possible correlation between hyperuricemia and osteoarthritis. also a positive association of knee joint osteoarthritis, generalized osteoarthritis progression with increasing uric acid level is suggested.
- Published
- 2016
48. To compare the outcome of local infiltration of corticosteroid and percutaneous release of pulley in treatment of trigger finger
- Author
-
Turkar, Rajesh, Jain, Siddharth, Turkar, Rajesh, and Jain, Siddharth
- Abstract
Background: Stenosing tenosynovitis of fingers is one of the common tendinopathy attended in orthopaedic practice. A number of methods have been described for the treatment of this problem. Treatment ranges from conservative management to surgical procedures. Stenosing tenosynovit is also known as Trigger finger. Material & Methods: in this Prospective study all patients presented with trigger finger Grade 2 and 3 were randomly allocated into 2 groups. One group received local corticosteroid injection and in the other group, percutaneous release of pulley was performed as treatment option. These patients were then followed and assessed weekly over a period of two month and their progress noted. Results: We studied a total of 42 patients. Majority (71.4%) were females. The commonest age group is 40-50 years olds (56.6%).The most common presenting symptom was pain with triggering (52.3%). There was significant improvement in pain in the first two weeks in both groups but there was better improvement of pain in the corticosteroid group initially especially after first week. As for the triggering, there was significant improvement noted in first week in corticosteroid group but there was no difference in degree of improvement between both the groups after four week. The corticosteroid group had a complication rate of 10% whereas the percutaneous release group complication rate was 18.1%%. The recurrence rate was comparable in both the groups. Conclusion: Trigger finger is a common condition in orthopaedic practice. The commonly affected fingers are the centrally located on the palm. Local infiltration corticosteroid percutaneous release of pulley gives comparable results in long follow-up however corticosteroid injection gives better result initially with less complication.
- Published
- 2016
49. Duplication Distance to the Root for Binary Sequences
- Author
-
Alon, Noga, Bruck, Jehoshua, Farnoud, Farzad, Jain, Siddharth, Alon, Noga, Bruck, Jehoshua, Farnoud, Farzad, and Jain, Siddharth
- Abstract
We study the tandem duplication distance between binary sequences and their roots. In other words, the quantity of interest is the number of tandem duplication operations of the form $\seq x = \seq a \seq b \seq c \to \seq y = \seq a \seq b \seq b \seq c$, where $\seq x$ and $\seq y$ are sequences and $\seq a$, $\seq b$, and $\seq c$ are their substrings, needed to generate a binary sequence of length $n$ starting from a square-free sequence from the set $\{0,1,01,10,010,101\}$. This problem is a restricted case of finding the duplication/deduplication distance between two sequences, defined as the minimum number of duplication and deduplication operations required to transform one sequence to the other. We consider both exact and approximate tandem duplications. For exact duplication, denoting the maximum distance to the root of a sequence of length $n$ by $f(n)$, we prove that $f(n)=\Theta(n)$. For the case of approximate duplication, where a $\beta$-fraction of symbols may be duplicated incorrectly, we show that the maximum distance has a sharp transition from linear in $n$ to logarithmic at $\beta=1/2$. We also study the duplication distance to the root for sequences with a given root and for special classes of sequences, namely, the de Bruijn sequences, the Thue-Morse sequence, and the Fibbonaci words. The problem is motivated by genomic tandem duplication mutations and the smallest number of tandem duplication events required to generate a given biological sequence., Comment: submitted to IEEE Transactions on Information Theory
- Published
- 2016
50. Duplication-Correcting Codes for Data Storage in the DNA of Living Organisms
- Author
-
Jain, Siddharth, Farnoud, Farzad, Schwartz, Moshe, Bruck, Jehoshua, Jain, Siddharth, Farnoud, Farzad, Schwartz, Moshe, and Bruck, Jehoshua
- Abstract
The ability to store data in the DNA of a living organism has applications in a variety of areas including synthetic biology and watermarking of patented genetically-modified organisms. Data stored in this medium is subject to errors arising from various mutations, such as point mutations, indels, and tandem duplication, which need to be corrected to maintain data integrity. In this paper, we provide error-correcting codes for errors caused by tandem duplications, which create a copy of a block of the sequence and insert it in a tandem manner, i.e., next to the original. In particular, we present two families of codes for correcting errors due to tandem-duplications of a fixed length, the first family can correct any number of errors while the second corrects a bounded number of errors. We also study codes for correcting tandem duplications of length up to a given constant $k$, where we are primarily focused on the cases of $k=2,3$. Finally, we provide a full classification of the sets of lengths allowed in tandem duplication that result in a unique root for all sequences., Comment: Submitted to IEEE Transactions on Information Theory
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.