45 results on '"Jenia Jitsev"'
Search Results
2. Experience-driven formation of parts-based representations in a model of layered visual memory
- Author
-
Jenia Jitsev and Christoph V. Der Malsburg
- Subjects
self-organization ,cortical column ,visual memory ,competitive learning ,unsupervised learning ,activity homeostasis ,Neurosciences. Biological psychiatry. Neuropsychiatry ,RC321-571 - Abstract
Growing neuropsychological and neurophysiological evidence suggests that the visual cortex uses parts-based representations to encode, store and retrieve relevant objects. In such a scheme, objects are represented as a set of spatially distributed local features, or parts, arranged in stereotypical fashion. To encode the local appearance and to represent the relations between the constituent parts, there has to be an appropriate memory structure formed by previous experience with visual objects. Here, we propose a model how a hierarchical memory structure supporting efficient storage and rapid recall of parts-based representations can be established by an experience-driven process of self-organization. The process is based on the collaboration of slow bidirectional synaptic plasticity and homeostatic unit activity regulation, both running at the top of fast activity dynamics with winner-take-all character modulated by an oscillatory rhythm. These neural mechanisms lay down the basis for cooperation and competition between the distributed units and their synaptic connections. Choosing human face recognition as a test task, we show that, under the condition of open-ended, unsupervised incremental learning, the system is able to form memory traces for individual faces in a parts-based fashion. On a lower memory layer the synaptic structure is developed to represent local facial features and their interrelations, while the identities of different persons are captured explicitly on a higher layer. An additional property of the resulting representations is the sparseness of both the activity during the recall and the synaptic patterns comprising the memory traces.
- Published
- 2009
- Full Text
- View/download PDF
3. Reproducible Scaling Laws for Contrastive Language-Image Learning.
- Author
-
Mehdi Cherti, Romain Beaumont, Ross Wightman, Mitchell Wortsman, Gabriel Ilharco, Cade Gordon, Christoph Schuhmann, Ludwig Schmidt, and Jenia Jitsev
- Published
- 2023
- Full Text
- View/download PDF
4. Effect of pre-training scale on intra- and inter-domain, full and few-shot transfer learning for natural and X-Ray chest images.
- Author
-
Mehdi Cherti and Jenia Jitsev
- Published
- 2022
- Full Text
- View/download PDF
5. DataComp: In search of the next generation of multimodal datasets.
- Author
-
Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah M. Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander J. Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, and Ludwig Schmidt
- Published
- 2023
6. JUWELS Booster - A Supercomputer for Large-Scale AI Research.
- Author
-
Stefan Kesselheim, Andreas Herten, Kai Krajsek, Jan Ebert, Jenia Jitsev, Mehdi Cherti, Michael Langguth, Bing Gong, Scarlet Stadtler, Amirpasha Mozaffari, Gabriele Cavallaro, Rocco Sedona, Alexander Schug, Alexandre Strube, Roshni Kamath, Martin G. Schultz, Morris Riedel, and Thomas Lippert
- Published
- 2021
- Full Text
- View/download PDF
7. Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning.
- Author
-
Marco Pleines, Jenia Jitsev, Mike Preuss, and Frank Zimmer
- Published
- 2020
- Full Text
- View/download PDF
8. Prediction of Acoustic Fields Using a Lattice-Boltzmann Method and Deep Learning.
- Author
-
Mario Rüttgers, Seong-Ryong Koh, Jenia Jitsev, Wolfgang Schröder 0001, and Andreas Lintermann
- Published
- 2020
- Full Text
- View/download PDF
9. Super-Resolution of Large Volumes of Sentinel-2 Images with High Performance Distributed Deep Learning.
- Author
-
Run Zhang, Gabriele Cavallaro, and Jenia Jitsev
- Published
- 2020
- Full Text
- View/download PDF
10. Scaling Up a Multispectral Resnet-50 to 128 GPUs.
- Author
-
Rocco Sedona, Gabriele Cavallaro, Jenia Jitsev, Alexandre Strube, Morris Riedel, and Matthias Book
- Published
- 2020
- Full Text
- View/download PDF
11. LAION-5B: An open large-scale dataset for training next generation image-text models.
- Author
-
Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, and Jenia Jitsev
- Published
- 2022
12. Towards Prediction of Turbulent Flows at High Reynolds Numbers Using High Performance Computing Data and Deep Learning.
- Author
-
Mathis Bode, Michael Gauding, Jens Henrik Göbbert, Baohao Liao, Jenia Jitsev, and Heinz Pitsch
- Published
- 2018
- Full Text
- View/download PDF
13. Self-generated Off-line Memory Reprocessing Strongly Improves Generalization in a Hierarchical Recurrent Neural Network.
- Author
-
Jenia Jitsev
- Published
- 2014
- Full Text
- View/download PDF
14. Learning from Delayed Reward und Punishment in a Spiking Neural Network Model of Basal Ganglia with Opposing D1/D2 Plasticity.
- Author
-
Jenia Jitsev, Nobi Abraham, Abigail Morrison, and Marc Tittgemeyer
- Published
- 2012
- Full Text
- View/download PDF
15. Learning from positive and negative rewards in a spiking neural network model of basal ganglia.
- Author
-
Jenia Jitsev, Abigail Morrison, and Marc Tittgemeyer
- Published
- 2012
- Full Text
- View/download PDF
16. Information-Theoretic Connectivity-Based Cortex Parcellation.
- Author
-
Nico S. Gorbach, Silvan Siep, Jenia Jitsev, Corina Melzer, and Marc Tittgemeyer
- Published
- 2011
- Full Text
- View/download PDF
17. A Gabor Wavelet Pyramid-Based Object Detection Algorithm.
- Author
-
Yasuomi D. Sato, Jenia Jitsev, Jörg Bornschein, Daniela Pamplona, Christian Keck, and Christoph von der Malsburg
- Published
- 2011
- Full Text
- View/download PDF
18. Visual Object Detection by Specifying the Scale and Rotation Transformations.
- Author
-
Yasuomi D. Sato, Jenia Jitsev, and Christoph von der Malsburg
- Published
- 2010
- Full Text
- View/download PDF
19. Dynamic link models for global decision making with binding-by-synchrony.
- Author
-
Yasuomi D. Sato, Jenia Jitsev, Thomas Burwick, and Christoph von der Malsburg
- Published
- 2010
- Full Text
- View/download PDF
20. Off-line memory reprocessing following on-line unsupervised learning strongly improves recognition performance in a hierarchical visual memory.
- Author
-
Jenia Jitsev and Christoph von der Malsburg
- Published
- 2010
- Full Text
- View/download PDF
21. A Visual Object Recognition System Invariant to Scale and Rotation.
- Author
-
Yasuomi D. Sato, Jenia Jitsev, and Christoph von der Malsburg
- Published
- 2008
- Full Text
- View/download PDF
22. Deep Learning for the Automation of Particle Analysis in Catalyst Layers for Polymer Electrolyte Fuel Cells
- Author
-
Mohammad Javad Eslamibidgoli, Kourosh Malek, Mariah Batool, Jasna Jankovic, André Colliard-Granero, Jenia Jitsev, and Michael Eikerling
- Subjects
Computer science ,business.industry ,Deep learning ,Medical image computing ,Preprocessor ,Particle ,Segmentation ,Image processing ,Particle size ,Artificial intelligence ,business ,Convolutional neural network ,Computational science - Abstract
The rapidly growing use of imaging infrastructure in the energy materials domain drives significant data accumulation in terms of their amount and complexity. The applications of routine techniques for image processing in materials research are often \textit{ad hoc}, indiscriminate, and empirical, which renders the crucial task of obtaining reliable metrics for quantifications obscure. Moreover, these techniques are expensive, slow, and often involve several preprocessing steps. This paper presents a novel deep learning-based approach for the high-throughput analysis of the particle size distributions from transmission electron microscopy (TEM) images of carbon-supported catalysts for polymer electrolyte fuel cells. Our approach employs training an instance segmentation model, called StarDist [Schmidt et al. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2018, Lecture Notes in Computer Science, vol 11071. Springer, Cham], which resolves the main challenge in the pixel-wise localization of nanoparticles in TEM images: the overlapping particles. The segmentation maps outperform models reported in the literature, and the results on particle size analyses agree well with manual particle size measurements, albeit at a significantly lower cost.
- Published
- 2021
23. Convolutional neural networks for high throughput screening of catalyst layer inks for polymer electrolyte fuel cells
- Author
-
Mohammad J, Eslamibidgoli, Fabian P, Tipp, Jenia, Jitsev, Jasna, Jankovic, Michael H, Eikerling, and Kourosh, Malek
- Abstract
The performance of polymer electrolyte fuel cells decisively depends on the structure and processes in membrane electrode assemblies and their components, particularly the catalyst layers. The structural building blocks of catalyst layers are formed during the processing and application of catalyst inks. Accelerating the structural characterization at the ink stage is thus crucial to expedite further advances in catalyst layer design and fabrication. In this context, deep learning algorithms based on deep convolutional neural networks (ConvNets) can automate the processing of the complex and multi-scale structural features of ink imaging data. This article presents the first application of ConvNets for the high throughput screening of transmission electron microscopy images at the ink stage. Results indicate the importance of model pre-training and data augmentation that works on multiple scales in training robust and accurate classification pipelines.
- Published
- 2021
24. Using physics-informed enhanced super-resolution generative adversarial networks for subfilter modeling in turbulent reactive flows
- Author
-
Zeyu Lian, Mathis Bode, Michael Gauding, Dominik Denker, Marco Davidovic, Jenia Jitsev, Konstantin Kleinheinz, and Heinz Pitsch
- Subjects
Speedup ,Artificial neural network ,Generalization ,business.industry ,Mechanical Engineering ,General Chemical Engineering ,Deep learning ,Context (language use) ,02 engineering and technology ,01 natural sciences ,010305 fluids & plasmas ,020303 mechanical engineering & transports ,0203 mechanical engineering ,Computer engineering ,Robustness (computer science) ,0103 physical sciences ,ddc:660 ,A priori and a posteriori ,Artificial intelligence ,Physical and Theoretical Chemistry ,Graphics ,business - Abstract
Proceedings of the Combustion Institute 38(2), 2617-2625 (2021). doi:10.1016/j.proci.2020.06.022, Published by Elsevier, Amsterdam [u.a.]
- Published
- 2021
- Full Text
- View/download PDF
25. Scaling Up a Multispectral Resnet-50 to 128 GPUs
- Author
-
Alexandre Strube, Morris Riedel, Gabriele Cavallaro, Jenia Jitsev, Matthias Book, and Rocco Sedona
- Subjects
Earth observation ,Speedup ,Artificial neural network ,Computer science ,business.industry ,Deep learning ,Multispectral image ,Volume (computing) ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Convolutional neural network ,Residual neural network ,Data modeling ,Computer engineering ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,ddc:610 ,Artificial intelligence ,business ,0105 earth and related environmental sciences - Abstract
Similarly to other scientific domains, Deep Learning (DL) holds great promises to fulfil the challenging needs of Remote Sensing (RS) applications. However, the increase in volume, variety and complexity of acquisitions that are carried out on a daily basis by Earth Observation (EO) missions generates new processing and storage challenges within operational processing pipelines. The aim of this work is to show that High-Performance Computing (HPC) systems can speed up the training time of Convolutional Neural Networks (CNNs). Particular attention is put on the monitoring of the classification accuracy that usually degrades when using large batch sizes. The experimental results of this work show that the training of the model scales up to a batch size of 8,000, obtaining classification performances in terms of accuracy in line with those using smaller batch sizes.
- Published
- 2020
26. ROS-MUSIC Toolchain for Spiking Neural Network Simulations in a Robotic Environment
- Author
-
Jenia Jitsev, Abigail Morrison, Philipp Weidel, Renato Duarte, and Karolína Korvasová
- Subjects
Spiking neural network ,Artificial neural network ,Quantitative Biology::Neurons and Cognition ,business.industry ,Computer science ,General Neuroscience ,Interface (computing) ,Toolchain ,Computer Science::Robotics ,Cellular and Molecular Neuroscience ,Middleware ,Poster Presentation ,Robot ,Reinforcement learning ,Artificial intelligence ,business ,Nervous system network models - Abstract
Studying a functional, biologically plausible neural network that performs a particular task is highly relevant for progress in both neuroscience and machine learning. Most tasks used to test the function of a simulated neural network are still very artificial and thus too narrow, providing only little insight into the true value of a particular neural network architecture under study. For example, many models of reinforcement learning in the brain rely on a discrete set of environmental states and actions [1]. In order to move closer towards more realistic models, modeling studies have to be conducted in more realistic environments that provide complex sensory input about the states. A way to achieve this is to provide an interface between a robotic and a neural network simulation, such that a neural network controller gains access to a realistic agent which is acting in a complex environment that can be flexibly designed by the experimentalist. To create such an interface, we present a toolchain, consisting of already existing and robust tools, which forms the missing link between robotic and neuroscience with the goal of connecting robotic simulators with neural simulators. This toolchain is a generic solution and is able to combine various robotic simulators with various neural simulators by connecting the Robot Operating System (ROS) [2] with the Multi-Simulation Coordinator (MUSIC) [3]. ROS is the most widely used middleware in the robotic community with interfaces for robotic simulators like Gazebo, Morse, Webots, etc, and additionally allows the users to specify their own robot and sensors in great detail with the Unified Robot Description Language (URDF). MUSIC is a communicator between the major, state-of-the-art neural simulators: NEST, Moose and NEURON. By implementing an interface between ROS and MUSIC, our toolchain is combining two powerful middlewares, and is therefore a multi-purpose generic solution. One main purpose is the translation from continuous sensory data, obtained from the sensors of a virtual robot, to spiking data which is passed to a neural simulator of choice. The translation from continuous data to spiking data is performed using the Neural Engineering Framework (NEF) proposed by Eliasmith & Anderson [4]. By sending motor commands from the neural simulator back to the robotic simulator, the interface is forming a closed loop between the virtual robot and its spiking neural network controller. To demonstrate the functionality of the toolchain and the interplay between all its different components, we implemented one of the vehicles described by Braitenberg [5] using the robotic simulator Gazebo and the neural simulator NEST. In future work, we aim to create a testbench, consisting of various environments for reinforcement learning algorithms, to provide a validation tool for the functionality of biological motivated models of learning.
- Published
- 2020
- Full Text
- View/download PDF
27. Prediction of Acoustic Fields Using a Lattice-Boltzmann Method and Deep Learning
- Author
-
Jenia Jitsev, Seong-Ryong Koh, Andreas Lintermann, Mario Rüttgers, and Wolfgang Schröder
- Subjects
Artificial neural network ,business.industry ,Computer science ,Deep learning ,Lattice Boltzmann methods ,Function (mathematics) ,Computational fluid dynamics ,01 natural sciences ,Article ,Square (algebra) ,010305 fluids & plasmas ,010101 applied mathematics ,Lattice-boltzmann method ,Deep convolutional neural networks ,0103 physical sciences ,Aeroacoustics ,Aeroacoustic predictions ,Artificial intelligence ,Boundary value problem ,0101 mathematics ,Sound pressure ,business ,Algorithm - Abstract
Using traditional computational fluid dynamics and aeroacoustics methods, the accurate simulation of aeroacoustic sources requires high compute resources to resolve all necessary physical phenomena. In contrast, once trained, artificial neural networks such as deep encoder-decoder convolutional networks allow to predict aeroacoustics at lower cost and, depending on the quality of the employed network, also at high accuracy. The architecture for such a neural network is developed to predict the sound pressure level in a 2D square domain. It is trained by numerical results from up to 20,000 GPU-based lattice-Boltzmann simulations that include randomly distributed rectangular and circular objects, and monopole sources. Types of boundary conditions, the monopole locations, and cell distances for objects and monopoles serve as input to the network. Parameters are studied to tune the predictions and to increase their accuracy. The complexity of the setup is successively increased along three cases and the impact of the number of feature maps, the type of loss function, and the number of training data on the prediction accuracy is investigated. An optimal choice of the parameters leads to network-predicted results that are in good agreement with the simulated findings. This is corroborated by negligible differences of the sound pressure level between the simulated and the network-predicted results along characteristic lines and by small mean errors.
- Published
- 2020
28. ROS-MUSIC Toolchain for Spiking Neural Network Simulations in a Robotic Environment - Presented @ CNS2015
- Author
-
Weidel, Philipp, Duarte, Renato, Korvasova, Karolina, Jenia Jitsev, and Morrison, Abigail
- Subjects
Computer Science::Robotics ,Quantitative Biology::Neurons and Cognition - Abstract
Studying a functional, biologically plausible neural network that performs a particular task is highly relevant for progress in both neuroscience and machine learning. Most tasks used to test the function of a simulated neural network are still very artificial and thus too narrow, providing only little insight into the true value of a particular neural network architecture under study. For example, many models of reinforcement learning in the brain rely on a discrete set of environmental states and actions. In order to move closer towards more realistic models, modeling studies have to be conducted in more realistic environments that provide complex sensory input about the states. A way to achieve this is to provide an interface between a robotic and a neural network simulation, such that a neural network controller gains access to a realistic agent which is acting in a complex environment that can be flexibly designed by the experimentalist.To create such an interface, we present a toolchain, consisting of already existing and robust tools, which forms the missing link between robotic and neuroscience with the goal of connecting robotic simulators with neural simulators. This toolchain is a generic solution and is able to combine various robotic simulators with various neural simulators by connecting the Robot Operating System (ROS) with the Multi-Simulation Coordinator (MUSIC). ROS is the most widely used middleware in the robotic community with interfaces for robotic simulators like Gazebo, Morse, Webots, etc, and additionally allows the users to specify their own robot and sensors in great detail with the Unified Robot Description Language (URDF). MUSIC is a communicator between the major, state-of-the-art neural simulators: NEST, Moose and NEURON. By implementing an interface between ROS and MUSIC, our toolchain is combining two powerful middlewares, and is therefore a multi-purpose generic solution.One main purpose is the translation from continuous sensory data, obtained from the sensors of a virtual robot, to spiking data which is passed to a neural simulator of choice. The translation from continuous data to spiking data is performed using the Neural Engineering Framework (NEF) proposed by Eliasmith & Anderson. By sending motor commands from the neural simulator back to the robotic simulator, the interface is forming a closed loop between the virtual robot and its spiking neural network controller.To demonstrate the functionality of the toolchain and the interplay between all its different components, we implemented one of the vehicles described by Braitenberg using the robotic simulator Gazebo and the neural simulator NEST.In future work, we aim to create a testbench, consisting of various environments for reinforcement learning algorithms, to provide a validation tool for the functionality of biological motivated models of learning.
- Published
- 2020
- Full Text
- View/download PDF
29. Remote Sensing Big Data Classification with High Performance Distributed Deep Learning
- Author
-
Alexandre Strube, Jenia Jitsev, Jon Atli Benediktsson, Gabriele Cavallaro, Rocco Sedona, and Morris Riedel
- Subjects
Earth observation ,Computer science ,Remote sensing application ,Big data ,0211 other engineering and technologies ,convolutional neural network ,02 engineering and technology ,Convolutional neural network ,distributed deep learning ,0202 electrical engineering, electronic engineering, information engineering ,Graphics ,021101 geological & geomatics engineering ,Remote sensing ,business.industry ,Deep learning ,sentinel-2 ,Supercomputer ,high performance computing ,Statistical classification ,classification ,General Earth and Planetary Sciences ,020201 artificial intelligence & image processing ,Artificial intelligence ,ddc:620 ,business ,residual neural network - Abstract
High-Performance Computing (HPC) has recently been attracting more attention in remote sensing applications due to the challenges posed by the increased amount of open data that are produced daily by Earth Observation (EO) programs. The unique parallel computing environments and programming techniques that are integrated in HPC systems are able to solve large-scale problems such as the training of classification algorithms with large amounts of Remote Sensing (RS) data. This paper shows that the training of state-of-the-art deep Convolutional Neural Networks (CNNs) can be efficiently performed in distributed fashion using parallel implementation techniques on HPC machines containing a large number of Graphics Processing Units (GPUs). The experimental results confirm that distributed training can drastically reduce the amount of time needed to perform full training, resulting in near linear scaling without loss of test accuracy.
- Published
- 2019
- Full Text
- View/download PDF
30. Towards Prediction of Turbulent Flows at High Reynolds Numbers Using High Performance Computing Data and Deep Learning
- Author
-
Michael Gauding, Jens Henrik Göbbert, Heinz Pitsch, Baohao Liao, Jenia Jitsev, and Mathis Bode
- Subjects
business.industry ,Turbulence ,Computer science ,Deep learning ,Direct numerical simulation ,Reynolds number ,Context (language use) ,Function (mathematics) ,Supercomputer ,01 natural sciences ,010305 fluids & plasmas ,Physics::Fluid Dynamics ,symbols.namesake ,0103 physical sciences ,symbols ,Statistical physics ,Artificial intelligence ,010306 general physics ,business - Abstract
In this paper, deep learning (DL) methods are evaluated in the context of turbulent flows. Various generative adversarial networks (GANs) are discussed with respect to their suitability for understanding and modeling turbulence. Wasserstein GANs (WGANs) are then chosen to generate small-scale turbulence. Highly resolved direct numerical simulation (DNS) turbulent data is used for training the WGANs and the effect of network parameters, such as learning rate and loss function, is studied. Qualitatively good agreement between DNS input data and generated turbulent structures is shown. A quantitative statistical assessment of the predicted turbulent fields is performed.
- Published
- 2018
31. Corticostriatal circuit mechanisms of value-based action selection: Implementation of reinforcement learning algorithms and beyond
- Author
-
Kenji Morita, Jenia Jitsev, and Abigail Morrison
- Subjects
0301 basic medicine ,Computer science ,Models, Neurological ,Internal model ,Motor Activity ,Action selection ,Choice Behavior ,03 medical and health sciences ,Behavioral Neuroscience ,0302 clinical medicine ,Lateral inhibition ,Neural Pathways ,Reinforcement learning ,Animals ,Humans ,Selection (genetic algorithm) ,Cerebral Cortex ,Probabilistic logic ,Winner-take-all ,Corpus Striatum ,Complex dynamics ,030104 developmental biology ,Algorithm ,Reinforcement, Psychology ,030217 neurology & neurosurgery ,Algorithms - Abstract
Value-based action selection has been suggested to be realized in the corticostriatal local circuits through competition among neural populations. In this article, we review theoretical and experimental studies that have constructed and verified this notion, and provide new perspectives on how the local-circuit selection mechanisms implement reinforcement learning (RL) algorithms and computations beyond them. The striatal neurons are mostly inhibitory, and lateral inhibition among them has been classically proposed to realize "Winner-Take-All (WTA)" selection of the maximum-valued action (i.e., 'max' operation). Although this view has been challenged by the revealed weakness, sparseness, and asymmetry of lateral inhibition, which suggest more complex dynamics, WTA-like competition could still occur on short time scales. Unlike the striatal circuit, the cortical circuit contains recurrent excitation, which may enable retention or temporal integration of information and probabilistic "soft-max" selection. The striatal "max" circuit and the cortical "soft-max" circuit might co-implement an RL algorithm called Q-learning; the cortical circuit might also similarly serve for other algorithms such as SARSA. In these implementations, the cortical circuit presumably sustains activity representing the executed action, which negatively impacts dopamine neurons so that they can calculate reward-prediction-error. Regarding the suggested more complex dynamics of striatal, as well as cortical, circuits on long time scales, which could be viewed as a sequence of short WTA fragments, computational roles remain open: such a sequence might represent (1) sequential state-action-state transitions, constituting replay or simulation of the internal model, (2) a single state/action by the whole trajectory, or (3) probabilistic sampling of state/action.
- Published
- 2015
32. Fast rhythm cycles as atomic fragments of cortical processing and learning
- Author
-
Jenia Jitsev
- Subjects
education.field_of_study ,Computer science ,Brain activity and meditation ,business.industry ,General Neuroscience ,Computation ,Population ,Information processing ,Pattern recognition ,Engram ,Mixture model ,Cellular and Molecular Neuroscience ,Poster Presentation ,Unsupervised learning ,Artificial intelligence ,education ,business ,Network model - Abstract
Neuronal rhythms of different frequencies are ubiquitous in the brain activity. These rhythms are thought to be not just a mere epiphenomenon of neural dynamics, but to play an important role in information processing performed by the brain networks. However, the character of their functional involvement remains still largely elusive. Fast brain rhythms in the gamma frequency range of 40-100 Hz, known to modulate both neuronal activity and synaptic plasticity, were often proposed to provide a reference frame for operations performed by cortical microcircuits [1,2]. More precisely, it was hypothesized that a flexible winner-take-all (WTA) computation is performed in a cycle of gamma oscillation by local fine-scale subnetworks that contain tightly coupled excitatory pyramidal neurons residing in cortical layer II-III. Such operation selects and amplifies a small population of pyramidal cells based on the incoming afferent input while suppressing the rest, rapidly generating a sparse code that represents the current stimulus in a course of a single gamma cycle. This hypothesis leaves open whether learning and memory trace formation as well may rely on fast rhythm cycles as discrete atomic fragments of ongoing processing. We use here a hierarchical recurrent network that employs gamma cycle as an atomic fragment for unsupervised learning of object identity from natural image input [3]. Unsupervised learning runs on the top of a fast winner-take-all (WTA)-like computation performed within a single cycle of the ongoing fast rhythm. If given natural face images, the network is able to create memory traces containing reusable facial visual elements that are linked in associative, generative manner via simultaneously established bottom-up, lateral and top-down connectivity into a global person face identity. If a face image of a memorized person is presented, the network is able to rapidly recall its identity and gender in a single gamma cycle. The operation performed within a single cycle may be interpreted as a probabilistic inference of the latent causes that create the input and an estimation of the parameters of a mixture model with latent causes as its components. This computation has the character of an expectation-maximization procedure, where expectation part is carried out by WTA-like computation and maximization involves plasticity mechanisms that change synaptic strength and neural excitability over many repetitive cycles. Even if decoupled from external input, the network can self-generate activity in an off-line regime, replaying the memory content in a sequence of gamma cycles and improving its organization to generalize better over the novel face images not presented before once back in input-driven regime [4]. Thus, the presented network model provides interpretation of the gamma cycle as an elementary fragment of ongoing processing and learning, where each cycle embeds a winner-take-all-like computation that supports memory trace formation and memory trace maintenance in hierarchical recurrent network pathways of the cortex.
- Published
- 2014
33. Learning from positive and negative rewards in a spiking neural network model of basal ganglia
- Author
-
Abigail Morrison, Jenia Jitsev, and Marc Tittgemeyer
- Subjects
Spiking neural network ,education.field_of_study ,Artificial neural network ,Computer science ,business.industry ,Population ,Ventral striatum ,Striatum ,Neurophysiology ,Machine learning ,computer.software_genre ,Medium spiny neuron ,Synapse ,Reward system ,medicine.anatomical_structure ,Synaptic plasticity ,Basal ganglia ,medicine ,Reinforcement learning ,Artificial intelligence ,education ,business ,computer ,Neuroscience - Abstract
Despite the vast amount of experimental findings on the role of the basal ganglia in reinforcement learning, there is still general lack of network models that use spiking neurons and plausible plasticity mechanisms to demonstrate network-level reward-based learning. In this work we extend a recent spiking actor-critic network model of the basal ganglia, aiming to create a minimal realistic model of learning from both positive and negative rewards. We hypothesize and implement in the model segregation of not only the dorsal striatum, but also of the ventral striatum into populations of medium spiny neurons (MSNs) that carry either D1 or D2 dopamine (DA) receptor type. This segregation allows explicit representation of both positive and negative expected reward within respective population. In line with recent experiments, we further assume that D1 and D2 MSN populations have distinct, opposing DA-modulated bidirectional synaptic plasticity. We implement the spiking network model in the simulator NEST and conduct experiments involving application of delayed rewards in a grid world setting, where a moving agent has to reach a goal state while maximizing the total obtained reward. We demonstrate that the network can learn not only to approach the positive rewards, but also to consequently avoid punishments as opposed to the original model. The spiking network model highlights thus functional role of D1-D2 MSN segregation within striatum and explains necessity for reversed direction of DA-dependent plasticity found at synapses converging on different types of striatal MSNs.
- Published
- 2012
34. Information-Theoretic Connectivity-Based Cortex Parcellation
- Author
-
Jenia Jitsev, Marc Tittgemeyer, Silvan Siep, Nico S. Gorbach, and Corina Melzer
- Subjects
medicine.anatomical_structure ,Computer science ,Cortex (anatomy) ,Fingerprint (computing) ,medicine ,Inferior frontal gyrus ,Precentral gyrus ,Context (language use) ,Noise (video) ,Neuroscience ,Cortex (botany) - Abstract
One of the most promising avenues for compiling connectivity data originates from the notion that individual brain regions maintain individual connectivity profiles; the functional repertoire of a cortical area ("the functional fingerprint") is closely related to its anatomical connections ("the connectional fingerprint") and, hence, a segregated cortical area may be characterized by a highly coherent connectivity pattern. Existing clustering techniques in the context of connectivity-based cortex parcellation are usually exploratory. We therefore advocate an information-theoretic framework for connectivity-based cortex parcellation which avoids many assumptions imposed by previous methods. Clustering is based upon maximizing connectivity information while allowing noise in the data to vote for the optimal number of cortical subunits. The automatic parcellation of the inferior frontal gyrus together with the precentral gyrus reveals cortical subunits consistent with previous studies.
- Published
- 2012
35. Self-generated off-line memory reprocessing on different layers of a hierarchical recurrent neuronal network
- Author
-
Jenia Jitsev
- Subjects
Computer science ,General Neuroscience ,lcsh:QP351-495 ,Equalization (audio) ,lcsh:RC321-571 ,Cellular and Molecular Neuroscience ,lcsh:Neurophysiology and neuropsychology ,Poster Presentation ,Biological neural network ,Activity regulation ,Network performance ,Layer (object-oriented design) ,Hierarchical network model ,Performance improvement ,lcsh:Neurosciences. Biological psychiatry. Neuropsychiatry ,Neuroscience ,Off line - Abstract
Memory traces in the cortex are embedded into a scaffold of feed-forward and recurrent connectivity of the hierarchically organized processing pathways. Strong evidence suggests that consolidation of the memory traces in such a memory network depends on an off-line reprocessing done in the sleep state or during restful waking. It remains largely unclear, what plasticity mechanisms are involved in this consolidation process and what changes are induced at what sites in the network during memory reprocessing in the off-line regime. This study focuses on functional consequences an off-line reprocessing has in a hierarchical recurrent neuronal network that learns different person identities from natural face images in unsupervised manner [1]. Due to the inherently self-exciting, but competitive winner-take-all-like unit dynamics, the two-layered network is able to self-generate sparse activity even in the absence of external input in an off-line regime. In this regime, the network reactivates the memory traces established during preceding on-line learning. Remarkably, this off-line memory replay turns out to be highly beneficial for the network recognition performance [2]. The benefit is articulated after the off-line regime in a strong boost of identity recognition rate on the alternative face views to which the network has not been exposed during learning. Performance of both network layers is affected by the boost. Surprisingly, the positive effect is independent of synapse-specific plasticity, relying completely on a synapse-unspecific mechanism of homeostatic activity regulation. This homeostatic mechanism tunes network unit excitabilities, equalizing the excitability levels within the network layers during the off-line reprocessing and causing the performance improvement when the network is back in the on-line regime. Performing excitability equalization for the lower and the higher network layers in separate, it becomes possible to dissociate the contribution of both layers to the positive effect observed after the off-line reprocessing. Equalizing the excitability levels on only one of both layers boosts the network recognition performance, independent of whether the equalization is made on the lower or on the higher layer. The excitability equalization on the higher layer has hereby a slightly stronger effect on network performance. The full boost however is achieved only if both layers are simultaneously processed via excitability equalization. Interestingly, the full effect cannot be simply explained by adding up the separate contributions of each layer, indicating that there is a substantial synergetic interaction between both in achieving the improvement after the off-line memory reprocessing. These findings suggest that all layers of the network hierarchy contribute their distinct part to the improvement of network recognition performance if affected by the off-line reprocessing, which provides interesting hints how off-line memory reprocessing may act on the hierarchically organized pathways in the brain during the states of sleep or restful waking.
- Published
- 2011
36. Off-line memory reprocessing in a recurrent neuronal network formed by unsupervised learning
- Author
-
Jenia Jitsev and Christoph von der Malsburg
- Subjects
Focus (computing) ,Computer science ,Mechanism (biology) ,Speech recognition ,Visual cortex ,medicine.anatomical_structure ,Face (geometry) ,Generalization (learning) ,medicine ,Biological neural network ,Unsupervised learning ,General Materials Science ,Off line ,Neuroscience - Abstract
In the visual cortex, memory traces for complex objects are embedded into a scaffold of feed-forward and recurrent connectivity of the hierarchically organized visual pathway. Strong evidence suggests that consolidation of the memory traces in such a memory network depends on an off-line reprocessing done in the sleep state or during restful waking. It remains largely unclear, what plasticity mechanisms are involved in this consolidation process and what changes are induced in the network during memory reprocessing in the off-line regime. Here we focus on the functional consequences off-line reprocessing has in a hierarchical recurrent neuronal network that learns different person identities from natural face images in an unsupervised manner. Due to the inherently self-exciting, but competitive unit dynamics, the two-layered network is able to self-generate sparse activity even in the absence of external input in an off-line regime. In this regime, the network replays the memory content established during preceding on-line learning. Remarkably, this off-line memory replay turns out to be highly beneficial for the network recognition performance. The benefit is articulated after the off-line regime in a strong boost of identity recognition rate on the alternative face views to which the network has not been exposed during learning. Performance of both network layers is affected by the boost. Surprisingly, the positive effect is independent of synapse-specific plasticity, relying completely on a synapse-unspecific mechanism of homeostatic activity regulation that tunes network unit excitability. Comparing further a purely feed-forward configuration of the network with its fully recurrent original version reveals a stronger boost in recognition performance for the latter after the off-line reprocessing. These findings suggest that the off-line memory reprocessing enhances generalization capability of the hierarchical recurrent network by improving communication of contextual cues mediated via recurrent lateral and top-down connectivity.
- Published
- 2011
37. Dynamic link models for global decision making with binding-by-synchrony
- Author
-
Thomas Burwick, Jenia Jitsev, Yasuomi D. Sato, and Christoph von der Malsburg
- Subjects
genetic structures ,Process (engineering) ,business.industry ,Mechanism (biology) ,Computer science ,Encoding (memory) ,Data integrity ,Dynamic link matching ,Cognitive neuroscience of visual object recognition ,Link (geometry) ,Artificial intelligence ,Object (computer science) ,business - Abstract
We address the problem of integrating information about multiple objects and their positions on a visual scene. A primate visual system has fewer difficulties in rapidly achieving integration, given even when presented with several objects. Here, we propose a neurally plausible mechanism for simultaneously coordinating the local decision-making process of “what”- and “where”-information for the organization of global multi-object recognition. The mechanism is based on paradigms of binding-by-synchrony and dynamic link matching in a network system of the macrocolumnar cortical model. These paradigms are responsible for encoding an individual object and its position through a synchronization-desynchronization process among selected or unselected links of the objects.
- Published
- 2010
38. Off-line memory reprocessing following on-line unsupervised learning strongly improves recognition performance in a hierarchical visual memory
- Author
-
Christoph von der Malsburg and Jenia Jitsev
- Subjects
Relation (database) ,business.industry ,Computer science ,Speech recognition ,Neurophysiology ,Machine learning ,computer.software_genre ,Facial recognition system ,Visualization ,Visual memory ,Face (geometry) ,Semantic memory ,Unsupervised learning ,Visual short-term memory ,Artificial intelligence ,business ,computer - Abstract
Recently, experience-driven unsupervised learning was shown to create combinatorial parts-based representations in a model of hierarchical visual memory. Examining the memory's ability to recognize persons from a database of natural face images, we show that an off-line, sleep-like operating regime of the memory domain results in a significant improvement of the system's ability to generalize over novel face views. Surprisingly, the positive effect turns out to be independent of synapse-specific plasticity, relying entirely on a homeostatic mechanism equalizing the intrinsic excitability levels of the units within the memory network. We show that this excitability equalization is the main cause for the improvement of memory function. A possible relation to cortical off-line memory reprocessing during certain sleep stages is discussed.
- Published
- 2010
39. Unsupervised learning of object identities and their parts in a hierarchical visual memory
- Author
-
Jenia Jitsev and Christoph von der Malsburg
- Subjects
Cellular and Molecular Neuroscience ,Visual memory ,Computer science ,business.industry ,Competitive learning ,Neuroscience (miscellaneous) ,Unsupervised learning ,Computer vision ,Artificial intelligence ,Object (computer science) ,business - Published
- 2009
40. Functional role of opponent, dopamine modulated D1/D2 plasticity in reinforcement learning
- Author
-
Abigail Morrison, Jenia Jitsev, Marc Tittgemeyer, and Nobi Abraham
- Subjects
Spiking neural network ,General Neuroscience ,Ventral striatum ,Striatum ,Medium spiny neuron ,Reward system ,Cellular and Molecular Neuroscience ,medicine.anatomical_structure ,Basal ganglia ,Poster Presentation ,Biological neural network ,medicine ,Reinforcement learning ,ddc:610 ,Psychology ,Neuroscience ,psychological phenomena and processes - Abstract
The basal ganglia network is thought to be involved in adaptation of organism's behavior when facing its positive and negative consequences, that is, in reinforcement learning. It has been hypothesized that dopamine (DA) modulated plasticity of synapses projecting from different cortical areas to the input nuclei of the basal ganglia, the striatum, plays a central role in this form of learning, being responsible for updating future outcome expectations and action preferences. In this scheme, DA transmission is considered to convey a prediction error signal that is generated if internal expectations do not match the outcomes observed after action execution. So far, there has been no satisfying model for what neural circuits computing this signal within the basal ganglia may look like, how this computation is performed and what is the mechanistic role of DA release in adapting the system towards optimal behavior in a given task. Aiming towards a model of a canonical circuit for learning task-conform behavior from both reward and punishment, we extended a previously introduced spiking actor-critic network model of the basal ganglia [1] to contain the segregation of both the dorsal (actor) and ventral (critic) striatum into populations of D1 and D2 medium spiny neurons (MSNs). This segregation allows explicit, separate representation of both positive and negative expected outcomes by the distinct populations in the ventral striatum. The positive and negative components of expected outcome were fed to dopamine (DA) neurons in SNc/VTA region, which compute and signal reward prediction error by DA release. Based on recent experimental work [2], DA level was assumed to modulate plasticity of D1 and D2 synapses in opposing way, inducing LTP on D1 and LTD on D2 synapses if being high and vice versa if being low. Crucially, this form of opponent plasticity implements temporal-difference (TD)-like update of both positive and negative outcome expectations separately and performs appropriate action selection adaptation. We implemented the network in the NEST simulator [3] using leaky integrate-and-fire spiking neurons and designed a battery of experiments involving application of reward and punishment in various grid world tasks. In each task, an agent had to explore the states and learn to maximize the total reward obtained. Number of states, magnitudes and delays of reward and punishment were manipulated across different tasks. We demonstrate that across the tasks the network can learn both to approach the delayed rewards while consequently avoiding punishments, the latter posing severe difficulties for the previous model without D1/D2 segregation [1]. Thus, the spiking neural network model highlights the functional role of D1/D2 MSN segregation within the striatum in implementing appropriate TD-like learning from both reward and punishment and explains necessity for opponent direction of DA-dependent plasticity found at synapses converging on distinct striatal MSN types. This modeling approach can be extended in the future work to study how abnormal D1/D2 plasticity may lead to a reorganization of the basal ganglia network towards pathological, dysfunctional states, like for instance those observed in Parkinson disease under condition of progressive dopamine depletion.
- Published
- 2013
41. Impact of recurrent connectivity on off-line memory reprocessing in a hierarchical neural network formed by unsupervised learning
- Author
-
Jenia, Jitsev, primary
- Published
- 2012
- Full Text
- View/download PDF
42. JUWELS Booster – A Supercomputer for Large-Scale AI Research
- Author
-
Stefan Kesselheim, Andreas Herten, Kai Krajsek, Jan Ebert, Jenia Jitsev, Mehdi Cherti, Michael Langguth, Bing Gong, Scarlet Stadtler, Amirpasha Mozaffari, Gabriele Cavallaro, Rocco Sedona, Alexander Schug, Alexandre Strube, Roshni Kamath, Martin G. Schultz, Morris Riedel, and Thomas Lippert
- Subjects
FOS: Computer and information sciences ,0303 health sciences ,03 medical and health sciences ,Computer Science - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Distributed, Parallel, and Cluster Computing (cs.DC) ,Biologie ,030304 developmental biology ,Machine Learning (cs.LG) - Abstract
In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the J\"ulich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its system architecture, parallel, distributed model training, and benchmarks indicating its outstanding performance. We exemplify its potential for research application by presenting large-scale AI research highlights from various scientific fields that require such a facility., Comment: 12 pages, 5 figures. Accepted at ISC 2021, Workshop Deep Learning on Supercomputers. This is a duplicate submission as my previous submission is on hold for several weeks now and my attempts to contact the moderators failed
- Full Text
- View/download PDF
43. Experience-driven formation of parts-based representations in a model of layered visual memory
- Author
-
Jenia Jitsev and Christoph von der Malsburg
- Subjects
FOS: Computer and information sciences ,Computer science ,Property (programming) ,Competitive learning ,Neuroscience (miscellaneous) ,FOS: Physical sciences ,cortical column ,unsupervised learning ,parts-based representation ,Machine Learning (cs.LG) ,lcsh:RC321-571 ,Cellular and Molecular Neuroscience ,Visual memory ,Visual Objects ,Set (psychology) ,lcsh:Neurosciences. Biological psychiatry. Neuropsychiatry ,Original Research ,computer.programming_language ,Self-organization ,bidirectional plasticity ,Recall ,business.industry ,Pattern recognition ,competitive learning ,self-organization ,Nonlinear Sciences - Adaptation and Self-Organizing Systems ,Computer Science - Learning ,activity homeostasis ,Quantitative Biology - Neurons and Cognition ,FOS: Biological sciences ,Unsupervised learning ,Neurons and Cognition (q-bio.NC) ,Artificial intelligence ,business ,visual memory ,Adaptation and Self-Organizing Systems (nlin.AO) ,computer ,Neuroscience - Abstract
Growing neuropsychological and neurophysiological evidence suggests that the visual cortex uses parts-based representations to encode, store and retrieve relevant objects. In such a scheme, objects are represented as a set of spatially distributed local features, or parts, arranged in stereotypical fashion. To encode the local appearance and to represent the relations between the constituent parts, there has to be an appropriate memory structure formed by previous experience with visual objects. Here, we propose a model how a hierarchical memory structure supporting efficient storage and rapid recall of parts-based representations can be established by an experience-driven process of self-organization. The process is based on the collaboration of slow bidirectional synaptic plasticity and homeostatic unit activity regulation, both running at the top of fast activity dynamics with winner-take-all character modulated by an oscillatory rhythm. These neural mechanisms lay down the basis for cooperation and competition between the distributed units and their synaptic connections. Choosing human face recognition as a test task, we show that, under the condition of open-ended, unsupervised incremental learning, the system is able to form memory traces for individual faces in a parts-based fashion. On a lower memory layer the synaptic structure is developed to represent local facial features and their interrelations, while the identities of different persons are captured explicitly on a higher layer. An additional property of the resulting representations is the sparseness of both the activity during the recall and the synaptic patterns comprising the memory traces., Comment: 34 pages, 12 Figures, 1 Table, published in Frontiers in Computational Neuroscience (Special Issue on Complex Systems Science and Brain Dynamics), http://www.frontiersin.org/neuroscience/computationalneuroscience/paper/10.3389/neuro.10/015.2009/
- Full Text
- View/download PDF
44. A global decision-making model via synchronization in macrocolumn units
- Author
-
Thomas Burwick, Jenia Jitsev, Yasuomi D. Sato, and Christoph von der Malsburg
- Subjects
Cognitive science ,Cellular and Molecular Neuroscience ,lcsh:Neurophysiology and neuropsychology ,Human–computer interaction ,Mechanism (biology) ,General Neuroscience ,ddc:570 ,lcsh:QP351-495 ,Synchronization (computer science) ,Psychology ,lcsh:Neurosciences. Biological psychiatry. Neuropsychiatry ,Decision-making models ,lcsh:RC321-571 - Abstract
Poster presentation: Introduction We here address the problem of integrating information about multiple objects and their positions on the visual scene. A primate visual system has little difficulty in rapidly achieving integration, given only a few objects. Unfortunately, computer vision still has great difficultly achieving comparable performance. It has been hypothesized that temporal binding or temporal separation could serve as a crucial mechanism to deal with information about objects and their positions in parallel to each other. Elaborating on this idea, we propose a neurally plausible mechanism for reaching local decision-making for "what" and "where" information to the global multi-object recognition. ...
- Full Text
- View/download PDF
45. Activity-dependent bidirectional plasticity and homeostasis regulation governing structure formation in a model of layered visual memory
- Author
-
Christoph von der Malsburg and Jenia Jitsev
- Subjects
Structure (mathematical logic) ,Cellular and Molecular Neuroscience ,Theoretical computer science ,Basis (linear algebra) ,Visual memory ,General Neuroscience ,Process (computing) ,Engram ,Object (computer science) ,Psychology ,Facial recognition system ,Neuroscience ,Task (project management) - Abstract
Our work deals with the self-organization [1] of a memory structure that includes multiple hierarchical levels with massive recurrent communication within and between them. Such structure has to provide a representational basis for the relevant objects to be stored and recalled in a rapid and efficient way. Assuming that the object patterns consist of many spatially distributed local features, a problem of parts-based learning is posed. We speculate on the neural mechanisms governing the process of the structure formation and demonstrate their functionality on the task of human face recognition.
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.