59,237 results on '"Spector, A."'
Search Results
2. ThunderKittens: Simple, Fast, and Adorable AI Kernels
- Author
-
Spector, Benjamin F., Arora, Simran, Singhal, Aaryan, Fu, Daniel Y., and Ré, Christopher
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
The challenge of mapping AI architectures to GPU hardware is creating a critical bottleneck in AI progress. Despite substantial efforts, hand-written custom kernels fail to meet their theoretical performance thresholds, even on well-established operations like linear attention. The diverse hardware capabilities of GPUs might suggest that we need a wide variety of techniques to achieve high performance. However, our work explores whether a small number of key abstractions can drastically simplify the process. We present ThunderKittens (TK), a framework for writing performant AI kernels while remaining easy to use and maintain. Our abstractions map to the three levels of the GPU hierarchy: (1) at the warp-level, we provide 16x16 matrix tiles as basic data structures and PyTorch-like parallel compute operations over tiles, (2) at the thread-block level, we provide a template for overlapping asynchronous operations across parallel warps, and (3) at the grid-level, we provide support to help hide the block launch and tear-down, and memory costs. We show the value of TK by providing kernels that match or outperform prior kernels for a range of AI operations. We match CuBLAS and FlashAttention-3 on GEMM and attention inference performance and outperform the strongest baselines by $10-40\%$ on attention backwards, $8\times$ on state space models, and $14\times$ on linear attention.
- Published
- 2024
3. LoLCATs: On Low-Rank Linearizing of Large Language Models
- Author
-
Zhang, Michael, Arora, Simran, Chalamala, Rahul, Wu, Alan, Spector, Benjamin, Singhal, Aaryan, Ramesh, Krithik, and Ré, Christopher
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Statistics - Machine Learning - Abstract
Recent works show we can linearize large language models (LLMs) -- swapping the quadratic attentions of popular Transformer-based LLMs with subquadratic analogs, such as linear attention -- avoiding the expensive pretraining costs. However, linearizing LLMs often significantly degrades model quality, still requires training over billions of tokens, and remains limited to smaller 1.3B to 7B LLMs. We thus propose Low-rank Linear Conversion via Attention Transfer (LoLCATs), a simple two-step method that improves LLM linearizing quality with orders of magnitudes less memory and compute. We base these steps on two findings. First, we can replace an LLM's softmax attentions with closely-approximating linear attentions, simply by training the linear attentions to match their softmax counterparts with an output MSE loss ("attention transfer"). Then, this enables adjusting for approximation errors and recovering LLM quality simply with low-rank adaptation (LoRA). LoLCATs significantly improves linearizing quality, training efficiency, and scalability. We significantly reduce the linearizing quality gap and produce state-of-the-art subquadratic LLMs from Llama 3 8B and Mistral 7B v0.1, leading to 20+ points of improvement on 5-shot MMLU. Furthermore, LoLCATs does so with only 0.2% of past methods' model parameters and 0.4% of their training tokens. Finally, we apply LoLCATs to create the first linearized 70B and 405B LLMs (50x larger than prior work). When compared with prior approaches under the same compute budgets, LoLCATs significantly improves linearizing quality, closing the gap between linearized and original Llama 3.1 70B and 405B LLMs by 77.8% and 78.1% on 5-shot MMLU., Comment: 47 pages, 20 figures, 18 tables, preprint
- Published
- 2024
4. Persistent homology classifies parameter dependence of patterns in Turing systems
- Author
-
Spector, Reemon, Harrington, Heather A., and Gaffney, Eamonn A.
- Subjects
Quantitative Biology - Quantitative Methods ,Mathematics - Algebraic Topology ,92C15, 55N31 - Abstract
This paper illustrates a further application of topological data analysis to the study of self-organising models for chemical and biological systems. In particular, we investigate whether topological summaries can capture the parameter dependence of pattern topology in reaction diffusion systems, by examining the homology of sublevel sets of solutions to Turing reaction diffusion systems for a range of parameters. We demonstrate that a topological clustering algorithm can reveal how pattern topology depends on parameters, using the chlorite--iodide--malonic acid system, and the prototypical Schnakenberg system for illustration. In addition, we discuss the prospective application of such clustering, for instance in refining priors for detailed parameter estimation for self-organising systems., Comment: 24 pages, 9 figures
- Published
- 2024
5. A sparse resolution of the DiPerna-Majda gap problem for $2$D Euler equations
- Author
-
Domínguez, Oscar and Spector, Daniel
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Functional Analysis - Abstract
A central question which originates in the celebrated work in the 1980's of DiPerna and Majda asks what is the optimal decay $f > 0$ such that uniform rates $|\omega|(Q) \leq f(|Q|)$ of the vorticity maximal functions guarantee strong convergence without concentrations of approximate solutions to energy-conserving weak solutions of the $2$D Euler equations with vortex sheet initial data. A famous result of Majda (1993) shows $f(r) = [\log (1/r)]^{-1/2}$, $r<1/2$, as the optimal decay for \emph{distinguished} sign vortex sheets. In the general setting of \emph{mixed} sign vortex sheets, DiPerna and Majda (1987) established $f(r) = [\log (1/r)]^{-\alpha}$ with $\alpha > 1$ as a sufficient condition for the lack of concentrations, while the expected gap $\alpha \in (1/2, 1]$ remains as an open question. In this paper we resolve the DiPerna-Majda $2$D gap problem: In striking contrast to the well-known case of distinguished sign vortex sheets, we identify $f(r) = [\log (1/r)]^{-1}$ as the optimal regularity for mixed sign vortex sheets that rules out concentrations. For the proof, we propose a novel method to construct explicitly solutions with mixed sign to the $2$D Euler equations in such a way that wild behaviour creates within the relevant geometry of \emph{sparse} cubes (i.e., these cubes are not necessarily pairwise disjoint, but their possible overlappings can be controlled in a sharp fashion). Such a strategy is inspired by the recent work of the first author and Milman \cite{DM} where strong connections between energy conservation and sparseness are established., Comment: 24 pages
- Published
- 2024
6. Folate metabolism and risk of childhood acute lymphoblastic leukemia: a genetic pathway analysis from the Childhood Cancer and Leukemia International Consortium
- Author
-
Metayer, Catherine, Spector, Logan G, Scheurer, Michael E, Jeon, Soyoung, Scott, Rodney J, Takagi, Masatoshi, Clavel, Jacqueline, Manabe, Atsushi, Ma, Xiaomei, Hailu, Elleni M, Lupo, Philip J, Urayama, Kevin Y, Bonaventure, Audrey, Kato, Motohiro, Meirhaeghe, Aline, Chiang, Charleston WK, Morimoto, Libby M, and Wiemels, Joseph L
- Subjects
Epidemiology ,Health Sciences ,Hematology ,Cancer ,Human Genome ,Genetics ,Childhood Leukemia ,Health Disparities ,Minority Health ,Rare Diseases ,Pediatric ,Pediatric Cancer ,Clinical Research ,2.1 Biological and endogenous factors ,Humans ,Precursor Cell Lymphoblastic Leukemia-Lymphoma ,Folic Acid ,Polymorphism ,Single Nucleotide ,Child ,Case-Control Studies ,Female ,Male ,Genome-Wide Association Study ,Risk Factors ,Genetic Predisposition to Disease ,Child ,Preschool ,Medical and Health Sciences ,Biomedical and clinical sciences ,Health sciences - Abstract
BackgroundPrenatal folate supplementation has been consistently associated with a reduced risk of childhood acute lymphoblastic leukemia (ALL). Previous germline genetic studies examining the one carbon (folate) metabolism pathway were limited in sample size, scope, and population diversity and led to inconclusive results.MethodsWe evaluated whether ∼2,900 single-nucleotide polymorphisms (SNP) within 46 candidate genes involved in the folate metabolism pathway influence the risk of childhood ALL, using genome-wide data from nine case-control studies in the Childhood Cancer and Leukemia International Consortium (n = 9,058 cases including 4,510 children of European ancestry, 3,018 Latinx, and 1,406 Asians, and 92,364 controls). Each study followed a standardized protocol for quality control and imputation of genome-wide data and summary statistics were meta-analyzed for all children combined and by major ancestry group using METAL software.ResultsNone of the selected SNPs reached statistical significance, overall and for major ancestry groups (using adjusted Bonferroni P-value of 5 × 10-6 and less-stringent P-value of 3.5 × 10-5 accounting for the number of "independent" SNPs). None of the 10 top (nonsignificant) SNPs and corresponding genes overlapped across ancestry groups.ConclusionsThis large meta-analysis of original data does not reveal associations between many common genetic variants in the folate metabolism pathway and childhood ALL in various ancestry groups.ImpactGenetic variants in the folate pathway alone do not appear to substantially influence childhood acute lymphoblastic leukemia risk. Other mechanisms such as gene-folate interaction, DNA methylation, or maternal genetic effects may explain the observed associations with self-reported prenatal folate intake.
- Published
- 2024
7. Design and Performance of the ALPS II Regeneration Cavity
- Author
-
Kozlowski, Todd, Wei, Li-Wei, Spector, Aaron D., Hallal, Ayman, Fraedrich, Henry, Brotherton, Daniel C., Oceano, Isabella, Ejlli, Aldo, Grote, Hartmut, Hollis, Harold, Karan, Kanioar, Mueller, Guido, Tanner, D. B., Willke, Benno, and Lindner, Axel
- Subjects
Physics - Optics - Abstract
The Regeneration Cavity (RC) is a critical component of the Any Light Particle Search II (ALPS II) experiment. It increases the signal from possible axions and axion-like particles in the experiment by nearly four orders of magnitude. The total round-trip optical losses of the power circulating in the cavity must be minimized in order to maximize the resonant enhancement of the cavity, which is an important figure of merit for ALPS II. Lower optical losses also increase the cavity storage time and with the 123 meter long ALPS II RC we have demonstrated the longest storage time of a two-mirror optical cavity. We measured a storage time of $7.17 \pm 0.01$ ms, equivalent to a linewidth of 44.4 Hz and a finesse of 27,500 at a wavelength of 1064 nm., Comment: 16 pages, 8 figures, 1 table
- Published
- 2024
8. Conversational Prompt Engineering
- Author
-
Ein-Dor, Liat, Toledo-Ronen, Orith, Spector, Artem, Gretz, Shai, Dankin, Lena, Halfon, Alon, Katz, Yoav, and Slonim, Noam
- Subjects
Computer Science - Computation and Language - Abstract
Prompts are how humans communicate with LLMs. Informative prompts are essential for guiding LLMs to produce the desired output. However, prompt engineering is often tedious and time-consuming, requiring significant expertise, limiting its widespread use. We propose Conversational Prompt Engineering (CPE), a user-friendly tool that helps users create personalized prompts for their specific tasks. CPE uses a chat model to briefly interact with users, helping them articulate their output preferences and integrating these into the prompt. The process includes two main stages: first, the model uses user-provided unlabeled data to generate data-driven questions and utilize user responses to shape the initial instruction. Then, the model shares the outputs generated by the instruction and uses user feedback to further refine the instruction and the outputs. The final result is a few-shot prompt, where the outputs approved by the user serve as few-shot examples. A user study on summarization tasks demonstrates the value of CPE in creating personalized, high-performing prompts. The results suggest that the zero-shot prompt obtained is comparable to its - much longer - few-shot counterpart, indicating significant savings in scenarios involving repetitive tasks with large text volumes.
- Published
- 2024
9. Generational Computation Reduction in Informal Counterexample-Driven Genetic Programming
- Author
-
Helmuth, Thomas, Pantridge, Edward, Frazier, James Gunder, and Spector, Lee
- Subjects
Computer Science - Neural and Evolutionary Computing ,Computer Science - Artificial Intelligence ,Computer Science - Software Engineering - Abstract
Counterexample-driven genetic programming (CDGP) uses specifications provided as formal constraints to generate the training cases used to evaluate evolving programs. It has also been extended to combine formal constraints and user-provided training data to solve symbolic regression problems. Here we show how the ideas underlying CDGP can also be applied using only user-provided training data, without formal specifications. We demonstrate the application of this method, called ``informal CDGP,'' to software synthesis problems. Our results show that informal CDGP finds solutions faster (i.e. with fewer program executions) than standard GP. Additionally, we propose two new variants to informal CDGP, and find that one produces significantly more successful runs on about half of the tested problems. Finally, we study whether the addition of counterexample training cases to the training set is useful by comparing informal CDGP to using a static subsample of the training set, and find that the addition of counterexamples significantly improves performance.
- Published
- 2024
10. Stay Tuned: An Empirical Study of the Impact of Hyperparameters on LLM Tuning in Real-World Applications
- Author
-
Halfon, Alon, Gretz, Shai, Arviv, Ofir, Spector, Artem, Toledo-Ronen, Orith, Katz, Yoav, Ein-Dor, Liat, Shmueli-Scheuer, Michal, and Slonim, Noam
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Fine-tuning Large Language Models (LLMs) is an effective method to enhance their performance on downstream tasks. However, choosing the appropriate setting of tuning hyperparameters (HPs) is a labor-intensive and computationally expensive process. Here, we provide recommended HP configurations for practical use-cases that represent a better starting point for practitioners, when considering two SOTA LLMs and two commonly used tuning methods. We describe Coverage-based Search (CBS), a process for ranking HP configurations based on an offline extensive grid search, such that the top ranked configurations collectively provide a practical robust recommendation for a wide range of datasets and domains. We focus our experiments on Llama-3-8B and Mistral-7B, as well as full fine-tuning and LoRa, conducting a total of > 10,000 tuning experiments. Our results suggest that, in general, Llama-3-8B and LoRA should be preferred, when possible. Moreover, we show that for both models and tuning methods, exploring only a few HP configurations, as recommended by our analysis, can provide excellent results in practice, making this work a valuable resource for practitioners.
- Published
- 2024
11. $BMO$ and gradient estimates for solutions of critical elliptic equations
- Author
-
Chen, You-Wei Benson, Manfredi, Juan, and Spector, Daniel
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Classical Analysis and ODEs ,Mathematics - Functional Analysis - Abstract
In this paper we explore several applications of the recently introduced spaces of functions of bounded $\beta$-dimensional mean oscillation for $\beta \in (0,n]$ to regularity theory of critical exponent elliptic equations. We first show that functions with gradient in weak-$L^n$ are in $BMO^\beta$ for any $\beta \in (0,n]$, improving the classical result $\nabla u\in L^n$ implies $u\in BMO$. We apply this result to the Poisson equation $-\Delta u = \operatorname*{div} F$ with zero boundary conditions in a bounded $C^1$ domain to show that $u\in BMO^{\beta}$ when $F$ is in weak-$L^n$. Next, we consider the $n$-Laplace equation \begin{align*} -\operatorname*{div}( |\nabla U|^{n-2} \nabla U) &= F \text{ in } \Omega, \newline U &=0 \text{ on }\partial \Omega. \end{align*} with $F\in L^1(\Omega)$ and show that the classical result $u\in BMO$ can be improved to $u\in BMO^\beta$. Finally, we consider the $n$-Laplace equation in the case when $F \in L^1$, $\operatorname*{div} F=0$ and prove that for smooth domains $\Omega$ we have the estimate \begin{align*} \|\nabla U \|_{L^n} \mathbb \leq C \, \|F\|^{1/(n-1)}_{L^1}, \end{align*} where the constant $C$ is independent of $F$., Comment: 18 pages, 1 figure
- Published
- 2024
12. Just read twice: closing the recall gap for recurrent language models
- Author
-
Arora, Simran, Timalsina, Aman, Singhal, Aaryan, Spector, Benjamin, Eyuboglu, Sabri, Zhao, Xinyi, Rao, Ashish, Rudra, Atri, and Ré, Christopher
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Recurrent large language models that compete with Transformers in language modeling perplexity are emerging at a rapid rate (e.g., Mamba, RWKV). Excitingly, these architectures use a constant amount of memory during inference. However, due to the limited memory, recurrent LMs cannot recall and use all the information in long contexts leading to brittle in-context learning (ICL) quality. A key challenge for efficient LMs is selecting what information to store versus discard. In this work, we observe the order in which information is shown to the LM impacts the selection difficulty. To formalize this, we show that the hardness of information recall reduces to the hardness of a problem called set disjointness (SD), a quintessential problem in communication complexity that requires a streaming algorithm (e.g., recurrent model) to decide whether inputted sets are disjoint. We empirically and theoretically show that the recurrent memory required to solve SD changes with set order, i.e., whether the smaller set appears first in-context. Our analysis suggests, to mitigate the reliance on data order, we can put information in the right order in-context or process prompts non-causally. Towards that end, we propose: (1) JRT-Prompt, where context gets repeated multiple times in the prompt, effectively showing the model all data orders. This gives $11.0 \pm 1.3$ points of improvement, averaged across $16$ recurrent LMs and the $6$ ICL tasks, with $11.9\times$ higher throughput than FlashAttention-2 for generation prefill (length $32$k, batch size $16$, NVidia H100). We then propose (2) JRT-RNN, which uses non-causal prefix-linear-attention to process prompts and provides $99\%$ of Transformer quality at $360$M params., $30$B tokens and $96\%$ at $1.3$B params., $50$B tokens on average across the tasks, with $19.2\times$ higher throughput for prefill than FA2.
- Published
- 2024
13. Potential trace inequalities via a Calder\'on-type theorem
- Author
-
Mihula, Zdeněk, Pick, Luboš, and Spector, Daniel
- Subjects
Mathematics - Functional Analysis ,Mathematics - Analysis of PDEs ,Mathematics - Classical Analysis and ODEs - Abstract
We establish an approach to trace inequalities for potential-type operators based on an appropriate modification of an interpolation theorem due to Calder\'on. We develop a general theoretical tool for establishing boundedness of notoriously difficult operators (such as potentials) on certain specific types of rearrangement-invariant function spaces from analogous properties of operators that are easier to handle (such as fractional maximal operators). The key ingredient for the development of the theory is the initial pair of endpoint estimates for the easier operator whose pivotal example is based on a two-weight inequality of Sawyer. Among various applications we obtain a generalization of the celebrated trace inequality involving the Riesz potential and the Hausdorff content by Korobkov and Kristensen., Comment: 30 pages
- Published
- 2024
14. Effective Adaptive Mutation Rates for Program Synthesis
- Author
-
Ni, Andrew and Spector, Lee
- Subjects
Computer Science - Neural and Evolutionary Computing - Abstract
The problem-solving performance of many evolutionary algorithms, including genetic programming systems used for program synthesis, depends on the values of hyperparameters including mutation rates. The mutation method used to produce some of the best results to date on software synthesis benchmark problems, Uniform Mutation by Addition and Deletion (UMAD), adds new genes into a genome at a predetermined rate and then deletes genes at a rate that balances the addition rate, producing no size change on average. While UMAD with a predetermined addition rate outperforms many other mutation and crossover schemes, we do not expect a single rate to be optimal across all problems or all generations within one run of an evolutionary system. However, many current adaptive mutation schemes such as self-adaptive mutation rates suffer from pathologies like the vanishing mutation rate problem, in which the mutation rate quickly decays to zero. We propose an adaptive bandit-based scheme that addresses this problem and essentially removes the need to specify a mutation rate. Although the proposed scheme itself introduces hyperparameters, we either set these to good values or ensemble them in a reasonable range. Results on software synthesis and symbolic regression problems validate the effectiveness of our approach., Comment: 12 pages, 4 figures. Accepted at GECCO'24
- Published
- 2024
15. Pareto-Optimal Learning from Preferences with Hidden Context
- Author
-
Boldi, Ryan, Ding, Li, Spector, Lee, and Niekum, Scott
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Ensuring AI models align with human values is essential for their safety and functionality. Reinforcement learning from human feedback (RLHF) uses human preferences to achieve this alignment. However, preferences sourced from diverse populations can result in point estimates of human values that may be sub-optimal or unfair to specific groups. We propose Pareto Optimal Preference Learning (POPL), which frames discrepant group preferences as objectives with potential trade-offs, aiming for policies that are Pareto-optimal on the preference dataset. POPL utilizes Lexicase selection, an iterative process to select diverse and Pareto-optimal solutions. Our empirical evaluations demonstrate that POPL surpasses baseline methods in learning sets of reward functions, effectively catering to distinct groups without access to group numbers or membership labels. Furthermore, we illustrate that POPL can serve as a foundation for techniques optimizing specific notions of group fairness, ensuring inclusive and equitable AI model alignment.
- Published
- 2024
16. Investigating the Design, Participation and Experience of Teaching and Learning Facilitated by User-Generated Microgames on an Open Educational Platform
- Author
-
Imam Fitri Rahmadi, Zsolt Lavicza, Selay Arkün Kocadere, Tony Houghton, and Jonathan Michael Spector
- Abstract
Although user-generated microgames, defined as very simple games made by non-professionals on open platforms, are popular and appear to have considerable advantages in facilitating learning, further exploration is needed to establish their potential in instructional practices. The present study investigates the design, participation and experience of teaching and learning facilitated by user-generated microgames on an open educational platform. Through an exploratory experiment research method, four elementary school teachers designed and implemented microgame-based learning utilising these very small games on GeoGebra Classroom attended by 129 students. Data were gathered from lesson plans, classroom activity records and self-reflection questionnaires. This study revealed that teachers designed learning with various user-generated microgames and debriefing methods respecting learning content, but they shared comparatively similar scenarios by inserting microgame-based learning into the middle of the main session. The completion rate for the debriefing activity is minimum although the total joining times overshoot the number of students. Teachers found that user-generated microgames are acceptable to orchestrate short serious gaming sessions even though they are limited to one player with basic interfaces. Notwithstanding several disadvantages of these microgames recognised by students, such as missing learning instructions and inadequate interfaces, they so far enjoy learning by playing the games. The most critical implication of this study is to provide sufficient instructions and additional time for microgaming sessions in elementary schools to ensure sustainable completion of the briefing, playing and debriefing activities.
- Published
- 2024
- Full Text
- View/download PDF
17. Gender Differences in the Path to Medical School Deanship
- Author
-
Iyer, Maya S, Bradford, Carol, Gottlieb, Amy S, Kling, David B, Jagsi, Reshma, Mangurian, Christina, Marks, Lilly, Meltzer, Carolyn C, Overholser, Barbara, Silver, Julie K, Way, David P, and Spector, Nancy D
- Subjects
Health Services and Systems ,Health Sciences ,Clinical Research ,Gender Equality ,Humans ,Female ,Male ,Schools ,Medical ,Leadership ,United States ,Faculty ,Medical ,Qualitative Research ,Sex Factors ,Adult ,Middle Aged ,Career Mobility ,Biomedical and clinical sciences ,Health sciences - Abstract
ImportanceWomen account for only 28% of current US medical school deans. Studying the differences between women and men in their preparation to becoming deans might help to explain this discrepancy.ObjectiveTo identify differences in the leadership development experiences between women and men in their ascent to the medical school deanship.Design, setting, and participantsIn this qualitative study, volunteers from the roster of the Association of American Medical Colleges Council of Deans were solicited and interviewed from June 15 to November 9, 2023. Women deans were recruited first, then men who had been appointed to their deanships at a similar time to their women counterparts were recruited. Deans were interviewed on topics related to number of applications for deanships, prior leadership roles, leadership development, personal factors, and career trajectories. Interviews were coded, and themes were extracted through conventional content analysis.Main outcome and measuresCareer and leadership development experiences were elicited using a semistructured interview guide.ResultsWe interviewed 17 women and 17 men deans, representing 25.8% (34 of 132) of the total population of US medical school deans. Most deans (23 [67.6%]) practiced a medicine-based specialty or subspecialty. No statistically significant differences were found between women and men with regard to years to attain deanship (mean [SD], 2.7 [3.4] vs 3.7 [3.7] years), years as a dean (mean [SD], 5.7 [5.2] vs 6.0 [5.0] years), highest salary during career (mean [SD], $525 769 [$199 936] vs $416 923 [$195 848]), or medical school rankings (mean [SD], 315.5 [394.5] vs 480.5 [448.9]). Their reports indicated substantive gender differences in their paths to becoming a dean. Compared with men, women deans reported having to work harder to advance, while receiving less support and opportunities for leadership positions by their own institutions. Subsequently, women sought leadership development from external programs. Women deans also experienced gender bias when working with search firms.Conclusions and relevanceThis qualitative study of US medical school deans found that compared with men, women needed to be more proactive, had to participate in external leadership development programs, and had to confront biases during the search process. For rising women leaders, this lack of support had consequences, such as burnout and attrition, potentially affecting the makeup of future generations of medical school deans. Institutional initiatives centering on leadership development of women is needed to mitigate the gender biases and barriers faced by aspiring women leaders.
- Published
- 2024
18. On dimension stable spaces of measures
- Author
-
Spector, Daniel and Stolyarov, Dmitriy
- Subjects
Mathematics - Functional Analysis ,Mathematics - Analysis of PDEs ,Mathematics - Classical Analysis and ODEs - Abstract
In this paper, we define spaces of measures $DS_\beta(\mathbb{R}^d)$ with dimensional stability $\beta \in (0,d)$. These spaces bridge between $M_b(\mathbb{R}^d)$, the space of finite Radon measures, and $DS_d(\mathbb{R}^d)= \mathrm{H}^1(\mathbb{R}^d)$, the real Hardy space. We show the spaces $DS_\beta(\mathbb{R}^d)$ support Sobolev inequalities for $\beta \in (0,d]$, while for any $\beta \in [0,d]$ we show that the lower Hausdorff dimension of an element of $DS_\beta(\mathbb{R}^d)$ is at least $\beta$., Comment: 30 pages
- Published
- 2024
19. On the Trace of $\dot{W}_{a}^{m+1,1}(\mathbb{R}_{+}^{n+1})$
- Author
-
Leoni, Giovanni and Spector, Daniel
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Functional Analysis ,46E35 - Abstract
In this paper we prove extension results for functions in Besov spaces. Our results are new in the homogeneous setting, while our technique applies equally in the inhomogeneous setting to obtain new proofs of classical results. While our results include $p>1$, of principle interest is the case $p=1$, where we show that \begin{equation*} \int_{\mathbb{R}_{+}^{n+1}}t^{a}|\nabla^{m+1}u(x,t)|\;dtdx\lesssim\left\vert f\right\vert _{B^{m-a,1}(\mathbb{R}^{n})} \end{equation*} for all $f \in \dot{B}^{m-a,1}(\mathbb{R}^{n})$ (the homogeneous Besov space) where $u$ is a suitably scaled heat extension of $f$., Comment: 37 pages
- Published
- 2024
20. Explaining vague language
- Author
-
Égré, Paul and Spector, Benjamin
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Science and Game Theory ,Computer Science - Information Theory ,91A86 ,I.2.7 - Abstract
Why is language vague? Vagueness may be explained and rationalized if it can be shown that vague language is more useful to speaker and hearer than precise language. In a well-known paper, Lipman proposes a game-theoretic account of vagueness in terms of mixed strategy that leads to a puzzle: vagueness cannot be strictly better than precision at equilibrium. More recently, \'Egr\'e, Spector, Mortier and Verheyen have put forward a Bayesian account of vagueness establishing that using vague words can be strictly more informative than using precise words. This paper proposes to compare both results and to explain why they are not in contradiction. Lipman's definition of vagueness relies exclusively on a property of signaling strategies, without making any assumptions about the lexicon, whereas \'Egr\'e et al.'s involves a layer of semantic content. We argue that the semantic account of vagueness is needed, and more adequate and explanatory of vagueness.
- Published
- 2024
21. The mosaic permutation test: an exact and nonparametric goodness-of-fit test for factor models
- Author
-
Spector, Asher, Barber, Rina Foygel, Hastie, Trevor, Kahn, Ronald N., and Candès, Emmanuel
- Subjects
Statistics - Methodology ,62H25 (Primary) 62G10, 62G09 (Secondary) - Abstract
Financial firms often rely on fundamental factor models to explain correlations among asset returns and manage risk. Yet after major events, e.g., COVID-19, analysts may reassess whether existing risk models continue to fit well: specifically, after accounting for a set of known factor exposures, are the residuals of the asset returns independent? With this motivation, we introduce the mosaic permutation test, a nonparametric goodness-of-fit test for preexisting factor models. Our method can leverage modern machine learning techniques to detect model violations while provably controlling the false positive rate, i.e., the probability of rejecting a well-fitting model, without making asymptotic approximations or parametric assumptions. This property helps prevent analysts from unnecessarily rebuilding accurate models, which can waste resources and increase risk. To illustrate our methodology, we apply the mosaic permutation test to the BlackRock Fundamental Equity Risk (BFRE) model. Although the BFRE model generally explains the most significant correlations among assets, we find evidence of unexplained correlations among certain real estate stocks, and we show that adding new factors improves model fit. We implement our methods in the python package mosaicperm., Comment: 42 pages, 13 figures
- Published
- 2024
22. Leveraging Symbolic Regression for Heuristic Design in the Traveling Thief Problem
- Author
-
Ni, Andrew and Spector, Lee
- Subjects
Computer Science - Neural and Evolutionary Computing - Abstract
The Traveling Thief Problem is an NP-hard combination of the well known traveling salesman and knapsack packing problems. In this paper, we use symbolic regression to learn useful features of near-optimal packing plans, which we then use to design efficient metaheuristic genetic algorithms for the traveling thief algorithm. By using symbolic regression again to initialize the metaheuristic GA with near-optimal individuals, we are able to design a fast, interpretable, and effective packing initialization scheme. Comparisons against previous initialization schemes validates our algorithm design., Comment: 23 pages
- Published
- 2024
23. Optical cavity characterization with a mode-matched heterodyne sensing scheme
- Author
-
Spector, Aaron D. and Kozlowski, Todd
- Subjects
Physics - Optics ,Physics - Instrumentation and Detectors - Abstract
We describe a technique for measuring the complex reflectivity of an optical cavity with a resonant local oscillator laser and an auxiliary probe laser, each coupled via opposite ends of the cavity. A heterodyne sensing scheme is then used to observe the phase and amplitude of the interference beat-note between the promptly reflected field and the cavity transmitted field injected through the far mirror. Since the local oscillator laser must pass through the cavity before interfering with the probe laser these measurements are not only independent of the spatial coupling of either laser to the cavity, but also obtained at the in-situ position of the cavity Eigenmode. This technique was demonstrated on a 19 m cavity to measure the individual transmissivities of each of the mirrors as well as the round trip optical losses to an accuracy of several parts per million., Comment: 16 pages, 6 figures, 1 table
- Published
- 2024
24. AIntibody: an experimentally validated in silico antibody discovery design challenge
- Author
-
Erasmus, M. Frank, Spector, Laura, Ferrara, Fortunato, DiNiro, Roberto, Pohl, Thomas J., Perea-Schmittle, Katheryn, Wang, Wei, Tessier, Peter M., Richardson, Crystal, Turner, Laure, Kumar, Sumit, Bedinger, Daniel, Sormanni, Pietro, Fernández-Quintero, Monica L., Ward, Andrew B., Loeffler, Johannes R., Swanson, Olivia M., Deane, Charlotte M., Raybould, Matthew I. J., Evers, Andreas, Sellmann, Carolin, Bachas, Sharrol, Ruffolo, Jeff, Nastri, Horacio G., Ramesh, Karthik, Sørensen, Jesper, Croasdale-Wood, Rebecca, Hijano, Oliver, Leal-Lopes, Camila, Shahsavarian, Melody, Qiu, Yu, Marcatili, Paolo, Vernet, Erik, Akbar, Rahmad, Friedensohn, Simon, Wagner, Rick, Kurella, Vinodh babu, Malhotra, Shipra, Kumar, Satyendra, Kidger, Patrick, Almagro, Juan C., Furfine, Eric, Stanton, Marty, Graff, Christilyn P., Villalba, Santiago David, Tomszak, Florian, Teixeira, Andre A. R., Hopkins, Elizabeth, Dovner, Molly, D’Angelo, Sara, and Bradbury, Andrew R. M.
- Published
- 2024
- Full Text
- View/download PDF
25. Artificial Intelligence to Promote Racial and Ethnic Cardiovascular Health Equity
- Author
-
Amponsah, Daniel, Thamman, Ritu, Brandt, Eric, James, Cornelius, Spector-Bagdady, Kayte, and Yong, Celina M.
- Published
- 2024
- Full Text
- View/download PDF
26. Common Sense Matters: Reply to Janzen, Sonu, and Myrebøe’s Reviews of In Search of Responsibility as Education
- Author
-
Spector, Hannah
- Published
- 2024
- Full Text
- View/download PDF
27. DALex: Lexicase-like Selection via Diverse Aggregation
- Author
-
Ni, Andrew, Ding, Li, and Spector, Lee
- Subjects
Computer Science - Neural and Evolutionary Computing ,Computer Science - Machine Learning - Abstract
Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so significantly more quickly. The new method, called DALex (for Diversely Aggregated Lexicase), selects the best individual with respect to a weighted sum of training case errors, where the weights are randomly sampled. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between the behavior of lexicase selection and its "relaxed" variants, such as epsilon or batch lexicase selection, by adjusting a single hyperparameter, named "particularity pressure," which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems., Comment: 15 pages, 4 figures. Accepted at EuroGP'24
- Published
- 2024
28. How to optimize neuroscience data utilization and experiment design for advancing primate visual and linguistic brain models?
- Author
-
Tuckute, Greta, Finzi, Dawn, Margalit, Eshed, Zylberberg, Joel, Chung, SueYeon, Fyshe, Alona, Fedorenko, Evelina, Kriegeskorte, Nikolaus, Yates, Jacob, Grill-Spector, Kalanit, and Kar, Kohitij
- Subjects
Quantitative Biology - Neurons and Cognition - Abstract
In recent years, neuroscience has made significant progress in building large-scale artificial neural network (ANN) models of brain activity and behavior. However, there is no consensus on the most efficient ways to collect data and design experiments to develop the next generation of models. This article explores the controversial opinions that have emerged on this topic in the domain of vision and language. Specifically, we address two critical points. First, we weigh the pros and cons of using qualitative insights from empirical results versus raw experimental data to train models. Second, we consider model-free (intuition-based) versus model-based approaches for data collection, specifically experimental design and stimulus selection, for optimal model development. Finally, we consider the challenges of developing a synergistic approach to experimental design and model building, including encouraging data and model sharing and the implications of iterative additions to existing models. The goal of the paper is to discuss decision points and propose directions for both experimenters and model developers in the quest to understand the brain.
- Published
- 2024
29. Single-cell genomics and regulatory networks for 388 human brains.
- Author
-
Emani, Prashant, Liu, Jason, Clarke, Declan, Jensen, Matthew, Warrell, Jonathan, Gupta, Chirag, Meng, Ran, Lee, Che Yu, Xu, Siwei, Dursun, Cagatay, Lou, Shaoke, Chen, Yuhang, Chu, Zhiyuan, Galeev, Timur, Hwang, Ahyeon, Li, Yunyang, Ni, Pengyu, Zhou, Xiao, Bakken, Trygve, Bendl, Jaroslav, Bicks, Lucy, Chatterjee, Tanima, Cheng, Lijun, Cheng, Yuyan, Dai, Yi, Duan, Ziheng, Flaherty, Mary, Fullard, John, Gancz, Michael, Garrido-Martín, Diego, Gaynor-Gillett, Sophia, Grundman, Jennifer, Hawken, Natalie, Henry, Ella, Hoffman, Gabriel, Huang, Ao, Jiang, Yunzhe, Jin, Ting, Jorstad, Nikolas, Kawaguchi, Riki, Khullar, Saniya, Liu, Jianyin, Liu, Junhao, Liu, Shuang, Ma, Shaojie, Margolis, Michael, Mazariegos, Samantha, Moore, Jill, Moran, Jennifer, Nguyen, Eric, Phalke, Nishigandha, Pjanic, Milos, Pratt, Henry, Quintero, Diana, Rajagopalan, Ananya, Riesenmy, Tiernon, Shedd, Nicole, Shi, Manman, Spector, Megan, Terwilliger, Rosemarie, Travaglini, Kyle, Wamsley, Brie, Wang, Gaoyuan, Xia, Yan, Xiao, Shaohua, Yang, Andrew, Zheng, Suchen, Gandal, Michael, Lee, Donghoon, Lein, Ed, Roussos, Panos, Sestan, Nenad, Weng, Zhiping, White, Kevin, Won, Hyejung, Girgenti, Matthew, Zhang, Jing, Wang, Daifeng, Geschwind, Daniel, and Gerstein, Mark
- Subjects
Humans ,Aging ,Brain ,Cell Communication ,Chromatin ,Gene Regulatory Networks ,Genomics ,Mental Disorders ,Prefrontal Cortex ,Quantitative Trait Loci ,Single-Cell Analysis - Abstract
Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multiomics datasets into a resource comprising >2.8 million nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550,000 cell type-specific regulatory elements and >1.4 million single-cell expression quantitative trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.
- Published
- 2024
30. High-resolution myelin-water fraction and quantitative relaxation mapping using 3D ViSTa-MR fingerprinting
- Author
-
Liao, Congyu, Cao, Xiaozhi, Iyer, Siddharth Srinivasan, Schauman, Sophie, Zhou, Zihan, Yan, Xiaoqian, Chen, Quan, Li, Zhitao, Wang, Nan, Gong, Ting, Wu, Zhe, He, Hongjian, Zhong, Jianhui, Yang, Yang, Kerr, Adam, Grill-Spector, Kalanit, and Setsompop, Kawin
- Subjects
Physics - Medical Physics ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Purpose: This study aims to develop a high-resolution whole-brain multi-parametric quantitative MRI approach for simultaneous mapping of myelin-water fraction (MWF), T1, T2, and proton-density (PD), all within a clinically feasible scan time. Methods: We developed 3D ViSTa-MRF, which combined Visualization of Short Transverse relaxation time component (ViSTa) technique with MR Fingerprinting (MRF), to achieve high-fidelity whole-brain MWF and T1/T2/PD mapping on a clinical 3T scanner. To achieve fast acquisition and memory-efficient reconstruction, the ViSTa-MRF sequence leverages an optimized 3D tiny-golden-angle-shuffling spiral-projection acquisition and joint spatial-temporal subspace reconstruction with optimized preconditioning algorithm. With the proposed ViSTa-MRF approach, high-fidelity direct MWF mapping was achieved without a need for multi-compartment fitting that could introduce bias and/or noise from additional assumptions or priors. Results: The in-vivo results demonstrate the effectiveness of the proposed acquisition and reconstruction framework to provide fast multi-parametric mapping with high SNR and good quality. The in-vivo results of 1mm- and 0.66mm-iso datasets indicate that the MWF values measured by the proposed method are consistent with standard ViSTa results that are 30x slower with lower SNR. Furthermore, we applied the proposed method to enable 5-minute whole-brain 1mm-iso assessment of MWF and T1/T2/PD mappings for infant brain development and for post-mortem brain samples. Conclusions: In this work, we have developed a 3D ViSTa-MRF technique that enables the acquisition of whole-brain MWF, quantitative T1, T2, and PD maps at 1mm and 0.66mm isotropic resolution in 5 and 15 minutes, respectively. This advancement allows for quantitative investigations of myelination changes in the brain., Comment: 38 pages, 12 figures and 1 table
- Published
- 2023
- Full Text
- View/download PDF
31. Optimizing Neural Networks with Gradient Lexicase Selection
- Author
-
Ding, Li and Spector, Lee
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Neural and Evolutionary Computing - Abstract
One potential drawback of using aggregated performance measurement in machine learning is that models may learn to accept higher errors on some training cases as compromises for lower errors on others, with the lower errors actually being instances of overfitting. This can lead to both stagnation at local optima and poor generalization. Lexicase selection is an uncompromising method developed in evolutionary computation, which selects models on the basis of sequences of individual training case errors instead of using aggregated metrics such as loss and accuracy. In this paper, we investigate how lexicase selection, in its general form, can be integrated into the context of deep learning to enhance generalization. We propose Gradient Lexicase Selection, an optimization framework that combines gradient descent and lexicase selection in an evolutionary fashion. Our experimental results demonstrate that the proposed method improves the generalization performance of various widely-used deep neural network architectures across three image classification benchmarks. Additionally, qualitative analysis suggests that our method assists networks in learning more diverse representations. Our source code is available on GitHub: https://github.com/ld-ing/gradient-lexicase., Comment: ICLR 2022
- Published
- 2023
32. Visual Literacy of Molecular Biology Revealed through a Card-Sorting Task
- Author
-
Newman, Dina L., Spector, Hannah, Neuenschwander, Anna, Miller, Anna J., Trumpore, Lauren, and Wright, L. Kate
- Abstract
Visual literacy, which is the ability to effectively identify, interpret, evaluate, use, and create images and visual media, is an important aspect of science literacy. As molecular processes are not directly observable, researchers and educators rely on visual representations (e.g., drawings) to communicate ideas in biology. How learners interpret and organize those numerous diagrams is related to their underlying knowledge about biology and their skills in visual literacy. Furthermore, it is not always obvious how and why learners interpret diagrams in the way they do (especially if their interpretations are unexpected), as it is not possible to "see" inside the minds of learners and directly observe the inner workings of their brains. Hence, tools that allow for the investigation of visual literacy are needed. Here, we present a novel card-sorting task based on visual literacy skills to investigate how learners interpret and think about DNA-based concepts. We quantified differences in performance between groups of varying expertise and in pre- and postcourse settings using percentages of expected card pairings and edit distance to a perfect sort. Overall, we found that biology experts organized the visual representations based on deep conceptual features, while biology learners (novices) more often organized based on surface features, such as color and style. We also found that students performed better on the task after a course in which molecular biology concepts were taught, suggesting the activity is a useful and valid tool for measuring knowledge. We have provided the cards to the community for use as a classroom activity, as an assessment instrument, and/or as a useful research tool to probe student ideas about molecular biology.
- Published
- 2023
33. The unique catalytic properties of PSAT1 mediate metabolic adaptation to glutamine blockade
- Author
-
Qiu, Yijian, Stamatatos, Olivia T., Hu, Qingting, Ruiter Swain, Jed, Russo, Suzanne, Sann, Ava, Costa, Ana S. H., Violante, Sara, Spector, David L., Cross, Justin R., and Lukey, Michael J.
- Published
- 2024
- Full Text
- View/download PDF
34. Interaction between MED12 and ΔNp63 activates basal identity in pancreatic ductal adenocarcinoma
- Author
-
Maia-Silva, Diogo, Cunniff, Patrick J., Schier, Allison C., Skopelitis, Damianos, Trousdell, Marygrace C., Moresco, Philip, Gao, Yuan, Kechejian, Vahag, He, Xue-Yan, Sahin, Yunus, Wan, Ledong, Alpsoy, Aktan, Liverpool, Jynelle, Krainer, Adrian R., Egeblad, Mikala, Spector, David L., Fearon, Douglas T., dos Santos, Camila O., Taatjes, Dylan J., and Vakoc, Christopher R.
- Published
- 2024
- Full Text
- View/download PDF
35. Understanding the genetic complexity of puberty timing across the allele frequency spectrum
- Author
-
Kentistou, Katherine A., Kaisinger, Lena R., Stankovic, Stasa, Vaudel, Marc, Mendes de Oliveira, Edson, Messina, Andrea, Walters, Robin G., Liu, Xiaoxi, Busch, Alexander S., Helgason, Hannes, Thompson, Deborah J., Santoni, Federico, Petricek, Konstantin M., Zouaghi, Yassine, Huang-Doran, Isabel, Gudbjartsson, Daniel F., Bratland, Eirik, Lin, Kuang, Gardner, Eugene J., Zhao, Yajie, Jia, Raina Y., Terao, Chikashi, Riggan, Marjorie J., Bolla, Manjeet K., Yazdanpanah, Mojgan, Yazdanpanah, Nahid, Bradfield, Jonathan P., Broer, Linda, Campbell, Archie, Chasman, Daniel I., Cousminer, Diana L., Franceschini, Nora, Franke, Lude H., Girotto, Giorgia, He, Chunyan, Järvelin, Marjo-Riitta, Joshi, Peter K., Kamatani, Yoichiro, Karlsson, Robert, Luan, Jian’an, Lunetta, Kathryn L., Mägi, Reedik, Mangino, Massimo, Medland, Sarah E., Meisinger, Christa, Noordam, Raymond, Nutile, Teresa, Concas, Maria Pina, Polašek, Ozren, Porcu, Eleonora, Ring, Susan M., Sala, Cinzia, Smith, Albert V., Tanaka, Toshiko, van der Most, Peter J., Vitart, Veronique, Wang, Carol A., Willemsen, Gonneke, Zygmunt, Marek, Ahearn, Thomas U., Andrulis, Irene L., Anton-Culver, Hoda, Antoniou, Antonis C., Auer, Paul L., Barnes, Catriona L. K., Beckmann, Matthias W., Berrington de Gonzalez, Amy, Bogdanova, Natalia V., Bojesen, Stig E., Brenner, Hermann, Buring, Julie E., Canzian, Federico, Chang-Claude, Jenny, Couch, Fergus J., Cox, Angela, Crisponi, Laura, Czene, Kamila, Daly, Mary B., Demerath, Ellen W., Dennis, Joe, Devilee, Peter, De Vivo, Immaculata, Dörk, Thilo, Dunning, Alison M., Dwek, Miriam, Eriksson, Johan G., Fasching, Peter A., Fernandez-Rhodes, Lindsay, Ferreli, Liana, Fletcher, Olivia, Gago-Dominguez, Manuela, García-Closas, Montserrat, García-Sáenz, José A., González-Neira, Anna, Grallert, Harald, Guénel, Pascal, Haiman, Christopher A., Hall, Per, Hamann, Ute, Hakonarson, Hakon, Hart, Roger J., Hickey, Martha, Hooning, Maartje J., Hoppe, Reiner, Hopper, John L., Hottenga, Jouke-Jan, Hu, Frank B., Huebner, Hanna, Hunter, David J., Jernström, Helena, John, Esther M., Karasik, David, Khusnutdinova, Elza K., Kristensen, Vessela N., Lacey, James V., Lambrechts, Diether, Launer, Lenore J., Lind, Penelope A., Lindblom, Annika, Magnusson, Patrik K. E., Mannermaa, Arto, McCarthy, Mark I., Meitinger, Thomas, Menni, Cristina, Michailidou, Kyriaki, Millwood, Iona Y., Milne, Roger L., Montgomery, Grant W., Nevanlinna, Heli, Nolte, Ilja M., Nyholt, Dale R., Obi, Nadia, O’Brien, Katie M., Offit, Kenneth, Oldehinkel, Albertine J., Ostrowski, Sisse R., Palotie, Aarno, Pedersen, Ole B., Peters, Annette, Pianigiani, Giulia, Plaseska-Karanfilska, Dijana, Pouta, Anneli, Pozarickij, Alfred, Radice, Paolo, Rennert, Gad, Rosendaal, Frits R., Ruggiero, Daniela, Saloustros, Emmanouil, Sandler, Dale P., Schipf, Sabine, Schmidt, Carsten O., Schmidt, Marjanka K., Small, Kerrin, Spedicati, Beatrice, Stampfer, Meir, Stone, Jennifer, Tamimi, Rulla M., Teras, Lauren R., Tikkanen, Emmi, Turman, Constance, Vachon, Celine M., Wang, Qin, Winqvist, Robert, Wolk, Alicja, Zemel, Babette S., Zheng, Wei, van Dijk, Ko W., Alizadeh, Behrooz Z., Bandinelli, Stefania, Boerwinkle, Eric, Boomsma, Dorret I., Ciullo, Marina, Chenevix-Trench, Georgia, Cucca, Francesco, Esko, Tõnu, Gieger, Christian, Grant, Struan F. A., Gudnason, Vilmundur, Hayward, Caroline, Kolčić, Ivana, Kraft, Peter, Lawlor, Deborah A., Martin, Nicholas G., Nøhr, Ellen A., Pedersen, Nancy L., Pennell, Craig E., Ridker, Paul M., Robino, Antonietta, Snieder, Harold, Sovio, Ulla, Spector, Tim D., Stöckl, Doris, Sudlow, Cathie, Timpson, Nic J., Toniolo, Daniela, Uitterlinden, André, Ulivi, Sheila, Völzke, Henry, Wareham, Nicholas J., Widen, Elisabeth, Wilson, James F., Pharoah, Paul D. P., Li, Liming, Easton, Douglas F., Njølstad, Pål R., Sulem, Patrick, Murabito, Joanne M., Murray, Anna, Manousaki, Despoina, Juul, Anders, Erikstrup, Christian, Stefansson, Kari, Horikoshi, Momoko, Chen, Zhengming, Farooqi, I. Sadaf, Pitteloud, Nelly, Johansson, Stefan, Day, Felix R., Perry, John R. B., and Ong, Ken K.
- Published
- 2024
- Full Text
- View/download PDF
36. Early Successful Experiences of Surgical Conversion of Endoscopic Gastric Plication to Roux-en-Y Gastric Bypass
- Author
-
Shin, Thomas H., Bi, Danse, Jirapinyo, Pichamol, Thompson, Christopher C., Spector, David, and Tavakkoli, Ali
- Published
- 2024
- Full Text
- View/download PDF
37. Investigating the design, participation and experience of teaching and learning facilitated by user-generated microgames on an open educational platform
- Author
-
Rahmadi, Imam Fitri, Lavicza, Zsolt, Arkün Kocadere, Selay, Houghton, Tony, and Spector, Jonathan Michael
- Published
- 2024
- Full Text
- View/download PDF
38. Cervantes y los servicios de inteligencia: Espías e informantes en “La gitanilla”
- Author
-
Spector, Matías A.
- Published
- 2024
- Full Text
- View/download PDF
39. Objectives Are All You Need: Solving Deceptive Problems Without Explicit Diversity Maintenance
- Author
-
Boldi, Ryan, Ding, Li, and Spector, Lee
- Subjects
Computer Science - Neural and Evolutionary Computing - Abstract
Navigating deceptive domains has often been a challenge in machine learning due to search algorithms getting stuck at sub-optimal local optima. Many algorithms have been proposed to navigate these domains by explicitly maintaining diversity or equivalently promoting exploration, such as Novelty Search or other so-called Quality Diversity algorithms. In this paper, we present an approach with promise to solve deceptive domains without explicit diversity maintenance by optimizing a potentially large set of defined objectives. These objectives can be extracted directly from the environment by sub-aggregating the raw performance of individuals in a variety of ways. We use lexicase selection to optimize for these objectives as it has been shown to implicitly maintain population diversity. We compare this technique with a varying number of objectives to a commonly used quality diversity algorithm, MAP-Elites, on a set of discrete optimization as well as reinforcement learning domains with varying degrees of deception. We find that decomposing objectives into many objectives and optimizing them outperforms MAP-Elites on the deceptive domains that we explore. Furthermore, we find that this technique results in competitive performance on the diversity-focused metrics of QD-Score and Coverage, without explicitly optimizing for these things. Our ablation study shows that this technique is robust to different subaggregation techniques. However, when it comes to non-deceptive, or ``illumination" domains, quality diversity techniques generally outperform our objective-based framework with respect to exploration (but not exploitation), hinting at potential directions for future work., Comment: Published at the Workshop on Agent Learning in Open-Endedness (ALOE) at NeurIPS 2023
- Published
- 2023
40. Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture
- Author
-
Fu, Daniel Y., Arora, Simran, Grogan, Jessica, Johnson, Isys, Eyuboglu, Sabri, Thomas, Armin W., Spector, Benjamin, Poli, Michael, Rudra, Atri, and Ré, Christopher
- Subjects
Computer Science - Machine Learning - Abstract
Machine learning models are increasingly being scaled in both sequence length and model dimension to reach longer contexts and better performance. However, existing architectures such as Transformers scale quadratically along both these axes. We ask: are there performant architectures that can scale sub-quadratically along sequence length and model dimension? We introduce Monarch Mixer (M2), a new architecture that uses the same sub-quadratic primitive along both sequence length and model dimension: Monarch matrices, a simple class of expressive structured matrices that captures many linear transforms, achieves high hardware efficiency on GPUs, and scales sub-quadratically. As a proof of concept, we explore the performance of M2 in three domains: non-causal BERT-style language modeling, ViT-style image classification, and causal GPT-style language modeling. For non-causal BERT-style modeling, M2 matches BERT-base and BERT-large in downstream GLUE quality with up to 27% fewer parameters, and achieves up to 9.1$\times$ higher throughput at sequence length 4K. On ImageNet, M2 outperforms ViT-b by 1% in accuracy, with only half the parameters. Causal GPT-style models introduce a technical challenge: enforcing causality via masking introduces a quadratic bottleneck. To alleviate this bottleneck, we develop a novel theoretical view of Monarch matrices based on multivariate polynomial evaluation and interpolation, which lets us parameterize M2 to be causal while remaining sub-quadratic. Using this parameterization, M2 matches GPT-style Transformers at 360M parameters in pretraining perplexity on The PILE--showing for the first time that it may be possible to match Transformer quality without attention or MLPs., Comment: NeurIPS 2023 (Oral)
- Published
- 2023
41. Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization
- Author
-
Ding, Li, Zhang, Jenny, Clune, Jeff, Spector, Lee, and Lehman, Joel
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Neural and Evolutionary Computing - Abstract
Reinforcement Learning from Human Feedback (RLHF) has shown potential in qualitative tasks where easily defined performance measures are lacking. However, there are drawbacks when RLHF is commonly used to optimize for average human preferences, especially in generative tasks that demand diverse model responses. Meanwhile, Quality Diversity (QD) algorithms excel at identifying diverse and high-quality solutions but often rely on manually crafted diversity metrics. This paper introduces Quality Diversity through Human Feedback (QDHF), a novel approach that progressively infers diversity metrics from human judgments of similarity among solutions, thereby enhancing the applicability and effectiveness of QD algorithms in complex and open-ended domains. Empirical studies show that QDHF significantly outperforms state-of-the-art methods in automatic diversity discovery and matches the efficacy of QD with manually crafted diversity metrics on standard benchmarks in robotics and reinforcement learning. Notably, in open-ended generative tasks, QDHF substantially enhances the diversity of text-to-image generation from a diffusion model and is more favorably received in user studies. We conclude by analyzing QDHF's scalability, robustness, and quality of derived diversity metrics, emphasizing its strength in open-ended optimization tasks. Code and tutorials are available at https://liding.info/qdhf., Comment: ICML 2024
- Published
- 2023
42. Model-Agnostic Covariate-Assisted Inference on Partially Identified Causal Effects
- Author
-
Ji, Wenlong, Lei, Lihua, and Spector, Asher
- Subjects
Economics - Econometrics ,Mathematics - Statistics Theory ,Statistics - Methodology ,Statistics - Machine Learning ,62G15 (Primary), 62G05 (Secondary) ,G.3 ,I.2.m - Abstract
Many causal estimands are only partially identifiable since they depend on the unobservable joint distribution between potential outcomes. Stratification on pretreatment covariates can yield sharper partial identification bounds; however, unless the covariates are discrete with relatively small support, this approach typically requires consistent estimation of the conditional distributions of the potential outcomes given the covariates. Thus, existing approaches may fail under model misspecification or if consistency assumptions are violated. In this study, we propose a unified and model-agnostic inferential approach for a wide class of partially identified estimands, based on duality theory for optimal transport problems. In randomized experiments, our approach can wrap around any estimates of the conditional distributions and provide uniformly valid inference, even if the initial estimates are arbitrarily inaccurate. Also, our approach is doubly robust in observational studies. Notably, this property allows analysts to use the multiplier bootstrap to select covariates and models without sacrificing validity even if the true model is not included. Furthermore, if the conditional distributions are estimated at semiparametric rates, our approach matches the performance of an oracle with perfect knowledge of the outcome model. Finally, we propose an efficient computational framework, enabling implementation on many practical problems in causal inference., Comment: 59 pages, 4 figures
- Published
- 2023
43. A Symmetry-based Framework for Model Selection of Coral Reef Population Growth Models
- Author
-
Spector, Reemon
- Subjects
Quantitative Biology - Quantitative Methods ,Mathematics - Representation Theory ,Physics - Biological Physics ,92D25 (Primary) 22E70 (Secondary) - Abstract
The problem of selecting a model given a set of candidates remains a challenging one that pervades many scientific fields. We employ techniques from the theory of Lie groups to analyse the symmetries in differential equation models of population growth, with the aim of informing the model selection problem. To illustrate the use of Lie symmetries in model selection, we apply them to simulated data and to coral reef data from the Great Barrier Reef, demonstrating that the trivial symmetries can distinguish between candidate models. A method for finding locally optimal parameters for multi-parameter symmetries is presented, and the paper concludes with related results, some open problems, and avenues of further research., Comment: 15 pages, 5 figures
- Published
- 2023
44. Constructor algorithms for building unconventional computers able to solve NP-complete problems
- Author
-
McCaffrey, Tony, Gorochowski, Thomas E., and Spector, Lee
- Subjects
Computer Science - Emerging Technologies ,F.1, J.3 - Abstract
Nature often builds physical structures tailored for specific information processing tasks with computations encoded using diverse phenomena. These can sometimes outperform typical general-purpose computers. However, describing the construction and function of these unconventional computers is often challenging. Here, we address this by introducing constructor algorithms in the context of a robotic wire machine that can be programmed to build networks of connected wires in response to a problem and then act upon these to efficiently carry out a desired computation. We show how this approach can be used to solve the NP-complete Subset Sum Problem (SSP) and provide information about the number of solutions through changes in the voltages and currents measured across these networks. This work provides a foundation for building unconventional computers that encode information purely in the lengths and connections of electrically conductive wires. It also demonstrates the power of computing paradigms beyond digital logic and opens avenues to more fully harness the inherent computational capabilities of diverse physical, chemical and biological substrates., Comment: 14 pages, 4 figures
- Published
- 2023
45. Nonchromosomal birth defects and risk of childhood acute leukemia: An assessment in 15 000 leukemia cases and 46 000 controls from the Childhood Cancer and Leukemia International Consortium
- Author
-
Lupo, Philip J, Chambers, Tiffany M, Mueller, Beth A, Clavel, Jacqueline, Dockerty, John D, Doody, David R, Erdmann, Friederike, Ezzat, Sameera, Filippini, Tommaso, Hansen, Johnni, Heck, Julia E, Infante‐Rivard, Claire, Kang, Alice Y, Magnani, Corrado, Malagoli, Carlotta, Marcotte, Erin L, Metayer, Catherine, Bailey, Helen D, Mora, Ana M, Ntzani, Evangelia, Petridou, Eleni Th, Pombo‐de‐Oliveira, Maria S, Rashed, Wafaa M, Roman, Eve, Schüz, Joachim, Wesseling, Catharina, Spector, Logan G, and Scheurer, Michael E
- Subjects
Biomedical and Clinical Sciences ,Oncology and Carcinogenesis ,Rare Diseases ,Clinical Research ,Cancer ,Pediatric ,Hematology ,Pediatric Cancer ,Childhood Leukemia ,Pediatric Research Initiative ,2.1 Biological and endogenous factors ,Aetiology ,Child ,Humans ,Infant ,Risk Factors ,Leukemia ,Myeloid ,Acute ,Birth Weight ,Logistic Models ,Case-Control Studies ,Surveys and Questionnaires ,acute lymphoblastic leukemia ,acute myeloid leukemia ,birth defects ,childhood leukemia ,epidemiology ,Oncology & Carcinogenesis ,Oncology and carcinogenesis - Abstract
Although recent studies have demonstrated associations between nonchromosomal birth defects and several pediatric cancers, less is known about their role on childhood leukemia susceptibility. Using data from the Childhood Cancer and Leukemia International Consortium, we evaluated associations between nonchromosomal birth defects and childhood leukemia. Pooling consortium data from 18 questionnaire-based and three registry-based case-control studies across 13 countries, we used multivariable logistic regression models to estimate odds ratios (ORs) and 95% confidence intervals (CIs) for the association between a spectrum of birth defects and leukemia. Our analyses included acute lymphoblastic leukemia (ALL, n = 13 115) and acute myeloid leukemia (AML, n = 2120) cases, along with 46 172 controls. We used the false discovery rate to account for multiple comparisons. In the questionnaire-based studies, the prevalence of birth defects was 5% among cases vs 4% in controls, whereas, in the registry-based studies, the prevalence was 11% among cases vs 7% in controls. In pooled adjusted analyses, there were several notable associations, including (1) digestive system defects and ALL (OR = 2.70, 95% CI: 1.46-4.98); (2) congenital anomalies of the heart and circulatory system and AML (OR = 2.86, 95% CI: 1.81-4.52) and (3) nervous system defects and AML (OR = 4.23, 95% CI: 1.50-11.89). Effect sizes were generally larger in registry-based studies. Overall, our results could point to novel genetic and environmental factors associated with birth defects that could also increase leukemia susceptibility. Additionally, differences between questionnaire- and registry-based studies point to the importance of complementary sources of birth defect phenotype data when exploring these associations.
- Published
- 2024
46. An experimental assessment of the nall lexical gap
- Author
-
Maldonado, Mora, Zhou, Ruizhe, and Spector, Benjamin
- Subjects
Semantics - Abstract
Universal constraints on word meaning apply to both lexical and logical words. Across languages, a well-known gap in the logical vocabulary is that 'not all' is never lexicalized. This gap extends beyond determiners to the modal and temporal domains; e.g. 'not must' and 'not always' are typically not lexicalized (Horn 1973). The challenge is to explain this gap. The non-lexicalization of 'not all' has been explained as resulting from a cognitive bias against intrinsically marked meanings (e.g., Katzir and Singh 2013). Recent alternative accounts, however, have explained this same gap relying on considerations of communicative efficiency rather than cognitive markedness (e.g., Enguehard and Spector 2021). In a series of word learning experiments, we disentangle these views by testing whether learners are more likely to infer that a novel word means 'some' rather than 'not all' and whether this varies depending on the communicative needs in the context.
- Published
- 2024
47. La libertad entendida como cooperación social: desarrollando un nuevo enfoque
- Author
-
Ezequiel Spector
- Subjects
libertad ,cooperación social ,liberalismo ,Law in general. Comparative and uniform law. Jurisprudence ,K1-7720 ,Jurisprudence. Philosophy and theory of law ,K201-487 - Abstract
En este trabajo, argumento que las principales concepciones de la libertad (las diferentes variantes de la concepción negativa, positiva y republicana) no le han prestado la debida atención a un componente esencial de la libertad: la cooperación social. En este sentido, desarrollo una concepción de la libertad que le da a la cooperación social un papel protagónico: «libertad entendida como cooperación social» (LCS). Argumento que la actividad de elaborar y realizar planes de vida no puede comprenderse en ausencia de un entorno social donde las personas colaboran entre sí. De acuerdo con este enfoque, la cooperación social es constitutiva de la libertad. Sostengo, por otra parte, que LCS reúne elementos atractivos de concepciones previas de la libertad, pero sin dejar de ser una concepción autónoma que descansa en la idea de cooperación social.
- Published
- 2024
- Full Text
- View/download PDF
48. Visual-Verbal Journals, Literature, and Literacies of Well-Becoming
- Author
-
Karen Spector, James S. Chisholm, Krista Griffin, Kathryn F. Whitmore, Al Cassada, Jennifer Orosco, Taylor Brow, and Andria Regan
- Abstract
Critically-oriented teacher education has been under assault in the United States (U.S.), England, and Australia through policies that have had a chilling effect on teaching critical race theory, gender, and sexuality. We are concerned that these reactionary movements will further distort the histories, lives, and humanity of minoritized groups while reinforcing the single storylines of dominant groups. Our post-qualitative inquiry demonstrates how four literacy teacher education instructors and four preservice literacy teachers from various regions of the U.S. used visual-verbal journals (VVJs) and quality literature in critically-oriented, artful pedagogy to disrupt normative forces in teacher education. Data analyses were informed by the philosophy of Deleuze and Guattari, particularly the concepts of "becoming" and "health," which have explanatory power over affective encounters across the four different sites. We focus on encounters that produced "literacies of well-becoming," which are reading, composing, and thinking processes that multiply the ways learners: 1) encounter self and other; 2) relate to histories and sociopolitical forces; and 3) circulate life-affirming practices. This article provides affirmative examples of how affects can produce health, which for Deleuze involves distributed capacities to break out of well-worn grooves of habit by connecting to the world in new ways.
- Published
- 2024
- Full Text
- View/download PDF
49. Help-Seeking among College Survivors of Dating and Sexual Violence: A Qualitative Exploration of Utilization of University-Based Victim Services
- Author
-
Julia Cusano, Leila Wood, Roxanna S. Ast, Sarah McMahon, Jordan J. Steiner, and Cassie Spector
- Abstract
Objective: Study uses qualitative data to examine help-seeking decisions as well as the drivers and barriers to utilization of university-based victim services through the accounts of survivors. Participants: The current study involves the analysis of 33 semi-structured interviews that were conducted with dating and sexual violence (DSV) survivors at a large, Mid-Atlantic University who both did and did not utilize university-based victim services. Methods: Data were analyzed using a thematic analysis approach. Results: Analysis shows that while survivors of DSV undergo a process of help-seeking that is similar to those described in previous help-seeking models, there are additional factors that contribute to a reluctancy to seek services at a university-based victim services center in particular that must be accounted for in the literature. Conclusions: The findings from the current study underscore the importance of understanding the specific drivers and barriers to utilization of university-based victim services.
- Published
- 2024
- Full Text
- View/download PDF
50. Accelerating LLM Inference with Staged Speculative Decoding
- Author
-
Spector, Benjamin and Re, Chris
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Recent advances with large language models (LLM) illustrate their diverse capabilities. We propose a novel algorithm, staged speculative decoding, to accelerate LLM inference in small-batch, on-device scenarios. We address the low arithmetic intensity of small-batch inference by improving upon previous work in speculative decoding. First, we restructure the speculative batch as a tree, which reduces generation costs and increases the expected tokens per batch. Second, we add a second stage of speculative decoding. Taken together, we reduce single-batch decoding latency by 3.16x with a 762M parameter GPT-2-L model while perfectly preserving output quality., Comment: Published at ES-FOMO at ICML 2023
- Published
- 2023
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.