Search

Your search keyword '"Baldwin P"' showing total 117 results

Search Constraints

Start Over You searched for: Author "Baldwin P" Remove constraint Author: "Baldwin P" Topic computer science - computation and language Remove constraint Topic: computer science - computation and language
117 results on '"Baldwin P"'

Search Results

1. Arabic Dataset for LLM Safeguard Evaluation

2. ToolGen: Unified Tool Retrieval and Calling via Generation

3. Loki: An Open-Source Tool for Fact Verification

4. Unconditional Truthfulness: Learning Conditional Dependency for Uncertainty Quantification of Large Language Models

5. LLM Stability: A detailed analysis with some surprises

6. To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction

7. Inference-Time Selective Debiasing

8. Answering real-world clinical questions using large language model based systems

9. Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs

10. Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph

11. Evaluating Evidence Attribution in Generated Fact Checking Explanations

12. Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

13. IndoCulture: Exploring Geographically-Influenced Cultural Commonsense Reasoning Across Eleven Indonesian Provinces

14. Revisiting subword tokenization: A case study on affixal negation in large language models

15. Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

16. A Little Leak Will Sink a Great Ship: Survey of Transparency for Large Language Models from Start to Finish

17. Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification

18. MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

19. PALO: A Polyglot Large Multimodal Model for 5B People

20. Eagle: Ethical Dataset Given from Real Interactions

21. ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic

22. BiMediX: Bilingual Medical Mixture of Experts LLM

23. Emergent Word Order Universals from Cognitively-Motivated Language Models

24. A Chinese Dataset for Evaluating the Safeguards in Large Language Models

25. Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents

26. Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon

27. Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting

28. Location Aware Modular Biencoder for Tourism Question Answering

29. Demystifying Instruction Mixing for Fine-tuning Large Language Models

30. LLM360: Towards Fully Transparent Open-Source LLMs

31. Psychometric Predictive Power of Large Language Models

32. LM-Polygraph: Uncertainty Estimation for Language Models

33. Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information Retrieval

34. Robustness Tests for Automatic Machine Translation Metrics with Adversarial Attacks

35. Unsupervised Lexical Simplification with Context Augmentation

36. Factuality Challenges in the Era of Large Language Models

37. Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU

38. Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings

39. Connecting the Dots in News Analysis: Bridging the Cross-Disciplinary Disparities in Media Bias and Framing

40. Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

41. Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs

42. Collective Human Opinions in Semantic Textual Similarity

43. CMMLU: Measuring massive multitask language understanding in Chinese

44. Language models are not naysayers: An analysis of language models on negation benchmarks

45. Unsupervised Paraphrasing of Multiword Expressions

46. Bactrian-X: Multilingual Replicable Instruction-Following Models with Low-Rank Adaptation

47. Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP

48. Does Vision Accelerate Hierarchical Generalization in Neural Language Learners?

49. NusaCrowd: Open Source Initiative for Indonesian NLP Resources

50. Systematic Evaluation of Predictive Fairness

Catalog

Books, media, physical & digital resources