The amount of dynamic information in the world greatly exceeds the capacity to attend to and represent this information. However, due to efficient voluntary and/or involuntary allocation of resources, many tasks can be performed despite these cognitive limits. For example, when driving a car, the position of many other cars around the driver, the colour of the traffic lights, and the possible presence of pedestrians on the side of the road must be monitored, which requires the recruitment of visual working memory (VWM), a short-term store of visual information limited in capacity to about three to four units of information (Alvarez & Cavanagh, 2004; Luck & Vogel, 1997). Although there is far more visual information available than can be represented in VWM at any given time, some parts of the visual world are very likely to change (e.g., a traffic light is likely to change from red to green) whereas other parts are less likely to change (e.g., a stop sign is unlikely to change colour). Therefore, selectively attending to and storing objects or features that are likely to change is an efficient strategy to improve the ability to detect changes in the visual environment. This selection can be guided by explicit task instructions, or by probability information that has been learned incidentally (Beck, Angelone, & Levin, 2004; Beck, Angelone, Levin, Peterson, & Varakin, 2008; Droll, Gigone, & Hayhoe, 2007; van Lamsweerde & Beck, 2011). We examined how the ability to incidentally learn and use probability information to efficiently allocate VWM resources is affected by whether this information is likely to be initially maintained in VWM.Grouping in Visual Working MemoryAlthough capacity in VWM is very limited, capacity can be maximized by combining multiple pieces of information into a single unit. For example, a red ball is composed of several spatially connected features from different dimensions: colour, shape, size, texture, and so forth. One way of maximizing VWM capacity is to represent all of these connected features together in a single representation (e.g., a "red ball": Alvarez & Cavanagh, 2004; Luck & Vogel, 1997; Luria & Vogel, 2011; Vogel, Woodman, & Luck, 2001).However, while the spatial connection of features is a useful perceptual cue for grouping together two features (Xu, 2006), gestalt grouping principles such as proximity (Woodman, Vecera, & Luck, 2003), similarity (Peterson & Berryhill, 2013), or closure (Anderson, Vogel, & Awh, 2013) may also be used to group together multiple features in VWM. For example, in a change detection task-in which participants determine whether a change occurs between a memory display of items followed by a test display-performance is higher when the memory display contains multiple squares of the same colour (similarity-grouping cue) than when all of the colours are unique. This suggests that identical colours are grouped together into a single unit in VWM (Peterson & Berryhill, 2013). Furthermore, Anderson et al. (2013) found that contralateral delay activity (an ERP marker of the number of items in VWM: Vogel & Machizawa, 2004) decreased when closure cues were present. These data suggest that features can be grouped together in VWM via gestalt grouping cues.What about when two conflicting grouping cues are available? For example, a display with a red oval and a red triangle could potentially be maintained in two ways: as a "red oval" and a "red triangle" (using the spatial connection cue) or a grouping of "red" information (using the feature similarity cue), resulting in separate or no representation for the shape information. If both the colour and shape are task-relevant, then the spatial connection cue would be the most efficient use of VWM capacity, because all of the features can be remembered in two "coloured shape" VWM representations (assuming a capacity of at least two units; Luck & Vogel, 1997). …