Abstract
When two identical visual discs move toward each other on a two-dimensional visual display, they can be perceived as either “streaming through” or “bouncing off” each other after their coincidence. Previous studies have observed a strong bias toward the streaming percept. Additionally, the incidence of the bouncing percept in this ambiguous display could be increased by various factors, such as a brief sound at the moment of coincidence and a momentary pause of the two discs. The streaming/bouncing bistable motion phenomenon has been studied intensively since its discovery. However, little is known regarding the neural basis underling the perceptual ambiguity in the classic version of the streaming/bouncing motion display. The present study investigated the neural basis of the perception disambiguating underling the processing of the streaming/bouncing bistable motion display using event-related potential (ERP) recordings. Surprisingly, the amplitude of frontal central P2 (220–260 ms) that was elicited by the moving discs ~200 ms before the coincidence of the two discs was observed to be predictive of subsequent streaming or bouncing percept. A larger P2 amplitude was observed for streaming percept than the bouncing percept. These findings suggest that the streaming/bouncing bistable perception may have been disambiguated unconsciously ~200 ms before the coincidence of the two discs.
Similar content being viewed by others
Introduction
The human visual system often experiences unitary and stable perception by integrating relevant sensory information in the environment. However, on occasions when the available sensory information is fragmented or ambiguous, our perception may alternate between two or more mutually exclusive interpretations1. Well-known examples of this bistable perception include the famous face-vase drawing, Necker cube and binocular rivalry. Additionally, bistable perception occurs not only in the static figures mentioned above, but in moving objects as well. Consider two identical visual targets moving toward each other along the same horizontal line with equal and constant speed in a two-dimensional display: Two visual targets start their motion from opposite sides, coincide at the center of the screen, move apart, and stop at each other’s starting point. In this display, observers typically perceive the motion of the targets after coincidence as either “streaming through” or “bouncing off” each other2.
Despite the ambiguous nature of the streaming/bouncing motion display, subjects usually show a strong bias to report the streaming percept (~80% streaming response vs. ~20% bouncing response)3,4,5. Interestingly, a large number of behavioral studies have reported various factors that can reverse perceptual dominance from streaming toward bouncing, such as a momentary pause of the two discs3, 5, a brief sound6,7,8,9,10,11,12,13,14, a transient visual distractor4, 15, and even a transient tactile vibration16 at the moment of coincidence of the two discs as well as post-coincidence trajectory duration17, 18, pre-coincidence trajectory switches (used to manipulate expectation)19, a partial overlap20, 21 of the two discs, and an orientation difference between the stripes on the discs’ and their path of motion15. Several neuroscience studies have explored the neural mechanisms of the effect of sound on streaming/bouncing motion display (i.e. the audiovisual bounce-inducing effect, ABE). For example, an event-related functional magnetic resonance imaging (fMRI) study has shown that a transient sound induced higher activation in multimodal areas (e.g., the prefrontal and posterior parietal cortex) when subjects perceived bouncing than streaming22. An electroencephalograph (EEG) study found increased beta-rhythm synchronization across frontal, parietal, and occipital cortex and gamma-rhythm synchronization across central and temporal regions for bouncing trials compared to streaming trials when the coincident sound was introduced23. Furthermore, using transcranial magnetic stimulation (TMS) to temporarily deactivate the posterior parietal cortex resulted in an attenuated magnitude of the sound’s effect on improving the bouncing percept24. Combining these neuroscience studies, cross-modal integration between polysensory and unisensory brain cortices may underlie the effect of sound on reversing the perceptual dominance of the streaming/bouncing motion display.
Despite numerous identified bounce-inducing factors, especially evidences based on neural mechanisms for the audiovisual bounce-inducing effect (ABE), the neural basis of disambiguating perception during the streaming/bouncing bistable perception processing in purely visual streaming/bouncing display remains unclear. In other words, it is not understood how one percept (streaming or bouncing) overwhelms the other and eventually becomes a stable percept under an ambiguous context in a visual streaming/bouncing bistable display. Sekuler and Sekuler as well as Watanabe and Shimojo have proposed that the human perceptual system might utilize experience of the three-dimensional world in a probabilistic way to derive a stable percept in a two-dimensional visual streaming/bouncing motion display5, 7, 17. For example, Watanabe and Shimojo argued that moving objects in a natural environment are seldom (but still possibly) aligned at the same depth plane7, 17. In other words, the bouncing event in a three-dimensional world would occur only when the two moving balls exist in the same depth plane (it’s less possible but still occurs occasionally). Our perceptual system may compute this probability when facing a visual streaming/bouncing motion display, so the streaming percept could be dominant and the visual streaming/bouncing motion display could potentially be ambiguous. Although this probabilistic inference account is highly reasonable and has been confirmed by several studies13, 19, 20, there is little experimental evidence to support this account in the context of a purely visual streaming/bouncing motion display19, 25.
In previous bistable perception studies, a traditional view proposed that antagonistic activity within the visual system is a neural consequence of spontaneous perceptual reversals26, 27. Consistent with this viewpoint, many studies have found neural activity in the primary visual cortex28,29,30,31,32, lateral geniculate nucleus33, 34, and extrastriate visual cortex35,36,37,38,39 that correlates with perceptual outcomes or perceptual transitions in ambiguous displays. However, a growing number of fMRI studies have demonstrated that the frontal and parietal cortex play a causal role in initiating perceptual reversals in bistable displays37, 40,41,42,43, which indicates that high-level brain areas could reorganize ambiguous information within the visual cortex and eventually reverse our perceptual interpretations1, 44, 45. On the other hand, a series of event-related potential (ERP) studies have identified two successive ERP components (difference wave) related to perceptual reversals of bistable stimuli, which are the occipital-parietal “reversal negativity” (RN)46,47,48,49,50,51 and the central-parietal “late positive complex” (LPC)47,48,49,50, 52. These phenomena were discovered by comparing trials on which perceptual outcome was reversed to trials on which the perceptual outcome remained the same across successive trials. The RN (170–300 ms post-stimulus) is considered to be an early index of switches between neural representations that synthesize the current contents of conscious perception, while the LPC (400–600 ms) is thought to represent post-perceptual processing related to evaluating and reporting perceptual outcomes49, 53, 54. Therefore, the emerging brain model of bistable perception is a dynamic and highly interactive network between low-level (sensory) and high-level (frontal and parietal) brain regions.
Apart from the reversal-related brain activities that are associated with the perception reversals mentioned above, the percept-related brain activities that are associated with the ongoing stable percept deriving should also be highlighted in bistable perceptions, especially in the case of visual streaming/bouncing display because “streaming through” and “bouncing off” of the two visual discs after coincidence are completely opposite perceptual states in nature. Therefore, exploring the neural basis of streaming and bouncing percepts in the visual streaming/bouncing display can contribute to better understandings of the issue of how and when one perceptual state prevails over the other and eventually becomes a stable percept during an ambiguous situation in bistable perceptions along with the well-known effect of sound on reversing the perceptual dominance of the streaming/bouncing display (ABE). Using high density ERP recordings, the present study investigated the brain dynamics of streaming/bouncing bistable motion processing. A trial-based analysis was performed to investigate neural activities in trials on which subjects reported streaming percepts (streaming trials) with trials on which bouncing percepts (bouncing trials) were reported. Surprisingly, the amplitude of the P2 component over the frontal central scalp ~200 ms before the coincidence of two discs was found to be larger for the steaming percept than for the bouncing percept. These results demonstrated that brain activity ~200 ms before the coincidence of two discs was predictive of the subsequent perceptual outcome of the visual streaming/bouncing display.
Materials and Methods
Participants
A total of 23 healthy subjects participated in this study after giving informed consent as required and approved by the Human Research Protections Program of SooChow University. All methods were carried out in accordance with the relevant guidelines and regulations. All participants had normal or corrected-to-normal vision and were naive to the purpose of the experiment. Data from 5 subjects were eliminated due to an inadequate number of trials (less than 60 trials) for one of the perception outcomes (streaming or bouncing), leaving data of 18 subjects (11 female, mean age of 21.6 years) for further analysis.
Stimuli and procedure
The experiment was conducted in a dimly lit, sound attenuated chamber. All stimuli were constructed and scripted using “Presentation” software (Neurobehavioral Systems, version 18.0) and were presented on a 27-inch LCD monitor (ASUS VG278HE, refresh rate 100 Hz, resolution 1920 × 1080). A small red cross (0.3° × 0.3° of visual angle) served as a fixation point and was presented at the center of the gray background (10 cd/m2 of luminance) throughout each block. In 66.7% of trials (response-required trials), two identical black discs (each 1.05° in diameter) were initially presented at the opposite edges of the background. The discs were separated by 18.9° (visual angle) horizontally and placed 3.46° above the fixation cross on the first frame (the duration for each frame was 50 ms). From frame 2 through 9, the two discs moved toward each other along same horizontal path. Each frame was presented immediately after the offset of the preceding frame (i.e., frame to frame SOA was 50 ms). On frame 10 (450 ms after the onset of the first frame), the two discs visually coincided above the fixation cross. From frame 11 through 19, the discs moved apart from each other and stopped at the starting point of the opposite disc (Fig. 1). Given the initial distance of 18.9° and the duration of 50 ms for each frame as well as a total of 19 frames, the two movement of the two discs took 900 ms with a constant speed of 21°/s (i.e., 1.05° per frame). In the other 33.3% of trials (“catch” trials), the stimuli presented on frame 1 to frame 9 were exactly the same as of the response-required trials. However, no stimulus was presented from frame 10 through 19 (i.e. frame 10 to frame 19 were presented as blank frames) on catch trials. In other words, the discs moved toward each other and then suddenly disappeared just before their coincidence, which produced neither a streaming nor bouncing percept. These catch trials were included in the experiment to ensure that subjects were responding veridically based on their perceptual outcomes after the coincidence event occurred and not simply based on guesswork before that event ocurred.
Subjects observed the stimuli display from a distance of 85 cm while holding their gaze on the fixation cross during the experiment. Their assigned task was to report whether the two discs appeared to be “streaming through” or “bouncing off” each other after coincidence based on their intuition during response-required trials by pressing one of two buttons on a keyboard. The response buttons for “streaming” and “bouncing” percepts were counterbalanced across participants. No responses were required when the motion of the two discs disappeared before coincidence on catch trials. Both types of trials occurred on each block with a given probability (response-required trials: 66.7%; catch trials: 33.3%) in a randomized sequence and the intertrial intervals (ITI) varied from 1200 to 1600 ms (Fig. 1). The whole experiment consisted of 15 blocks of 60 trials each and subjects were allowed to take a short break after finishing each block.
Electrophysiological recordings and analysis
The electroencephalogram (EEG) was recorded continuously using a 64-channel tin-electrode cap (Quik-Cap, NeuroScan, Inc.) based on an extended 10–20 system montage. Standard 10–20 sites were FP1, FPz, FP2, F7, F3, Fz, F4, F8, T7, C3, Cz, C4, T8, P7, P3, Pz, P4, P8, O1, Oz and O255. Additional intermediate sites were AF3, AF4, F5, F1, F2, F6, FC7, FC5, FC3, FC1, FCz, FC2, FC4, FC6, FC8, C5, C1, C2, C6, TP7, CP5, CP3, CP1, CPz, CP2, CP4, CP6, TP8, P5, P1, P2, P6, PO7, PO5, PO3, POz, PO4, PO6, PO8, CB1 and CB2. Horizontal eye movements were monitored by two bipolar electrodes at the left and right external canthi (horizontal EOG). Vertical eye movements and blinks were monitored via two bipolar electrodes above and below the left eye (vertical EOG). The left mastoid electrode served as the reference and all electrode impedances were kept below 5 kΩ during data acquisition. The raw EEG and EOG signals were amplified with a gain of 10,000, filtered with an amplifier bandpass of 0.05–100 Hz, and were digitized with a sampling rate of 1000 Hz. The EEG signals on response-required trials were averaged in 1200 ms epochs time-locked to the onset of the two visual discs (i.e. time-locked to the onset of frame 1, see Fig. 1) with a 200-ms pre-stimulus baseline. Epochs contaminated by eye movements, eye blinks, muscle activity, or amplifier blocking were discarded before averaging, leaving a total of 512 ± 11 valid epochs (mean ± SE) on response-required trials for averaging. The resulting averaged ERP waveforms were then digitally low-pass filtered (3 dB cutoff at 30 Hz) to remove high-frequency noise produced by muscle movements and external electrical sources. After filtering, all averaged ERP waveforms were then re-referenced to the algebraic average of the left and right mastoid electrodes.
To examine the neural activities of the streaming percept and bouncing percept, ERPs recorded on the response required trials were analyzed on the basis of the subjects’ perceptual responses. A trial-by-trial analysis was performed by separating the response required trials on which subjects reported streaming percept (streaming trials) from trials on which bouncing percept (bouncing trials) were reported. ERP waveforms were averaged separately for streaming trials and bouncing trials. On average, there were 315 ± 22 (mean ± SE) streaming trials and 197 ± 18 bouncing trials. The main ERP components in both streaming and bouncing trials were quantified as the mean amplitudes with respect to a 200-ms pre-stimulus baseline over the following time windows and with the following electrodes: (1) C1 component (75–95 ms, measured over POz, PO3, PO4, Oz, O1, O2); (2) P1 component (120–145 ms, measured over the same electrodes as the C1 component); (3) N1 component (170–200 ms, measured over the same electrodes as C1 component); (4) P2 component (220–260 ms, measured over FCz, FC1, FC2, Cz, C1, C2); (5) N400 component (350–440 ms, measured over the same electrodes as P2 component); (6) N600 component (560–660 ms, measured over the same electrodes as P2 component); and (7) P3 component (800–900 ms, measured over CPz, CP1, CP2, Pz, P1, P2). These time windows and their corresponding measurement electrodes were chosen because each ERP component had its maximal amplitude over its given time windows and electrodes. After quantifying these main ERP components, the mean amplitudes of each ERP component were then subject to a two-way repeated-measure ANOVA with factors of perceptual response (streaming/bouncing) and the electrode (channels listed above for each ERP component), respectively. When appropriate, ANOVA results were corrected using the Greenhouse-Geisser procedure. Post-hoc comparisons for the main effects of electrode were made to determine the significance of pair-wise contrasts when appropriate, using the Bonferroni adjustment for multiple comparisons.
Results
Behavioral results
The percentage of certain responses (streaming or bouncing), and reaction times (measured relatively to the moment of coincidence, i.e., the onset of frame 10) between streaming trials and bouncing trials were compared by ANOVA, respectively. Consistent with the findings in previous behavioral studies that streaming percept dominates in the original visual streaming/bouncing motion display3,4,5, the percentage of streaming responses in our study was significantly higher compared to the bouncing responses [streaming, 61.2 ± 3.6% (mean ± SE) of response required trials; bouncing, 38.8 ± 3.6% of response required trials; F(1, 17) = 9.58, p < 0.007, η 2 p = 0.36]. Although there was a trend of faster reaction times for streaming responses compared to bouncing responses (streaming, 524 ± 20 ms; bouncing, 538 ± 21 ms), which appears to be consistent with the observation that the dominant percept (i.e. streaming) in the visual streaming/bouncing display leads to shorter reaction times10, this difference did not reach significant level [F(1, 17) = 2.13, p = 0.163, η 2 p = 0.11] in the present study.
ERP results
Pre-coincidence P2 amplitudes predict streaming/bouncing percepts
Time-locking to the onset of the two discs (i.e., frame 1 onset; see Fig. 1) allowed us to examine the neural-activity patterns before the coincidence. Due to the two visual discs being placed 3.46° above the fixation cross (Fig. 1), these upper visual field stimuli elicited a typically negative C1 component peaking at 75–95 ms over the occipital scalp56 in both streaming and bouncing trials (Fig. 2A). The two-way ANOVA for this C1 component did not show a significant main effect of perceptual response [F(1, 17) = 2.32, p = 0.146, η 2 p = 0.12; streaming, −0.71 ± 0.27 μV (mean ± SE); bouncing, −0.40 ± 0.19 μV; Fig. 2B, upper row], and the main effect of electrode [F(5, 85) = 2.98, p = 0.062, η 2 p = 0.15] as well as the response x electrode interaction effect [F(5, 85) = 2.01, p = 0.155, η 2 p = 0.11] were also not significant. After C1, the occipitally distributed P1 component (with the maximal amplitude at 120–145 ms latency) also did not show a significant main effect of perceptual response [F(1, 17) = 0.47, p = 0.501, η 2 p = 0.03; streaming, −0.61 ± 0.44 μV; bouncing, −0.47 ± 0.54 μV], and neither the main effect of the electrode [F(5, 85) = 1.98, p = 0.161, η 2 p = 0.10] nor the response x electrode interaction effect [F(5, 85) = 1.00, p = 0.377, η 2 p = 0.06] was significant. Similarly, the subsequent N1 component peaking at 170–200 ms over occipital electrodes also did not reveal a significant modulation as a function of perceptual response [F(1, 17) = 2.13, p = 0.163, η 2 p = 0.11; streaming, −3.09 ± 0.49 μV; bouncing, −2.83 ± 0.50 μV; Fig. 2B, bottom row], and neither the main effect of the electrode [F(5, 85) = 2.85, p = 0.067, η 2 p = 0.14] nor the response x electrode interaction effect [F(5, 85) = 1.35, p = 0.271, η 2 p = 0.07] was significant. These results suggest that early visual-evoked ERPs before the coincidence of the two discs were not associated with distinct perceptual states in the visual streaming/bouncing display.
As shown in Fig. 3, motion of the two discs before their coincidence elicited an apparent positivity over the fronto-central scalp during 220–260 ms in both streaming and bouncing trials. This positivity is most likely to be a P2 component due to its time course and scalp topography57, and it is obvious that the P2 amplitude is larger on streaming trials than bouncing trials. Indeed, the two-way ANOVA showed a highly significant main effect of perceptual response [F(1, 17) = 10.19, p < 0.006, η 2 p = 0.38], with greater P2 amplitude on streaming trials (streaming, 3.75 ± 0.70 μV; bouncing, 3.10 ± 0.74 μV). The main effect of the electrode was also significant [F(5, 85) = 3.92, p < 0.05, η 2 p = 0.19]. Pair-wise comparisons for this main effect showed that P2 amplitude was larger on FCz than on FC2 (p < 0.03 after Bonferroni correction), and was larger on C1 and Cz than on C2 (ps < 0.05 after Bonferroni correction). The response x electrode interaction effect for P2 amplitude was not significant [F(5, 85) = 1.05, p = 0.379, η 2 p = 0.06]. Following the positivity of P2 component, there was also a negative deflection labeled as N400 component (Fig. 3A) over the fronto-central scalp (Fig. 3B, middle row) just before the coincidence of the two discs, which was maximal at 350–440 ms. The two-way ANOVA for this negativity also revealed significant modulation as a function of perceptual response [F(1, 17) = 6.19, p < 0.03, η 2 p = 0.27], with greater N400 amplitude for the bouncing trials (streaming, −2.58 ± 0.69 μV; bouncing, −2.98 ± 0.74 μV), and a significant main effect for electrodes [F(5, 85) = 5.55, p < 0.02, η 2 p = 0.25], with the N400 amplitude being larger on Cz than on C1 and C2 (ps < 0.03 after Bonferroni correction), as well as a nonsignificant response x electrode interaction effect [F(5, 85) = 0.48, p = 0.691, η 2 p = 0.03].
It is noteworthy that streaming and bouncing trials were separated based on subjects’ perceptual responses, and when ERP waveforms started to differ significantly in the P2 interval between streaming and bouncing trials, the two visual discs were still moving toward each other (Fig. 3A). In other words, even before the streaming/bouncing event actually occurred, the variations of neural activity (indexed by the fronto-central P2 component) have influenced the perceptual decision we would make after the streaming/bouncing event really occurred. In other words, the pre-coincidence difference on P2 component predicts the perceptual outcome of the visual streaming/bouncing display.
Post-coincidence P3 amplitudes dissociate streaming/bouncing responses
After the streaming/bouncing event occurred (i.e., the coincidence of the two discs), there was firstly a broad negativity with the maximal deflection during 560–660 ms (Fig. 3A) over fronto-central scalp (Fig. 3B, bottom row) for both streaming and bouncing trials, which was labeled as N600 component. The two-way ANOVA for the N600 amplitude revealed no significant difference between streaming and bouncing trials [F(1, 17) = 0.10, p = 0.753, η 2 p = 0.01; streaming, −4.67 ± 0.73 μV; bouncing, −4.60 ± 0.65 μV], and neither the main effects of the electrode [F(5, 85) = 3.56, p = 0.053, η 2 p = 0.17] nor the response x electrode interaction effect [F(5, 85) = 1.19, p = 0.312, η 2 p = 0.07] was significant. Following the N600 component, a late slow positivity extended over 800–900 ms for both the streaming and bouncing trials (Fig. 4A) is most likely to be the P3/P300 component given its broad amplitude distribution and parietal maximal scalp topography (Fig. 4B), which were associated with the detection of task-relevant events58. This slow positivity was quantified over 800–900 ms because both the maximal amplitude and detectable amplitude difference between the streaming and bouncing trials were included in this time window. Indeed, this obvious amplitude difference resulted in a significant main effect of perceptual response [F(1, 17) = 5.75, p < 0.03, η 2 p = 0.26], with greater P3 amplitude on bouncing trials (4.10 ± 0.70 μV) compared to streaming trials (3.28 ± 0.71 μV). The main effect of the electrode was also significant [F(5, 85) = 6.91, p < 0.006, η 2 p = 0.29], with the P3 amplitude being larger on Pz than on CP1 and CP2 (ps < 0.013 after Bonferroni correction). Finally, the response x electrode interaction effect for the P3 amplitude was not significant [F(5, 85) = 0.72, p = 0.485, η 2 p = 0.04]. These results indicated that the post-coincidence P3 amplitudes could also distinguish different perceptual outcomes in the visual streaming/bouncing display.
Discussion
The present study investigated the neural basis of the processing of the classic streaming/bouncing motion display using high density event-related potential (ERP) recordings. The behavioral results showed no significant difference on RTs between the streaming and bouncing percepts, which was inconsistent with the findings of Sanabria et al. that found the dominant percept (i.e., streaming) in the visual streaming/bouncing display led to faster RTs10, although there was a tendency of faster RTs for streaming responses (streaming, 524 ± 20 ms; bouncing, 538 ± 21 ms) in the present study. A possible reason for this disparity might be that the existence of streaming/bouncing display with a salient sound at the moment of coincidence of the two discs [visual-auditory (VA) condition] in their experimental design influenced the response bias when participants perceived the purely visual streaming/bouncing display [visual-only (V) condition]. Therefore, when both VA and V conditions existed, salient sound in the VA condition would induce more bouncing percepts, which might result in hesitation when the subjects perceived streaming in the VA condition. Conversely, experience with the VA condition might also result in hesitation when subjects perceived bouncing in the V condition. Thus, the pattern that markedly faster RTs for bouncing percepts in the VA condition and faster RTs for streaming percepts in the V condition eventually occurred in study of Sanabria et al.10. However, the present study did not introduce a transient sound at the moment of coincidence, and so there was no response bias effect. Therefore, the results led to no substantial difference in RTs between streaming and bouncing percepts in the present study.
The ERP results showed a larger positive deflection on the frontal central P2 (220–260 ms) amplitude and N400 (350–440 ms) amplitude (with relatively smaller negative deflection on N400) before the coincidence for the streaming trials than bouncing trials. It is noteworthy that brain activities before motion or stimuli onset could bias subsequent perceptual results in ambiguous displays38, 50, 59, 60, which suggests that ongoing brain activities before stimulus-driven processes might contribute to how perceptual conflict is resolved by the human brain. The pre-coincidence differences in ERPs in the present study seem to make sense if we assume the motion sequences start at the moment of coincidence. Specifically, if the movement of the two discs started at the exact frame of their coincidence without any motion trajectory before coincidence, it is obvious that subjects in the behavioral task would perceive neither streaming nor bouncing but only two discs moving apart from the coincident point. Thus, the movement of the two discs before their coincidence actually served as a prerequisite for subsequent perceptual outcomes. Indeed, Grove et al. found that manipulating pre-coincidence trajectory switches (used to manipulate expectation) in streaming/bouncing display could significantly bias subsequent perceptual inferences19, which supports this viewpoint. It is also worth mentioning that the fMRI study that focused on the audiovisual bounce-inducing effect (ABE) conducted by Bushara et al. also investigated the purely visual streaming/bouncing display, but their data was time-locked to the coincidence of the two discs and found no difference in brain activity between the streaming and bouncing percepts22. When the fMRI results and our results showing a pre-coincidence difference on P2 and N400 components are combined, an inference can be drawn that subjects processed the motion information of two discs before their coincidence is an essential factor in influencing subsequent perceptual outcomes. In other words, the origin of ambiguity in visual streaming/bouncing processing may come from how subjects processed the motion information of the two discs before their coincidence. That is not to say other factors that occurred after or at the moment of the coincidence of the two discs are not important. In fact, a brief sound presented at the moment of coincidence could reverse the dominance of the streaming percept (i.e. ABE)6, and the post-coincidence trajectory duration was also found to be sufficient to bias subsequent perceptual inferences17, 18. Therefore, based on the present results and the discussion above, the pre-coincidence ERP differences found in the present study may reflect unconscious reduction of the ambiguity of streaming/bouncing display.
It has yet to be determined, however, that what aspects of perceptual process are reflected by the pre-coincidence positive deflection (i.e., larger positivity for the streaming percept than bouncing percept) that started ~200 ms before coincidence of the discs. Although the P2 component is typically thought as part of a cognitive matching system that compares sensory inputs with expectations derived from memory57, 61, and is involved in working memory62 and semantic processing63, its neurophysiological role has not been well characterized partly because it often overlaps with other adjacent ERPs such as N1, N2 and P3 while recording64. However, a recent study conducted by Shu et al. found that the P2 component is sensitive to depth perception, with larger P2 amplitude on three-dimensional (3D) than two-dimensional (2D) images (differences in physical properties between 3D and 2D images were minimized)65. Similarly, Liu et al. found that size perception was modulated by depth cues, with larger P2 amplitude on ball in upper than in lower visual field (the size of ball was always the same) for the 3D background condition but no difference for the 2D background condition66, indicating the P2 component is sensitive to depth information as well. If this is the case, the greater P2 amplitude on streaming than bouncing trials in the present study (Fig. 3) might be attributed to utilization of experience in a three-dimensional environment [objects in natural environment are seldom (but still possibly) aligned at the same depth plane] when perceiving visual streaming/bouncing display as Sekuler and Sekuler as well as Watanabe and Shimojo have proposed5, 7, 17 (see the Introduction section for details). In other words, when subjects considered the two moving discs as being at different depth planes (indexed by larger P2), they would perceive streaming after the coincidence, but when the two discs were thought to be at the same depth plane (indexed by smaller P2), the bouncing percept would be reported. Consistent with this account, Grove and Sakurai found streaming percepts increased as depth disparity between two discs increased in both the visual-auditory and visual-only streaming/bouncing display, although their focus was the audiovisual bounce-inducing effect (ABE)67. Similarly, Matsuno and Tomonaga investigated the visual streaming/bouncing display in chimpanzees and found streaming percepts increased when more depth cues were introduced25. Combined with the evidences and explanations above, the P2 amplitude difference between bouncing and streaming trials in the present study may reflect whether or not the subject perceived the two moving discs before their coincidence as being at the same depth plane. However, further research is still needed to examine whether the perceived depth between the two discs is the determinant factor for the visual streaming/bouncing illusion, because the present study did not test it directly.
As reported above, the pre-coincidence positive deflection that larger positivity was observed for the streaming percept than bouncing percept started ~200 ms before coincidence. However, the early visual-evoked ERPs (i.e., C1, P1 and N1 component) were almost identical between the streaming and bouncing trials (Fig. 2). It is easy to understand these results because these early visual-evoked ERPs were found to be very sensitive to changes in the physical properties of the stimuli64, whereas the stimuli (i.e., the motion of the two discs) inducing the streaming and bouncing percepts were always the same in our experiment. Thus, identical stimuli sequences elicited the similar early visual ERPs is exactly what would be expected. In contrast, as previous bistable perception studies demonstrated, due to the same physical inputs of stimuli, any changes in the electrophysiological response between mutually exclusive perceptions could therefore be ascribed to higher level perceptual or cognitive factors, rather than to factors relied on early sensory-input properties48, 49. This point of view fits the results of P2 in the present study and the prominent hypothesis that the P2 component represents part of a cognitive matching system that compares sensory inputs with expectations derived from memory57.
After the streaming/bouncing event occurred (i.e., coincidence of the two discs), the P3 amplitudes also dissociated streaming/bouncing responses, with greater P3 amplitude on bouncing trials than streaming trials (Fig. 4). Numerous previous studies have considered the P3 component to be an index of post-perceptual updates for working memory required to perform the perceptual-reporting task68,69,70,71. Overall, if no stimulus feature change is detected, the current mental pattern or “schema” of the stimulus representation will be sustained. If a new stimulus is detected, attentional processes will govern a change or “updating” of the current stimulus context, which is concomitant with larger P3 amplitude64, 68, 72. Meanwhile, the P3 component amplitude has been shown to increase for a more deviant (less probable) task-defined stimulus73,74,75,76,77. In the present behavioral results as well as previous studies of visual streaming/bouncing display, the streaming percept was dominant and occurred more frequently, whereas bouncing was the inferior percept and occurred occasionally3,4,5. Therefore, it is reasonable to infer that the streaming percept is subjectively the default state for subjects, which is considered as a highly probable perceptual outcome. Therefore, the bouncing percept is subjectively a novel perceptual outcome for subjects with a low probability. If that is the case, larger P3 amplitude for bouncing trials in the present study may reflect a novel bouncing percept with a low probability that triggered contextual updates for perceptual representation in subjects’ working memory. On the other hand, the P3 component in the present study seemed to appear so late that a time window of 800–900 ms was used to measure its amplitude. However, it makes sense because previous studies have shown that the latency of P3/P300 depends on the time required for stimuli classification, that is, P3 component should appear after classifying stimuli based on task requests78,79,80,81. Although ERP difference before streaming/bouncing event occurred had already influenced the subsequent perceptual outcomes in the present study, it is apparent that distinct perceptual outcomes must emerge after observing the occurrence of streaming/bouncing event. Specifically, classifying stimuli based on task requests (i.e., reporting either streaming or bouncing) in the present study actually started from the moment of the coincidence of the discs. Thus, if ERP waveforms were time-locked to the moment of the coincidence (450 ms), the time window of P3 component would become 350–450 ms, and along with its parietal maximal scalp topography (Fig. 4B), these characteristics were consistent with the results reported previously for typical P3/P300 component58, 82.
Conclusion
In summary, the present study investigated the neural-activity patterns associated with the streaming and bouncing percepts to explore the origin of ambiguity and the neural basis of disambiguating perception during processing of the classic visual streaming/bouncing display. Interestingly, a frontal central positive deflection ~200 ms before the coincidence of the discs was found to be predictive of subsequent perceptual outcomes in the visual streaming/bouncing display. Moreover, P3 amplitudes ~400 ms after the coincidence of the two discs also dissociated streaming/bouncing responses. Base on previous studies of bistable perception that highlighted the role of high-level brain areas on perceptual interpretations1, 44, 45, and recent findings that the P2 component was sensitive to depth information65, 66, as well as the existing hypothesis that experience in a three-dimensional environment was involved in disambiguating the visual streaming/bouncing display5, 7, 17, 19, we conclude that the pre-coincidence frontal central positive difference between streaming and bouncing percepts in the present study may reflect whether or not the subject perceived the two moving discs before their coincidence as being at the same depth plane. Further studies are still needed to examine directly whether the perceived depth relationship between the two discs is the determinant factor for the visual streaming/bouncing display.
References
Sterzer, P., Kleinschmidt, A. & Rees, G. The neural bases of multistable perception. Trends Cogn. Sci. 13, 310–318 (2009).
Metzger, W. Beobachtungen über phänomenale identität. Psychol. Forsch. 19, 1–60 (1934).
Bertenthal, B. I., Banton, T. & Bradbury, A. Directional bias in the perception of translating patterns. Perception. 22, 193–207 (1993).
Watanabe, K. & Shimojo, S. Attentional modulation in perception of visual motion events. Perception. 27, 1041–1054 (1998).
Sekuler, A. B. & Sekuler, R. Collisions between moving visual targets: what controls alternative ways of seeing an ambiguous display? Perception. 28, 415–432 (1999).
Sekuler, R., Sekuler, A. B. & Lau, R. Sound alters visual motion perception. Nature. 385, 308 (1997).
Watanabe, K. & Shimojo, S. When sound affects vision: effects of auditory grouping on visual motion perception. Psychol. Sci. 12, 109–116 (2001).
Scheier, C., Lewkowicz, D. J. & Shimojo, S. Sound induces perceptual reorganization of an ambiguous motion display in human infants. Dev. Sci. 6, 233–241 (2003).
Fujisaki, W., Shimojo, S., Kashino, M. & Nishida, S. Recalibration of audiovisual simultaneity. Nat. Neurosci. 7, 773–778 (2004).
Sanabria, D., Correa, Á., Lupiáñez, J. & Spence, C. Bouncing or streaming? exploring the influence of auditory cues on the interpretation of ambiguous visual motion. Exp. Brain Res. 157, 537–541 (2004).
Remijn, G. B., Ito, H. & Nakajima, Y. Audiovisual integration: an investigation of the ‘streaming-bouncing’ phenomenon. J. Physiol. Anthropol. Appl. Human Sci. 23, 243–247 (2004).
Dufour, A., Touzalin, P., Moessinger, M., Brochard, R. & Després, O. Visual motion disambiguation by a subliminal sound. Conscious Cogn. 17, 790–797 (2008).
Grassi, M. & Casco, C. Audiovisual bounce-inducing effect: when sound congruence affects grouping in vision. Atten. Percept. Psychophys. 72, 378–386 (2010).
Grove, P. M., Ashton, J., Kawachi, Y. & Sakurai, K. Auditory transients do not affect visual sensitivity in discriminating between objective streaming and bouncing events. J. Vis. 12, 1–11 (2012).
Kawabe, T. & Miura, K. Effects of the orientation of moving objects on the perception of streaming/bouncing motion displays. Percept. Psychophys. 68, 750–758 (2006).
Shimojo, S. & Shams, L. Sensory modalities are not separate modalities: plasticity and interactions. Curr. Opin. Neurobiol. 11, 505–509 (2001).
Watanabe, K. & Shimojo, S. Postcoincidence trajectory duration affects motion event perception. Percept. Psychophys. 63, 16–28 (2001).
Kawachi, Y., Kawabe, T. & Gyoba, J. Stream/bounce event perception reveals a temporal limit of motion correspondence based on surface feature over space and time. i-Perception 2, 428–439 (2011).
Grove, P. M., Robertson, C. & Harris, L. R. Disambiguating the stream/bounce illusion with inference. Multisens. Res. 29, 453–464 (2016).
Grassi, M. & Casco, C. Audiovisual bounce-inducing effect: attention alone does not explain why the discs are bouncing. J. Exp. Psychol. Hum. Percept. Perform. 35, 235–243 (2009).
Grassi, M. & Casco, C. Revealing the origin of the audiovisual bounce-inducing effect. Seeing Perceiving 25, 223–233 (2012).
Bushara, K. O. et al. Neural correlates of cross-modal binding. Nat. Neurosci. 6, 190–195 (2003).
Hipp, J. F., Engel, A. K. & Siegel, M. Oscillatory synchronization in large-scale cortical networks predicts perception. Neuron. 69, 387–396 (2011).
Maniglia, M., Grassi, M., Casco, C. & Campana, G. The origin of the audiovisual bounce-inducing effect: a TMS study. Neuropsychologia. 50, 1478–1482 (2012).
Matsuno, T. & Tomonaga, M. Stream/bounce perception and the effect of depth cues in chimpanzees (pan troglodytes). Atten. Percept. Psychophys. 73, 1532–1545 (2011).
Attneave, F. Multistability in perception. Sci. Am. 225, 63–71 (1972).
Blake, R. A neural theory of binocular rivalry. Psychol. Rev. 96, 145–167 (1989).
Polonsky, A., Blake, R., Braun, J. & Heeger, D. J. Neuronal activity in human primary visual cortex correlates with perception during binocular rivalry. Nat. Neurosci. 3, 1153–1159 (2000).
Tong, F. & Engel, S. Interocular rivalry revealed in the human cortical blind-spot representation. Nature. 411, 195–199 (2001).
Lee, S. H., Blake, R. & Heeger, D. J. Traveling waves of activity in primary visual cortex during binocular rivalry. Nat. Neurosci. 8, 22–23 (2005).
Haynes, J. D. & Rees, G. Predicting the stream of consciousness from activity in human visual cortex. Curr. Biol. 15, 1301–1307 (2005).
Parkkonen, L., Andersson, J., Hämäläinen, M. & Hari, R. Early visual brain areas reflect the percept of an ambiguous scene. Proc. Natl. Acad. Sci. USA 105, 20500–20504 (2009).
Haynes, J. D., Deichmann, R. & Rees, G. Eye-specific effects of binocular rivalry in the human lateral geniculate nucleus. Nature. 438, 496–499 (2005).
Wunderlich, K., Schneider, K. A. & Kastner, S. Neural correlates of binocular rivalry in the human lateral geniculate nucleus. Nat. Neurosci. 8, 1595–1602 (2005).
Hasson, U., Hendler, T., Bashat, D. B. & Malach, R. Vase or face? a neural correlate of shape-selective grouping processes in the human brain. J. Cogn. Neurosci. 13, 744–753 (2001).
Andrews, T. J., Schluppeck, D., Homfray, D., Matthews, P. & Blakemore, C. Activity in the fusiform gyrus predicts conscious perception of rubin’s vase-face illusion. Neuroimage. 17, 890–901 (2002).
Sterzer, P. & Rees, G. A neural basis for percept stabilization in binocular rivalry. J. Cogn. Neurosci. 20, 389–399 (2008).
Hesselmann, G., Kell, C. A., Eger, E. & Kleinschmidt, A. Spontaneous local variations in ongoing neural activity bias perceptual decisions. Proc. Natl. Acad. Sci. USA 105, 10984–10989 (2008).
Fang, F., Kersten, D. & Murray, S. O. Perceptual grouping and inverse fMRI activity patterns in human visual cortex. J. Vis. 8, 1–9 (2008).
Lumer, E. D., Friston, K. J. & Rees, G. Neural correlates of perceptual rivalry in the human brain. Science. 280, 1930–1934 (1998).
Windmann, S., Wehrmann, M., Calabrese, P. & Onur, N. Role of the prefrontal cortex in attentional control over bistable vision. J. Cogn. Neurosci. 18, 456–471 (2006).
Sterzer, P. & Kleinschmidt, A. A neural basis for inference in perceptual ambiguity. Proc. Natl. Acad. Sci. USA 104, 323–328 (2007).
Raemaekers, M., van der Schaaf, M. E., van Ee, R. & van Wezel, R. J. A. Widespread fMRI activity differences between perceptual states in visual rivalry are correlated with differences in observer biases. Brain Res. 1252, 161–171 (2009).
Leopold, D. A. & Logothetis, N. K. Multistable phenomena: changing views in perception. Trends Cogn. Sci. 3, 254–264 (1999).
Long, G. M. & Toppino, T. C. Enduring interest in perceptual ambiguity: alternating views of reversible figures. Psychol. Bull. 130, 748–768 (2004).
Kornmeier, J. & Bach, M. Early neural activity in necker-cube reversal: evidence for low-level processing of a gestalt phenomenon. Psychophysiology. 41, 1–8 (2004).
Kornmeier, J. & Bach, M. The necker cube–an ambiguous figure disambiguated in early visual processing. Vis. Res. 45, 955–960 (2005).
Pitts, M. A., Nerger, J. L. & Davis, T. J. Electrophysiological correlates of perceptual reversals for three different types of multistable images. J. Vis. 7, 102–104 (2007).
Pitts, M. A., Gavin, W. J. & Nerger, J. L. Early top-down influences on bistable perception revealed by event-related potentials. Brain Cogn. 67, 11–24 (2008).
Britz, J., Landis, T. & Michel, C. M. Right parietal brain activity precedes perceptual alternation of bistable stimuli. Cereb. Cortex 19, 55–65 (2009).
Intaite, M., Koivisto, M., Ruksenas, O. & Revonsuo, A. Reversal negativity and bistable stimuli: attention, awareness, or something else? Brain Cogn. 74, 24–34 (2010).
Basar-Eroglu, C., Struber, D., Stadler, M., Kruse, P. & Basar, E. Multistable visual perception induces a slow positive EEG wave. Int. J. Neurosci. 73, 139–151 (1993).
Pitts, M. A. & Britz, J. Insights from intermittent binocular rivalry and EEG. Front. Hum. Neurosci. 5, 107, doi:10.3389/fnhum.2011.00107 (2011).
Davidson, G. D. & Pitts, M. A. Auditory event-related potentials associated with perceptual reversals of bistable pitch motion. Front. Hum. Neurosci. 8, 572, doi:10.3389/fnhum.2014.00572 (2014).
Jasper, H. H. The ten-twenty electrode system of the international federation. Electroencephalogr. Clin. Neurophysiol. 10, 371–375 (1958).
Clark, V. P., Fan, S. & Hillyard, S. A. Identification of early visual evoked potential generators by retinotopic and topographic analyses. Hum. Brain Mapp. 2, 170–187 (1994).
Luck, S. J. & Hillyard, S. A. Electrophysiological correlates of feature analysis during visual search. Psychophysiology. 31, 291–308 (1994).
Johnson, R. A triarchic model of P300 amplitude. Psychophysiology 23, 367–384 (1986).
Williams, Z. M., Elfar, J. C., Eskandar, E. N., Toth, L. J. & Assad, J. A. Parietal activity and the perceived direction of ambiguous apparent motion. Nat. Neurosci. 6, 616–623 (2003).
Hesselmann, G., Kell, C. A. & Kleinschmidt, A. Ongoing activity fluctuations in hMT+ bias the perception of coherent visual motion. J. Neurosci. 28, 14481–14485 (2008).
Freunberger, R., Klimesch, W., Doppelmayr, M. & Höller, Y. Visual P2 component is related to theta phase-locking. Neurosci. Lett. 426, 181–186 (2007).
Lefebvre, C. D., Marchand, Y., Eskes, G. A. & Connolly, J. F. Assessment of working memory abilities using an event-related brain potential (ERP): compatible digit span backward task. Clin. Neurophysiol. 116, 1665–1680 (2005).
Federmeier, K. D. & Kutas, M. Picture the difference: electrophysiological investigations of picture processing in the two cerebral hemispheres. Neuropsychologia. 40, 730–747 (2002).
Luck, S. J. An Introduction to the Event-Related Potential Technique (Cambridge, MA: MIT Press, 2005).
Shu, O. et al. P1 and P2 components of human visual evoked potentials are modulated by depth perception of 3-dimensional images. Clin. Neurophysiol. 121, 386–391 (2010).
Liu, Q. et al. Neural correlates of size illusions: an event-related potential study. Neuroreport. 20, 809–814 (2009).
Grove, P. M. & Sakurai, K. Auditory induced bounce perception persists as the probability of a motion reversal is reduced. Perception. 38, 951–965 (2009).
Donchin, E. Surprise!… surprise? Psychophysiology. 18, 493–513 (1981).
Donchin, E. & Coles, M. G. Is the P300 component a manifestation of context updating? Behav. Brain Sci. 11, 357–374 (1988).
McEvoy, L. K., Smith, M. E. & Gevins, A. Dynamic cortical networks of verbal and spatial working memory: effects of memory load and task practice. Cereb. Cortex. 8, 563–574 (1998).
Picton, T. W. The P300 wave of the human event-related potential. J. Clin. Neurophysiol. 9, 456–479 (1992).
Polich, J. Updating P300: an integrative theory of P3a and P3b. Clin. Neurophysiol. 118, 2128–2148 (2007).
Duncan-Johnson, C. C. & Donchin, E. On quantifying surprise: the variation of event-related potentials with subjective probability. Psychophysiology. 14, 456–467 (1977).
Duncan-Johnson, C. C. & Donchin, E. The P300 component of the event-related brain potential as an index of information processing. Biol. Psychol. 14, 1–52 (1982).
Courchesne, E., Hillyard, S. A. & Courchesne, R. Y. P3 waves to the discrimination of targets in homogeneous and heterogeneous stimulus sequences. Psychophysiology. 14, 590–597 (1977).
Dalbokova, D., Gille, H. G. & Ullsperger, P. Amplitude variations in P300 component due to unpredictable stepwise change of stimulus probability. Int. J. Psychophysiol. 10, 33–38 (1990).
Vogel, E. K., Luck, S. J. & Shapiro, K. L. Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. J. Exp. Psychol. Hum. Percept. Perform. 24, 1656–1674 (1998).
Kutas, M., Mccarthy, G. & Donchin, E. Augmenting mental chronometry: the P300 as a measure of stimulus evaluation time. Science. 197, 792–795 (1977).
Magliero, A., Bashore, T. R., Coles, M. G. H. & Donchin, E. On the dependence of P300 latency on stimulus evaluation processes. Psychophysiology. 21, 171–186 (1984).
Verleger, R. On the utility of P3 latency as an index of mental chronometry. Psychophysiology. 34, 131–156 (1997).
Verleger, R., Jaskowski, P. & Wascher, E. Evidence for an integrative role of P3 in linking reaction to perception. J. Psychophysiol. 19, 165–181 (2005).
Polich, J. & Kok, A. Cognitive and biological determinants of P300: an integrative review. Biol. Psychol. 41, 103–146 (1995).
Acknowledgements
This research was supported by Natural Science Foundation of China 31400868 (W.F.F.) and 31400893 (Y.L.), Nature Science Foundation of JiangSu Province in China BK20160171 (L.N.J.).
Author information
Authors and Affiliations
Contributions
S.Z. and W.F. designed research; S.Z., Y.W. and L.J. performed research; S.Z., L.J., Y.L. and W.F. analyzed data; S.Z., Y.W., C.F., Y.L. and W.F. wrote the paper.
Corresponding authors
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhao, S., Wang, Y., Jia, L. et al. Pre-coincidence brain activity predicts the perceptual outcome of streaming/bouncing motion display. Sci Rep 7, 8832 (2017). https://doi.org/10.1038/s41598-017-08801-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-017-08801-5
- Springer Nature Limited