Key words

1 Introduction

A fundamental question in systems neuroscience is how sensory stimuli are encoded by the activity of neurons. This question has been the subject of intense scrutiny across sensory systems, leading to fundamental discoveries describing the nature of neural representations across many levels of information processing and abstraction [1, 2]. The rapid pace of this progress has been matched with an improvement in tools used for recording neuronal activity at a large scale (e.g., two-photon (2P) calcium imaging), leading to the now-routine generation of datasets comprised of hundreds to thousands of simultaneously recorded neurons in awake behaving animals [3, 4]. A consistent observation made from recordings taken across sensory systems is that neural responses to sensory stimuli include changes in both the rate and timing of action potentials within populations of neurons, as well as modulation with respect to ongoing, internally generated rhythms (sniffing, whisking, locomotion, etc.) [5,6,7,8,9,10]. Despite the ability to measure complex sensory responses, it remains unclear what role individual features of neural activity (e.g. rate, timing, etc.) play in perceptually guided behavior.

One way to pose this question is to ask what features of sensory-evoked activity are read by downstream brain areas to guide behavior. To answer this question, one must establish a causal link between features of neural activity (e.g., the firing rate and timing of specific neurons) and behavioral report (e.g., detecting a sensory stimulus). Perturbations of neural activity are therefore a key tool for connecting specific circuits and features of spiking activity to perception. Pioneering this conceptual framework, an influential study by Newsome and colleagues demonstrated that the perceptual effects of stimulation could be quantitatively estimated, employing experiments in which a monkey’s motion-perception-related response was biased by electrical stimulation of the middle temporal visual area (MT) [11]. In these experiments, the monkey could be biased toward reporting a certain direction of motion in an ambiguous stimulus if stimulation was applied to a columnar area of cells which were maximally responsive, or ‘tuned’, to the corresponding motion direction. One reason why the effect of this stimulation was both predictable and reliable was that the spatial scale of the manipulation (hundreds of micrometers current spread) was on the order of the neural representation (a cortical column).

The advent of optogenetics has permitted increasingly precise interventions across a broad range of neural circuits and behaviors, culminating in the development of two-photon (2P) optogenetics [12,13,14,15,16]. When combined with 2P imaging and holographic targeting, this technique permits the activation of arbitrary sets of pre-selected neurons with up to sub-millisecond precision in vivo [17,18,19,20,21,22]. Neural representations encoded at fine spatiotemporal scales can now be interrogated using “precision optogenetics,” increasing the potential for an even more nuanced understanding of the connection between neural codes and behavior [23,24,25,26,27,28,29]. A detailed description of optical methods for combining 2P photostimulation and imaging appears in other chapters of this book. In what follows, we specifically focus on methods developed for combining precise photostimulation with psychophysical measurements of evoked synthetic percepts and on complementary methods used to confirm a precise relationship between optical manipulations and the observed behavior. These methods underlie recent scientific studies on the coding logic of olfactory perception, whose results are described in detail in our recent publications [25, 30, 31]. While the particular features of interest may be determined by the sensory system under study, this chapter highlights general frameworks for connecting synthetically induced activity with perceptual and behavioral readout.

2 Paradigms for Psychophysical Measurement of Synthetic Perception

The rapid development of optogenetics within the past decade has introduced an extensive toolbox of genetically encoded, light-activated ion channels capable of temporally precise, bidirectional modulation of activity within predefined circuits and cell types [16, 18, 24, 32]. In order to link specific circuits and cell types with their role in generating a sensory guided behavior, a common approach has been to manipulate neural activity while an animal is presented with a stimulus and test the consequence on behavior compared to the normal condition. In this way an experimenter can enhance, eliminate, or “tweak” neural activity to discover how circuits and cell types may be involved in making a particular perceptual judgement.

This approach has undoubtedly enhanced our understanding of which brain circuits and cell types are essential for certain perceptual tasks. For example, inhibiting the visual cortex impairs orientation discrimination [33], and broad, unstructured optogenetic activation of olfactory inputs interferes with concurrent odor recognition [34]. But these manipulations are relatively uninformative about precisely how sensory percepts are formed by those circuits and cell types, that is, how they are encoded by and decoded from their activity.

To address this question, we must manipulate sensory representations at the spatiotemporal scale with which they are encoded. Previous technologies making use of electrical or one-photon manipulations lacked the ability to perturb sets of individual, pre-selected cells, limiting their potential to directly connecting specific neurons and spiking patterns with sensory-guided behavior. However, spatiotemporal excitation technologies from the emerging field of precision optogenetics are capable of addressing representations that are encoded at either the microcircuit or distributed scales. In what follows, we will describe the application of two key techniques.

The first, 2P holographic optogenetics, is currently the only method capable of selectively stimulating individual neurons which can be targeted based on their response properties, cell type, location, and projection targets, along with many other selection criteria [35]. The second, one-photon (1P) patterned optogenetics, is capable of stimulating groups of superficial neurons spread out over many millimeters in arbitrary patterns, which can precisely manipulate discrete representations organized over larger spatial scales, such as olfactory bulb glomeruli [30, 36]. Critically, precision optogenetics has the ability to bypass the sensory organ or peripheral circuits involved in sensory-guided behaviors and directly address the perceptual impact of specifically targeted neurons. In this way, an experimenter can create “synthetic” stimuli, whose cellular composition and timing are exactly known. This allows the experimenter to take a step beyond validating the participation of individual circuits for behavior and to determine precisely what features of the activity of neural circuits guide perception. This is accomplished by holding individual features constant (e.g., what neurons are stimulated), while parametrically varying other features (e.g., timing), and performing sensitive psychophysical measurements of the perceptual impact.

2.1 Using Detection to Test the Relevance of Neural Codes

The first experimental strategy is to assess the influence of specific features within neural activity patterns on the detectability of an artificial stimulus. Animal survival relies on the detection of brief, faint cues to signal the presence of food, mates, and predators. A multitude of studies across species have revealed the exquisite sensitivity of sensory systems to their preferred stimuli [37, 38]. Exploring how critical information is conveyed by sensory circuits at the perceptual limit provides a lens to examine the essential coding features connecting neural activity to behavior. However, even in this minimal regime, sensory-evoked responses can be complex, simultaneously encoding stimuli in the identity, rate, and timing of specific neurons. By replacing external stimuli with targeted activation of neurons in sensory areas, we can directly test which features of the observed activity affect the detectability of the artificial stimulus. We can then infer the features of activity essential for perception.

Early studies using either electrical or 1P optogenetic stimulation revealed that rodents can detect changes in the spike rate of single neurons [39, 40], or across populations of hundreds of neurons [41] in the somatosensory and visual cortices, using stimuli that lasted for hundreds of milliseconds. Additional studies have explored the role of relative spike timing [41, 42], and sensitivity to latency [43, 44] in populations of hundreds to thousands of optogenetically activated cortical neurons. These studies were successful in demonstrating that rodents can perceive minimal perturbations in sensory cortical neurons, and perceptibility may, or may not, vary along certain feature dimensions, like timing. However, the techniques used in these experiments lacked the ability to target specific sets of neurons according to their functional properties or tuning. By optogenetically labeling olfactory sensory neurons expressing a specific receptor (M72-ChR2), a key related study in the olfactory system revealed that mice could detect brief activation (10 ms) of a single glomerulus [45]. This well-defined input channel to the olfactory bulb is a site of convergence where ~5,000 olfactory sensory neurons provide input to ~25–30 mitral and tufted cells (projection neurons), demonstrating that the elementary features of olfactory perception operate at, or below, this spatiotemporal scale.

The recent maturation of 2P holographic optogenetics represents an important opportunity for probing the relevance of neural coding features for detection. This approach has recently been applied across a range of sensory systems (olfactory, visual, somatosensory), where 2P photostimulation of a predefined set of neurons was used as the stimulus to be detected. Because specific neurons can be targeted using this technique, a larger feature space can be explored. For example, a recent study observed that the functional connectivity and orientation tuning of a subset of photostimulated neurons was related to their detectability [23] (see also Chap. 11). Extended photostimulation durations (1s) of a subset of visual cortical neurons “recalled” a larger ensemble response tuned to a particular visual stimulus, and as few as two of these “pattern completion” neurons were detectable. That is, strongly photostimulating a specific small group of neurons (as few as two neurons) evoked the activation of a larger, behaviorally detectable pattern. Another recent study found that mice could detect 250–500 ms photostimulation of ~14 somatosensory (barrel) cortical neurons and that this ability improved with experience but did not depend on the precise neurons targeted [26].

Both of these studies found evidence that mice can detect the photostimulation of a few neurons delivered over a relatively long timescale (250–1000 ms), potentially evoking hundreds of spikes. However, for a range of behaviors animals have been shown to make decisions about salient perceptual cues within very short temporal windows [46,47,48]. For example, rodents are capable of detecting odorants in less than a single sniff (<100 ms), at extremely low concentrations (as low as 10−12 M) [34, 37, 49, 50]. In this case the very first volley of action potentials generated by inhaling an odor are sufficient to inform an animal’s response.

To understand what features of olfactory bulb activity guide rapid detection of faint stimuli, Gill, Lerman et al., 2020 extended the use of 2P holographic optogenetics to probe the detectability of single spikes distributed across small ensembles of olfactory bulb neurons (Fig. 1) [25, 51]. In this study, head-fixed, water-restricted mice were trained to detect the synchronous activation of a group of olfactory bulb neurons composed of mitral cells (excitatory), or a mixture of mitral cells and granule cells (inhibitory). The animals’ respiration was monitored so that photostimulation delivery could be timed to a fixed delay after inhalation (20~40 ms), mimicking the sampling of an odor (Fig. 1a, b). A specific, predefined set of neurons were used as the stimulus across several sessions, allowing the experimenters to assess the effects of learning or plasticity on detectability (Fig. 1c). Mice were able to detect the synchronous activation of ~30 neurons at high performance within several sessions.

Fig. 1
Three schematics of photostimulation. A has the apparatus with the components sound cues, sniff signal, water rewards, and 2 P imaging 920 n m plus 2 P stimulation 1028 n m. B has 5 trial structures labeled tone, sniff, photo slim, response interval, and lick. C has 30 neurons and outcomes for go and no-go with red and blank circles, respectively.

Testing detection of precision photostimulation. (a) Schematic of photostimulation detection experiment. A head-fixed mouse with a chronically implanted window above the olfactory bulb is positioned in front of a lickspout and pressure sensor to monitor respiration (sniff). (b) Trial structure for detection experiment. A tone signaled the start of trials and photostimulation was timed to a fixed delay relative to sniff (c) Left, neurons in the mitral cell layer (MCs and GCs) co-expressing ChrimsonR-tdTomato (red) and GCaMP6s (green). Thirty neurons were targeted for simultaneous photostimulation (white circles). Scale bar – 40 μm. Right, outcomes for responses to the “go” and “no-go” trials. Red circles indicate stimulation of a particular cell, while empty circles indicate no stimulation. (Adapted from Ref. [25])

After mice reached a consistent level of performance, the stimulus was parametrically varied along three feature dimensions: number of neurons, relative timing (synchrony), and latency from inhalation (Fig. 2). In this way, the contribution of each of these features to detectability could be independently assessed. We found that mice were capable of detecting single spikes distributed across <20 neurons, with an average psychophysical threshold of 10–15 neurons. Synchrony was tested by staggering the timing of spikes when the full pattern of neurons was targeted, revealing a strong dependence on the relative timing of photostimulation across neurons. Finally, we specifically manipulated the latency from inhalation and found that detection did not depend on sniff phase.

Fig. 2
A set of 3 schematics each under A, B, and C. A has 3 stimuli with decreasing numbers from top to bottom. They are for N equals 9, 5, and 2. B has 3 stimuli with decreasing synchrony from top to bottom. They are for delta t equals 10, 30, and 50 m s. C has 3 stimuli with increasing latency from top to bottom for the time 25, 45, and 60 m s.

Independent control of multiple activity features. (a) Schematic of stimuli which vary in the number of neurons targeted, but not the timing of activation. (b) Schematic of stimuli which vary in the synchrony across neurons, but not the number of neurons. Stimuli were presented with a mean latency of 45 ms across conditions. (c) Schematic of stimuli which vary in the latency of photostimulation relative to the onset of inhalation, but not synchrony, or number of neurons. (Adapted from Ref. [25])

These results demonstrated that the exquisite sensitivity of the olfactory system goes far beyond a single sniff or glomerulus, with mice capable of detecting just a few spikes, so long as they occurred within less than 30 ms of each other. The independent control of coding features using precision optogenetics was critical for discovering a previously unknown contribution of relative spike timing to olfactory perception. Future studies will surely extend this approach to test an even broader range of features across sensory systems using even more sensitive behavioral tasks to ultimately reveal the building blocks of perception.

2.2 Technical Implementation of Detection Experiments

Behavioral Task and Training

The previously described experiments all made use of a similar strategy for assessing the detectability of 2P photostimulation. They used a go/no-go behavioral paradigm, in which an animal makes a binary decision about the presence or absence of a stimulus. In this paradigm, an animal is trained to respond in some manner (e.g., licking, lever press, nose poke) for a reward, or withhold a response (e.g., do not lick, press, or poke), to signal whether a “go” stimulus cue is present or absent (“no-go”). Typically, the “go” stimulus is randomly, or pseudo-randomly presented on a fraction of trials (typically 0.5), where on the remaining fraction of trials an animal experiences either a “distractor” stimulus or nothing. Animal behavior is progressively shaped to associate the availability of reward with the presence or absence of the stimulus. Once the task is acquired, the experimenter can then change the parameters of the stimulus in either a trial or blockwise manner to determine the sensitivity of animals to each parameter based on the frequency of correct choices or error type.

In the previously described experiments, 2P photostimulation of a predefined set of neurons was used as the “go” stimulus. In all cases the behavioral experiments were conducted in head-fixed, water-restricted mice outfitted with a cranial window. Detection experiments begin by limiting the amount of water available to an animal to ~1 mL/day, depending on their body weight. After 3–5 days of water restriction, mice will consume their daily allotment, at which point they can begin training to receive water while head fixed. Habituation to head fixation and “lick training” can occur simultaneously, by gently head-fixing naïve mice, and making available a metal spout (lick tube) that will deliver a small droplet of water when contacted by the tongue. Mice can freely lick the tube to receive 2 μL droplets, until they have learned to lick enough times to receive the full 1 mL of water for the day. This has the advantage of forming an association between head fixation and the availability of reward which can decrease overt distress during behavioral training and imaging sessions. A useful tool for detecting licking is a capacitive touch sensor coupled to hypodermic tubing which can be used to trigger the release of water through a pinch valve or solenoid controlled by a microcontroller or computer.

After animals reliably lick for water, it is necessary to shape their behavior to acclimate them to the timing and conventions of the go/no-go task. This can be done by initially training mice to recognize very salient stimuli and thus learn the associations of the task before moving on to more difficult conditions. As pre-training for a 2P photostimulation detection task, this typically takes the form of training animals to detect a high intensity target stimulus for the sensory modality under study (e.g., a high concentration odorant, a high contrast oriented grating, a large amplitude whisker deflection), with another clearly separable stimulus used as the distractor, or “no-go” stimulus, or nothing at all. Alternatively, 1-photon light of the appropriate wavelength for the opsin expressed in the neurons of interest can be used to train animals to recognize artificial photostimulation during this training stage. This has the advantage of more closely mimicking the 2P photostimulation detection task, since both involve artificial activation of neurons in a given area, but has the disadvantage of not clearly mapping onto a specific perceptual stimulus, which makes it difficult to study generalization from real to artificial stimuli. Either way, performing a shaping procedure is essential so that learning trajectories during 2P photostimulation can reflect each animal learning to detect activity in a small ensemble of neurons, and not merely learning the basic contingencies of a go/no-go task.

Perceptual Testing

Once the animal’s behavior has been shaped, detection of the 2P photostimulation can be tested. If neurons are targeted across days, care must be taken to align each session’s field of view with a common template (described in Subheading 3.2). Responses of targeted neurons should be measured outside of the behavioral task for each session in order to determine whether changes in detectability are related to learning, or a change in the ability to photostimulate the targeted neurons.

After detection ability for a set of neurons has been established, the experimenter can vary features of the evoked activity to test their contribution to detection. For manipulations of neuron number, the stimulus is replaced with a hologram targeting all, or a subset of the neurons from the set. It is important to maintain the same power per neuron across conditions. For manipulations of timing, it is important to first measure the average latency and jitter of spikes evoked by 2P photostimulation (described in more detail in Subheading 3.1). To control the timing and power of 2P photostimulation, a Pockels cell can be used for rapid and precise control of light delivery. Alternatively, a shutter can be used; however, it is essential to confirm that the animal cannot hear the shutter if detection is being tested. For experiments testing relative timing of neuronal activation, or synchrony, it is important to use a spatial light modulator (SLM) capable of rapidly switching between holograms targeting subsets of the neurons (ideally <10 ms switching time).

If using a go/no-go paradigm, manipulations can be performed in a trial, or blockwise manner. While changing the condition randomly (e.g., number of targeted neurons) on each “go” trial may seem like an unbiased way to test the feature space, it is often best to test a single condition per block of trials. The reason for this is that performance errors can take two flavors, false alarms and misses. Mice tend to make the majority of their incorrect choices as false alarms, ensuring that they do not miss a potential “go” trial or opportunity to obtain a reward. By testing one condition per block of trials, usually composed of 50% go and 50% no-go (no stimulus), false alarms are readily interpretable, as the false alarm rate may increase as the stimulus condition becomes more difficult to detect (e.g., less neurons targeted), even if the number of “hits” does not decrease. If different conditions are randomly interleaved in a trial-by-trial manner, one relies only on miss rate to determine the differences in detection performance, since all conditions share the same false alarm rate, significantly reducing the sensitivity of this measure.

2.3 Measuring Perceptual Distance of Synthetic Percepts

The second experimental strategy developed recently to connect neural activity features to perception is to measure the perceptual distance of synthetic percepts evoked by optogenetic stimulation. Experiencing a familiar object, for example, a rose, evokes a complex spatiotemporal pattern of activity in sensory areas. What features of this activity determine the identity of the object as “rose” and not another object like “tulip” or “orange”? Are some features generally better at explaining the differences in how stimuli are perceived, regardless of the specific stimuli being compared? By determining these features, we can expose the computational strategies underlying perceptual identity.

Traditionally, it has been difficult to determine the perceptual relevance of different coding features using natural (non-optogenetic) stimuli. One reason for this difficulty is that different features of neural activity often covary with one another as stimulus identity is changed. For example, presentation of a rose, tulip, and orange may each evoke changes in both the firing rate and timing of overlapping sets of neurons. Which of these features (cell identity, rate, timing, etc.) is essential for the animal to discriminate “flower” from “fruit,” or to identify “rose” specifically? While expanding the stimulus set to include more diverse examples may help tease out responses unique to each class or exemplar, biophysical constraints often impose correlations between features, and it can be difficult or impossible to design stimulus sets that fully disentangle their contribution.

The use of natural stimuli to determine perceptually relevant coding features is further hampered by generating inferences purely through correlation. While some features of activity may appear to predict perceptual choices made by the animal, they may not have any actual influence on the behavior. For example, the spike rate of neurons in a particular brain region may be highly predictive of whether an animal will classify a stimulus as “flower” or “fruit,” but it need not be the case that any of these neurons is actually relevant for the discrimination. If the experimenter was to manipulate the spike rate of the neurons (activating or silencing them), they may find no effect on the choices of the animal, despite the strong correlation of the spike rate to the behavior in the normal condition. In this way, merely observing activity and relating features to behavior has limited power for testing causal models of perception.

Precision optogenetics therefore provides an opportunity to independently manipulate activity features and determine their differential impact on perception. Further, this technique provides the opportunity to test whether behavior is guided by combinations, or conjunctions of features (e.g., rate and tuning, or sequential order and phase). By creating fully synthetic stimuli using precision optogenetic stimulation, one can manipulate activity features while an animal performs a recognition, or discrimination task, in which they signal the perceived identity of a stimulus through their behavioral choice. By measuring the frequency with which animals categorize induced activity patterns to be the same, or different percepts, one can determine the relative perceptual distance between synthetic stimuli. Finally, this approach allows the experimenter to determine what individual or combinations of features define the axes of perceptual identity for a particular neural circuit.

This strategy has recently been used in our work to probe the perceptual relevance of different coding features in the mouse olfactory system. Chong, Moroni, et al. (2020) used a combination of genetic labeling (OMP-ChR2, expressing ChR2 only in olfactory sensory neurons [52]) with precise optical targeting using a high-resolution digital micromirror device to project light patterns onto the surface of the olfactory bulb, evoking activity patterns with single glomerulus resolution (Fig. 3a) [30]. In this way, we created synthetic odor stimuli by activating sets of glomeruli with high spatial and temporal precision. We trained water-deprived mice to recognize a single spatiotemporal pattern as a target “odor,” and to report activation of this pattern by licking a waterspout. The mice were trained to discriminate this stimulus from every other non-target pattern (any spatiotemporal pattern not identical to the target), and to report any non-target pattern by licking a different waterspout (Fig. 3b).

Fig. 3
Four illustrations of the perturbations of synthetic odors. A has a sketch of the mouse with a stimulation pattern, sniff signal, and water rewards near its head. B has 3 patterns labeled olfactory bulb, target, and non-target example. C has a pattern of perturbation. In pattern D, the target spots for 3 and 5 shift to right and left, respectively.

Spatial and temporal perturbations of synthetic odors. (a) Schematic of the experimental setup. Dorsal olfactory bulb was exposed by a chronically implanted 3 mm window. Spatiotemporal stimulation patterns, created by a digital micromirror device, were projected onto the olfactory bulb of a head-fixed OMP-ChR2 mouse in front of a pressure sensor for sniff monitoring, and lick spouts delivering water. (b) Schematics for pattern discrimination task. Animals were trained to recognize Target versus Non-target patterns defined on a stimulation grid. Target patterns comprised six spots, initialized randomly but fixed across subsequent sessions, activated in an ordered sequence defined in time where 0 marks inhalation onset. Non-target patterns were six off-Target spots, randomly chosen from trial to trial, with randomized timing within 300 ms from inhalation (~single sniff). (c) Illustration of spatial perturbations: One or multiple spots in Target patterns were randomly replaced with Non-target spots. (d) Illustration of temporal perturbations in which one or multiple spots in Target patterns were temporally shifted. (Adapted from Ref. [30])

After mice learned to perform this task with high accuracy, we tested how recognition changed as we systematically manipulated the activity patterns across several feature dimensions (Fig. 3c, d). Mice experienced a small proportion of “probe” trials (10% of total trials), in which the target stimulus was modified, either by replacing spots with non-target spots (Fig. 3c) or by shifting spots in time (Fig. 3d). We measured the fraction of trials in which mice reported the modified pattern as being like the “target” pattern. This proportion reflected the perceptual distance between the modified and original target pattern.

To determine the relative influence of the spatial and temporal features on perception, the experiments of Chong et al. 2020 used precise parameterization of perturbations to the target stimulus. One key finding was a primacy effect in which perturbations to earlier activated glomeruli had larger effects on perceptual responses than later activated glomeruli. Despite the fact that animals could use any available activity features to solve the target vs. non-target categorization, such as glomerular identity or timing, this result suggests that these features do not carry equal weight for informing an olfactory percept, as all the tested mice were more strongly influenced by changes to the earliest spots in the pattern without being explicitly trained to use this feature. While both spatial and temporal perturbations were effective at increasing the perceptual distance from the target pattern, the joint assessment of these features demonstrated that odor identity representation is nuanced and determined along several dimensions simultaneously. This study highlights how a fully synthetic approach can be used to establish basic principles of the neural code informing perception. Models derived from these findings could then be tested and further refined by the use of naturalistic stimuli, ultimately closing the loop between causal manipulation and ethological observation.

2.4 Technical Implementation of Perceptual Distance Experiments

In the previously described experiments, mice performed a 2-alternative forced choice task in which licking either the left lick spout or the right lick spout signaled a choice of either Target or Non-target pattern perceived by the animal. The lick spout assigned to Target or Non-target should be randomly determined for each experimental animal to control for any systematic side bias. Trials consist of a stimulus period, a grace period in which mice can lick without reward or punishment, and a response period in which the first lick determines the animal’s choice. The purpose of a grace period is to reduce the influence of impulsive licking on the trial outcome and is typically ~0.5 s following the stimulus period. This period provides time for an animal who may have been licking in response to the trial initiation to change to an informed licking pattern after experiencing the stimulus. Initially, mice will often be biased toward licking one side over the other [53], so it is important during training to perform a de-biasing procedure which adaptively increases the incidence of trials on the biased-against side [54]. By doing so, side biases can be eliminated before initiating critical phases of the experiment.

The first lick during the response period is counted as the animal’s choice and the trial ends, leading to an inter-trial-interval before the next trial. Reward will be provided to the animal at different rates depending on the phase of the experiment. Initially, during the shaping period, animals are trained to discriminate between one Target and one Non-target pattern until animals exceed a threshold of performance (80%). During this phase, animals are rewarded for correct choices 100% of the time. After the initial shaping, Non-target patterns are randomly initialized on each trial, and due to the combinatorics, they never repeat. The animals are also shaped toward a 70% reinforcement rate. At this point, the animals experience test sessions in which probe trials make up 10% of the total trials while Target and Non-target trials comprise 45% each. These probe trials involve perturbations to the target pattern along feature dimensions under study (spot identity, relative timing, shift with respect to respiration, etc.), and are randomly interleaved with the Target and Non-target trial types. Critically, probe trials are never rewarded, allowing the animal’s choices to reflect their internal Target vs Non-target category boundaries.

3 Validating Precise Manipulation

3.1 Characterizing the Scale and Timing of Response to Stimulation

In order to interpret the results of an experiment combining precision optogenetics with behavior, it is critical that the characteristics and specificity of the neural response to the manipulation be known. For example, if one wants to study how the relative order of neurons activated by 2P photostimulation impacts the perceptual quality evoked by the manipulation, it is critical that the timing of cellular responses to stimulation be known, lest the neurons respond out of order. Similarly, if an experiment relies on comparing how two different groups of neurons are perceived when they are separately activated, it is critical to determine the specificity with which the groups can be targeted, as inadvertent activation of neurons in the wrong group could confound the inferences made from an animal’s behavior. We will cover three useful methods for assessing the scale and timing of responses to all-optical targeting and manipulation in vivo which may aid in the design and interpretation of behavioral experiments.

Targeted Electrophysiology

Prior to initiating a study utilizing all-optical methods such as 2P photostimulation for combination with behavior, it is strongly advisable to assess the responses of neurons to a range of photostimulation parameters (power, duration, frequency, etc.). In this way, the all-optical system can be properly “tuned” to provide an expected response for a given opsin, cell-type, tissue depth, etc. (see Note 3). Efficacy and timing of evoked responses to stimulation using “standard” approaches, such as holographic patch, or spiral scanning of a focused beam have been characterized in the literature for only a very limited range of conditions, and rarely in vivo (though a number of recent studies have helped to fill this knowledge gap [18,19,20, 25, 28]. Currently, single-unit electrophysiology is the standard for measuring photocurrents and evoked spiking with high temporal precision. Combining high-impedance electrophysiological recordings performed with a glass pipette and 2-photon imaging, it is possible to target and record from specific predefined neurons (Fig. 4a).

Fig. 4
A scan in A, graphs in B, and raster plot in C. A has the electrophysical scan with a highlighted neuron in the center. B has 3 fluctuating graphs in decreasing order for 20, 10, and 5 m s, respectively. C has a raster plot for 1 trial. It also has a plot illustrating latency and photo stimulation, with jitter marked at the bottom.

Targeted electrophysiology. (a) 2P-guided cell-attached electrophysiological recording in an awake C57/BL6 mouse mitral cell layer co-expressing ChrimsonR-tdTomato (red) and calcium indicator GCaMP6s (green). The neuron (white circle) was targeted by a light patch (scale bar, 20 μm). (b) Examples of electrophysiological recordings during 5, 10, and 20 ms photostimulation. (c) Example raster plot for 10 repetitions of 10 ms photostimulation with 30 mW average power. The response latency is defined as the time from photostimulation onset to the first spike, and jitter is defined as the standard deviation of the latency across photostimulation repetitions. (Adapted from Ref. [25])

To perform targeted recordings, one must begin with an appropriate amplifier, digitizer, and program for recording and aligning electrophysiological data to photostimulation delivery. As the main interface to the tissue, borosilicate glass pipettes pulled to 5–11 MΩ (~1–1.5 μm tip size) are appropriate when coupled to an electrode holder with an outlet for pressure regulation. For juxtacellular recordings (cell abutting, but not internal), electrodes can be filled with a modified current-clamp “external” solution (130 mM K-gluconate) containing fluorescent dye for visualization. In practice, neurons are often labeled with both a calcium indicator (GCaMP6, jRGeCO1a, etc.) and a fluorophore indicating the expression of an opsin (ChrimsonR-tdTomato, Chronos-EYFP, etc., though see Note 1), which makes visualizing the thin tip of a glass pipette difficult, since both green and red imaging channels contain information. To solve this, an approach is to use a mixture of green and red dyes (Alexa 488/594 mixture, Milipore) when preparing the pipette solution, which allows the pipette to be viewed simultaneously on both the green and red imaging channels, allowing it to stand out to some degree from the surrounding tissue.

It is optimal to perform recordings in conditions as similar as possible to behavioral conditions, so it is recommended to perform recordings in awake, head-fixed animals. To achieve this, it is important to have a stable, chronically implanted preparation that permits repeated electrophysiological access to the targeted area, as well as optical clarity for 2P imaging. Both of these requirements can be met by implanting a glass cranial window (Warner) with pre-drilled holes filled with silicone elastomer (Quik-Sil). If appropriately mixed, the elastomer will remain transparent and create an air-tight seal in the cranial window, allowing it to be implanted chronically. Prior to the recording session, the silicone plug can be removed with a pair of fine forceps, exposing the tissue underneath. The brain should be bathed in sterile saline or artificial cerebral spinal fluid for the duration of the recordings, after which the cranial window can be re-sealed using the same technique. This method permits multiple recording sessions for each animal, and can drastically increase the yield of stable recordings, since the implanted cover glass also effectively minimizes brain movement.

For the purpose of measuring the temporal resolution of an all-optical manipulation, measuring spiking delay from the onset of photostimulation and the jitter of the timing of the evoked spiking are essential (Fig. 4b, c). To estimate the spatial resolution of the manipulation, moving the focus of the stimulation laterally (x,y), and in depth (z), should be performed by fixing all features of the photostimulation and moving the objective a fixed amount while recording the effect on spiking. From this, one can measure how far from the targeted neuron the photostimulation could evoke spiking and infer how near another neuron would need to be, to be inadvertently activated by the photostimulation. While this is a useful estimate, in practice “off-target” activation depends on many factors, including the geometry of the area under study and the sparseness of opsin expression, to name a few, meaning the measured lateral and axial resolution is not the same as the effective resolution of the system, which can be estimated using methods in the following sections.

Network Responses to Photostimulation

It is highly desirable to demonstrate that the scale of the manipulation employed in an experiment combining all-optical manipulation with behavior is matched to that of the neural representation being probed. For example, if targeting individual neurons tuned to a particular orientation in visual cortex, and these neurons are interdigitated with neurons tuned to other orientations in a “salt and pepper” organization, it is sensible to demonstrate single-neuron resolution of photostimulation. A technique to estimate the effective scale of a manipulation is to target individual neurons composing some representation and observe via 2P calcium imaging the effect of stimulation on the non-targeted neurons in the field of view (FOV). For example, if an experiment involves simultaneously targeting 30 neurons that share the same tuning, it is useful to target each of these cells individually and measure the effective network response to each target in addition to targeting the full pattern. In this way, one can screen for undesirable off-target effects or reject neurons that do not respond on their own. Both of these factors could be harder to observe when targeting the full pattern, because a great degree of network modulation is expected as the number of targeted neurons increases, and targeted neurons that are modulated by the full pattern may not actually be responding to photostimulation directly, but just modulated along with the rest of the network.

To screen for off-target effects, or inadvertent photostimulation of neighboring neurons, it is useful to look at whether excitatory responses to untargeted neurons cluster around photostimulation sites (Fig. 5). A common way of representing the spatial extent of the manipulation is to bin the average response of all neurons in the imaging plane by their centroid distance to the photostimulation targets (Fig. 5a). Then one can plot the distribution of responses across a range of radial eccentricities for neurons that do and do not express the opsin (Fig. 5b, c). To visualize this distribution, it is useful to compute a spatial heatmap of the average response to photostimulation across neurons (Fig. 5b). To perform this analysis, for each single-cell stimulation target, compute the average response in a brief time window (e.g., 100 ms) following photostimulation for all the neurons in the field of view expressing the opsin. Using the spatial footprint of each cell body (outline), create an image where each cell body spatial footprint is labeled (colored) with the average response to a photostimulation target, and repeat the procedure to produce images for all the photostimulation targets. Finally, shift the x–y position of the images to align the centroids of the photostimulation targets and average across the images to produce a spatial heatmap of the response to photostimulation (excluding space between neurons from the average). From this visualization, one can determine whether there is a spatial bias of excitatory or inhibitory responses in the area surrounding the targeted neurons. While an idealized version of the result would be that excitatory responses disappear within one neuron’s radius away from the photostimulation targets, in practice this is unrealistic. For example, there may be a large degree of local excitatory connectivity which falls off with eccentricity, making a gradually decaying response profile fully expected. How then can you disambiguate synaptically (network) driven responses in neighboring neurons from “off-target” photostimulation (Fig. 5c–e)? If opsin expression is sparser than expression of the functional indicator, this means a subset of the neurons in a FOV will be opsin-negative, providing an opportunity to compare their responses to single-cell photostimulation with cells that are opsin-positive (Fig. 5c). If the non-targeted opsin-positive neurons exceed the response of the opsin-negative neurons, this is evidence that the non-targeted opsin-positive neurons are directly activated by the photostimulation laser (Hypothesis 1: “Off-target stimulation,” Fig. 5d). If the non-targeted opsin-positive neurons do not respond in excess of the opsin-negative neurons, this is compelling evidence that network modulation can be explained by synaptic effects alone (Hypothesis 2: “Network effects,” Fig. 5e, and the experimental outcome in Fig. 5c). Confidence in this interpretation may depend on similar populations of cells being labeled with the opsin in a random and unbiased way or using a preparation which is engineered to be sparsely labeled. However, even in densely labeled samples this approach can be beneficial for identifying non-responsive targeted neurons to exclude from multi-neuron patterns and measuring the spatial scale of the influence of individual neurons in a larger, multiple-neuron manipulation.

Fig. 5
A scan labeled A and graphs labeled from B to E. A has an electrophysical scan with the target in the center and radial distances from 50 to 250. In B, a heatmap marks opsin plus response in the center. C has a dot plot for opsin negative and opsin positive. In line graphs D and E, the opsin plus response exceeds and does not exceed, respectively.

Assessing specificity of photostimulation-evoked activity. (a) Example FOV centered on the mitral cell layer of a Tbet-Cre mouse. One mitral cell is targeted for 2P photostimulation (small white circle, 15 μm diameter). Red labeling corresponds to FLEX-ChrimsonR-tdTomato expression limited to a subset of mitral cells, and green labeling corresponds to pan-neuronal GCaMP6s expression. Normalized fluorescence (ΔF/F) is averaged for every neuron in the 100 ms period following photostimulation and averaged across neurons occupying the same radial distance from the target (example bin size = 50 μm). (b) A spatial heatmap of average response vs. ROI position centered on 49 mitral cell targets (n = 2 Tbet-cre mice, 138 total neurons, 3631 photostimulations, 3 pulses per photostimulation: 10 ms on – 10 ms off, at 30 mW, or 0.19 mW/μm2). Only cells labeled with ChrimsonR-tdTomato were included. The “targeted” bin is outlined by a black circle. (c) The average radial decay of responses across ChrimsonR-tdTomato positive neurons (Opsin+, red) and ChrimsonR-tdTomato negative neurons (Opsin−, green). Cell responses radially binned and averaged for each targeted neuron and bin means were averaged across targets (mean ± s.e.m., 49 targeted neurons, n = 2 Tbet-cre mice). Asterisks indicate a significant difference between the average binned response of ChrimsonR+ and ChrimsonR− neurons (*p < 0.05, two sample t-test, Holm-Bonferroni corrected for multiple comparisons). (d) A cartoon of the hypothesis that photostimulation activates neighboring Opsin+ neurons due to off-target photostimulation: Opsin+ responses will exceed the Opsin− responses, reflecting inadvertent stimulation of nearby Opsin+ neurons. (e) A cartoon of the hypothesis that single-cell photostimulation leads to responses of nearby neurons due to network effects: Opsin+ responses do not exceed the responses of Opsin− neurons. (Adapted from Ref. [25])

Omit-One-Target

When targeting many neurons simultaneously, the probability of inadvertently activating untargeted neurons is significantly increased. While somatically targeted opsins and axially confined stimulation techniques like temporal focusing help to decrease the incidence of inadvertent activation, it is beneficial to directly test targeting specificity in this more complex regime. A useful approach is to perform an “omit-one-target” experiment and analysis (Fig. 6). After defining a set of neurons to target simultaneously, one can generate holograms systematically omitting each spot from the full pattern (Fig. 6a). These holograms can be used for photostimulation, with multiple repetitions randomly interleaved into blocks of trials. Then it is possible to compare the response in each neuron when it was targeted vs. when it was omitted (Fig. 6b). While ideally the response of every non-targeted neuron would fall to zero, this is unrealistic, especially if targeting groups of neurons that may compose an interconnected circuit. However, if target selection was biased toward highly photostimulation responsive, or sensitive neurons, a general reduction in response magnitude for the omitted neurons can serve as strong evidence for cellular specificity. Demonstrating that neurons within this population can be individually controlled in the presence of simultaneous photostimulation of a large number of targets validates that manipulations of neuron number, or specific neuron identity, can be properly interpreted from behavioral experiments.

Fig. 6
Two illustrations of photo stimulation targets. In A, the patterns display the spots for all targeted versus omit one target. For omitting the target, the targets 1, 2, to n are omitted and represented by blank dots. In B, the dot plots are for targeted and omitted which range between estimated 0.03 to 0.15 and negative 0.05 and 008, respectively.

Characterizing response in omitted photostimulation targets. (a) Schematic of an omit-one-target experiment. Top. Pattern of spots targeting many neurons simultaneously. Bottom. Omit one target condition in which one of the spots is removed from the stimulation pattern. All spots are individually dropped, and the holograms are randomly presented to the mouse along with the “all targeted” hologram. (b) Average response to 10 ms photostimulation. The average response per cell when it was targeted vs. when it was omitted with all other cells targeted. (**p < 0.001, two-sample t-test, targeted: 0.08 ± 0.006, vs. omitted: 0.02 ± 0.007, mean ΔF/F ± s.e.m, 28 targeted cells (19 responsive), n = 2 WT mice, 30 photostimulations per datapoint, 10 ms duration, 20 mW/patch, 0.125 mW/μm2). (Adapted from Ref. [25])

3.2 Registration Between Photostimulation and Imaging, and In Situ Evaluation of Targeting

In addition to measuring the response characteristics of neurons under study in an all-optical behavioral experiment, it is also essential to characterize the physical position and scale of the stimulus being applied. While extensive characterizations of the point spread function (PSF) of the stimulus should be performed prior to engaging in in vivo experiments, there are several procedures that can be performed on a routine basis during behavioral experiments to significantly increase confidence in the experimental outcomes. These involve registering the photostimulation and imaging arms of the microscope into alignment and methods for online motion correction to correct for slow drift in the positions of targeted neurons. Additionally, we highlight a potentially transformative approach, holographic optogenetic confocally unraveled sculpting microscopy (HOCUS), for in situ evaluation of stimulus shape and position deep in the tissue of awake behaving animals [31].

Calibration

It is critical to start by ensuring that the optics of the 2P photostimulation and imaging arms of the microscope have been appropriately selected and aligned to permit imaging and photostimulation of the same FOV. Still, small drifts in the alignment can occur over time. This might be due to temperature fluctuations, mechanical stress, vibrations, changes in laser output angle, among many other factors. Luckily, small translations, rotations, and shearing of the photostimulation FOV relative to the imaging FOV can be compensated for with a simple calibration procedure. A calibration pattern can be burned onto a fluorescent plate by the photostimulation system. Then the plate can be imaged, and the calibration pattern can be used to register the two arms of the microscope by computing the disparity between the desired pattern and observed pattern.

Using a predefined calibration pattern made of several small (~1 μm) targets, deliver 1–2 ms of photostimulation while increasing the average power at the sample until small burn marks appear on the fluorescent plate. The goal is to burn all of the spots in the calibration pattern evenly at a small size. Then, capture an image of the burned pattern and compute the inverse rotation matrix and offset from the desired pattern. This rotation and offset can be applied when generating holograms to exactly compensate for the effects of small drifts in alignment between the photostimulation and imaging arms.

Online Motion Correction

For behavioral experiments that require targeting the same neurons for photostimulation across days, it is useful to employ an online motion correction method (see Note 2). In the experiments described by Gill, Lerman et al., 2020, the FOV was first aligned to a reference image manually, then the position was fine-tuned automatically using a custom-designed closed-loop algorithm, implemented as a module within ScanImage software [25, 55] (Vidrio Technologies). This algorithm attempted to minimize the difference between the reference image and the FOV by iteratively moving the microscope stage (Sutter 285) to reduce the residual displacement computed using a rigid motion correction package (NoRMCorre, Flatiron Institute [56]). The optimization typically converged within 10–15 s once the residual displacement vector was reduced to <0.5 μm in magnitude. In addition to aligning the FOV across days, we performed this routine between consecutive blocks (60 trials, 6–9 min) during each behavioral session to minimize the effect of slow xy drift due to brain and microscope motion; therefore, ensuring the photostimulation targets remained consistent throughout each session. We monitored for drift in the z dimension as well, which was manually corrected using the reference image between blocks (60 trials, 6–9 min) if necessary, though displacement was typically small (~3 μm in a 1.5 h session).

In Situ Evaluation of Holographic Patterns

2-photon precision optogenetics experiments often rely on holographic wavefront shaping techniques to generate light patterns targeted to individual neurons deep in living tissue. Despite careful alignment between the stimulation and imaging fields of view, light propagating through the brain can undergo significant tissue-induced distortions, mainly through scattering, that lead to a discrepancy between the desired and actual light patterns reaching the neurons. These effects can be unpredictable, as distortions will vary as a function of specific tissue geometry and composition. Distortions are especially detrimental to experiments combining cellular photostimulation with behavior, as the shape and position of holographic patches may be far from desired values, limiting the interpretability of perceptual judgements.

In order to directly measure the effects of tissue-induced distortions on projected light patterns, Lerman et al. 2019 described a new method, holographic optogenetics confocally unraveled sculpting (HOCUS), for real-time, in situ evaluation of holographic light patterns (Fig. 7) [31]. This technique involves confocally descanning reflections from light patterns focused into the brain. Photons emitted from the sample, due to ballistic reflections of light from the photostimulation laser source, are imaged with a confocal detection system. This system can be added to most microscopes combining 2P imaging and photostimulation, making use of scanning optics typically used for 2P imaging in a reverse direction to descan reflections from a static light pattern projected into the tissue (Fig. 7a). Descanned light is focused by an electrically tunable lens to a pinhole to confocally reject out-of-focus light, permitting an evaluation of reflected light patterns at the focal plane (Fig. 7a, b). By combining images collected by 2P imaging and HOCUS (typically averaging frames collected over a second), targeted neurons and holographic patches as they appear in the brain can be viewed simultaneously for evaluation. This technique is capable of measuring tissue-induced distortions unique to a particular field of view and holographic pattern, and enables real-time correction of holographic spot position by adjusting the hologram to compensate for any difference from the desired location. This technique could also be used, in principle, to optimize the generated holograms to compensate for tissue-induced distortions in the shape of the light patches. Ultimately, directly viewing and correcting deviations between the desired light pattern and actual light pattern being used to photostimulate neurons could significantly improve the reproducibility of experiments combining 2P holographic stimulation with behavior.

Fig. 7
Two illustrations of in vivo with HOCUS. In A, above the head of a mouse, there is a path comprising stimulation, HOCUS, and imaging components that pass through P M T 1, 2, P B S, D E T, and so on to form merged equals imaging plus HOCUS in the head of the mouse. B has an image of G CaMP 6 s expression with 4 spots for reflected stimulation light.

Visualizing light patterns in vivo with HOCUS. (a) The 2P imaging path is combined with the holographic 2P photostimulation path for in vivo experiments in head-fixed, behaving mice. The reflected stimulation light passes through the PBS, is descanned by the mirrors, and is then reflected by the dichroic mirror through the pinhole onto the detector. PBS polarizing beam splitter, PMT1, PMT2 photomultiplier tubes, SLM spatial light modulator, DET detector. (b) A merged image of GCaMP6s expression (green) and the HOCUS-imaged reflected stimulation light (red) showing a pattern of four light spots projected onto the brain, 120 μm deep in the olfactory bulb, and positioned on four respective neurons. Scale bar: 25 μm. (Adapted from Ref. [31])

3.3 Assessing the Reliability of Behavioral Readout

While initial measurements of the responses evoked by photostimulation as well as calibrations to ensure proper targeting are essential for experiments combining all-optical manipulations with behavior, they are not the only controls necessary to be confident in a behavioral result. As an additional precaution, it is extremely useful to ensure that the observed behavioral effects depend only on manipulation of neural activity and not other features of the stimulus. We provide two methods for handling this requirement in the 1P pattern stimulation and 2P photostimulation regimes.

Stimulation Masking

When designing a stimulus dependency control in the 1P regime, a primary factor to consider is that animals may be able to see the light being used for stimulation. Even if light is delivered through an insulated optic fiber, preventing the animal from detecting the light externally, it is possible that light delivered within the brain can still stimulate the retina. Humans and mice are capable of visual detection at or near the level of single photons, so even moderate light powers in highly scattering media run the risk of inadvertent visual detection through light traveling within the brain and ultimately to the retina [38, 57]. This can seriously affect the outcome of an experiment in which animals are asked to detect photostimulation of groups of neurons, or discriminate between patterns of activation, as an animal could potentially learn to solve the tasks partially, or entirely, using their vision. A common control for this is to perform the same behavioral experiments on animals that lack a functional opsin and demonstrate that they do not perform the task above chance level. This is a useful control for short-term experiments; however, this does not rule out the possibility of a shifting strategy in opsin-positive animals over a longer time course. All-optical experiments exploring the effects of a large photostimulation parameter space on detection or discrimination often take weeks or months to complete. While animals may initially base their choices on direct neural activation, if visual cues exist, it is possible for them to later switch strategies to rely, to some degree, on directly seeing the stimulus. A solution for this is to use a set of “blanking LEDs” matched to the wavelength of light used for stimulation. These can be positioned near the eyes and triggered during behavioral trials. If the blanking LEDs are significantly brighter than the stimulation, this removes the possibility of using the stimulation light as a cue, as the animals will see a strong light of the same wavelength regardless of the stimulation condition.

Sham-Photostimulation

Experiments involving 2P photostimulation provide much less opportunity for inadvertent visual stimulation, since the wavelengths involved are typically near-infrared and thus invisible to rodents and primates. Still, 2P photostimulation comes with its own caveats, as the light powers used tend to be a great deal higher than those used for 1P photostimulation. This introduces the possibility that animals could sense changes in light-induced heating within the brain through its effects on neural activity, or via tactile stimulation of surrounding tissue (see Note 3 and Ref. [58]). Demonstrating an inability of opsin-negative animals to detect or discriminate photostimulation patterns or when targeting opsin-negative neurons does not exclude the possibility that opsin-positive animals engaged in extensive behavioral training could learn to use heat as a cue.

An ideal control would be to provide a version of the stimulus that reproduces all the features of the original photostimulation, but does not evoke spiking in the targeted neurons. For this, we can leverage the non-linear nature of 2P excitation. Since 2P excitation relies on the peak power of laser pulses reaching the sample, increasing the duration of the laser pulses while fixing the average power can effectively reduce 2P excitation, and thus activation of the opsin (Fig. 8a). Many lasers used for 2P photostimulation have an internal or external compressor for dispersion compensation, or to maximize peak power of the output pulses. It is possible to change the pulse width at the sample from a typical value of ~200 fs to >15 ps by adjusting the laser’s compressor, provided it has the appropriate range, all while keeping the same average power, or the amount of light delivered to the tissue, constant. Therefore, lengthening the pulses provides the opportunity for a “sham” photostimulation control that can reproduce the heat and possible indirect sensory effects present during a behavioral task, but is capable of eliminating spiking induced by 2P photostimulation by reducing 2P excitation by several orders of magnitude (Fig. 8b, c). By interleaving blocks of trials identical to the typical behavioral experiment, but using the sham photostimulation control, it is possible to confirm to what degree the behavior relies on evoked spiking, and not on other factors (Fig. 8d). This method could even be used to calibrate the appropriate average power delivered during the behavior, as the power could be increased during the sham photostimulation condition until it is detectable, then reduced until it is well below the detectable level for the rest of the experiment.

Fig. 8
Three graphs and a table. In A, the graph plots power versus time for stimulation and control. In B, a table lists 2 P stim, heat, and effects for stimulation and control. C has a raster plot for 20 trials and a dot plot of firing rate versus time. D has a dot plot for the detection accuracy of pre-control, control, and post-control.

Sham-photostimulation control. (a) A schematic demonstrating the effect of tuning the laser pulse duration. Time dependence of laser power for pulse trains with the same pulse frequency (f) and average power, but different pulse durations: short pulse, τS (red), and long pulse, τL (gray). To photostimulate a cell, laser power must exceed a certain threshold, Pth. (b) Left, a table summarizing the differences in effects evoked by the short and long pulse duration stimuli. Right, a schematic of the behavioral setup for the sham photostimulation control experiment. (c) Representative example raster plots (top) and peristimulus time histograms (PSTHs) (bottom) for short pulse photostimulation (~200 fs, red) and long pulse sham photostimulation (control, ≥15 ps, gray) (20 trials per condition, 30 mW, 10 ms illumination, n = 1 cell in 1 WT mouse). (d) Detection accuracy as a function of photostimulation condition. During the sham-photostimulation control blocks, detection accuracy dropped to chance level (0.5 ± 0.003, mean ± s.e.m., p = 0.37, one-sample t-test, 0.060.125 mW/μm2, n = 5 mice, 2 WT (filled circles) and 3 Tbet-cre (empty circles) and was significantly different from both pre- and post-control measurements (p < 0.001, Fisher’s exact test, 0.060.125 mW/μm2, n = 5 mice, 2 WT (filled circles) and 3 Tbet-cre (empty circles). (Adapted from Ref. [25])

4 Notes

  1. 1.

    Proper co-expression of the opsin and activity indicator is critical for any study seeking to use precision optogenetics to probe perception. In our experience, this is the primary factor limiting the success of experiments. Many combinations of opsin and calcium indicator simply do not co-express uniformly following viral injection without some tweaking of the viral titres, ratio between viruses, and total volume injected. Further, serotype can play a role, leading to unpredictable tropism or competition, where some cells express only the opsin and others the indicator. In practice, this means it is advisable to spend considerable time carefully testing the co-expression and responsiveness of neurons following the injection of different constructs in a range of titres, ratios, and volumes prior to conducting behavioral experiments. Even when a suitable combination is found, there can still be dramatic variability in expression and the proportion of responsive neurons between experimental animals. This means it is useful to inject and implant more animals than are predicted to be necessary in order to choose a cohort suitable for the study. The availability of transgenic mice expressing either the calcium indicator, opsin or both may greatly alleviate issues with expression in the future; however, it is important to note that transgenic expression is often substantially lower than virally mediated expression, leading to a significant increase in the laser power necessary for imaging and/or photostimulation. Constructs enabling bicistronic expression of both an opsin and calcium indicator using a single viral vector may also improve outcomes and have already been used for probing perception [24], though it is important to note that a 1:1 ratio in expression may not always be ideal. For example, when an opsin is very sensitive and a calcium indicator is relatively dim, the high laser power necessary for imaging could introduce substantial crosstalk between the imaging laser and opsin.

  2. 2.

    It is extremely important to compensate for the movement of the brain to ensure that targeted neurons are properly illuminated across each experimental session. The brain is non-rigid and can experience both fast movements (e.g., artifacts of licking or animal locomotion) and slow movements (swelling and contracting on the order of minutes to hours). While trials containing fast movements can be rejected after the experiment has been conducted, slow movements can lead to a substantial difference between the location of the holographic spots and the targeted neurons that cannot be corrected post-hoc. For example, one may generate a hologram at the start of a session to target a group of neurons, only to find at the end of the session that the holographic spots no longer align with the locations of the targets since the brain has shifted relative to the FOV of the microscope. This may lead to a corresponding drop in behavioral performance across the session, and significantly impact the interpretation of the results. Therefore, it is best to re-align the microscope position with a reference image every few minutes either manually or using an online motion correction procedure (e.g., the method described in Subheading 3.2).

  3. 3.

    Increasing the light power delivered to targeted neurons does not always lead to an improvement in photostimulation efficacy and may be detrimental at high values. The spiking response of individual neurons will generally increase in rate and consistency across repetitions when the average photostimulation power is increased, until the response saturates at a range of values that is particular to each neuron. The point of saturation has been exceeded when increasing the average power does not activate significantly more opsin proteins, or when the maximum firing rate of the cell has been reached. Further increasing the photostimulation power beyond the minimal required intensity has several disadvantages including (1) increasing the likelihood of stimulating untargeted neurons, especially in the axial dimension, and (2) causing excess tissue heating which may lead to physiological and perceptual artifacts [58]. At the limit, high photostimulation power may lead to cellular ablation, and, in practice, the range of effective photostimulation powers is often near this threshold (though the precise power limit depends on many factors such as depth, vasculature, wavelength, etc.). For these reasons, prior to conducting a behavioral experiment, it is useful to measure the response of individual neurons to a range of light powers in order to estimate the minimal light power necessary for photostimulation.

5 Outlook

The possibility of linking precise coding features to specific behaviors now permits questions previously limited to theory and speculation to be directly addressed. At the coarsest level, determining the essential building blocks for perception through studies of detection using both real and synthetic stimuli will define the boundaries within which to explore the perceptual quality imparted by specific neurons and patterns of activity. By manipulating sensory circuits at a behaviorally and physiologically relevant spatiotemporal scale, a relationship can be established between the perceptual space and the feature space of neural activity. However, the precision of the inferences that can be drawn about this relationship is jointly determined by the resolution of both the behavioral metrics and neural manipulations employed. Therefore, we can expect advances in both capacities as the field continues to mature.

It is inevitable that the opsins, optics, and technology supporting precision optogenetics will continue to improve, ushering in a host of possibilities for bidirectional modulation of increasingly large and diverse neural representations. Less apparent is how behavioral methodologies will adapt to meet the nuance and complexity permitted by these tools. As the dimensionality of the neural feature space explorable by optical manipulations increases, binary behavioral readouts (lick vs. no lick), and simple stimulus-reward associations (stim 1 = rewarded, stim 2 = unrewarded) may be insufficient, or, at best, inefficient for mapping activity features to perception.

To meet these changing demands, our research groups, among others, are exploring new paradigms, as well as adapting existing methods traditionally overlooked for use in rodent behavior that increase the information gained about perceptual quality from each judgement made by an animal. Two examples include continuous report, in which an animal can smoothly adjust the synthetic stimulus to match or deviate from an internal template (potentially derived from a real stimulus), or delayed-match-to-sample, in which two stimuli are directly compared within each trial to determine if they are perceptually identical or different. These methods have the advantage of allowing an animal to directly report the meaningful combination of features composing a percept (continuous report), and allowing many synthetic and natural stimuli to be compared within one experimental session (delayed-match-to-sample). The challenge remains to optimize methods for compatibility with head-fixed behavior and to accelerate training by exploring more intuitive readouts of choice.

A related, and equally important problem involves determining which features to manipulate (which neurons, timing, number of spikes, etc.) to maximize information gained from finite length experiments. As the number of neurons and features that can be addressed in a single experiment increases, screening the perceptual impact of all combinations of features will become prohibitively time consuming. Therefore, it is important for experiments to be guided by models implicating which neurons and combinations of features are likely to have the greatest perceptual impact. One recent method involves calculating the “intersection information” of neurons, by using a statistical approach to identify activity features carrying stimulus and choice information during a sensory guided behavior, laying out predictions for how modulation of features carrying intersection information will affect behavior [59]. Another approach is to infer the functional connectivity of a local circuit from the effects of focal stimulation of component neurons [28]. Using the inferred structure, one could predict how the effects of stimulation will propagate through the circuit, and the stimulation could be designed to activate specific modes of activity, testing their effect on perception. Ultimately, the emergence of newly informative behavioral paradigms along with novel conceptual frameworks will likely be some of the most exciting outcomes to come from the application of precision optogenetics and synthetic perception.