We set out to create a system that would allow improved confocal microscopy data visualization and colocalization analysis. To achieve this goal we worked closely with domain experts to identify limitations in their current analysis capabilities as well as to verify that our prototypes were indeed functional. A main objective was to make the colocalization analysis process more intuitive. In this respect, virtual reality was a very attractive option, since it offers immersion into a three-dimensional environment that is already familiar to any user. This is an important advantage over currently prevalent methods, where the sample visualization is displayed on a two-dimensional screen, and where relative spacial positions are difficult to judge due to the lack of depth.
The user can choose to visualize the microscopy sample using either the texture-based or the volume ray casting rendering methods, depending on what they want to accomplish. In order to allow colocalization to be calculated in an interactive way, a graphical user interface (GUI) was implemented which allows the user to manipulate the rendering parameters of the sample. These parameters include the noise filtering threshold, the global opacity and the opacity of the different color channels. Further GUI panels allow more detailed colocalization analysis as well as region of interest selections. Examples of these GUI panels are shown in Figs. 1 and 2.
Our system was implemented using the Unity engine [6]. Unity uses a left-hand coordinate system, with the x-axis pointing towards the right of the view, the y-axis pointing up and the z-axis pointing away from the user. This is the coordinate system that is used for scaling the ROI.
Sample preparation
In order to test our visualization system, mouse embryonic fibroblast (MEF) cells were stained with anti-DNA, anti tubulin and anti-actin-Alexa-633 antibody, followed by incubation with conjugated Alexa-488 donkey-anti-rabbit and Alexa-568 donkey-anti-mouse secondary antibody (Life Technologies). Nuclei were counterstained using the DNA intercalating fluorochrome Hoechst 33342. Image acquisition was performed using the Zeiss LSM 780 confocal microscope equipped with a GaAsp detector and images were acquired through z-stack acquisition, with an increment of ±0.4 μm between image frames. The AxioCam MRm camera was utilized to capture images.
The sample can be seen in Fig. 3 which shows a mammalian cell with 2 nuclei (blue) surrounded by small mitochondrial DNA fragments (red) as well as a thin microtubule network (green) and a thick structural actin network (magenta). Note that due to the blue and red channels overlapping, it also appears as magenta in the middle.The tubulin and actin network facilitate transport function as well as cell stability, creating a cellular skeleton.
Input interface
When a user wears a VR headset they are visually cut off from the world. This makes traditional input such as a keyboard, mouse, pen or touch interface impractical. Two alternative methods for interacting with the system were therefore implemented. The first uses the Leap Motion hand tracking system [15], which allows the user’s hands to be visible inside the VR environment. The user can therefore use his or her hands to interact with the GUI and the sample itself in an already familiar and intuitive way. While this input method has the important advantage of absolute intuitiveness, occasionally users struggled to use the interface due to inaccuracies in the hand tracking process. This was compounded by the lack of haptic feedback that would be present when touching real objects.
As a second input method we combined the VR headset’s built-in head tracking with a traditional gamepad. Since a gamepad is a physical device that the user can hold, it allows for finer control over the visualization. Button presses are combined with the direction of the user’s gaze, as determined from the HMD’s head tracking system. Most gamepads also provide force feedback, which gives the user a greater sense of physical interaction. However, a gamepad is less intuitive and requires the user to learn which input interactions are mapped to which functions. Also, since it is not part of the rendered scene, it is less immersive.
Hand tracking
Hand tracking was achieved using the Leap Motion [15], which uses a stereoscopic infrared camera, along with a software API, to interpret the camera data. When attached to the front of the VR headset, the user’s hands can be tracked and rendered inside the virtual environment. Our system allows the user to translate, scale and rotate the sample using intuitive hand gestures, such as pulling and turning.
Since hand tracking is sometimes inaccurate, these gestures are designed to be simple and unambiguous. In designing suitable gestures, we have been guided by other standard VR gesture based applications designed around the Leap Motion as well as Leap Motion’s “VR Best Practices Guidelines” [20–22]. For example, we do not rely on small finger movements.
Since there is no physical feedback from the virtual content, visual feedback is used to inform the user when gestures or interactions with virtual elements are detected. We have accomplished this by rendering a bounding box around the volume sample and changing its color, based on the currently detected interaction. This was paired with optional text hints that were momentarily displayed over the sample.
In order to translate, scale or rotate the sample, the user must move both hands to be partially or totally inside the bounding box. Both hands must then perform the pinch gesture (index finger touches thumb). When this double pinch is detected, translation can be performed by moving both hands in the same direction simultaneously, while scaling can be performed by pulling the hands apart or pushing them closer together. The sample is rotated when the hands rotate relative to each other. Both rotation and translation can also be performed by pinching with only one hand, and moving the hand around for translation and turning the hand for rotation. We found that this allows for more precise rotations.
The user can interact with the GUI by touching the GUI elements in 3D space, in a similar way that a touch screen would be used.
Inaccuracies of the hand tracking system
The hand tracking system provided by the Leap Motion and the Orion SDK is currently the best supported consumer tracking system that also integrates easily with a virtual reality headset. It also provides the most stable hand and finger tracking that we are aware of. There are, however, several technical limitations to the system that can cause some difficulty for novice users.
Firstly, due to the use of pinch gestures for interacting with the sample, precise finger tracking is necessary. However, when the user’s hands move too far away from the tracking device, finger tracking sometimes becomes unreliable. This also makes interaction with GUI elements using a single finger difficult. These problems can, however, be mitigated by requiring the user to move physically closer to the element in question.
Secondly, because the hand tracking system needs line of sight to accurately track each finger it becomes inaccurate when the fingers are obstructed, by for example the other hand [22].
Unfortunately, neither of these problem scenarios are instinctively clear to novice users. Even when they are cautioned not to perform such problematic hand movements, the immersive nature of virtual reality makes the adherence to this advice difficult. The frequency with which these difficulties were experienced varied greatly from user to user, and it is expected that they would be largely overcome as users become experienced in using the interface.
Gamepad
We made use of a traditional gamepad with two analog sticks and four front facing buttons, a directional pad and four trigger buttons. Translation, scaling and rotation were performed using the analog sticks in conjunction with button presses. We combined the gamepad input with the head tracking provided by the VR headset. Using the head tracking, a 3D cursor is rendered in the center of the display. When the 3D cursor hovers over a GUI element, it is highlighted and raised slightly to indicate to the user which element they will interact with when a button is pressed on the gamepad.
Interaction with the GUI elements is accomplished by moving the 3D cursor over the element in question and pressing a button on the gamepad. When sliders are selected in the GUI their value can be changed using the direction pad. This offers an advantage over the hand tracking, since the user can change the rendering parameters without having to continue looking the GUI.
Colocalization visualization and analysis
The GUI layout for colocalization analysis, which shows several colocalization metrics as well as scatter plots, can be seen in Figs. 1 and 2. Figure 2 shows that the GUI panels are angled to face the user when looking in that direction, with the volume sample in the center. This setup increases the immersion in the virtual environment. Colocalization is visualized in real-time as the settings are changed.
Figure 1 shows how the GUI allows the user to select the two channels that should be considered for colocalization analysis. In order to assist with the analysis the user can select to overlay the colocalized voxels on the volume sample as white voxels, or to render only the colocalized voxels either in their original colors or in white. These rendering options are illustrated in Fig. 3. Furthermore, the thresholds used in the colocalization metric calculations as well as the rendering opacity of the colocalized voxels can be adjusted. Once these parameters have been optimized interactively, the colocalization metrics and scatter plots can be calculated. All these are only calculated within a pre-selected region of interest (ROI), which is discussed in the next section.
Selection of the region of interest (ROI)
In order to effectively investigate the colocalization between two color channels, a good region of interest (ROI) selection tool is required. Three different ROI selecting tools were implemented, namely the box, cylinder and freehand tools. Example selections with these tools can be seen in Fig. 4. When the user is in the ROI selection mode, the same interactions used to manipulate the sample are used to manipulate the ROI. The user additionally has the ability to scale the selection along a specific axis.
When the box and cylinder tools are selected, they initially include the entire volume sample. They must subsequently be transformed to include only the section desired. When the freehand tool is selected, the user traces out the ROI using either head movement (with gamepad) or by pointing the index finger (with hand tracking). Since inaccurate freehand selections may result from the parallax effect, the volume sample is flattened and rotated to face the user before selection. Once the initial rough ROI has been selected, the user has the option to scale the ROI selection independently along any of the three axes. In the case of the freehand tool, the volume is first unflattened. By scaling along the z-axis, the user can accurately position the ROI within the volume.
After the region of interest has been selected, a ROI mask is generated in the form of a two-dimensional boolean array, as well as back and front z-positions that indicates at which depths the ROI selection starts and ends.
User trials
Since the average user is generally unfamiliar with VR, we wanted to establish the degree of ease with which users could use our interface. More particularly we wanted to establish how difficult it is to perform certain defined manipulations using the different VR interfaces. Accordingly, user trials were carried out with a diverse group of 29 computer users. Of these, 5 were biologists that regularly work with biological visualization tools, 15 were engineers (mostly electrical and electronic engineers), another 6 were people with tertiary education in other fields and 3 had no tertiary education. Subjects were between the ages of 21 and 60. Of the participants, 20 were male and 9 female. Many statistical tests are based on the assumption that the data is normally distributed. For this reason we computed the statistical power of the task results using Matlab, to determine whether the rejection of H
0 is valid. Furthermore, the power analysis proved that our sample size was adequate to ensure a statistical power of greater than 0.8 for all the tasks, except for the task to transform the sample (which was almost identical for the two interfaces) and the task to scale the ROI along the z-axis (which requires a sample size of 37).
Each participant in the study was asked to perform the same tasks, in the same order, using both the gamepad and the hand tracking interfaces. The time taken to perform each task was measured to allow subsequent objective productivity comparison. Lastly, the participants were asked to perform similar interactions with a traditional keyboard and mouse, without their time being taken, in order for the participants to gain understanding in the current standard interfaces. After using each interface, the participants were asked to complete a subjective questionnaire describing their experience when performing the different tasks.
In order to ensure that the results from different participants were comparable, all the tasks were supervised by the same researcher and all participants followed the same procedure for corresponding samples. The order in which the interfaces were tested was the same for all participants using the gamepad first, then the hand tracking and finally the traditional input. The supervising researcher ensured that all sample transformations, GUI interactions and ROI selections were completed with the same accuracy. The selection accuracies that was required for acceptance is illustrated in Fig. 4. Furthermore, the users were required to make the selection with the same accuracy between interfaces.
Pre-test preparation
Tests were carried out in a quiet studio environment with only the participant and the researcher present. Each user was informed about the purpose of the test and what they will see and experience. Each participant was then asked to give a subjective rating between 1 and 5 indicating their general computer proficiency, and their experience with biological visualization and colocalization analysis. Each participant also provided self-assessments of their experience in using a gamepad, a hand tracking device and a VR headset.
Overall the participants indicated that they had medium to high computer proficiency (Fig. 5
a). The gamepad experience among the participants was diverse (Fig. 5
b) and was a factor that we considered in our later analysis. Most of the participants had very little or no prior exposure to VR or hand tracking.
Because most participants had no prior VR or hand tracking interface experience, we used two demonstration programs to familiarize them with movement and hand interaction in a VR environment. The standard Oculus desk scene and the Blocks demo [23] created by Leap Motion were used for this purpose. Most users are astonished when using VR for the first time, and these introductions helped to ensure that the subjective feedback was based on the effectiveness and productivity of the implementation rather than the initial enthusiasm provoked by VR.
After the participant was comfortable in the VR environment, they were given a brief demonstration of the tasks that they were expected to perform, using our software, as well as an explanation of how to use the given interface. They were then given approximately 10 minutes to familiarize themselves with the interface. Only after they felt comfortable with the interface were they asked to perform the defined tasks, which are described in the next section.
Objective evaluation
In order to ensure a fair comparison between the participants’ experience of and productivity with each VR interface, a series of tasks was devised to ensure that the different aspects of their interactions could be tested and timed. Tasks were chosen that would cover all the aspects of the system that a biological investigator would use to perform a basic sample visualization and colocalization analysis:
-
1.
The participants were shown an image of a desired transformation on a volume sample. In order to match this image, the participant needed to perform translation, scaling and rotation of the sample.
-
2.
The participants were asked to change several rendering parameters to prescribed values. This mostly involved changing slider values in the GUI.
-
3.
The participants were asked to place a ROI selection box around a prominent colocalized feature in the sample. This was divided into two steps:
-
(a)
The ROI box needed to be scaled and translated to the correct position to surround the feature.
-
(b)
Subsequently the sample needed to be rotated by 90° and the box scaled along the z-axis to match the depth of the colocalized feature.
-
4.
Finally the participants were asked to use the freehand ROI selection tool to accurately outline the colocalized feature and adjust the z-dimension of the ROI.
Each task was explained verbally to each participant immediately before it was performed. The actions of each participant were recorded using both a video camera as well as screen capture. This allowed the time taken for each task to be accurately and unobtrusively measured later. This approach proved to be very effective in making the users feel relaxed while performing the tasks. The entire procedure took between 30 and 45 minutes for each participant. Times were measured to the closest second.
Subjective evaluation
After completing each task, the participants gave a perceived ease of use rating for each interface using a 5 point scale. They were also asked to provide subjective ratings describing how often they forgot the button allocations or the required input gestures. Lastly, they were asked to rate their general sense of how productive they would be when using the interface for microscopy data visualization and colocalization analysis.
Since our system was conceptualized with VR at its core, no keyboard or mouse support was implemented. In order to gain insight into how the users perceived the VR interface when compared to conventional input methods, they were asked to perform similar interactions using traditional 3D software using a keyboard and mouse, without the VR headset. The interactions were designed to mimic those implemented in standard biological applications, such as ZEN image software by ZEISS. Subsequently they were again asked to rate the the ease of use.
Finally, the participants ranked the three interfaces according to their preference for performing colocalization analysis, and to how difficult the three interfaces were to learn to use.