After decades of research, there is still no comprehensive, validated model of program comprehension. Recently, researchers have been applying psycho-physiological measures to expand our understanding of program comprehension. In this position paper, we argue that measuring program comprehension simultaneously with functional magnetic resonance imaging (fMRI) and eye tracking is promising. However, due to the different nature of both measures in terms of response delay and temporal resolution, we need to develop suitable tools. We describe the challenges of conjoint analysis of fMRI and eye-tracking data, and we also outline possible solution strategies.
Existing research in program comprehension has paid less attention to the coverage of programming concepts that were contained within the source codes used for studies. In this paper, we examine source codes covering four introductory programming concepts: branching, loops and arrays, sorting, and tail recursion. The diverse types of code fragments give rise to eye movement patterns more structured according to the control flow and data flow of the program. To facilitate analysis of this class of program comprehension strategies, we propose data flow-based metrics and describe automatic computation of the metrics. In evaluation of the proposal, we conducted a pilot study with novice and intermediate programmers. In the study with recordings from 26 programmers we compute basic fixation and saccade metrics along with a data flow-based metric.
Recently, eye-tracking analysis for finding the cognitive load and stress while problem-solving on the whiteboard during a technical interview is finding its way in software engineering society. However, there is no empirical study on analyzing how much the interview setting characteristics affect the eye-movement measurements. Without knowing that, the results of a research on eye-movement measurements analysis for stress detection will not be reliable. In this paper, we analyzed the eye-movements of 11 participants in two interview settings, one on the whiteboard and the other on the paper, to find out if the characteristics of the interview settings affect the analysis of participants' stress. To this end, we applied 7 Machine Learning classification algorithms on three different labeling strategies of the data to suggest researchers of the domain a useful practice of checking the reliability of the eye-measurements before reporting any results.
Researchers have been employing psycho-physiological measures to better understand program comprehension, for example simultaneous fMRI and eye tracking to validate top-down comprehension models. In this paper, we argue that there is additional value in eye-tracking data beyond eye gaze: Pupil dilation and blink rates may offer insights into programmers' cognitive load. However, the fMRI environment may influence pupil dilation and blink rates, which would diminish their informative value. We conducted a preliminary analysis of pupil dilation and blink rates of an fMRI experiment with 22 student participants. We conclude from our preliminary analysis that the correction for our fMRI environment is challenging, but possible, such that we can use pupil dilation and blink rates to more reliably observe program comprehension.
In order to ensure sufficient quality, software engineers conduct code reviews to read over one another's code looking for errors that should be fixed before committing to their source code repositories. Many kinds of errors are spotted, from simple spelling mistakes and syntax errors, to architectural flaws that may span several files. However, we know little about how software developers read code when looking for defects. What kinds of code trigger engineers to check more deeply into suspected defects? How long do they take to verify whether a defect is really there? We conducted a study of 35 software engineers performing 40 code reviews while capturing their gaze with an eye tracker. We classified each code defect the developers found and captured the patterns of eye gazes used to deliberate about each one. We report how long it took to confirm defect suspicions for each type of defect and the fraction of time spent skimming the code vs. carefully reading it. This work provides a starting point for automating code reviews that could help engineers spend more time focusing on the difficult task of defect confirmation rather than the tedious task of defect discovery.
Previous work investigating the eye movements of computer programmers with dyslexia suggests that the gaze behaviour expected of dyslexic readers when processing natural text does not consistently manifest when programmers with dyslexia read program code. Instead, the observed eye movements of programmers with dyslexia appear to represent a complex hybrid of gaze behaviour both typical and atypical of dyslexic readers. Building on this work, this paper explores the possible impact of code style, layout and crowding on the reading behaviour of programmers with dyslexia. Related work on the phenomenon of crowding in the dyslexia literature is used to inform a possible experimental design to explore the effect of crowding in this context.