Comparing parallelisation strategies for a bioelectrical brain imaging algorithm

Juhász Zoltán Dr. <juhasz@virt.uni-pannon.hu>
Pannon Egyetem

Brain EEG (electroencephalography) is a routine clinical method to record the electrical activity of the brain. During measurement, typically 25 electrodes (channels) are placed on the scalp, and the electrical signals detected on the scalp are measured and displayed in wave form. The main advantage of the EEG in comparison to other methods (CT, MR) is its high temporal resolution (msec range), the disadvantage is the low spatial resolution and the fact that it measures activity on the scalp not within the brain. In our research programme running at the University of Pannonia, we are developing new methods and algorithms that make it possible to reconstruct the original activation sources of the brain for a given signal pattern and localise it with high spatial accuracy. Such a new imaging method could open up new opportunities in diagnostics.

One of the biggest obstacles in solving this problem is the very high computational intensity of the measurement evaluation algorithms. We are using 64 channels – soon 128. We need to solve the so-called inverse problem; we need to find an electrical source from the detected signal pattern. In this paper, we show one potential algorithm for solving this problem that detects one source only. The number of channels, the large number of measurements and the many potential locations within the brain result in 30-60 second running times for a single measurement point in time. For a typical measurement (several seconds), the evaluation time is measured in hours which is clearly unacceptable in clinical use. The only solution is to use parallel computing.

To find the best implementation, we have implemented the algorithm in several languages an in various parallel programming dialects (Java, C/MPI, C/OpenMP), then executed these on different parallel architectures (PC clusters, multi-core processors, NIIF supercomputer) to understand their behaviour, analyse running time, speedup and scalability.

The paper presents our results and conclusions drawn from these experiments, and in the end show a new possibility, the use of GPUs for speeding up the calculations. We hope to be able to show results obtained on our 4 teraflops NVIDIA Tesla S1070 device.