Online Publication Bioinformatics Research Article http://werner.yellowcouch.org/Papers/maldiart/
OtherPapers, Dissertations, Presentations, Posters, Reports, Course notes, Proposals

Artefacts in the Mass Spectra Output from MALDI-TOF and MALDI-TOF/TOF Machines

Werner Van Belle1* - werner@yellowcouch.org, werner.van.belle@gmail.com
Olav Mjaavatten2

1- Bioinformatics Group; Norut IT; Research Park; 9294 Tromsų; Norway
2- Proteomic Unit (Probe); University of Bergen; ; Bergen; Norway
* Corresponding author

Abstract: MALDI-TOF mass spectrometry is a well known and widely used technique to fingerprint and sequence proteins. A carefull investigation of the mass spectra output from unnamed machines shows a number of artefacts produced by the machines themselves. Because these artefacts complicate a number of procedures we present a number of preliminary techniques we developed to get rid of most of the artefacts.

Keywords: matrix assisten laser desorption ionisation MALDI time of flight TOF artefacts noise
Reference: Werner Van Belle, Olav Mjaavatten; Artefacts in the Mass Spectra Output from MALDI-TOF and MALDI-TOF/TOF Machines; Proceeding of the VIIth International Symposium of the Protein Society section proteomics, interactomics and protein networks; April 2005
FilesMaldi2004.pdf, Maldi2004.ps.gz

Introduction

In MALDI-TOF (Matrix assistend laser desorption ionisation) a sample is mixed with a matrix. When this mixture dries it forms crystals. When such a crystalized mixture is targeted with a high energy laser beam with the correct wavelength, the matrix itself will suddenly absorb the incoming energy and heat up. This rapid heating causes sublimation of the matrix and subsequent expansian of the molecules co-crystalised within the matrix. The ions are then accelerated using a strong electrical field and thus seperated based on their $\frac{m}{z}$ ratio. The ions can then be detected at the end of the tube, or reflected and then be detected. This (optional) reflection phase increases the accuracy of the technique substantially.

In a typical proteomics setup a mass spectrogram is taken, the peaks are selected and then used to fingerprint proteins. Some machines offer the possibility to use an advanced lift system which makes it possible to measure the mass of the (poly)peptides within a larger frament of a specific weight. This makes sequencing of proteins possible.

Artefacts

We performed a number of measurements on differnet mass spectrometers. Surprisingly, the output from these machines contains a number of artefacts, which were also present on machines located at other sites, such as the Flemish Biotechnology Centre and freely published online spectra.

We believe that these artefacts complicate a number of possible uses of those machines

  1. Some of the artefacts might actually shift peaks a little backward or forward over the $\frac{m}{z}$ axis.
  2. The artefacts make it difficult to automatically select smaller peaks. E.g; when the noise level is close to the peak level, a human expert is able to select these, however a computer is unable to do so as long as the data is noise filled. This might be important for fingerprinting multi protein complexes.
  3. Some of the artefacts have signal levels which might even exceed the actual signal.
Below we present the artefacts we found. The investigation of the spectral output of the machine is based on a sliding windows fourier transform. When a data series is converted to the frequency domain we can see which frequency is present at which time and with how much strength. E.g; the right side of figure [*] has 2 axis. The X-axis is the $\frac{m}{z}$ axis. The Y-axis is the frequency axis. On the top of this axis we find high frequencies, on the bottom we find low frequencies. Every $(x,y)$ position in this diagram has a color. White means that frequency $y$ is not present at time $x$. Yellow means that some signal is present and red to dark red indicates a very strong presence of the given frequency. Typically, a particle hitting a detector will give rise to a vertical line in the frequency diagram.

Tones in Reflection Mode

Image GlobalNoiseSignalFixed

Figure: Artefacts in a typical mass spectrometry using the reflection mode.

The first experiment concerns the typical fingerprinting of a protein. In this experiment the reflection mode was turned on. The mass spectrum output consist of 158548 samples between 100.003 and 4019.170 Da. The window size of the SFFT is 2048 samples, which forms a good compromis between frequency-accuracy and position accuracy. In all the figures we present, both the m/z axis and the energy axis have been normalized. The frequency analsyis has also been normalized and is shown in dB.

This experiment (figure [*]) clearly shows

  1. 3 static tones superimposed over the signal (these are the three horizontal lines), as well as
  2. 3 lineair upward sweeping tones (the three slightly upward slanted lines).
  3. a burst of noise shortly after the deflection mode of the machine.
The tones are very likely not created by ions hitting the detector because this would mean that the ions are released at a steady frequency, independent of their size. Since the laser desorption results in a sublimation burst, such a steady periodic phenomenon is highly unexpected. On the other hand, the noise burst after the deflection phase is what we would expect, nevertheless it still makes finding peaks more difficult.

Decaying tones in Lift

Image LiftNoise

Figure: Noise of the Lift Spectrum

In a second experiment we measured the lift of a peak using a Maldi-TOF/TOF machine. The mixture contained a proteinfragement which was to be sequenced. The output from the machine ranges from 20.067 till 1264.626, in 67873 samples. Again, the m/z, energy and frequency content are all three normalized. The frequency analysis (figure [*]) shows

  1. 2/3 static tones at a low frequency
  2. 2/3 decaying tones which start at a high frequency and decay exponentially.
These artefacts are clearly different from the ones previously encountered. Also, in this experiment the signal/noise level is quite high. Even so that an expert is needed to select the correct peaks for further analysis.

Lineair Mode


Image Noise10

Image Noise1

Figure: Noise in lineair mode. Up: after 1 shot, Down, after 10 shots.

Image Noise1000

Image Noise100
Figure: Noise in Lineair Mode after 100 (up) and 1000 (down) shots

In a third experiment we measured the pure noise output of a Maldi machine in lineair mode. The output shown in figures [*] and [*] covers 110296 samples between 40 kDa and 80kDa. During the experiment, the laser was switched off, as such we measure only the noise generated by the machine. The artefacts we now observed were even more interesting then the previous ones.

  1. White noise, as could have been expected. Please note here that this is with the laser turned off and as such it does not say anythiong about the signal/noise ratio.
  2. A probabilistic distribution of pulses. To clarify this further: the upper part of figure [*] is the noise of 1 shot. The lower part of the same figure is after 10 shots. The upper part of figure [*] if the noise after 100 shots and its lower part is after 1000 shots. 1 shot gives rise to a strong pulse at a certain position. However, as observed in the other measurements, the location of this pulse is dependent on the actual shot. So, depending on the number of times we shoot we get different noise fingerprints.
Clearly this probabilsitic pulse train forms a big problem because it is highly dependent on the number of shots performed. As such it can a) easily be misinterpreted as a valid peak if there are few shots performed or b) will overrun the actual measurement when too much shots are performed.

Removing Artefacts

To investigate the feasibility to obtain more data out of the spectra, we created a number of denoising and enhancing techniques which we briefly present below.

Baseline Extraction

Image BaselineRemovedResultat
Figure: Baseline removal

The first step is to remove the energy overhead in the measurments. This is done by removing the baseline of the spectrum using a specific filter technique. The result is shown in figure [*].

Mass Spectrum Denoising

Image 90PercentDenoisedGlobal
Figure: Denoised sample

In order to denoise the data we first tried the creation of a number of digital notch filters. Because we don't want to shift the peaks back or forth in time, such a filter was required to have a zero-phase response over its entire spectrum. Also the impulse response of the filter needed to be as small as possible because we did not want to broaden the peaks, nor introduce unwelcome echos. A number of small experiments indicates that the results of such a filter would not be so very good. It became also clear that the chirp could not easily be removed by such a time independent filter. Therefore we created another technique of which you see the result in figure [*]. A local closeup of the denoised data (figure [*]) shows how the peaks are located at the same places, but now allow for fully automatic detection (certainly if you look at the SFFT of the data), which makes its very attractive in high trhroughput proteomics.

Image 90PercentDenoisedLocalView

Figure:
Up: zoomed in sample output from the machine. Bottom: the same data denoised

The accuracy of the algorithm we created is extremely high. It will retain position information exact. However the resolution of lower peaks wil be a little bit less than the higher peaks. This however should not form a problem because these peaks are still well differentiated. As can be seen in the previous pictures, accuracies far below 0.1 dalton can be achieved for smaller peaks.

Data Enhancing in High Noise Lineair Mode

Another experiment we performed was data enhancement of a linear mode mass spectrum. The mass spectrum we present is the output from a sample containing the cell-lysaat of Hela-cells. Clearly it is a relatively bad sample to put into maldi heavy mass lineair mode. Not only are these heavy masses difficult to get suspended, but also because the noise level might suffocate what we actually want to measure. Figure [*] shows how data enhancing helps in filtering out the noise.

The result of the algorithm on a standard protein mixture is shown in figure [*]. Important here is that certain peaks which would normally not be selected if we simply look at the highest value now show up. Whether some of these new peaks are important might be interesting to investigate.

Image M17T3-B1-NormalizedSfft

Image M17T3-B1-EnhancedSfft

Figure: Upper figure is the SFFT of a measurement of a Hela cell substrate. The figure below is the SFFT of the same substrate after data enhancing (but without removal of the pulse-train).

Image EnhancedPepmix

Figure: Lineair mode data enhancing of the output of a ProtMix II. Bottom is the actual output. Top is the enhanced output.

Automatic Detection of 'important' peaks'

Image MassSpectrumWhatever-AutocorFigure: Correlation Measure to further select peaks

Image M17T3-B1-EnhancedAutocorFigure: Technique to detect important peaks in enhanced lineair mode signal

A phenomenon often used to detect important peaks is the fact that isotopes will weigh different. For every ionised similar fragemetn we will sometimes measure x dalton, sometimes we might measure x+1 dalton (if there is one neutron more), and so on. This knowledge can be used to automatically detect important peaks as shown in figure [*]. The visualised graph is the autocorrelation graph which mainly measures whether a peak has 'echos'. If it has echos, then it probably is a series of peaks of the same fragment.

In a simila way, if we measure the autocorrelation of the enhanced lineair mode experiment, then we clearly see vertical bands. Very likely the content of every band will allow us to detect which bands are important. However, this is merely an educated guess.

Summary

We have presented a number of artefacts we have encountered in Maldi Tof and Maldi Tof/Tof machines. These are

  1. Static tones
  2. Upsweeping tones
  3. Decaying tones
  4. Probabilistic pulse trains
We also presented the output of some preliminary techniques we developed to show the feasability of data denoising

  1. Denoising of the static tones and upsweeping tones, without shifting the peaks back or forth.
  2. Enhancing of heavy mass linear mode spectra.
  3. Atomatic importance assesment when looking at multiple peaks.
http://werner.yellowcouch.org/
mailto:werner@yellowcouch.org