What is MassChroQ?

MassChroQ : Mass Chromatogram Quantification software, performs quantification on data obtained from Liquid Chromatography-Mass Spectrometry (LC-MS) techniques. In particular if performs :

  • alignment of LC-MS/MS runs,
  • eXtracted Ion Chromatograms (XIC) extraction and various XIC filtering,
  • peak detection and peak matching,
  • quantification of several items of interest: all the identified peptides in the data, a given list of m/z values, a given list of m/z-rt values etc.

What LC-MS/MS experiments is MassChroQ good for?

With MassChroQ you can quantify :

  • label-free as well as isotopic labeled data;
  • high as well as low resolution spectrometer systems data.
Moreover, MassChroQ :
  • automatically takes into account the identified peptides/proteins obtained from any identification software provided they are placed in spreadsheet files;
  • takes into account complex peptide/protein separations, (SCX, SDS-PAGE, ...);
  • implements two different LC-MS run alignment methods : the OBI-Warp alignment method based on MS level one data, and an MS level 2 based alignment method conceived in our team.

I know other software that do this, why is MassChroQ better?

In our proteomics platform, we need to analyse LC-MS data:

  • in a versatile way (we had different kind of experiment scenarios to treat),
  • in a secure and traceable way (this is research data so we have to be able to precisely follow, explain and reproduce every single step),
  • in a hardware/software independent way (we do not have big computing servers or unlimited money to buy Windows licences every year, we have simple personal computer, but robust as most of the time they run under Linux ),
  • if it is possible, we want to process large amounts of data in an automatic fast and light way.

But, we are not a software editor platform, so we first tried every existing quantification tool. But no one fitted our expectations. They were either limited in analysis types, were often obscure in their analysis, and very often they were specific to Windows systems or graphical interfaces. That is when we began developing our own pipelines and tools, which we kept private and used for our own need. But overtime, our tools became more complete and less specific to our platform needs. That is why, three years after, we decided to publicly release MassChroQ as a free, open source software.

MassChroQ is very flexible because it is fully configurable, the user can adapt the various parameters to its data quality, to its spectrometer type and to its biological interests (one can perform with or without labelling quantification, complex peptide/protein separation (2D-LC) etc.).

MassChroQ produces precise traces of all the analysis steps it performs (of the alignments, of XIC-s with and without signal filtering, of the peaks detected on them, of the matched peaks, etc.). These traces are in practice very useful, not only do they guide the user through the parameter adjustment process, but once these parameters adjusted and the analysis is over, they can be themselves used for statistical analysis, plots, etc.

It can automatically handle hard workloads of LC-MS data, fastly (takes less than three minutes for a full analysis of 1Go of profile mzXML data) and lightly (it uses at most 450 Mo of RAM, your personal computer stays fully functional).

Last but not least, MassChroQ is a free open-source software that comes with a library which can easily be integrated into your own pipeline. It is made available for Windows and Linux systems, and your are free to compile it on other systems. Every data that it uses or produces are in open standard formats (mzXML, mzML, gnumeric, tsv, ods).

How do I install and run MassChroQ?

Our Download page explains how to install and run MassChroQ on your platform.

What kind of isotopic labellings are handled in MassChroQ?

The quantification process in MassChroQ is based on MS level one continued data, thus only isotopic labeling techniques that can be analysed in MS level one are handled (for example Nter, Cter, SILAC, ICAT, ...). This means that the ITRAQ isotopic labeling types (based on MS level 2) are not handled in MassChroQ.

What isotopic labellings are not handled in MassChroQ?

ITRAQ isotopic labeling types, based on MS level 2 information, are not handled in MassChroQ.

What LC-MS/MS data file formats does MassChroQ accept?

MassChroQ accepts mzXML and mzML data file formats. Due to the complicated and yet not secured nature of the mzML format, we strongly recommend you the use of the mzXML format, which is simple, secure and complete.

Does MassChroQ perform protein identification?

No, MassChroQ is not an identification software. You have to perform identification before quantifying your identified peptides with MassChroQ. There are two possibilities for making MassChroQ automatically take into account the identification information of your data:

  • If you use the X!Tandem software as we do, things are quite easy: you can freely install our X!Tandem pipeline which performs identification using X!Tandem, filtering and validation, and also offers a direct export of the results into a complete, ready-to-use, masschroqML input file to MassChroQ.
  • If you are using another identification engine (like Mascot, or other), you will have to export your results into spreadsheet files, then rearrange the columns in these files in the MassChroQ order. MassChroQ will automatically parse these files and extract the information they contain during quantification. For precise information on how to do this, see the dedicated chapter of the MassChroQ manual available on our Documentation page.

Does MassChroQ handle SRM/MRM experiments?

As for version 1.2 of MassChroQ, SRM/MRM experiments are not handled. But this is a feature that we plan to implement in the next release version.

What is the masschroqML format?

The masschroqML format is the format of the input file to MassChroQ. This is an XML format, whose schema can be found here. It is in this input file that the user indicates all the data to analyse, the alignments to be performed on them, the different analysis parameters, all that MassChroQ needs to know to analyse these data. In our Download page we provide different masschroqML example files, corresponding to different analysis cases. To run an analysis, the user should take one of these files, edit it and modify it in order to fit to its data.

I have LC-MS/MS raw files coming from my spectrometer that I want to process with MassChroQ. How do I proceed?

The classical workflow of LC-MS/MS data processing can be resumed in the following steps:

  1. Conversion of data from RAW format to mzXML or mzML file format.

    The RAW format is a proprietary closed binary format which cannot be decoded without the constructor libraries. On the contrary, mzXML and mzML formats are open standard proteomics formats, the information they contain is fully transparent. MassChroQ accepts only mzXML and mzML data formats. To convert your raw data to one of these formats, several free and open source tools exist. The Seattle Proteome Center lists all possible alternatives for specific spectrometer types on this webpage.

  2. Protein identification and validation

    MassChroQ does not perform protein identification and validation, but facilitates automatic export of identification data in your masschroqML input file. See the dedicated FAQ question for details.

  3. Preparing the masschroqML input file to MassChroQ

    If you are not using the X!Tandem pipeline that exports your identification results in a ready-to-use masschroqML file, you should take one of the example masschroqML input files from our Documentation page. You should edit this file and modify the following lines:

    • the data_file lines should point to your mzXML/mzML data to analyse;
    • you should check and modify the groups block in order to form the groups that best fit your analysis (see dedicated question on groups);
    • the peptide_files lines should point to your peptide identification result files;
    • you should adapt the parameters of the alignment_method and quantification_method lines;
    • check that the align lines references your groups, the desired reference sample in each group and the desired previously defined alignment_method;
    • check that the quantify lines references your groups, the desired quantification mode and the desired previously defined quantification_method;
    • choose the desired formats and file names for the results in the quantification_results block;
    • choose the desired type of traces and output directory for the XIC traces in the quantification_traces block.
  4. Launch MassChroQ, check execution and results

    Launch the masschroq command on the previously edited masschroqML file. Check for execution messages on the command line. If an error occurs ("Oops, an error occurred in MassChroQ" message), most of the time it is because there is an error in the masschroqML file. The error message contains the line where the error is located, please check carefully this line and look for possible error causes (for example a reference to a bad sample or group name). Most of the time the masschroqML syntax errors are due to bad file names (not existing in the computer), bad references (to sample ou group names that do not exist), or to tag mismatches. A tag mismatch error in an xml syntax error, you have probably forgotten to close a, xml block or an attribute. In such cases you should take a deeper look at the masschroqML syntax in the examples or the masschroqML schema we provide. If no errors are encountered, when MassChroQ finishes you will find the following spreadsheet files in your system: the .time and .trace files which are produced by the alignment; the quantitative results ready for statistical analysis; the XIC traces: one trace file per XIC, containing the XIC-s before and after filtering, the detected and the matched peaks. To process these data and to check your results, you can run your own scripts or pipelines on them: these are spreadsheet files. We also provide some utility R and perl scripts that can help you with this.

MassChroQ's brief history.

MassChroQ was born in 2007 by Benoît Valot at the PAPPSO team while dealing with a great quantity of LC-MS data coming from different spectrometers. It first consisted of some perl scripts that analysed these data. But as the LC-MS data were becoming bigger and bigger, PAPPSO's bioinformaticist Olivier Langella, began writing in 2008 a C++ program called QuantiMScpp that already performed basic MS1 alignment, XIC-extraction, peak detection and quantification.

QuantiMScpp was used internally by the team members. In June of 2010 PAPPSO's computer engineer Edlira Nano joined the QuantiMScpp development adventure for a full time job on it. The software continued its development under the new name of MassChroQ and mature enough, on January, 2011 its first release version together with an academic article were published.

MassChroQ continues its development, new features and optimisation are continually being developed and released. Feel free to contribute to its development, or to simply suggest future features.