QGene data format

The standard format for importing marker, map, and trait data into QGene is as a .qdf file, which is structured as follows:

[Header]
Study name MY1 location means 10.8.07
Mating string r
Genotype symbols ABHxx-
Parent1 RT0034
Parent2 Cypress

[Locus]
AP2882 1 0 AAAAA-BB-BBBAABAABAAAAAAAAAABABBABBBBBBBAAAAAAABAAABBAAAAABABBBAAABAAABBBBBBBBBBBAAAABAAABBBBBBBAAABBBAABAHAAAAABBABAABBAAAAAABAA
RM10149 1 14.9 AAAAAAAABBBBABBAB-AAAAAAAAAAAABABBBBB-BBA-AAAAABAAHBBAABAABABBBABHBA-AABABBABBBABBAAABBABBBAAAABAAABBBAAAABAABAABBABAABBBAAAAABAB

[Trait]
AMY_AR M 21.8 21.85 22.7 22.65 22.15 23.3 23.4 23.9 22.95 23.7 22.8 21.5 23.6 20 . . 22.2 21.8 23.4 23.95 24.2 23.3 22.9 24.4 21.45 23.15 23.9 24.55 23.7 23.8 21.1 22.7 23.5 22.9 24.1 23.6 22.8 . 23.5 . 21.25 22 20.9 24.95 24.05 23.8 24.2 22.95 . 24.3 23.85 21.3 21.6 22.7 24.6 23.55 22.8 20.75 23.6 23.95 22.9 23.8 24.1 22 24.85 22.55 23.55 23.95 22.5 24.1 24.6 21.45 24.3 . 22.9 24 23.8 24.4 21.7 24.3 23.7 22.7 23.4 23.4 24.55 24.95 21.6 24.6 20.1 23.95 24.25 23.15 24.2 23.1 22.3 24.05 23.9 21.9 23.2 22.1 21.85 . 23.85 23.65 24 21.9 . 21.75 22.9 22.4 22.85 20.85 22.9 23.95 22 23.9 23.5 23.6 21.15 22.45 23.8 22.4 24.1 24.35 24.1 23.65 23.6 23.95 .
BLANKS_AR M 0 35 46.65 20 31.7 26.7 21.7 0 31.65 0 38.3 6.65 31.7 0 80 70 33.5 33.3 30 21.65 30 21.65 16.7 16.65 29.2 14.15 25.85 13.35 31.65 33.35 0 0 31.65 36.7 23.3 27.5 23.3 0 21.7 . 5 36.65 40 5.85 47.5 0 27.5 25 . 26.65 30 38.35 41.5 30 20 19.15 10 31.65 15.8 13.35 20.8 62.5 31.65 23.3 7.5 0 0 0 0 55 15 15 22.5 74.15 25 16.65 45 18.35 0 30 36.65 30 38.35 0 31.65 18.3 35.85 26.7 0 35 0 0 11.65 25.8 33.3 15.85 33.3 61.65 16.7 66.65 19.15 0 22.5 35 25 43.15 40 0 0 20 65 0 0 0 11.7 36.7 0 11.65 16.65 19.15 0 16.65 68.35 10 36.7 31.65 0 30 0

Explanation of .qdf format

The three content keywords in square brackets tell QGene the content of the material following them, up to the next keyword.

[Header] section

The Study name will be used to identify the data set in the Data manager, so use an informative (though not too long) name. The Mating string describes the genetic model to be used. At present QGene understands the following mating designs:

  • r refers to a standard recombinant-inbred progeny, created by multiple-generation selfing but possibly retaining some heterozygosity.
  • Any sequence of b, d, s, and i or their upper-case counterparts may be provided as a mating string. These operations refer to backcrossing, doubled-haploid creation, selfing, and random intercrossing, and the string is assumed to start with operations applied to the F1 generation. To specify a BC1F1 design, for example, write only b. An F2 will be s, an F3 ss, and a series of three backcrosses followed by a selfing bbbs.
  • QGene does not understand outcross designs at present.
The Genotype symbols are those you are using to represent the parent 1 homozygote AA, parent 2 homozygote aa, heterozygote Aa, the dominant marker phenotypes a_ and A_, and missing data, in the order given here. The x characters in the example denote symbols that do not appear in your data. You may use any alphanumeric symbols you want, but your file will be clearest to you and others if you stick to the ABHCD- convention introduced by the Mapmaker program and observed by others since. In all backcross designs, the first symbol (here A) is assumed to represent the recurrent parent.

The Parent1 and Parent2 entries provide QGene with labels for QTL effect plots, where analysts wish to determine the parental origin of a superior QTL allele -- and may be used in other plots such as for marker segregation, showing parental means on histograms, etc. If you don't provide names for the parents, QGene will default to A and B.

[Locus] section

  • Here the map and marker genotype (technically, marker phenotype) data are given. First word in the row is the marker name, followed by the chromosome and cM position of the marker, and finally the marker data.
  • If you don't know the true chromosome or map position, you must still enter some value for the first two words.
  • The marker data values need not be separated by whitespace, but will still be read properly if they are. QGene will verify that each marker is accompanied by the same number of genotype values, representing the individuals in the population.

[Trait] section

  • The first word on each line is the trait name; the second must be either N, O, or M -- indicating the trait to be nominal, ordinal, or metric -- and the rest of the entries are either numbers (for ordinal and metric traits) or alphanumeric strings (for nominal traits). At present (2.08) QGene doesn't provide any analysis for categorical traits, so don't include them in your data set.
  • Missing trait data should be represented by periods (.).
  • As with the marker data, QGene will verify that the numbers of trait values are consistent among traits and also with those of marker data.

Notes of caution

  • While you may load marker data with no trait data or vice versa, if your data set contains both kinds of data the marker data must come first.
  • While it's convenient to prepare your .qdf data file in Microsoft Excel, don't save it as an .xls file, which QGene won't be able to read! Save it as tab-separated text, .txt.

Other permitted formats

QGene 3.0 (Macintosh)

You can load files in the old QGene format. In the Load data file dialog, select both the .data and the .map files after choosing the corresponding file type from the Files of type: dropdown menu. To make this multiple selection you'll need to hold down your Control (Windows, Unix) or Command (Macintosh) key.

QTL Cartographer

This option should work for saving, but not loading, .mcd files.