|
Comparing two
datasets
eBURSTv3 provides the capability to compare and differentially highlight two
datasets. An initial dataset (REFERENCE) is first loaded and then a second dataset
(QUERY) can be loaded and compared to the reference dataset. This is a particularly
useful enhancement as it allows comparison of a user dataset (QUERY) with the
whole MLST database (REFERENCE) for that species, differentially highlighting
those STs that are unique to the user data and those that are also already present
in the MLST database. Alternatively, two user datasets can be compared highlighting
STs unique to either dataset and those common to both datasets.
Datasets can be uploaded
here or directly
into eBURSTv3 through the File
menu.
Species-specific www.mlst.net websites contain the facility to upload
query datasets
for comparison against
an entire database allowing you to explore
the predicted ancestry of your isolates
prior to submission to curators.
The ability to run eBURST on Oxford MLST
databases or compare datasets to those in
Oxford databases is provided, however, unlike
the datasets provided from mlst.net databases
these only contain one example of each ST.
After loading both datasets
eBURST initially compares the profiles
within both to check that there are no
identical allelic profiles assigned as
different STs or, conversely, no differing
allelic profiles assigned as the same
ST. Should discrepancies be found, descriptions
of the particular profiles are returned
in the Profiles Window and you are asked
to correct data prior to reloading.
Once consistent data
are loaded both datasets are displayed
in the profiles window allowing analysis
to begin as with a single dataset. STs
in the profiles window are coloured differentially
dependant on their membership of the two
datasets –
Black - STs found only
in the Reference dataset
Green - STs found only in the Query dataset
Cyan - STs found in both the Reference and Query datasets.
When viewing the eBURST
diagram of individual groups, or a population
snapshot of the two combined datasets,
the ST labels also are coloured in this
fashion.
To be able to visualise
the differences and similarities between
the two datasets more clearly, the ST
labels can be turned off (From the ‘Diagram’ menu
uncheck ‘Show ST labels’)
and a ‘halo’ is drawn around
the ST circle coloured as follows –
Green - STs found only
in the Query dataset
Cyan - STs found in both the Reference and Query datasets..
STs found only in the
reference dataset are shown as normal
without a halo(black).
The border thickness
of the halos can be changed through the
Diagram menu to produce the optimal visibility
of the differential colouring of STs for
saving for a publication of powerpoint
slide.
For further clarity, the colouring of the
primary founders and subgroup founders can
be removed (by selecting ‘black’ in
the ‘colour options’ menu from
the ‘Diagram’ menu.
Figure 6 depicts two
datasets, one (REFERENCE) collected at
one timepoint against another (QUERY)
collected at a later timepoint. It can
be seen that there are a number of minor
clonal compexes where the SLV’s
of the predicted founder are only present
in the query dataset, collected after
the reference dataset.

Figure 6
|