|
Building up the
eBURST diagram
The original BURST algorithm identifies the primary founder as the ST with the
largest number of SLVs. Those STs that are SLVs of the predicted primary founder
are assigned to the primary founder and the new ST that has the greatest number
of previously unassigned SLVs is identified. The iterative process of identifying
the new ST with the greatest number of previously unassigned SLVs continues until
all of the STs that have multiple unassigned SLVs (subgroup founders) have been
identified. DLVs of the primary founder and of the subgroup founders are then
identified in a similar iterative manner. The original BURST displayed the primary
founder and the subgroup founders, and their SLVs and DLVs, but did not attempt
to link all of these clusters of STs together.
In eBURST, the default
group definition results in all STs being
connected as a single clonal complex and
eBURST makes these links. However, if
the primary founder and subgroup founders,
and their SLVs and DLVs, are assigned
as described above for the original BURST
algorithm, there can be problems in achieving
a fully linked diagram without introducing
ad hoc linking rules. In order to circumvent
this problem, eBURST produces an initial
approximation of the above arrangement
of STs, which ensures that all STs in
the group are linked, and then optimises
the arrangement of STs to produce the
final eBURST diagram.
The procedure is as follows.
The ST that has the greatest number of
SLVs is assigned as the primary founder
and is positioned centrally with radial
links to all of its SLVs. Having assigned
all of the SLVs of the primary founder,
the SLVs of each of these SLVs are identified
(ignoring any STs that have already been
assigned to the primary founder) and linked,
and this iterative procedure of linking
previously unassigned SLVs carries on
outwards from the SLVs, to the DLVs and
then to the TLVs, until all SLV links
have been made.
Optimisation of the initial arrangement of STs is then carried out. The optimisation
method looks at each ST (excepting the primary founder and its SLVs, which
are unambiguously assigned) and searches for a better positioning of STs that
maximises the numbers of SLVs associated with subgroup founders. Optimisation
takes account of the simple model of clonal expansion that underpins BURST
where some STs within a clonal complex may have increased in frequency and
diversified to produce subgroups. It attempts to identify the most likely pattern
of subgroups by searching for those subgroup founders that have the greatest
numbers of linked SLVs.

Figure 2
An illustrative example is shown in Figure 2. The initial procedure identifies
ST1 as the ST with the greatest number of SLVs (the primary founder) and
links ST1 to its seven SLVs. It then assigns the SLVs of each of these seven
SLVs, and identifies ST2 as a SLV of ST17 and links it. Progressing further
outwards, the four descendent SLVs of ST2 are identified and linked, and
the process continues outwards and links the four descendent SLVs of ST3.
This initial assignment of SLVs from the primary founder outwards results
in STs that are SLVs of more than one ST being preferentially assigned to
the more centrally positioned ST (ST2 in Figure 2).
In the example shown
in Figure 2, optimisation identifies ST10
and ST12 as SLVs of ST3 as well as of
ST2. The optimisation procedure re-assigns
STs to maximise the numbers of SLVs associated
with ST3 as this subgroup founder has
more SLVs than ST2 and thus is a more
likely subgroup founder. ST2 and ST3 each
start with four SLVs and after optimisation
ST3 ends up with six linked SLVs (STs
10, 12, 13, 14, 15 and 16) and ST2 with
two linked SLVs (STs 3 and 11). Optimisation
re-organises the arrangement of STs to
maximise the numbers of SLVs associated
with subgroup founders, closely approximating
the sub-groups produced by the original
BURST algorithm, but providing complete
linkage between all of the STs in the
group. |