Ancestry tables¶
Contents of this section:
ref: sec_ancestrytables_intro ref: sec_ancestrytables_api
What are ancestry tables?¶
Intuitively, local ancestry might be shown with a matrix that looks something like this:
[make a picture and put it here]
A table that shows the most recent ancestral population from which a given node has inherited on each segment of their genomes.
Each row of the ancestry table tells something about who has inherited from who on a particular segment of their genome. Specifically, each row (L, R, A, P, S) specifies that sample S has inherited from ancestor A in population P over the genomic interval with coordinates (L, R).
Different ancestry tables can be produced from the same dataset depending on the set of ancestors that are of interest. We focus on applications of these tables to studies of local ancestry and IBD.
An example¶
Ancestry table with ancestors from ancestral populations:
API¶
-
class
ancestrytools.AncestryTable A table showing all genomic segments of the specified sample IDs that have ancestry with one of the specified populations. Each row (L, R, A, P, S) indicates that over the genomic interval with coordinates (L, R), the sample node with ID S has inherited from the ancestral node with ID A in population P.
Variables: - left (numpy.ndarray, dtype=np.float64) – The array of left coordinates.
- right (numpy.ndarray, dtype=np.float64) – The array of right coordinates.
- ancestor (numpy.ndarray, dtype=np.int32) – The array of ancestral nodes.
- population (numpy.ndarray, dtype=np.int32) – The array of population labels.
-
add_row(left, right, population, child, ancestor=-1) Adds a single row with the specified values to the bottom of the table.
Variables: - left (float) – The left coordinate of the segment.
- right (float) – The right coordinate of the segment.
- population (int) – The population of the ancestral node.
- child (int) – The ID of the child node.
- ancestor (int) – The ID of the ancestral node.
-
asdict() Returns a dictionary of table values. The keys are the column names, and the values are the numpy arrays holding the column values.
-
num_rows() Returns the number of rows in the table.
-
set_columns(left, right, population, child, ancestor=None) Sets the values in each column of the table. This makes it possible to add the information from many rows all at once.
Variables: - left (list, dtype=np.float64) – The list of left coordinates.
- right (list, dtype=np.float64) – The list of right coordinates.
- ancestor (list, dtype=np.int32) – The list of ancestral nodes.
- population (list, dtype=np.int32) – The list of population labels.
-
ancestrytools.get_ancestry_table(ts, populations, samples=None, keep_ancestors=False) Returns an AncestryTable showing local ancestry information for the specified set of samples.
Variables: - ts (tskit.TreeSequence) – The tree sequence containing the dataset.
- populations (list, dtype=int) – A list of ancestral population IDs of interest.
- samples (list, dtype=int) – A list of sample node IDs of interest. If None, all samples in the inputted tree sequence.
Parameters: keep_ancestors (bool) – If True, ancestral node IDs are retained in the output.
Returns: The ancestry table listing the local ancestry of the genomic segments corresponding to the child nodes.
Return type: :class:slime.AncestryTable