|
@@ -35,27 +35,6 @@ Note that this strategy is currently only available for `hg19`, `hg38`, `mm9` an
|
|
### Introduction of CALDER analysis for other genomes
|
|
### Introduction of CALDER analysis for other genomes
|
|
|
|
|
|
Although CALDER was mainly tested on human and mouse dataset, it can be applied to dataset from other genomes. One additional information is required in such case: a `feature_track` presumably positively correlated with compartment score (thus higher values in A than in B compartment). This information will be used for correctly determing the `A/B` direction. Some suggested tracks are gene density, H3K27ac, H3K4me1, H3K4me2, H3K4me3, H3K36me3 (or negative transform of H3K9me3) signals. Note that this information will not alter the hierarchical compartment/TAD structure, and can come from any external study with matched genome.
|
|
Although CALDER was mainly tested on human and mouse dataset, it can be applied to dataset from other genomes. One additional information is required in such case: a `feature_track` presumably positively correlated with compartment score (thus higher values in A than in B compartment). This information will be used for correctly determing the `A/B` direction. Some suggested tracks are gene density, H3K27ac, H3K4me1, H3K4me2, H3K4me3, H3K36me3 (or negative transform of H3K9me3) signals. Note that this information will not alter the hierarchical compartment/TAD structure, and can come from any external study with matched genome.
|
|
-<br>
|
|
|
|
-<br>
|
|
|
|
-`feature_track` should be a data.frame or data.table of 4 columns (chr, start, end, score), and can be generated directly from conventional format such as bed or wig, see the following example:
|
|
|
|
-
|
|
|
|
-```
|
|
|
|
-library(rtracklayer)
|
|
|
|
-feature_track = import('ENCFF934YOE.bigWig') ## from ENCODE https://www.encodeproject.org/files/ENCFF934YOE/@@download/ENCFF934YOE.bigWig
|
|
|
|
-feature_track = data.table::as.data.table(feature_track)[, c(1:3, 6)]
|
|
|
|
-```
|
|
|
|
- > feature_track
|
|
|
|
- chr start end score
|
|
|
|
- chr1 534179 534353 2.80512
|
|
|
|
- chr1 534354 572399 0
|
|
|
|
- chr1 572400 572574 2.80512
|
|
|
|
- chr1 572575 628400 0
|
|
|
|
- ... ... ... ...
|
|
|
|
- chrY 59031457 59032403 0
|
|
|
|
- chrY 59032404 59032413 0.92023
|
|
|
|
- chrY 59032414 59032415 0.96625
|
|
|
|
- chrY 59032416 59032456 0.92023
|
|
|
|
- chrY 59032457 59032578 0.78875
|
|
|
|
|
|
|
|
# Installation
|
|
# Installation
|
|
|
|
|
|
@@ -99,6 +78,7 @@ remotes::install_github("CSOgroup/CALDER")
|
|
|
|
|
|
# Usage
|
|
# Usage
|
|
|
|
|
|
|
|
+
|
|
The input data of CALDER is a three-column text file storing the contact table of a full chromosome (zipped format is acceptable, as long as it can be read by `data.table::fread`). Each row represents a contact record `pos_x, pos_y, contact_value`, which is the same format as that generated by the `dump` command of juicer (https://github.com/aidenlab/juicer/wiki/Data-Extraction):
|
|
The input data of CALDER is a three-column text file storing the contact table of a full chromosome (zipped format is acceptable, as long as it can be read by `data.table::fread`). Each row represents a contact record `pos_x, pos_y, contact_value`, which is the same format as that generated by the `dump` command of juicer (https://github.com/aidenlab/juicer/wiki/Data-Extraction):
|
|
|
|
|
|
16050000 16050000 10106.306
|
|
16050000 16050000 10106.306
|
|
@@ -114,6 +94,26 @@ The input data of CALDER is a three-column text file storing the contact table o
|
|
|
|
|
|
A demo dataset is included in the repository `CALDER/inst/extdata/mat_chr22_10kb_ob.txt.gz` and can be accessed by `system.file("extdata", "mat_chr22_10kb_ob.txt.gz", package='CALDER')` once CALDER is installed. This data contains contact values of GM12878 on chr22 binned at 10kb (Rao et al. 2014)
|
|
A demo dataset is included in the repository `CALDER/inst/extdata/mat_chr22_10kb_ob.txt.gz` and can be accessed by `system.file("extdata", "mat_chr22_10kb_ob.txt.gz", package='CALDER')` once CALDER is installed. This data contains contact values of GM12878 on chr22 binned at 10kb (Rao et al. 2014)
|
|
|
|
|
|
|
|
+`feature_track` should be a data.frame or data.table of 4 columns (chr, start, end, score), and can be generated directly from conventional format such as bed or wig, see the following example:
|
|
|
|
+
|
|
|
|
+```
|
|
|
|
+library(rtracklayer)
|
|
|
|
+feature_track = import('ENCFF934YOE.bigWig') ## from ENCODE https://www.encodeproject.org/files/ENCFF934YOE/@@download/ENCFF934YOE.bigWig
|
|
|
|
+feature_track = data.table::as.data.table(feature_track)[, c(1:3, 6)]
|
|
|
|
+```
|
|
|
|
+ > feature_track
|
|
|
|
+ chr start end score
|
|
|
|
+ chr1 534179 534353 2.80512
|
|
|
|
+ chr1 534354 572399 0
|
|
|
|
+ chr1 572400 572574 2.80512
|
|
|
|
+ chr1 572575 628400 0
|
|
|
|
+ ... ... ... ...
|
|
|
|
+ chrY 59031457 59032403 0
|
|
|
|
+ chrY 59032404 59032413 0.92023
|
|
|
|
+ chrY 59032414 59032415 0.96625
|
|
|
|
+ chrY 59032416 59032456 0.92023
|
|
|
|
+ chrY 59032457 59032578 0.78875
|
|
|
|
+
|
|
CALDER contains three modules: (1) compute chromatin domains; (2) derive their hierarchical organization and obtain sub-compartments; (3) compute nested sub-domains within each compartment domain.
|
|
CALDER contains three modules: (1) compute chromatin domains; (2) derive their hierarchical organization and obtain sub-compartments; (3) compute nested sub-domains within each compartment domain.
|
|
|
|
|
|
### Example one: use contact matrix file in dump format as input
|
|
### Example one: use contact matrix file in dump format as input
|