Yuanlong LIU 2 rokov pred
rodič
commit
9ca1306cbe
1 zmenil súbory, kde vykonal 6 pridanie a 3 odobranie
  1. 6 3
      README.md

+ 6 - 3
README.md

@@ -18,11 +18,10 @@ CALDER is a Hi-C analysis tool that allows: (1) compute chromatin domains from w
 Due to reasons such as low data quality or large scale structrual variation, compartments can be unrealiablly called at one `bin_size` (equivalent to `resoltution` in the literature) but might be captured at another `bin_size`. We added an opitimized `bin_size` selection strategy to call reliable compartments. It is based on the observation from our large scale compartment analysis (https://www.nature.com/articles/s41467-021-22666-3) that, although compartments can change between different conditions, their overall correlation `cor(compartment_rank_1, compartment_rank_2)` is high (> 0.4).
 <br>
 <br>
-Given a `bin_size` specified by user, we call compartment with extended `bin_sizes` and choose the smallest `bin_size` such that no bigger `bin_size` can increase the correclation with a reference compartment more than 0.05. For example, if correclation for `bin_size=10000` is 0.2 while for `bin_size=50000` is 0.6, we are more confident the latter is more reliable; if correclation for `bin_size=10000` is 0.5 while for `bin_size=50000` is 0.52, we would choose the former as it has higher resolution.
+Given a `bin_size` specified by user, we call compartments with extended `bin_sizes` and choose the smallest `bin_size` such that no bigger `bin_size` can increase the correclation with a reference compartment more than 0.05. For example, if correclation for `bin_size=10000` is 0.2 while for `bin_size=50000` is 0.6, we are more confident the latter is more reliable; if correclation for `bin_size=10000` is 0.5 while for `bin_size=50000` is 0.52, we would choose the former as it has higher resolution.
 <br>
 <br>
-High quality compartment calls were generated for `hg19` (hic data from GSE63525), `hg38` (hic data from https://data.4dnucleome.org/files-processed/4DNFI1UEG1HD/), `mm9` (hic data from GSM3959427), `mm10` (hic data from http://hicfiles.s3.amazonaws.com/external/bonev/CN_mapq30.hic)
-
+`bin_size` is extended in the following way such that we can aggregated directly from the input contact matrix into larger `bin_sizes`
 ```
 if(bin_size==5E3) bin_sizes = c(5E3, 10E3, 50E3, 100E3)
 if(bin_size==10E3) bin_sizes = c(10E3, 50E3, 100E3)
@@ -31,6 +30,10 @@ if(bin_size==25E3) bin_sizes = c(25E3, 50E3, 100E3)
 if(bin_size==40E3) bin_sizes = c(40E3, 80E3)
 if(bin_size==50E3) bin_sizes = c(50E3, 100E3)
 ```
+<br>
+<br>
+High quality reference compartments were generated for `hg19` (hic data from GSE63525), `hg38` (hic data from https://data.4dnucleome.org/files-processed/4DNFI1UEG1HD/), `mm9` (hic data from GSM3959427), `mm10` (hic data from http://hicfiles.s3.amazonaws.com/external/bonev/CN_mapq30.hic)
+
 
 # Installation