|
@@ -9,11 +9,16 @@ CALDER is a Hi-C analysis tool that allows: (1) compute chromatin domains from w
|
|
|
|
|
|
* Support for hg19, hg38, mm9, mm10 and other genomes
|
|
|
* Support input in .hic format generated by Juicer tools (https://github.com/aidenlab/juicer)
|
|
|
-* Opimized bin_size selection
|
|
|
+* Opitimized bin_size selection
|
|
|
* Added output in tabular .txt format for downstream analysis
|
|
|
* Aggregated all chromosome output into a single file
|
|
|
|
|
|
-## Introduction of opimized bin_size selection
|
|
|
+## Introduction of opitimized bin_size selection
|
|
|
+('bin_size' is equivalent to 'resoltution')
|
|
|
+
|
|
|
+We developed the opitimized bin_size selection method for the purpose of calling reliable compartments at high resolution.
|
|
|
+
|
|
|
+High quality compartment calls were generated for 'hg19' (hic data from GSE63525), 'hg38' (hic data from https://data.4dnucleome.org/files-processed/4DNFI1UEG1HD/), 'mm9' (hic data from GSM3959427), 'mm10' (hic data from http://hicfiles.s3.amazonaws.com/external/bonev/CN_mapq30.hic)
|
|
|
|
|
|
|
|
|
# Installation
|
|
@@ -120,7 +125,7 @@ CALDER_sub_domains(intermediate_data_file,
|
|
|
| **contact_file_dump** |A list of contact files in dump format, named by `chrs`. Each contact file stores the contact information of the corresponding `chr`. Only one of `contact_file_dump`, `contact_tab_dump`, `contact_file_hic` should be provided
|
|
|
| **contact_tab_dump** | A list of contact table in dump format, named by `chrs`, stored as an R object. Only one of `contact_file_dump`, `contact_tab_dump`, `contact_file_hic` should be provided
|
|
|
| **contact_file_hic** | A hic file generated by Juicer tools. It should contain all chromosomes in `chrs`. Only one of `contact_file_dump`, `contact_tab_dump`, `contact_file_hic` should be provided
|
|
|
-| **ref_genome** | One of 'hg19', 'hg38', 'mm9', 'mm10', 'others' (default). High quality compartment calls were generated for 'hg19' (hic data from GSE63525), 'hg38' (hic data from https://data.4dnucleome.org/files-processed/4DNFI1UEG1HD/), 'mm9' (hic data from GSM3959427), 'mm10' (hic data from http://hicfiles.s3.amazonaws.com/external/bonev/CN_mapq30.hic). These compartments will be used as reference compartments for optimized bin_size selection. If `ref_genome = others`, an `annotation_track` should be provided (see below) and no optimized bin_size selection will be performed
|
|
|
+| **ref_genome** | One of 'hg19', 'hg38', 'mm9', 'mm10', 'others' (default). These compartments will be used as reference compartments for optimized bin_size selection. If `ref_genome = others`, an `annotation_track` should be provided (see below) and no optimized bin_size selection will be performed
|
|
|
| **annotation_track** | A genomic annotation track in `data.frame` or `data.table` format. This track will be used for determing the A/B compartment direction when `ref_genome=others` and should presumably have higher values in A than in B compartment. Some suggested tracks can be gene density, H3K27ac, H3K4me1, H3K4me2, H3K4me3, H3K36me3 (or negative transform of H3K9me3 signals)
|
|
|
| **bin_size** | The bin_size (resolution) to run CALDER. `bin_size` should be consistent with the data resolution in `contact_file_dump` or `contact_tab_dump` if these files are provided as input, otherwise `bin_size` should exist in the `contact_file_hic` file. Recommended `bin_size` is between 10000 to 50000
|
|
|
| **save_dir** | the directory to save outputs
|