diff --git a/README.md b/README.md index 3bf1bfc..e11bf55 100755 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ If using more than one reference genome (i.e., analyzing more than one species p These data are the metagenomic assemblies (or assembled genomes, if those are studied) that would be compared. These data should be organized in per-sample assembly files - i.e., all the contigs assembled from sample X would be kept in a single file. If genomes are to be compared, each genome will be stored in a single fasta file. All files should be stored in the same directory (refered below as the "target directory"). -**The directory `Sample_input/Target_genomes/` contains a collection of target genomes, for the purpuse of self testing the instalation.** +**The directory `Sample_input/Target_genomes/` contains a collection of target genomes, for the purpuse of self testing the installation.** #### c. Metadata file (optional): The metadata file contains information regarding the genomes/assemblies to be compared. @@ -87,4 +87,4 @@ Changes to simplify and solve file naming procedure, and solve BLAST error resul a. Added script "old_to_new_names.py": changes the file names in the target folder to "Sample.xxx", fasta headers to "contig.yyy", writes the changes to a table, and assigns all sequences to a single file. It is called from find_overlapping_regions.sh. b. Modified "find_overlapping_regions.sh", to use "old_to_new_names.py" instead of sed command in step 1. c. Modified "SynTracker.R" to read the table generated by "old_to_new_names.py" and change temporarily assigned sample names back to the original sample names (i.e., file names). -d. Changed "SynTracker_functions.R" to handle the new naming format. \ No newline at end of file +d. Changed "SynTracker_functions.R" to handle the new naming format.