Bioinformatics Online Home
   Chapters Links Problems Enroll for Updates Help
 CHAPTER 12 PROBLEMS
   
 Problem 1
 Problem 2
 Problem 3
 Problem 4
 Problem 5
 Problem 6
 All problems

Home  >  Problems  >  Chapter 12

QUESTION ON SCRIPT EXAMPLE 1—BIOPERL: READING SEQUENCES FROM A GENBANK FORMAT FILE AND WRITING THEM TO A FASTA FORMAT FILE (p. 579)
  1. Change the script to output a different format, such as EMBL, GCG, or raw.




QUESTIONS ON SCRIPT EXAMPLE 2—BIOPERL: USING THE GLOB FUNCTION TO OBTAIN A LIST OF FILES NAMED WITH A SPECIFIED EXTENSION (p. 580)
  1. What EMBOSS program would you use to rearrange a DNA sequence while retaining its initial composition? What would be a reason for doing so?
  2. What are some other EMBOSS programs that would be interesting to run on a mutated sequence?
  3. Modify the check_cpg subroutine so that it also outputs the identity and length in base pairs of the sequences. Test this by downloading a few sequences into one file instead of individual files and running the script on a file containing multiple sequences.
  4. Modify the script so that it cleans up after itself by removing the ".cpg" and ".mut" files it creates.
  5. Modify the check_cpg subroutine so that it returns the length of the longest CpG island, and add code to the main script that computes the overall maximum CpG island length for all of the sequences and prints this information.
  6. Write a BioPerl script that uses SeqIO to read a FASTA file of sequences and Bio::Factory::EMBOSS to find the longest orf in each sequence.




QUESTIONS ON SCRIPT EXAMPLE 4—PERL: DBI DATABASE INTERFACE SCRIPT (p. 592)
  1. The script does not report a problem if neither the "load" nor "query" options are specified. Add code to detect this condition and print out a usage message.
  2. Change the script so that if the no -qid option is given but all other query options are present, the database query is run for all BLAST query sequences.




QUESTION ON SCRIPT EXAMPLE 5—PERL: READING FILES AND LOOKING FOR TEXT PATTERNS (p. 595)
  1. To recognize both upper- and lowercase sequences, the pattern match included letters in upper- and lowercase: /([^ACGTacgt])/. Change the pattern specification to /([^ACGT])/i to get a case-insensitive match and test the script on a file containing mixed cases. Which version do you think is preferable in terms of the script's output and why?




QUESTIONS ON SCRIPT EXAMPLE 6—PERL: COMMAND LINE ARGUMENTS AND TEXT MANIPULATION (p. 598)
  1. Add a second command line argument that specifies the E value above which to skip lines, and replace the hard-coded E-value exponent –50 in the script with a variable set to the specified E-value exponent.
  2. Change the script so that it opens an output file for writing and prints the desired lines from the BLAST output to this file.




QUESTIONS ON SCRIPT EXAMPLE 7—PERL: PATTERN SUBSTITUTION AND INCREMENTAL DEVELOPMENT OF SCRIPTS (p. 600)
  1. In the "Usage" error message, why is it better to use $0 to identify the script than putting the actual script file name in the message?
  2. Huntington's disease is characterized by the presence of 39 or more tandem CAG repeats in a gene on Chromosome 4. Does this script provide a valid method of screening for such occurrences? Why or why not?
  3. Replace the three lines inside the "if(!defined $infil)" loop with a single line that accomplishes the same thing. Change the pattern-matching and substitution expressions so that the call to uc is no longer needed. Add code to look for a second command line argument, "-s, for silent mode", and, if present, change the output of the definition line to omit the doctor's name. Hint: Use split, then use the string operator "." to concatenate together only the pieces you want to output.
  4. Replace the call to chomp with a pattern substitution to remove newlines from the sequence.
  5. Rewrite the script to use the Bio::SeqIO module. What advantages are provided by Bio::SeqIO?




 

© 2004 by Cold Spring Harbor Laboratory Press. All rights reserved.
No part of these pages, either text or image, may be used for any purpose other than personal use. Therefore, reproduction, modification, storage in a retrieval system, or retransmission, in any form or by any means, electronic, mechanical, or otherwise, for reasons other than personal use, is strictly prohibited without prior written permission.

 

 
Home Chapters Links Problems Enroll for Updates Help CSHL Press