Problem A - DNA random walks ---------------------------- The sequencing of the complete genomes of several organisms over the last decade has produced a large amount of data that needs to be analyzed. Genomic (DNA) data are large strings over 4 characters -- a,c,t,g. Analyzing whole genomes is challenging for many reasons. (a) The datasets are very large, up to hundreds of millions. VERY efficient algorithms are required -- a quadratic algorithm is rarely feasible. (b) For many machines, a dataset is too large to fit in the main memory, thus requiring sophisticated external-memory algorithms that read in and act on portions of the dataset at any time. (c) Since the nucleotides are long strings over an alphabet of size 4, there are many patterns that occur randomly. (d) Biologists do not know many things about the organisms which prevents computer scientists and mathematicians from building clean mathematical models. In the absence of good models, researchers have been forced to try different techniques that have proved to be useful in other fields of computational science. DNA random walks is one such technique. A random walk is a sequence of state values traversed by some entity. It is so named to indicate that a step is taken randomly, independent of the past steps. Random walks are characterized by the dimension of the state space. In this problem we will consider a 1-dimensional random walk -- i.e. the state is a single integer variable. DNA random walks are not really random in that a step depends on the next nucleotide read. The walk starts at 0 and steps are defined as follows: if the next nucleotide is a "a" or a "g" take one step down or decrement the position by 1; if the next nucleotide is a "t" or a "c" take one step up or increment the position by 1. Mathematicians have studied random walks for many years and quite a lot is known about the properties of random walks. DNA random walks reveal complicated dependences on history and use sophisticated mathematics. In this problem we are interested in two very simple properties of walks -- the maximum and the minimum. You program will read in a real genome dataset, taken from NCBI (a repository of such data). Input Format ------------ The input will contain a single instance, which is a genome data set from NCBI. There are several lines containing annotations and other information about the genome. Then there is a line containing only the word "ORIGIN". This is followed by many lines containing nucleotides; these lines have line numbers as well as blanks in addition to the nucleotides (see sample inputs). The last line only contains the string "//". Output Format ------------- A single line containing two integers -- the minimum and the maximum of the random walk. Sample Input ------------ LOCUS NC_XXXXXX 10000 bp DNA linear INV 16-OCT-2003 DEFINITION X...................................., complete sequence. ACCESSION XXXXXXXXX REGION: 1..10000 VERSION .......... ORIGIN 1 cctaagccta agcctaagcc taagcctaag cctaagccta agcctaagcc taagcctaag // Sample Output ------------- 0 3