I am currently developing a wpf application to extract a sequence from a file contained within a rich textbox. What I have been able to do is load the file (a pdb file, which is basically a text file and can be opened in wordpad) into the rich text box on my form. What I am having difficulty with is extracting the information I want from this rich text box.
I have enclosed my application and the raw data above (2B0F), and would be very grateful if you could review it and see how to get this particular aspect working. A PDB file is a Protein Data Bank used in Mass Spectromatry, and is big because the data it contains is extensive.
Basically, what I intend is to load the .pdb file into the rich text box, and then extract the sequence data. This data looks like:
MODEL 1
ATOM 1 N GLY A 1 296.995 21.768 -6.913 1.00 0.00 N
ATOM 2 CA GLY A 1 296.814 21.433 -8.356 1.00 0.00 C
ATOM 3 C GLY B 1 295.982 20.178 -8.579 1.00 0.00 C
ATOM 4 O GLY B 1 295.381 19.654 -7.643 1.00 0.00 O
ATOM 5 HA2 GLY B 1 296.321 22.263 -8.844 1.00 0.00 H
ATOM 6 HA3 GLY B 1 297.784 21.294 -8.810 1.00 0.00 H
ATOM 7 H1 GLY A 1 297.491 22.677 -6.814 1.00 0.00 H
etc, with a whole lot of data that I do not need at the moment at the top above this as you can see (you will need to scroll to see the data I am meaning in my application once you have loaded the pdb file. I would like to extract the sequence from this data above, and put it into a rich text box, so it looks like this (this is all the data from the sequence I have enclosed): As you can see, all I need is the atom number and the chemical code from the centre, as well as divide this up into two chains (for example chain A represents all those that have an A, and chain B represents all those that have a B in it.
>Chain_A
1 - GLY
2 - PRO
3 - ASN
4 - THR
5 - GLU
6 - PHE
7 - ALA
8 - LEU
9 - SER
10 - LEU
11 - LEU
12 - ARG
13 - LYS
14 - ASN
15 - ILE
16 - MET
17 - THR
18 - ILE
19 - THR
20 - THR
21 - SER
22 - LYS
23 - GLY
24 - GLU
25 - PHE
26 - THR
27 - GLY
28 - LEU
29 - GLY
30 - ILE
31 - HIS
32 - ASP
33 - ARG
34 - VAL
35 - CYS
36 - VAL
37 - ILE
38 - PRO
39 - THR
40 - HIS
41 - ALA
42 - GLN
43 - PRO
44 - GLY
45 - ASP
46 - ASP
47 - VAL
48 - LEU
49 - VAL
50 - ASN
51 - GLY
52 - GLN
53 - LYS
54 - ILE
55 - ARG
56 - VAL
57 - LYS
58 - ASP
59 - LYS
60 - TYR
61 - LYS
62 - LEU
63 - VAL
64 - ASP
65 - PRO
66 - GLU
67 - ASN
68 - ILE
69 - ASN
70 - LEU
71 - GLU
72 - LEU
73 - THR
74 - VAL
75 - LEU
76 - THR
77 - LEU
78 - ASP
79 - ARG
80 - ASN
81 - GLU
82 - LYS
83 - PHE
84 - ARG
85 - ASP
86 - ILE
87 - ARG
88 - GLY
89 - PHE
90 - ILE
91 - SER
92 - GLU
93 - ASP
94 - LEU
95 - GLU
96 - GLY
97 - VAL
98 - ASP
99 - ALA
100 - THR
101 - LEU
102 - VAL
103 - VAL
104 - HIS
105 - SER
106 - ASN
107 - ASN
108 - PHE
109 - THR
110 - ASN
111 - THR
112 - ILE
113 - LEU
114 - GLU
115 - VAL
116 - GLY
117 - PRO
118 - VAL
119 - THR
120 - MET
121 - ALA
122 - GLY
123 - LEU
124 - ILE
125 - ASN
126 - LEU
127 - SER
128 - SER
129 - THR
130 - PRO
131 - THR
132 - ASN
133 - ARG
134 - MET
135 - ILE
136 - ARG
137 - TYR
138 - ASP
139 - TYR
140 - ALA
141 - THR
142 - LYS
143 - THR
144 - GLY
145 - GLN
146 - CYS
147 - GLY
148 - GLY
149 - VAL
150 - LEU
151 - CYS
152 - ALA
153 - THR
154 - GLY
155 - LYS
156 - ILE
157 - PHE
158 - GLY
159 - ILE
160 - HIS
161 - VAL
162 - GLY
163 - GLY
164 - ASN
165 - GLY
166 - ARG
167 - GLN
168 - GLY
169 - PHE
170 - SER
171 - ALA
172 - GLN
173 - LEU
174 - LYS
175 - LYS
176 - GLN
177 - TYR
178 - PHE
179 - VAL
180 - GLU
181 - LYS
182 - GLN
>Chain_B
1 - LEU
2 - GLU
3 - ALA
4 - LEU
5 - PHE
6 - GLN
I would be very grateful if you were able to help me, as I am somewhat new in programming WPF.
Thank-you once again,
Timothy Johansen.
Timothy Johansen
W e b D e s i g n e r / P r o g r a m m e r
* The Best Reasons to Target Windows 8
Learn some of the best reasons why you should seriously consider bringing your Android mobile development expertise to bear on the Windows 8 platform.