Encryption using a Genetic Algorithm?
I'm looking for some advice on my current project using Genetic Algorithms. Although I was supposed to write a "small" program demonstrating GAs in some way for my university course work I'm very interested in seeing what it can do further.
The program I've written basically splits any file into two strings of bytes and uses 'crossover' at both a byte and bit level to encrypt the data, using a key entered by the user. The numbers the user enters are used as intersection points to decide where to splice the byte strings.
The first pass through the data swaps/crosses bytes between the two strings. The second pass then swaps bits between the strings. Once both passes are done the strings are written back to a new file.
I'm interested in anyone's views on how secure this method of encryption could be? I'm by no means a professional programmer yet but I'm interested in how this method would compare to conventional XOR encryption. Or if you have any ideas on how this method could be improved?
So far I've only tested the algorithm with small text files and jpegs. Looking at the resulting files I was impressed to find that it removed patterns in the data and encrypted the files extremely quickly (in comparison to using another standard encryption program).
Anyone have any thoughts on this?
Leon ^^
Re: Encryption using a Genetic Algorithm?
Interesting. The rule of thumb is that any encryption you invent is probably much less secure than the state-of-the-art, but it's at least a somewhat different approach to the concept.
Re: Encryption using a Genetic Algorithm?
Quote:
The program I've written basically splits any file into two strings of bytes and uses 'crossover' at both a byte and bit level to encrypt the data, using a key entered by the user. The numbers the user enters are used as intersection points to decide where to splice the byte strings.
The first pass through the data swaps/crosses bytes between the two strings. The second pass then swaps bits between the strings. Once both passes are done the strings are written back to a new file.
Any concrete example that illustrate this point ?
Thanks.
Re: Encryption using a Genetic Algorithm?
Quote:
Originally Posted by
Peter_APIIT
Any concrete example that illustrate this point ?
Thanks.
Once again....What does that have to do with the question??
Security of an encryption algorithym depends on a number of factors. Part of which is the inherent computationational power required to run through permutations, part of which is how much encrypted information is available for inspection, part of it is how "expected" the information is.
Consider: "DJUEFBC" That is an encrypted peice of information. No one will ever be able to conclusively determine what it means unless I provide more samples or a context in which the decoded information can be validated. It is 100% secure.
What usually matters is the viability of the attack surface. If it takes a super-computer 1 day to crack a nations security, then someone will have access to that hardware and will utilize it. It is NOT very secure. On the othe-hand if the exact same effort is required to get the password for my CodeGuru account, it is very unlikely that anyone will expend the effort.
Ironically, in a security challenge (well over a decade ago), I made a submission that usilized 63 different "insecure" methods in combination. The realitry is that it was not secure at all, but because no one expected such a "childish" technique I actually placed 3rd in the competition.
Re: Encryption using a Genetic Algorithm?
Quote:
Originally Posted by
Leon4116
I'm interested in anyone's views on how secure this method of encryption could be? I'm by no means a professional programmer yet but I'm interested in how this method would compare to conventional XOR encryption. Or if you have any ideas on how this method could be improved?
Without knowing too much I think your procedure would be characterized as a Feistel cipher,
http://en.wikipedia.org/wiki/Feistel_cipher
If I got it right the crossover points appear at certain fixed intervals? It would be more confusing if the crossover points would appear at random intervals instead. To accomplish this you could use the number the user enters as seed for a random generator. The number sequence it generates for that seed would then be used to produce a sequence of "random" integers between say 1 and 100 which would be used as crossover points.
This would increase the complexity of the cipher because both the seed and the random generator algorithm must be found out for deciphering. Also the cipher becomes more "genetic" because in nature crossover points are random.
Re: Encryption using a Genetic Algorithm?
I did post this message once before but for some reason it didn't show so I'll try posting it again
Quote:
Originally Posted by
_uj
Without knowing too much I think your procedure would be characterized as a Feistel cipher,
Thanks for the Wiki link I did read the information on the Feistel cipher, but again this is a standard XOR encryption algorithm where the bits in the original file are flipped using a hexadecimal key. Obviously algorithms like DES are a bit more complicated than simply using XOR on the entire file, but my point is that Genetic Algorithms don't do this.
If you were to use a genetic algorithm on these two bytes:
00000000 and 11111111
Take one intersection point [5] in the centre and they become:
00001111 and 11110000
This is basically what the program does but with kilobytes and megabytes. The key is used to define the intersection points of the byte strings. However, the program swaps whole strings of bytes first, and then swaps the bits in the strings.
It would be possible to use a random number generator to create new intersection points but I have a feeling that may work against the algorithm's security since a computer may be able to work backwards once the generator used is identified. Besides its just as easy to use the original key rather than spend time generating a new one. Its an interesting idea however. If the user only wanted to use a small key, then that key could be used to generate a longer and more secure key.
Quote:
Originally Posted by
Peter_APIIT
Any concrete example that illustrate this point ?
Thanks.
I copied a paragraph from Wikipedia about GAs into a text file and then ran it through the algorithm to produce a new text file and observe the output. I also added a series of dashes in the middle of the file to see how the algorithm copes with repeating patterns. The program goes through two levels of encryption using the same user defined key. The first run through crosses whole bytes to mix up the file and remove any repeating patterns. The result of this stage is a kind of hexadecimal word jumble. The second run through crosses sections of bits to effectively create new sets of bytes.
Here is the original text file:
Quote:
A genetic algorithm (GA) is a search technique used in computing to find exact or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics.
--------------------------------------------------
Genetic algorithms are a particular class of evolutionary algorithms (also known as evolutionary computation) that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination).
Here is the file after the first run through. I had it output to a text file so we can see the stages working:
Quote:
A ge
ic algocithm rith is re earchttechniqce us o invoomputingy alfinithxactals approx mate oluttionry o opttmizatiot andusearec probleminspened b algoluthms ar categy szed as globai searchmheurioticse
--- and---oss---- (a-------l-----comb----ion------net
Genetir algo(GA)ms aa sa par icular ulassedf e clutionar to gord ems ( oro knownias evsoluionas tcompuiation) nhat se tchhniques s. Gireticy evoritionaryebiologoriuch as inherltance, utatisn, s.lection,---- cr---over---lso cal-ed re----inat---).
And here is the final result after the second run through:
Quote:
M mm
liCaamdmcatlm`rITh$yp be0earcj0xgelfa0cdausam"anboomputonba0i`finetm|c`xalr spsoz ea`e`olutpinbrq4g opuqaiyo|i,t`and4searac`pboblu`qnsDmned!c(aegodutios!bp cmdegi`s|eh as olnfbl$qncrk(mydupiovi#s%`-jm
/-,,inl%-,oq5//%#((!-}o-/m-l%m!}mc/-b)-!5ion--!-'%nm
g$ndwkr(ahg/(ga)ic#aq ca papdeashizaum ss$fd(e4clutiohaw8dg,gorh `iq#$ osoahnnwliis4e6solumooms(`komptlmtjaf)cn(at ase tgh(nyquec-k. sarethbq lvozithmn`q{ebic|og2iscd as aniaqipceae$` duavisl,`sn,Eimn-!%-%(cs-/kmtm|%-ml#- a!l-m$,"%-mm-mnml---).
I could make the encryption more secure by running the file through the algorithm multiple times to further mutate the encrypted data. I also have an idea for a self-mutating key that changes itself after each pass through the file.
Leon
Re: Encryption using a Genetic Algorithm?
Nice, but, maybe I'm missing something with the idea to mutate the key... if you mutate the key, then you'll need to write the key mutation data (i.e the data required to know how to demutate the key) into the encrypted data right? But what have you bought yourself? All you have done is obfuscated the key to the user that supplied the key in the first place. The key to decrypt the data is still the original key right - so what have you bought? If not, then it is still a of length n, therefore, it is still no harder to crack for a machine (assuming that the original key was not a dictionary word or a permuation of one).
The only time I can think of when you would want to mutate a key, is when you need to hide the key information inside the encryption, but hiding the key in the encryption is simply a bad idea.
Re: Encryption using a Genetic Algorithm?
Quote:
Originally Posted by
Leon4116
[B]If you were to use a genetic algorithm on these two bytes:
00000000 and 11111111
Take one intersection point [5] in the centre and they become:
00001111 and 11110000
To be formal, note that this doesn't constitute a full genetic algorithm. This is only how chromosomes (solutions) are scrambled to produce new chromosomes (hopefylly better solutions).
So what does your proposed scrambling scheme look like? Say you have a file,
a1, a2, a3, a4 | b1, b2, b3, b4
The bar marks the middle of the file. The question marks mark arbitrary intersection points on the file.
Now after a scrambling the file looks like this (if I got you right),
a1, b2, a3, b4 | b1, a2, b3, a4
What has happened? Well, 4 of the file sections, namely a1, a3, b1 and b3, just remain where they were. Then a2 has been swapped with b2, and a4 with b4.
So the file is split in the middle and each half is divided exactly the same into sections at certain intersection points determined by the key. Then every other section on one half is swapped with the corresponding section on the other half. Finally the halves are merged together again forming the scrambled file.
This seems to correspond to a transposition cipher, it's just that file parts are transposed, rather than symbols. This also means that existing symbols may get "cut up" producing new symbols in the scrambling process. So it's a transposition cipher working at the symbol representation level. I know too little about ciphers to judge how much of a deciphering challenge this approach represents.
Re: Encryption using a Genetic Algorithm?
Quote:
Originally Posted by
PredicateNormative
Nice, but, maybe I'm missing something with the idea to mutate the key... if you mutate the key, then you'll need to write the key mutation data (i.e the data required to know how to demutate the key) into the encrypted data right? But what have you bought yourself? All you have done is obfuscated the key to the user that supplied the key in the first place. The key to decrypt the data is still the original key right - so what have you bought? If not, then it is still a of length n, therefore, it is still no harder to crack for a machine (assuming that the original key was not a dictionary word or a permuation of one).
The only time I can think of when you would want to mutate a key, is when you need to hide the key information inside the encryption, but hiding the key in the encryption is simply a bad idea.
I don't think it's meant to make the key "more secure" with regard to brute force methods, but to make the cyphertext less easily analyzed by mathematical or intuitive means. At the risk of revealing that I'm hardly qualified on the subject of cryptography, I would say it's the same idea behind stream cyphers.
Re: Encryption using a Genetic Algorithm?
Quote:
Originally Posted by
_uj
To be formal, note that this doesn't constitute a full genetic algorithm. This is only how chromosomes (solutions) are scrambled to produce new chromosomes (hopefylly better solutions).
Yes sorry I should have named this thread "Encryption using Genetic Crossover", which is really all it does. A full Genetic Algorithm is used to find best-fit solutions to problems which is not what I'm proposing. All the program does is split a file into sections (chromosomes) and crosses them to produce new data. The points at which the data is crossed are defined by the user key. So theoretically you couldn't read the data without knowing the original key to cross the bits and bytes back into their original positions.
Quote:
Originally Posted by
Hermit
I don't think it's meant to make the key "more secure" with regard to brute force methods, but to make the cyphertext less easily analyzed by mathematical or intuitive means. At the risk of revealing that I'm hardly qualified on the subject of cryptography, I would say it's the same idea behind stream cyphers.
Exactly what I was hoping to achieve. If one knows the encryption method used it doesn't matter how many times you mutate the key or what algorithm you used, the only important factor is the original key, which depending on the maximum number of combinations the key could be makes the data more secure from brute-force decryption methods.
The only advantage I can suppose to using Genetic Crossover in file encryption is that it removes repeating patterns quite proficiently; something that can be a security risk when using conventional encryption methods. Current ciphers and algorithms (as far as I know) have to process the files using considerable overhead to remove those repeating patterns, which might also make Genetic Crossover faster in that respect. Even using Genetic Crossover before a standard cipher might make the code more secure and reduce overhead.
Leon
Re: Encryption using a Genetic Algorithm?
Quote:
Originally Posted by
Leon4116
Yes sorry I should have named this thread "Encryption using Genetic Crossover", which is really all it does.
You also asked,
"I'm interested in anyone's views on how secure this method of encryption could be?"
Now, the purpose of my post was to show that your cipher is equivalent to a transposition cipher, with the special feature that it takes place at the symbol representation level. The latter gives rise to something that seems to be known as fractionation, or symbol splitting.
http://en.wikipedia.org/wiki/Transposition_cipher
Your question can now be answered with: This cipher is as secure as a transposition cipher with fractionation.
Re: Encryption using a Genetic Algorithm?
I guess Genetic Crossover should be added to Wikipedia as yet another cipher method then LoL. It'll be interesting to see how it compares in encryption time to the others once I've completed it. Unfortunately I had to submit what I'd already done today so any extra work will be on my own time, still a very interesting project though. I tend to find I become quite obsessed with improving my programs even after the work doesn't count. I'm glad that it might still be as useful as other ciphers, not a waste of time hopefully.
I actually have another question, going off at a slight tangent here, but I was thinking about the temporary files my program creates when it splits the original file into pieces. I found that trying to allocate temporary memory for large files tended to crash my program, especially in the ranges of hundreds of megabytes. My first thought was to split the original file into smaller files and open them for both input and output and just buffer the data in main memory in the order of a kilobyte. I still don't feel very easy however about overwriting a file I'm also reading from at the same time, but it also seems very wasteful to create lots of files, essentially duplicating or tripling the amount of storage needed on the hard drive. It might not be a problem for music files or jpegs, but what about 800 megabyte iso's, or gigabyte archives? What would be a safe method for conserving hard-drive space and main memory?
Leon