Click to See Complete Forum and Search --> : About Compressing using GZipStream


gsrk
February 14th, 2007, 08:33 AM
hi guys .. , I have the following code from book MCTS 70-536 ( Self paced training kit by Mcft Press) Microsoft .NET 2.0 Application Development Foundation ..

In lesson 3 "Compressing Streams " of Chapter 2 Input/Output ....


using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using System.IO.Compression;

namespace CompressDe
{
class Program
{
static void Main(string[] args)
{
FileStream sourceFile = File.OpenRead(@"C:\anyFile.txt");
FileStream destFile = File.Create(@"C:\anyFile.txt.gz");

GZipStream compStream = new GZipStream(destFile, CompressionMode.Compress);

int theByte = sourceFile.ReadByte();
while (theByte != -1)
{
compStream.WriteByte((byte)theByte);
theByte = sourceFile.ReadByte();
}

}
}


The problem with the following code is that the Size of the Compressed File comes out to be larger than the original file ....
then what is the profit of compressing the file . .

TheCPUWizard
February 14th, 2007, 08:39 AM
Depends on the content of the original file. Two common causes:

1) Very small input
2) Previously Compressed Input.

gsrk
February 14th, 2007, 08:53 AM
i hav used large files as inputs , i had tried many different types of files ..

but i hav found results as always bigger ... , is this with me only or somebody else also ..

e.g. i tried to compress a 50MB File , its size comes out to be 80 Mb , if the file was not compressible then the size should not have come out larger than 51Mb ...

gsrk
February 14th, 2007, 09:24 AM
i have more query's now....

when i use this method that i found somewhere else.. , i get the size of file 337KB (original file)compressed to 169 KB whereas when i use the method mentioned above according to the book then i got the size larger to 493KB

static void Main(string[] args)
{
// Get bytes from input stream
FileStream inFileStream = new FileStream(Path.Combine(Environment.CurrentDirectory, "C# Language Specification 2.0.doc"), FileMode.Open);
byte[] buffer = new byte[inFileStream.Length];
inFileStream.Read(buffer, 0, buffer.Length);
inFileStream.Close();

// Create GZip file stream and compress input bytes
FileStream outFileStream = new FileStream(Path.Combine(Environment.CurrentDirectory, "C# Language Specification 2.0.doc.gzip"), FileMode.Create);
GZipStream compressedStream = new GZipStream(outFileStream, CompressionMode.Compress);
compressedStream.Write(buffer, 0, buffer.Length);
compressedStream.Close();
outFileStream.Close();
}

torrud
February 14th, 2007, 09:51 AM
Well, I believe the problem is the call of the WriteByte() method. Keep in mind that GZip use the Deflate algorithm. That is a runlength encoding with an finishing entropy encoding. This encoding works lossless but with an input of one byte you get more overhead and the output increase.

If you use an bigger input like in your last method with the Write() method the runlength encoding works more efficient.

That is all. :D


If you need more detailed information look at Wikipedia or into books about encoding.