hi, so i have a byte array that im compressing using gzipstream, lets say the byte array length is always 10000 elements, then i compress it, once i have the file.gz in disc, i open it so i can decompress it and get the data, but then i want to compress another byte array in the same file, the length is still the same (10000 elements) but the values are different, is there going to be residual data from the previous array inside the file?
How exactly do you compress another byte array “in the same file”? GZip is a compression stream which has a distinct header, data and footer section with checkksums and all which indicates the end of the stream. When you just “append” another stream at the end of the first one, that would need to be read seperately.
We don’t know your code for writing and reading it back, so we can’t really help. When you say the values are different, is it possible that you actually overwrite the old file? Though even when you append the second stream at the end of the first, you can not read it as one stream.
Hi Bunny83
When you say the values are different, is it possible that you actually overwrite the old file?
This is what i want i think, i don’t want to append at a position, i want to replace the whole file while keeping the name of the file and the file in the same folder.
So it would be analogous to;
- have a file.gz in a folder
- delete it
- then create another in the same folder with the same name, but the elements stored in the byte array that the new one compresses are different while the length of the arrays compressed in the first and second files are the same.
Here is the code:
arrayOfBytes = new byte[10000];
for (int i = 0; i < 10000; i++)
{
arrayOfBytes[i] = 0; // just fill the array with 0s
}
filePath = Path.Combine(currentFolderPath, "file.gz");
using (FileStream fileStream = new FileStream(filePath, FileMode.Create))
using (GZipStream compressionStream = new GZipStream(fileStream, CompressionMode.Compress))
{
compressionStream.Write(arrayOfBytes, 0, arrayOfBytes.Length);
}
using (FileStream fileStream = new FileStream(filePath, FileMode.Open))
using (GZipStream decompressionStream = new GZipStream(fileStream, CompressionMode.Decompress))
{
using (MemoryStream memoryStream = new MemoryStream())
{
decompressionStream.CopyTo(memoryStream);
compressedData = memoryStream.ToArray();
}
}
newArrayOfBytes = new byte[10000];
for (int i = 0; i < 10000; i++)
{
newArrayOfBytes[i] = i % 255;//So now this array has diferent values than "arrayOfBytes"
}
filePath = Path.Combine(currentFolderPath, "file.gz");//Just for clarity, the filePath is the same as the previous one but now i use FileMode.Open instead of FileMode.Create
using (FileStream fileStream = new FileStream(filePath, FileMode.Open))
using (GZipStream compressionStream = new GZipStream(fileStream, CompressionMode.Compress))
{
compressionStream.Write(newByteArray, 0, newByteArray.Length);
}
So my question was, is the file.gz going to have residual data from the previous created file.gz?
Yes, potentially, because the second time you use FileMode.Open which will open the existing file and place the cursor / position at the start. So writing to the file would overwrite the data in the file. However the size of the stream can vary. So when the second data you write to the file, the stream may be compressed more and end up being shorter. As a result there would be parts left over when you finished writing your new stream. This should not really matter as the GZipStream would stop once it reaches the footer / the end of the new stream. Though of course it not something you want.
The solution is to use the correct FileMode which is Create again. FileMode.Create will create a new file if it doesn’t exist yet or replace it if it exists already. This makes sure the file starts empty.
You could also use Open, overwrite the new stream, make sure you flush the stream so everything is actually written out and then use SetLength on the file stream to the current position. This will truncate the file at the current position.
using (FileStream fileStream = new FileStream(filePath, FileMode.Open))
using (GZipStream compressionStream = new GZipStream(fileStream, CompressionMode.Compress))
{
compressionStream.Write(newByteArray, 0, newByteArray.Length);
compressionStream.Flush();
// truncate at current position.
fileStream.SetLength(fileStream.Position);
}
This should not really matter as the GZipStream would stop once it reaches the footer / the end of the new stream.
ohh i see, so when i decompress the file a second time i should always get the correct byte array back even if there’s residual data from the previous array, because the start and the end the decompressor would use, is stored somewhere in the file when compressing it(?). I don’t mind it if that’s the case, as long as i can always get the correct byte array back, and the residual data doesn’t accumulate beyond some maximum number ,which i assume, would be the worst case for the compression part, altou that would be dumb and inefficient i guess, so il probly go with the Create one.
anyways il try your suggestion, thanks!