talk
2797
Comment:
|
3937
|
Deletions are marked like this. | Additions are marked like this. |
Line 20: | Line 20: |
It has been suggested that [http://zsync.moria.org.uk/ zsync] be used instead. This is similar to the rsync algorithm, but has the advantage of working well with gzipped files created without the --rsyncable option, and does not need any special software running on the server. | It has been suggested that [http://zsync.moria.org.uk/ zsync] be used instead. This is similar to the rsync algorithm, but has the advantage of working well with gzipped files created without the --rsyncable option, and does not need any special software running on the server. Also although the zsync system requires .zsync files to put up on the web, these files are small control files, only ~ 1% of the size of the files to be downloaded. |
Line 24: | Line 24: |
However, zsync does not currently support the "ar" format used by deb. Even if it were modified to support ar files, it could not be used for debs packed using bzip2 rather than gzip. Furthermore, while zsync reduces the download by 70%, bsdiff is able to reduce the download by 90-95% ... meaning that bsdiff would make updating over a 56K modem feasible. | However, zsync does not currently support the "ar" format used by deb, although it should be "easy" to modify to support this format. Even if it were modified to support ar files, it could not be used for debs packed using bzip2 rather than gzip. Furthermore, while zsync reduces the download by 70%, bsdiff is able to reduce the download by 90-95%. This means that bsdiff would make updating over a 56K modem feasible. |
Line 26: | Line 26: |
For this reason I propose that we have bsdiff based debdiffs against all the files on the official cd(s). This means that the user can immediately upgrade their Ubuntu install over the net. Since they have just installed Ubuntu, the cd will be in the drive with all the deb files needing to be patched. Because the bsdiffs are only against the files on the cd, this will only need an additional ~100MB on the servers. |
For this reason I propose that we have bsdiff based debdiffs against all the files on the official cd(s). This means that the user can immediately upgrade their Ubuntu install, without worrying about bandwidth. Since they have just installed Ubuntu, the cd will be in the drive with all the deb files needing to be patched. Because the bsdiffs are only against the files on the cd, this will only need an additional ~100MB on the servers. |
Line 30: | Line 29: |
* New: Unfortunately users may not have access to the original deb files on their install cd, as the live cd and install cd are to be merged. * Also I know that many people believe that zsync could not possibly work effectively on zip files if --rsyncable is not used. However, please read how zsync achieves this in [http://zsync.moria.org.uk/paper/ this paper] * It is possible that the Ubuntu debs will be switched from gzip/bzip2 to the 7z format. Although it looks possible to modify zsync to support 7z, this would probably not be an easy task. = Maxim = 1) Using xdelta is stupid it has a lot of problems. Bsdiff outperforms it in all tasks. 2) I think that it will be wise to place deltas (using bsdiff) from packages that was shiped on cd and zsync files for all. Adding --rsyncable support for 7z archives is very easy task. It doesn't need to modify lzma code at all. |
PhillipSusi
I see several problems with this scheme:
1) xdeltas on gzipped data tend to be very inefficient. A small change in the original data set tends to make the gzip stream radically different.
2) It requires that the user still have the old package on hand to patch
I have a different proposal:
Have the delta-deb contain the full control info, and xdeltas for all of the non config files in the data.tar section. That way the user does not need the original .deb, if they have the package installed, then they just need to download the xdeltas for the installed files and patch them in place.
The package system already knows which files are config files and which are not, and it knows the md5 sums for those files so it can verify that they are correct before patching them with the xdelta.
JohnMccabeDansted
I have noticed that bsdiff always produced smaller diffs than xdelta, usually more than 10% smaller, sometimes over 80% smaller. Perhaps bsdiff should be used instead? See my [http://www.livejournal.com/users/flyingreptile/101020.html blog entry] for raw data.
It has been suggested that [http://zsync.moria.org.uk/ zsync] be used instead. This is similar to the rsync algorithm, but has the advantage of working well with gzipped files created without the --rsyncable option, and does not need any special software running on the server. Also although the zsync system requires .zsync files to put up on the web, these files are small control files, only ~ 1% of the size of the files to be downloaded.
Over the bsdiff/xdelta proposal, this has the advantage that we can put up a single .zsync file for each deb, and users can use zsync to reduce the bandwidth required for download regardless of which package they have. Infact zsync will work with the output of dpkg-repack, so you can use zsync to upgrade installed packages for which we no longer have the orignal deb file.
However, zsync does not currently support the "ar" format used by deb, although it should be "easy" to modify to support this format. Even if it were modified to support ar files, it could not be used for debs packed using bzip2 rather than gzip. Furthermore, while zsync reduces the download by 70%, bsdiff is able to reduce the download by 90-95%. This means that bsdiff would make updating over a 56K modem feasible.
For this reason I propose that we have bsdiff based debdiffs against all the files on the official cd(s). This means that the user can immediately upgrade their Ubuntu install, without worrying about bandwidth. Since they have just installed Ubuntu, the cd will be in the drive with all the deb files needing to be patched. Because the bsdiffs are only against the files on the cd, this will only need an additional ~100MB on the servers.
We may also put up debdiffs against files that have been updated in the last 10 days. This means that people can regularly and efficiently keep their machines up-to-date without need for more than a few extra MB on the servers.
* New: Unfortunately users may not have access to the original deb files on their install cd, as the live cd and install cd are to be merged.
* Also I know that many people believe that zsync could not possibly work effectively on zip files if --rsyncable is not used. However, please read how zsync achieves this in [http://zsync.moria.org.uk/paper/ this paper]
* It is possible that the Ubuntu debs will be switched from gzip/bzip2 to the 7z format. Although it looks possible to modify zsync to support 7z, this would probably not be an easy task.
Maxim
1) Using xdelta is stupid it has a lot of problems. Bsdiff outperforms it in all tasks. 2) I think that it will be wise to place deltas (using bsdiff) from packages that was shiped on cd and zsync files for all. Adding --rsyncable support for 7z archives is very easy task. It doesn't need to modify lzma code at all.
APTPackageDeltas/talk (last edited 2008-08-06 16:15:51 by localhost)