UnitsPolicy
|
Size: 7173
Comment: do not require CLI apps to have an --si switch
|
Size: 7195
Comment: prefer base-10 for file sizes and do not require options everywhere
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 49: | Line 49: |
| 1. Give the user the opportunity to decide between base-10 and base-2. The default must be base-10. | 1. Only show base-10, or give the user the opportunity to decide between base-10 and base-2 (he default must be base-10). |
This policy is currently only a draft!
Rationale
There are two ways to represent big numbers: You could either display them in multiples of 1000 = 10 3 (base 10) or 1024 = 2 10 (base 2). If you divide by 1000, you probably use the SI prefix names, if you divide by 1024, you probably use the IEC prefix names. The problem starts with dividing by 1024. Many applications use the SI prefix names for it and some use the IEC prefix names. The current situation is a mess. If you see SI prefix names you do not know whether the number is divided by 1000 or 1024. There is already a Brainstorm idea for it: Fix file size confusion.
Policy
Applications must use IEC standard for base-2 units:
- 1 KiB = 1,024 bytes (Note: big k)
- 1 MiB = 1,024 KiB = 1,048,576 bytes
- 1 GiB = 1,024 MiB = 1,048,576 KiB = 1,073,741,824 bytes
- 1 TiB = 1,024 GiB = 1,048,576 MiB = 1,073,741,824 KiB = 1,099,511,627,776 bytes
Applications must use SI standard for base-10 units:
- 1 kB = 1,000 bytes (Note: small k)
- 1 MB = 1,000 kB = 1,000,000 bytes
- 1 GB = 1,000 MB = 1,000,000 kB = 1,000,000,000 bytes
- 1 TB = 1,000 GB = 1,000,000 MB = 1,000,000,000 kB = 1,000,000,000,000 bytes
It is not allowed to use the SI standard for base-2 units:
- 1 kB != 1,024 bytes
- KB (with a big k) does not exist
Implementation
There are two ways to fix the abuse of the SI standard for base-2:
- Correct the application to divide by 1,000 and keep on using SI prefixes.
- Correct the application to keep on dividing by 1,024 but use the IEC prefixes.
Correct basis
Use base-10 for:
- network bandwidth (for example, 6 MBit/s or 50 kB/s)
- disk sizes (for example, 500 GB hard drive or 4.7 GB DVD)
Use base-2 for:
- RAM sizes (for example, 2 GiB RAM)
For file sizes there are two possibilities:
- Show both, base-10 and base-2 (in this order). An example would be the Linux kernel: "2930277168 512-byte hardware sectors: (1.50 TB/1.36 TiB)"
- Only show base-10, or give the user the opportunity to decide between base-10 and base-2 (he default must be base-10).
Exception
The application can keep their previous behavior for backwards compatibility if the following points apply. The application may add an option to display the sizes in base-10, too.
- is a command-line tool
- is often parsed by machine (for example, the output is used in scripts)
- only the prefix is displayed and not the unit (for example, M instead of MB)
Some applications which fall under this rule are:
- df
- du
- ls
Use cases
- Alice does not know much about computers. She is familiar with the SI prefix system (1 kg = 1000 g, 1 km = 1000 m). She quickly understands that 1 kB is 1000 bytes.
- Bob uses Ubuntu and Windows. He wants to see the same numbers for file sizes in Nautilus as in Windows Explorer so that he can simply compare them. Therefore, Nautilus needs to display the file sizes in base-2.
Additional notes
There is no third "standard" in the form of the O'Reilly Style Guide. It only specified abbreviations for 1,024 (K) and 1,000 (k), but not for 1,048,576 and 1,000,000 and so on.
References
General
Brainstorm ideas and blueprints
Other operating systems
How Mac OS X reports drive capacity - "In Mac OS X v10.6 Snow Leopard, storage capacity is displayed as per product specifications (base 10). A 200 GB drive shows 200 GB capacity ..."
Comments and suggestions
We should not use the naming "SI prefixes" and "IEC prefixes". The more accurate terminology would be "decimal prefixes" and "binary prefixes" This is because IEC is a standard body which happens to use both types of prefixes and SI is a system of units of measurement. ISO and IEC have jointly adopted both prefixes in what is now ISO/IEC 80000. See: http://en.wikipedia.org/wiki/ISO_80000
The default should be base 2 for file sizes, I don't think that being scientifically accurate matters as KiB and so on are rarely used. People who are not scientifically trained but understand what a kilobyte is assume it to be 1024 bytes, most people haven't heard of a kibibyte and don't want to either. It's not just Windows PCs but on every home computer for the past 25 years 'k', 'KB' or 'kb' has meant 1024 bytes, people who don't know this and do care are a tiny minority. Base 2 applies to websites and file sizes on mobile phones. Go with the flow, it shouldn't be Ubuntu's job to educate the public on SI standards, screw SI standards, this is Linux for human beings not scientists! It should be base 2 and in upper case unless talking about networks. Disks should display both, as the manufacturers talk in base 10 but base 2 is of interest to the users. --Gaz Davidson
- The default should be base 10 for file sizes. Base 2 should only be used where the thing being measured naturally comes in multiples of a power of 2. File and disk sizes do not; they can be any arbitrary number. People who are not trained in computer science assume that a kilobyte means 1000 bytes, since a kilometer is 1000 meters and a kilogram is 1000 grams. Most people have never heard of dividing by 1024, and don't want to either. It serves no purpose in this context. The reason the base 2 convention exists is to make memory calculations simpler. 16 + 16 = 32, rather than 16.4 + 16.4 = 32.8 or 16.8 + 16.8 = 33.6. But this simplification doesn't apply to file sizes or disk sizes, so using 1024 for file sizes when the user expects 1000 just makes things much more complicated than they need to be. The Linux kernel measures disk sizes in base 10, Mac OS X measures disk and file sizes in base 10, manufacturers measure in base 10, and users are familiar with base 10. They are not familiar with base 2.
UnitsPolicy (last edited 2010-03-28 04:38:53 by jeff-storago)