I spend a lot of time write and debugging command line scripts and programs. As much as I like looking at large numbers (millions, billions, trillions, etc) it can be difficult to read a big number and quickly parse how large it is, i.e. “is that 12 megabytes or 1.2 gigabytes?”.
A long time ago I wrote a small function that does pretty printing of a number of bytes. It can handle from bytes to exabytes, and properly handles integer numbers of a unit by printing it as an integer instead of float. Should be easy enough to adjust to your specific needs or style desires.
// Prints to the provided buffer a nice number of bytes (KB, MB, GB, etc)
void pretty_bytes(char* buf, uint bytes)
{
const char* suffixes[7];
suffixes[0] = "B";
suffixes[1] = "KB";
suffixes[2] = "MB";
suffixes[3] = "GB";
suffixes[4] = "TB";
suffixes[5] = "PB";
suffixes[6] = "EB";
uint s = 0; // which suffix to use
double count = bytes;
while (count >= 1024 && s < 7)
{
s++;
count /= 1024;
}
if (count - floor(count) == 0.0)
sprintf(buf, "%d %s", (int)count, suffixes[s]);
else
sprintf(buf, "%.1f %s", count, suffixes[s]);
}
Convenient! Two minor comments:
1) while (count >= 1024 && s < 7) // prevent overrunning suffixes[]
2) write into a temporary buffer, then only update buf if there is enough room in buf.
My inner nerd couldn't refuse to comment on these, sorry! It's just a helper function and it's one of those "customize for your own use" things too, so they're not major issues.
1) Thanks, good point!
2) How do I know if there is enough room in buf? I never defined the calling convention except for the implied “buf must be large enough”.
You can use strlen() to figure out how much free space you have in the buffer. strlen() assumes terminated strings though.
Side Note: strlen() is basically
len = 0;
while (buff*) {
buff++;
len++;
}
return len;
Pardon my C code, it’s been a while since I’ve needed to use pointers.
Yeah, I’m not sure if that would work, since it assumes terminated strings. I might just add another parameter for buf_len and use snprintf for safety.
I also considered doing the malloc inside the function, and force the caller to free it. Or use a static buffer internal to the function, but that wouldn’t be thread-safe I suppose.
How do you know where the end of the string is if it’s not null terminated? Or is that just something you don’t worry about? If not, the strlen(buff) definitely won’t work.
I’m always hesitant to allow the function to malloc and rely on the user to free, we’ve had some pretty bad experiences of “forgetting” to free memory. Writing to a user defined buffer with the available length is probably the safest way to go.
RE temp variable: I also keep forgetting that you’re working on highly embedded systems. Allocating a temp buffer on the inside of that function probably isn’t a great use of resources. At the end of the day, it’s a debug/helper function and you generally know *exactly* what you’re passing into it.
PS: Anyway you can define suffixes[] outside of that function as static or const?
PPS: It’s amazing how much we can analyze super trivial code to death, but when it comes to the heart of program, we kinda just glaze over it and assume it works.
PPPS: Did you consider accessing suffixes[min(bytes/1024,6)]; ?
Ok. I’m really done now.
(I guess this blog has a comment nesting depth of 4, so I am replying here)
I just assume that buf is large enough, no requirements on null termination or anything like that.
This is actually for use in a GPU programming situation, but run on the CPU. There’s (only) 4GB of RAM in my GPU, so I need to keep track of how much I’m allocating there, so this is a nice function to print the memory allocation log.
I figured I could/should put the suffixes outside the function, but it works this way and I think I’ll leave it here :- )
A two-argument min(), eh? I’m not sure I’ve seen that before. Another solution I for this problem I found was to use log() to find which suffix to use, but I didn’t want to mess with that. Plus this function isn’t called inside any loops (like 6 calls total for a program that can take multiple hours depending on data set) so I’m not at all worried about performance.