Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The solution in Rust is separate String and &str. &str is a reference to somewhere within String, and the length of the referred to region, and borrows from the String it refers to.

Any function that does not need to modify a String takes a &str. Any function that does modify a String typically takes a String, which means they consume their input. (Because of utf-8, in-place modification is generally a pipedream.)

Also, the headers are typically allocated on stack. Rust is a lot less shy about types that are larger than a pointer living inline whereever they are used, and this is something that seems to work a lot better than the alternative.



Allocating headers and strings separately blows your CPU cache. Hardly a performant way of doing hot loops.


Compared to calling strlen a bunch, which I’m sure is significantly more performant.


You never need to call strlen unless you are getting your inputs from a place that doesn’t give you a string length (such as stdin).


So which is it, then? Does keeping size separate "blows your CPU cache"¹ or not? You can't argue it does in one case (Rust) but not in your case…

(And note that the representation you're responding to is not really a "header", in the same sense that the trailing null is a "footer". The representation does not require the length be contiguous with the data, but that's what upthread was trying to say in the first place.)

¹(it doesn't…)


So now you are arguing that by default your strings should come with a length? Great!

If you want that, you might as well bake that length into the string type by default (and use a specialised type, perhaps a naked raw pointer into the string) for when you don't want to pass the length.


That's most interfaces…?


Not argv[].


You still need to call strlen on each element?


To get a correct understanding, if you aren't a Rust person, Rust's String is (literally, though this is opaque) Vec<u8> with checks to ensure it's actually not just arbitrary bytes you're storing but UTF-8 text.

Vec<u8> unlike String has a direct equivalent (well, the Rust design is slightly better from decades of extra experience, but same rough shape) in C++ std::vector<std::byte>

The C++ std::string is this very weird thing that results from having standardized std::string in C++ 98, then changing their mind after implementation experience. So while it's serviceable it's pretty poor and nobody should model what they're doing on this. There have been HN links to articles by people like Raymond Chen on what this type looks like now.


In order to access the string contents in the first place you need the pointer. The length is stored right next to it. So they're both going to be in the same cache line, assuming proper alignment. In the rare case in which they straddle a cache line, you just have to load once and then the length remains in cache for the remainder of the loop. (This is true regardless of where the length lives, in fact; as far as CPU cache is concerned it really makes little difference either way.)

(This is assuming SROA didn't break apart the string and put the length in a register, which it often does, making this entire question moot.)


Huh? The headers are either in registers or in stack. The top of stack is always in L1. There is no way in which this is inferior to handing over a pointer to a string and a length separately, other than requiring two additional words of storage in registers/stack.


How is that? Say you are reading 1000 lines of stdin at once to process them. Which registers are your string and substring headers stored.


If you are reading 1000 lines from stdin at once to separate Strings, you are already going to be accessing memory in 1000 places at the same time, and making it 1001 isn't meaningfully worse for your cache. (Implementation would be Vec<String>, which would lay out the 1000 headers contiguously.)

But I genuinely have a hard time understanding for what kind of workload I would ever do that. If you want to read a 1000 lines of stdin, and cannot use an iterator and must work on them at the same time, I would likely much rather read them into a single string and then split that into a 1000 &str using the .lines() iterator.


I was miffed at: 1000 lines from stdin. It’s the same problem 1000 times, not 1000 problems at once.


Presumably the idea is, for example, sorting? In which case you do have to read the entire input before you can do anything. But the way I'd do that is to read the entire stdin to a single String, then work with &str pointers to it.


If you really care about performance, you should not allocate within hot loops.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: