Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>So the usual linked list gets appended a few times is close to optimal, iteration has the same cost as in lower level languages with pointer chasing (usually a bit more because objects usually have headers that take quite some space, less of them potentially fitting into cache). It even has some advantage in that a modern compacting GC may move the objects close to each other.

I wonder why array-backed linked lists aren't more of a thing. I've used those before in C in hobby projects. You allocate an extendable array of nodes and the nodes next and/or prev pointers become indices in the array (although pointers would still work). Added nodes go on the end, deleted nodes stay where they are and just aren't referenced. If there is high churn, do something to keep track of inactive nodes to fill next. You're still spamming around an area of memory when iterating the list, but it's at least confined to one guaranteed contiguous block and you're only allocating to extend the array and not doing a lot of allocating and freeing of memory. You still have the memory overhead of the next/previous indices, though, so fewer nodes fit in cache than otherwise would.



Agreed, especially for high level languages and application level linked list usage. Worth mentioning I think the good linked list libs usually do this under the hood, and kernel & allocator level linked lists are always doing this. People implementing simple linked lists themselves, and/or copy-pasting the kind of code you get by searching for simple code examples (like the link in my sibling comment here), that's when they end up with very poor performing linked lists.

A closely related concept to what you're describing is internal storage linked lists [1], meaning the class/node contains the link(s) and is specialized for one particular kind of data, which often makes it more compact, and easier to allocate a block of nodes in a single call.

Anyway, I'd guess a big reason it's harder to find array backed lists when searching for linked list implementations is just because it's more advanced and may introduce memory management topics into articles and blog posts that the author doesn't want to cover. But I agree, once you use array-backed lists, the common examples of linked lists you can find online where new is used twice per node, once for a generic node, and once for the payload, seems ridiculously wasteful. This is the kind of thing I meant earlier - best practice would be to avoid a naive bad linked list even in code that isn't a hot spot. Time doesn't have to be spent optimizing it, and it doesn't need to appear as a hotspot, to know it should be avoided.

[1] https://en.wikipedia.org/wiki/Linked_list#Internal_and_exter...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: