I got fed up with Docker having a "Virtual machine service" using like 4GB of memory (while running no containers) on my 8GB Mac so I investigated what I thought would be "native" (arm64) Lima.
The disconnect to me is weird. I'm pretty sure with qemu-system-aarch64, Debian/Alpine don't use more than 100-200mb of RAM sitting idle. Where does 6GB of RAM usage come from?
Not that I think it's a great use of time to optimize for this whatsoever. I just thought it'd be a fun exercise.
Its page cache[0]. A similar thing happens with wsl2 on Windows[1]. The issue is the linux kernel running in the lima-vm thinks all the ram you give the VM is free to cache files on, it doesn't know there is a host OS that could also make use of that ram. So after loading up a ton of containers where each has its own OS dependencies that ends up being a lot of files the kernel keeps cached so long as nothing else needs the memory. There isn't really a fix besides right sizing the amount of ram you give the VM. Or I guess you could disable file caching but I wouldn't recommend it[2].
To add to this, the hypervisor can do more to integrate with the guest so that this issue is mitigated somewhat. For example, with KVM and a Linux guest there's a lot of optimizations to share memory pages between the two.
For reference: https://access.redhat.com/documentation/en-us/red_hat_enterp....
I remember VMWare solved this with a balloon driver, that basically asks the guest for a lot of memory at a very low priority, so it has precedence over the page cache, but under any program that actually needs it.
The mechanism is called balooning and technically doesn't need cooperation on guest kernel side (but it makes everything much smoother). Not idea about HyperV, but it is implemented e.g. in virtio https://pmhahn.github.io/virtio-balloon/ . You can google some failed experiments to add it to qemu
Honestly, Apple either need to support native containerization in the MacOS kernel (allowing people to create MacOS native images) or help bring Linux to the MBP.
Honestly I hate using my Mac for container based development. I know it is "hypervisors so faster than VMs", but the workflow is so slow that when I do personal Dev work on my older £400 Lenovo with Linux it runs rings around my MacBook when dealing with containers.
Agreed, I too have been moving off MacOS for my workflow. Wouldn't it be great if we could take full advantage of the incredible battery life and single threaded performance of our MBPs?
I think Linux running flawlessly, with full hardware acceleration, all the drivers and equal or close battery life on a MBP would make it the best laptop money could buy. Until that day, it's a ferrari with square wheels
Because it's hard to talk about some of these things without getting someone trying to "correct" me on terminology.
Hypervisors are one method of running virtual machines, you also have classical virtualisation methods and when I bring up the virtual machines being slow, I always get someone who pushes up their glasses with an "well actually its a Hypervisor running it, so it's essentially native speed".
I find Linux is a much nicer development environment all around though.
No, it’s not being pedantic when you are using terminology in a confusing or incorrect way that leads to it being unclear exactly what you are complaining about.
What do you mean by “classical virtualisation methods”, actual CPU emulation so you can run e.g. PowerPC on x86?
There is nothing unclear about what I said, I understand that hypervisors are supposed to be fast "almost native speed!". I also know that running Docker/Podman in Linux on MacOS isn't fast for many reasons.
I'm also confused as to what you mean. The hypervisor is what's running the VMs, no? So I don't understand your original comment where you say "hypervisors so faster than VMs" either.
I don't think misnome is trying to be pedantic either, I think we're both unclear as to what you're trying to say.
Edit: I'm also dubious that hypervisors always offer "native" speed. To me this seems like it depends a lot on the hypervisor, the workload, the guest OS, etc.
But...VirtualBox is a hypervisor, too. It says so in their documentation. I'm sure using the built-in hypervisor framework in MacOS bring performance improvements, but they're changing one hypervisor for the other.
Anyway, taking a guess at what you're saying: do you mean full software virtualization vs. hardware-assisted virtualization? If so, I would agree with you that even best-case performance of HA VMs is not quite bare-metal; sometimes it's close enough to not matter much.
Hyperkit is a hypervisor. Virtualbox is a hypervisor. Hyper-V is a hypervisor. VMWare products are hypervisors. They are all hypervisors, a hypervisor is just the platform that manages the virtualisation (which is a CPU feature). What changes is what parts of the host system are emulated and the manner in which system peripherals are passed through. The only alternative to a hypervisor that you could mean is full-system emulation, e.g. some modes of QEMU where the CPU itself is emulated on top of another system. This is obviously slow.
Virtualised OS are obviously "slightly" slower because they now don't have 100% of a system to use, and usually require a dedicated chunk of the system RAM, but usually what people actually mean by saying they are slower is because often they don't have direct hardware peripheral access (e.g. their own dedicated network card and disk drives) and the hypervisor emulates some amount of the system. This is the case with the most accessible linux VMs that run docker, because simulating the block devices on top of the MacOS filesystem is slow.
Docker itself is none of these, it's a fancy chroot enabled by linux kernel features, so _requires_ running on top of linux or a sufficiently linux-like compatibility layer.
So you see why it's somewhat incoherent and makes it sound like you are talking out of your ass when you complain that Hyperkit is a hypervisor but "Virtual machines aren't", but all these horrible people keep disagreeing with you.
In my day job we are investigating dropping containers for the cases where it's used for simply having some setup that can be shared between developers and have the build work.
Currently trialing nix-shell (https://nixos.org/manual/nix/stable/command-ref/nix-shell.ht...) in the iOS team (this team doesn't need to deploy software to servers, so its a small trial for now to make sure certain build tools are useable and on the right version for our build), but see no reason why it couldn't extend to other similar workflows in other teams.
I'm already using NixOS on my private laptop (sticking with macOS for work), but Nix on other OS's keeps tempting me. We're working on an application that needs to pull a whole bunch of "weird" dependencies, from all over the place: a specific version of Python, PyPI, apt/homebrew, binary-only proprietary software, etc. So far the answer has always been Docker and a specific release of Debian/Ubuntu, but keeping the code running on the host OS is always desirable, because there are occasionally features that need to work with host hardware, like audio.
My problem with Nix is, just like Rust, while it provides a very high ROI, the barrier to entry seems very steep; and unlike Rust, the documentation still leaves a lot to wish for. I'd go all-in on it right now, but I'm (rightfully, I guess?) scared of Nix-unique problems that will take 10x the effort to address than the mess that we're currently dealing with.
I won't be the person to deny it can be tough to deal with issues. The documentation is vast but very shallow too, so you end up having to figure stuff our yourself.
Same thing here. We're trying to cut down on the amount of OS-level dependencies for each service, which makes them feasible to run locally with minimal fuss.
This doesn't make much sense. You're probably running your software on Linux servers, so having native containers in MacOS doesn't really do much (Also...there is sandboxing in MacOS/Darwin already). And bringing Linux to the MBP, if you want containers..is going to require a Linux kernel, so at the end you're still going to be running Linux VMs, no matter what. This is exactly what WSL2 is doing.
Yeah I have a recurring issue where docker desktop on Mac will use >100% cpu with no containers running (verified with docker ps and docker stats commands). I’ve tried all of the troubleshooting they have posted online. Currently got it down to ~40% cpu usage when idle, but no guarantee that it stays there.
Tried switching to podman with docker-compose and podman-compose, but can’t get the auth to work for private repos on dockerhub. So basically just accepting my fate of crazy idle memory and cpu usage.
Are you using the “new” file synchronization? The combination of that, the new virtualization framework that stabilized last year, and upgrading to a MacBook with an M1 Pro, the relatively large codebase I run daily zips along very quickly. It’s made up of:
- a PHP-fpm container running a 5m LoC codebase
- a nodeJS container running a webpack dev server instance on a 1m LoC frontend codebase
- 2 elasticsearch containers (one for logs and 1 for for app search)
- 2 Kibana container
- 1 logstash container
- A rabbitmq container
- a memcached container
- a redis container
- a Percona container
- 3-4 micro services each in there own container.
- an imgproxy container
All of this runs quite snappily in ~6 GB of RAM and I’ve never noticed any slowness while also running Edge, slack, VSCode, PHPStorm, zoom, and a bunch of terminal windows, and I could probably get that down a bit by consolidating the elasticsearch and kibana instances.
I spent some time digging into this for the fast, light, and easy-to-use replacement for Docker Desktop and (co)lima that I've been working on.
The root of the problem is that VM memory is allocated on demand, but never freed after it's used for the first time. In other words, once used, it can never be released. Since my app has a lighter userspace, it starts out using less memory than other VMs, but eventually reaches the memory limit given enough usage and time. (My optimized memory management setup means the VM works well with a lower memory limit than others, but it doesn't solve the fundamental problem.)
Linux has a feature called "page reporting" to report chunks of memory that are no longer used to the hypervisor, which can then drop the to reduce usage on the host side. WSL 2 actually uses this feature, but I suspect it becomes less effective with longer VM uptime because memory becomes more fragmented over time. Since Hyper-V has been limited to dropping contiguous 2 MiB chunks of memory until recently [1], fragmentation is likely the reason many users report high memory usage. Page cache is definitely a contributor as well, but a much easier one to fix. It looks like Microsoft is working on the problem with page reporting.
Although Apple's Virtualization.framework doesn't support page reporting, I was able to implement it with some workarounds and confirmed that this works with QEMU on Linux. Unfortunately, while free memory is correctly reported to macOS, nothing actually seems to get freed. I'm planning to report this to Apple because it seems like memory ballooning (essentially a more limited and primitive version of page reporting) doesn't work as documented, whether it's Virtualization.framework or another VMM like QEMU. If/when Apple fixes this, it'll be possible to reduce memory usage significantly. Details from my investigation into what's going on with memory management on the XNU side: https://twitter.com/kdrag0n/status/1612309883411640321
The good news: From my testing, the issue isn't as bad as it appears. The "free" memory tends to compress quite well, so XNU's memory compression does a good job at taking care of it when you're actually running low on memory.
(Shameless plug on this topic: The app I'm working on already has quite a few improvements over others: fast networking (30 Gbps), VirtioFS and bidirectional filesystem sharing, Rosetta for fast x86, full Linux (not only Docker), lower CPU usage, native Swift UI, and other tweaks. Email in bio for waitlist. Details to avoid spamming this thread: https://news.ycombinator.com/item?id=34374176)
> This feature is powered by a Linux kernel patch that allows small contiguous blocks of memory to be returned to the host machine when they are no longer needed in the Linux guest. We updated the Linux kernel in WSL2 to include this patch, and modified Hyper-V to support this page reporting feature. In order to return as much memory to the host as possible, we periodically compact memory to ensure free memory is available in contiguous blocks. This only runs when your CPU is idle. You can see when this happens by looking for the ‘WSL2: Performing memory compaction’ message inside of the output of the dmesg command.
I didn't realize they already had triggers for compaction, thanks for sharing! I suspect fragmentation is still a major issue (along with page cache management, which DAX should help with), so it'll be interesting to see if/how Microsoft improves memory management.
I tried this k8s https://github.com/lima-vm/lima/blob/f7e7addab557da560da7146... example thinking it'd be a thin wrapper around QEMU.
6gb+ usage RAM with nothing deployed lol
The disconnect to me is weird. I'm pretty sure with qemu-system-aarch64, Debian/Alpine don't use more than 100-200mb of RAM sitting idle. Where does 6GB of RAM usage come from?
Not that I think it's a great use of time to optimize for this whatsoever. I just thought it'd be a fun exercise.