> Create an operating system specifically for the database and make it so you boot the database.
(Others downthread have pointed out unikernels and I agree with the criticisms)
This proposal is an excellent Phd project for someone like me :-)
It ticks all of the things I like to work on the most[1]:
Will involve writing low-level OS code
Get to hyper-focus on performance
Writing a language parser and executor
Implement scheduler, threads, processes, etc.
Implement the listening protocol in the kernel.
I have to say, though, it might be easier to start off with a rump kernel (netBSD), then add in a specific RAW disk access that bypasses the OS (no, or fewer, syscalls to use it), create a kernel module for accepting a limited type of task and executing that task in-kernel (avoiding a context-swtich on every syscall)[2].
Programs in userspace must have the lowest priority (using starvation-prevention mechanisms to ensure that user input would eventually get processed).
I'd expect a non-insignificant speedup by doing all the work in the kernel.
The way it is now,
userspace requests read() on a socket (context-switch to kernel),
gets data (context-switch to userspace),
parses a query,
requests read on disk (multiple context-switches to kernel for open, stat, etc, multiple switches back to userspace after each call is complete). This latency is probably fairly well mitigated with mmap, though.
logs diagnostic (multiple context-switches to and from kernel)
requests write on client socket (context switch to kernel back and forth until all data is written).
The goal of the DBOS would be to remove almost all the context-switching between userspace and kernel.
[1] My side projects include a bootable (but unfinished) x86 OS, various programming languages, performant (or otherwise) C libraries.
[2] Similar to the way RealTime Linux calls work (caller shares a memory buffer with rt kernel module, populates the buffer and issues a call, kernel only returns when that task is complete). The BPF mechanism works the same. It's the only way to reduce latency to the absolute physical minimum.
(Others downthread have pointed out unikernels and I agree with the criticisms)
This proposal is an excellent Phd project for someone like me :-)
It ticks all of the things I like to work on the most[1]:
Will involve writing low-level OS code
Get to hyper-focus on performance
Writing a language parser and executor
Implement scheduler, threads, processes, etc.
Implement the listening protocol in the kernel.
I have to say, though, it might be easier to start off with a rump kernel (netBSD), then add in a specific RAW disk access that bypasses the OS (no, or fewer, syscalls to use it), create a kernel module for accepting a limited type of task and executing that task in-kernel (avoiding a context-swtich on every syscall)[2].
Programs in userspace must have the lowest priority (using starvation-prevention mechanisms to ensure that user input would eventually get processed).
I'd expect a non-insignificant speedup by doing all the work in the kernel.
The way it is now,
userspace requests read() on a socket (context-switch to kernel),
gets data (context-switch to userspace),
parses a query,
requests read on disk (multiple context-switches to kernel for open, stat, etc, multiple switches back to userspace after each call is complete). This latency is probably fairly well mitigated with mmap, though.
logs diagnostic (multiple context-switches to and from kernel)
requests write on client socket (context switch to kernel back and forth until all data is written).
The goal of the DBOS would be to remove almost all the context-switching between userspace and kernel.
[1] My side projects include a bootable (but unfinished) x86 OS, various programming languages, performant (or otherwise) C libraries.
[2] Similar to the way RealTime Linux calls work (caller shares a memory buffer with rt kernel module, populates the buffer and issues a call, kernel only returns when that task is complete). The BPF mechanism works the same. It's the only way to reduce latency to the absolute physical minimum.