Re-arming existing callouts slightly optimized
After general testing, I narrowed down most of the missed-callout problems to my modification of callout_reset_on() in which I remove and reinsert (the old way) callouts that are pending and still on the callout queue.
EDIT: Well turns some interrupts are being missed, not callouts. :-\ Will have to find better ways to test.
Instead of removing and re-inserting the callouts, I’m now using the MINHEAP_KEY_CHANGE() macro to simply modify and rearrange the callout queue as appropriate. In the limited time I’ve been testing it, everything seems to be working much better. I still have doubts that some callouts are being delivered later than they should, though ktrdump(8) doesn’t show anything obvious. No callouts are being missed, though, and the wifi LED as well as the CPU frequency scaling are no longer getting stuck.
From the kernel trace dumps I also see that on a newly-booted system, the callout queue is pretty small, only about 65 callouts are pending at any given time. After starting X, a window manager, a browser and a few terminals, the queue becomes around 85 in size. With heavy network transfers going on, it can go up to 120 pending callouts. I’ve not been able to create a situation where the number of pending timeouts is higher. It seems like such a waste to pre-allocate c. 18000 (!! as my kern.ncallout key says at the moment) when only a couple hundred are going to be used at any given time.
I also re-tested the binary heap implementation, with over 1200 callouts in the queue and it passes all tests. So there can’t be any more bugs in it.
So barring any significant discoveries or malfunction, I consider this phase of the project completed. Next week I’ll be adding new callout api functions as well as (still) modifying kern_timeout.c to have my own locking scheme (having a queue implementation on which performing any operation leaves it in a consistent state means we can get away with much more finer grained locking). During this phase I’m hoping to implement some of the suggestions that rwatson has: more info on this mailing list post.
To wrap up, here is another example of why I dislike perforce:
/usr/home/pvaibhav/p4/calloutapi/src/sys$ p4 open kern/kern_timeout.c
Path ‘/usr/home/pvaibhav/p4/calloutapi/src/sys/kern/kern_timeout.c’ is not under client view ‘/home/pvaibhav/p4’
/usr/home/pvaibhav/p4/calloutapi/src/sys$ cd ~/p4/calloutapi/src/sys/
~/p4/calloutapi/src/sys$ p4 open kern/kern_timeout.c
//depot/projects/soc2009/calloutapi/src/sys/kern/kern_timeout.c#4 - opened for edit
Lame.