Issues with hardware: LAPIC et al

I did some more “naive” testing: using the system for general and not-so-general tasks, trying to identify any ill effects. The first problem I encountered (which I had been simply enduring for 2+ months) was sluggish system response. Turns out it was because my powerd configuration was set to adaptive instead of hiadaptive. I had always thought the sluggish response was due to bad Intel xorg drivers. Anyway, with that out of the way, and a relatively fast system (graphics as bad as always, however), I resumed testing other parts.

First off I noticed that the periodic blinks of the wireless card LED were sometimes taking longer than usual, and sometimes they were totally stuck in the “on” state. I had never paid attention to this earlier, but with the new kernel and new callout system, I was trying to find every single clue that might point at a problem.

I found something very frustrating: The local APIC, the hardware that delivers the preferred timer interrupts, apparently loses ticks when the processor is in a deep sleep state (C-state >= C2). As if a variable TimeStamp Counter (TSC) losing ticks in slower P-states was not enough trouble on a Pentium M, even the LAPIC had to lose ticks too — in C-states.

After setting the lowest C-state to only C1 (hw.acpi.cpu.cx_lowest=C1), and the corresponding increase in CPU temperature in this scorching 43 degree C Indian summer, the wireless LED started behaving.

I also made a few minor changes in softclock() -

  1. Reverted to using a cached value of cc_monoticks (our per-CPU monotonically increasing tick counter — I could just use the global variable ticks but I decided to keep the timers per-CPU. This is kinda irrelevant now as this whole infrastructure will be replaced soon). Earlier I had switched to using cc_monoticks directly, despite the fact that it could be increased by hardclock when it calls callout_tick(). The reasoning was that softclick() expires present and past callouts, so even if cc_monoticks had been increased while we were still processing the callouts, any newly-“missed” callouts would still get expired anyway, and this will save us having to reschedule softclock() again on the next HW tick. However, I changed to using cached value and allowing reschedules, because: not doing so would mean softclock() could potentially run for a long duration, in case it gets behind hardclock, and will continue “chasing” newly-missed callouts. Since during this process, the cc_lock is held, it makes things really nasty. So: idea dropped.

  2. I put back the idea of checking how man callouts softclock() has to examine during one tick, and temporarily unlocking and relocking cc_lock to “give interrupts a chance.” It’s made easier for us as in general, the callout queue cannot be changed in such a way that unlocking/relocking will make resuming normal operation impossible. In other words, even if during the time we drop the cc_lock, some other thread inserts or removes a callout, the callout queue will still always be in a consistent state (because each insertion/removal will preserve the heap property). There are no links and pointers to manage — internally the heap uses an array and each callout knows its own “selfindex.” So no harm done. However, the important part is that we must extract head only after we have reacquired the callout queue lock, which brings me to the last change :

  3. I made a mistake when adding the previous feature: I made softclock() extract the head first, and then drop the callout lock temporarily. This was wrong order, since during the time the lock was dropped, someone could have inserted a callout which ended up as the head, so the head we already extracted was no longer the “real” head. So I changed it to do this dropping/relocking business first, and only extract the head after we have reacquired the lock. In other words, only operate on a callout while holding the callout queue lock.

I’ll continue doing more casual tests, although finally I have to devise a proper plan to make sure things are working before I can move on and start adding new timer hardware interfaces. I’ve just read up on DTrace and am thinking of ways to use it for testing the callout system (which already implements a couple of DTrace probes). But I’m also planning to continue using ktr(9) — as the old saying goes

Real programmers don’t use Dtrace, they use printf(..)

Hehe.

Comments (View)
blog comments powered by Disqus