Xin Li pointed out a race condition in FreeBSD that exists in DragonFly too. Matt Dillon, on further inspection, found a deeper problem. I’m pasting his description here because he’s speaking another language – I don’t even recognize the acronyms he’s using.
Alan Cox responded with a link to the CMU Mach algorithm on this page as an example of a correct implementation.
“Alan, when I looked into this a bit more deeply I think there is an
even more serious problem which still needs to be fixed. The problem
is that the PTE entry on the foreign cpu may be loaded into the
foreign CPU’s TLB. When pmap_remove_all() recurses through removing
the pte from the various page tables it calls loadandclear(pte), but
this is not sufficient to synchronize TLB on the target cpu. The
TLB invalidate done later is far too late (and, in fact, there is no
way an asynch TLB invalidate could ever be used to solve this problem).I think there is a window of opportunity either (A) where a TLB entry
on the target cpu is dirtied and written back to memory (the page table)
*AFTER* the current cpu has cleared the entry, or (B) where the TLB
entry on the target cpu is valid and the target cpu modifies the page,
but races with our loadandclear() and gets entirely lost, resulting in
a dirty page winding up in the cache queue without either the software
or the hardware realizing that it’s dirty.(A) is the more serious scenario because PG_ZERO is set on page table
pages which are freed (after having been cleared by the pmap subsystem),
and if this race occurs the page in question will not in fact be
completely zero’d.I don’t think the TLB snoops the bus. If it doesn’t, then scenario (A)
can occur. If the TLB performs a RMW to update the dirty bit the page
table entry still winds up not being zero’d. If it performs a
writeback then the table entry we just zero’d could be updated by
a foreign cpu issuing a TLB writeback when going from clean->dirty.The problem appears to exist not only for shared page tables, but also
for private page tables which share the page in question. I don’t see
an easy solution… not even making the page read-only is sufficient
to prevent the race becaue we still have the TLB writeback problem
while trying to make the page read-only.”
The link supplied by Alan Cox was copied incorrectly. It has a trailing period that’s breaking the link.
The corroct url of the CMU Mach algorithm is:
http://www.cs.rochester.edu/u/www/courses/456/spring99/lecture/lecture9.html
instead of
http://www.cs.rochester.edu/u/www/courses/456/spring99/lecture/lecture9.html.
Fixed.