[Prism54-devel] [Bug 60] New: 2.6 modprobe oops: request_irq/interrupt race
bugzilla-daemon@mcgrof.com
bugzilla-daemon@mcgrof.com
Wed, 25 Feb 2004 08:45:48 +0000 (UTC)
http://prism54.org/cgi-bin/bugzilla/show_bug.cgi?id=60
Summary: 2.6 modprobe oops: request_irq/interrupt race
Product: prim54
Version: 1.0.2.2
Platform: ia32
OS/Version: Other
Status: NEW
Severity: normal
Priority: P2
Component: Kernel patches
AssignedTo: prism54-devel@prism54.org
ReportedBy: vda@port.imtp.ilyichevsk.odessa.ua
I've reported this on a mailing list before.
Since it happened to me only when I flashed 'newer' BIOS,
I simply reverted BIOS then.
Now, I am able to trigger this on another box
under specific conditions: VESA VGA framebuffer console
800x600, 256 colors. In a text mode console or even
VESA framebuffer of higher res it does not happen.
But I am pretty confident it's a driver bug, because
all stack traces show basically the same thing happening:
an interrupt from card strucking us right inside
rvalue = request_irq(pdev->irq, &islpci_interrupt,
SA_SHIRQ, ndev->name, priv);
(in islpci_hotplug.c)
I did not write down all stack traces I saw,
but all of them contained this call sequence:
prism54_probe+n
request_irq+n
islpci_interrupt+0 (a parameter on stack)
setup_irq+n
common_interrupt+n
do_IRQ+n
handle_IRQ_event+33/60
islpci_interrupt+293/510
I thougt about testing 2.6.3 with newer snapshot
(I have it already compiled). Since this is a race,
I might stop seeing this not because it's fixed,
but only because timing has subtly changed. :(
For completeness, here is an excerpt from my first mail
with trace copied from screen by hand:
driver_attach+n
bus_add_driver+n
driver_attach+n
bus_match+n
pci_device_probe+n
__pci_device_probe+n
pci_device_probe_static+n
islpci_interrupt+0 (+0: seems like a parameter on stack, not a ret addr)
prism54_probe+n
request_irq+n
islpci_interrupt+0
setup_irq+n
common_interrupt+n
do_IRQ+n (an interrupt struck us?)
handle_IRQ_event+33/60
islpci_interrupt+293/510
islpci_eth_receive+2f6/4b0
netif_rx+a4/190
netif_rx+a4/190 (second one is due to printk("(from %p)", NET_CALLER(skb)).
see below)
Code: 0f 0b ..... (thats a BUG)
Here's how it happened:
islpci_eth.c
============
int
islpci_eth_receive(islpci_private *priv)
{
...
/* the device has written an Ethernet frame in the data area
* of the sk_buff without updating the structure, do it now */
index = priv->free_data_rx % ISL38XX_CB_RX_QSIZE;
size = le16_to_cpu(control_block->rx_data_low[index].size);
skb = priv->data_low_rx[index];
...
if (discard)
dev_kfree_skb(skb);
else
netif_rx(skb); <=====================
dev.c
=====
int netif_rx(struct sk_buff *skb)
{
....
drop:
__get_cpu_var(netdev_rx_stat).dropped++;
local_irq_restore(flags);
kfree_skb(skb); <=======================
return NET_RX_DROP;
}
skbuff.c
========
void __kfree_skb(struct sk_buff *skb)
{
if (skb->list) {
printk(KERN_WARNING "Warning: kfree_skb passed an skb still "
"on a list (from %p).\n", NET_CALLER(skb));
BUG(); <===============
}
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.