3

Was the NE2000 Really That Bad?

 3 years ago
source link: http://www.os2museum.com/wp/was-the-ne2000-really-that-bad/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Was the NE2000 Really That Bad?

Over the last few months I have been on and off digging into the history of early PC networking products, especially Ethernet-based ones. In that context, it is impossible to miss the classic NE2000 adapter with all its offshoots and clones. Especially in the Linux community, the NE2000 seems to have had rather bad reputation that was in part understandable but in part based on claims that simply make no sense upon closer examination.

A genuine Novell NE2000 card (1992) with DP83901

First let’s recap a bit. In late 1986, National Semiconductor introduced the DP8390/91/92 chip set including a complete Ethernet controller, encoder/decoder, and a transceiver. The DP8390 NIC was a relatively simple design, not as advanced as the Intel 82586 or AMD LANCE, but significantly more capable and cheaper than the low-end offering of the era, the 3Com 3C501 EtherLink.

National Semiconductor (NS) published a reference design labeled DP839EB (EB for Evaluation Board); Application Note AN-479 described the board (see page 134 in the 1988 databook PDF).

The DP839EB was a short 8-bit ISA card with 8Kx8 SRAM, AUI and BNC (aka Cheapernet) connectors, as well as a RJ-45 connector that required an optional StarLAN daughterboard to work (StarLAN can be thought of as a somewhat different and proprietary forerunner of Twisted Pair Ethernet). NS and Novell worked together to support the DP8390 in NetWare, and NS encouraged OEMs to build NetWare-compatible cards.

The DP839EB reference design could use PC 8237-style DMA to transfer data to and from the NIC, but could also use programmed I/O (PIO) instead if DMA was unavailable or undesirable.

In early 1987, two products based on the DP3890 appeared on the market: Western Digital EtherCard/StarCard Plus (WD8003E/WD8003S) and Novell NE1000. Both were similar to the DP839EB reference design in that there was not a lot of additional logic surrounding the DP8390 chip, but neither design was exactly the DP839EB. In late 1987 or in 1988, the two boards were joined by a more complex design based on the DP8390, the 3Com EtherLink II (3C503).

P3300081-640x480.jpg
A 1989 3Com EtherLink II (3C503) with DP8390C

The WD8003E used strictly shared memory to move data to and from the card. The entire 8K SRAM was mapped in the host’s address space, and additionally required 32 bytes of I/O port space and an IRQ. The EtherCard Plus list price was initially $399.

In contrast, the NE1000 did not map its onboard SRAM into the host’s address space, but likewise removed the DMA support, and only supported PIO transfers that utilized the DP8390’s Remote DMA feature. That meant the NE1000 only required 32 bytes of I/O port space and an IRQ; no other resources were needed. The NE1000 was priced at $495 at introduction (3Com’s EtherLink 3C501 cost $595 at the time) but almost immediately dropped to $395 (just below the WD8003).

The comparison between the WD8003 and NE1000 is interesting. WD opted to use shared RAM which is faster but significantly more problematic to configure, and that was especially the case with the coming wave of 386 memory managers. Novell went in the opposite direction, choosing somewhat slower PIO but completely avoiding any configuration issues with shared memory.

Both Novell and WD decided to drop DMA support, probably because it avoided yet another source of configuration conflicts and because especially on PC/AT class systems, DMA was slower than either PIO or memory anyway.

It is important to keep in mind that the existing competition for the WD8003E and NE1000 weren’t fancy adapters like the 3Com 3C505 EtherLink Plus or the Exos 205T but, first and foremost, the cheapest Ethernet option available, the 3Com 3C501 EtherLink. And both the NE1000 and the WD8003E beat the 3C501 hands down, because they had a much bigger packet buffer (8K vs. 2K) and did not have the awful 3C501 limitation of having to switch between mutually exclusive transmit, receive, and host access modes.

In 1988, the NE2000 appeared. It was essentially a 16-bit version of the NE1000 with support for 16-bit AT bus (but still capable of working in 8-bit slots) and two SRAMs in an 8Kx16 configuration. This doubled the onboard memory capacity and enabled both the DP8390’s internal bus and the NE2000’s external ISA bus connection to use 16-bit transfers, significantly improving the speed at which the host could communicate with the adapter.

Western Digital released a similarly upgraded WD8013E (EtherCard Plus 16) with a 16-bit ISA interface and 16KB of onboard RAM.

On the part of Novell, the motivation was clearly not to make money on hardware but sell more software. In 1991, Novell let Anthem/Eagle take over the network card manufacturing and distribution; after all, the hardware business was something Novell wanted to get out of, not into. The main purpose of the NE1000 and NE2000 was to drive the prices of networking hardware down, and it did just that.

For anyone building a LAN in the late 1980s, putting expensive $800 “intelligent” adapters into client machines made zero sense. And choosing between a $500 3C501 EtherLink and a $400 NE1000 really was not much of a choice, and it’s no coincidence that the 3C501 vanished from the market pretty quickly, with the DP8390-based 3C503 taking its place.

P3300084-640x480.jpg
A 1988 Taiwanese NE1000 clone (CNET LKT-N100E)

As a side effect of its low price and reasonable performance, the NE2000 became the mainstay of PC LANs in the late 1980s and early 1990s and was used in many NetWare servers and countless client machines (as evidenced by numerous contemporary software reviews). It was supported by just about every PC networking package, and that in turn encouraged a lot of cloning. I believe that was the real cause of NE2000 dislike among Linux developers and users.

Curious Claims

Let’s see if we can break down the mixture of unsourced claims and outright nonsense that made it to everyone’s most reliable source of facts, Wikipedia: “In order to create these [NE1000 cards] at minimal R&D, engineering and production costs, Novell simply implemented, almost verbatim, a prototype design created by National Semiconductor using the 8390 Ethernet chip. National Semiconductor, for its part, had no qualms about the use of the design; the use of National Semiconductor chips made the proposal almost pure profit. However, since the design had been intended only as a proof-of-concept prototype, it implemented bare-minimum functionality: PIO was used instead of DMA, no buffering was provided and no provision was made for the use of a transceiver.”

The first sentence is more or less accurate, but makes it sound like Novell found a half-baked prototype design lying around, stripped it of anything useful, and started shoving it down the throats of unsuspecting users. The reality is that NS and Novell clearly worked together on software support before even the NE1000 was released, and that DP839EB “prototype design” was closer to what’s nowadays called a “reference design” where OEMs are free to make modifications but there’s nothing fundamentally wrong with the reference design as it is.

Now let’s take a look at the ostensibly technical part of the claims: “[S]ince the design had been intended only as a proof-of-concept prototype, it implemented bare-minimum functionality: PIO was used instead of DMA, no buffering was provided and no provision was made for the use of a transceiver.”

The suggestion is clearly that the DP839EB design was so lame that it didn’t even support DMA, except that’s not even true: Anyone can look at NS’s Application Note AN-479, DP839EB Network Evaluation Board, and see that it did support DMA. In fact even the initial revision of the NE1000 supported DMA, or at least had jumpers to configure it.

And yet both Novell and Western Digital took DMA out because it had questionable benefits and made configuration more difficult. Indeed PIO was used instead of DMA… because no one wanted to deal with DMA.

As for the claim that “no buffering was provided”, it is quite mystifying. The NE1000 had 8K of onboard SRAM and the DP8390 had additional on-chip FIFO. If that does not count as buffering, what does? It’s difficult to not classify that claim as pure nonsense.

The last bit, “no provision was made for the use of a transceiver”, is similarly suspect. Novell’s 1989 NetWare Installation Supplement describes the settings of two NE1000 models, Assy. #950-054401 and a newer Assy. #810-160-001, but they both (as well as the NE2000 described in the same manual) have BNC and AUI connectors, which means there’s one onboard transceiver for BNC and a way to connect an external AUI transceiver. The DP839EB likewise had both BNC and AUI connectors. It is theoretically conceivable that there was some completely unknown early AUI-less variant of the NE1000, but it is vastly more likely that the claim is just plain wrong like the other nonsense in the Wikipedia article.

What it’s Really About

A modified version of the “NE2000 is horrible” claims can be found e.g. here. Much like the talk about the 3C501 being awful makes sense as soon as one starts pretending that the 3C501 is a design from 1992 and not 1982, the criticisms of the NE2000 make much more sense if one pretends that Novell tried to sell it as a high-performance Ethernet adapter in 1995.

Again there are highly dubious claims such as that the NE1000/NE2000 had “no method for selecting a transceiver”, which is only true if the jumper block on the card (the standard method at the time) does not count.

The screed also requires one to believe that there were so many Taiwanese NE2000 clones because the NE2000 was the worst design ever and… that’s why everyone wanted one. I suppose that logic makes some sense in a world where everyone not running Linux is by definition an idiot, because what other reason could there possibly be for not running Linux?

But then we get to the real gist of the hate for the NE2000: “Proprietary-OS users didn’t care about those incompatibilities, since they used custom driver preloads in their hard drives as delivered by the OEM, or used custom driver diskettes. Linux/BSD users, by contrast, tended to have a rough time since they tended to (rather naively) assume that an NE2000 clone should routinely work with the standard ne.c + 8390.c driver.”

That actually makes a lot of sense. There were definitely many NE2000 more-or-less compatibles, and many of them used clones of the DP8390 chip rather than the original. And many of those clone chips were different enough that that code written for the DP8390 might break. The first of those clones was probably Western Digital’s WD83C690 and it already introduced several incompatibilities that happened to not matter to WD’s own drivers.

Even if a clone card used a genuine DP8390 chip, its PROM or I/O port behavior could be just different enough that a driver written for the NE2000 might not work.

Curiously, even among the Linux folk, there was disagreement on the merit of PCI-based NE2000 clones. While some said that “PCI NE2000 clones are a bad idea“, others considered them “good news“. The first point of view was based on the fact that there were many designs much better suited to PCI than the DP8390, the second argued that unlike ISA-based NE2000 clones, the PCI ones at least were likely to work with Linux. There’s a lot of truth to both of those viewpoints.

There are other gems out there, like this page which claims that “Ne2000 is not technically a card, it is a standard that several implementors follow”. If you look at it like that, then the Sound Blaster wasn’t a card either, and the IBM PC was just a standard that several implementors including IBM happened to follow by sheer coincidence. Which only makes any sense if one decides to completely ignore what what was the cause and what was the effect.

There are also other views, such as this one here: “Like many NatSemi DP8390 based NICs (WD8XXX and many others) [the NE2000] performed decently well with FreeBSD for those times, and it was widely available and quite cheap. Basically, Novell kick-started the PC networking era by throwing that thing out to the masses, essentially at cost. The 8390 and its clones were the Realteks of the ISA era, and did a way better job in that role than Realtek did ever since.” Even though I’m not entirely sure if it’s meant to be praise or criticism, it’s not wrong.

Literary Criticism

Mostly out of curiosity I started reading the source code of the Linux NE2000 driver. The heart of it is really a driver for the National Semiconductor DP8390 chip which is the shared by drivers for more or less all cards based on the DP8390 and its clones: Novell NE2000, Western Digital WD8003, 3Com 3C503, and numerous others.

What I read in the source code was… not terribly confidence-inspiring. I found several problems that are fairly obvious if one looks for them but are also easy enough to overlook. The identified issues all relate to the receive path of the DP8390 driver, which—given the chip architecture—is significantly more tricky to implement than the transmit path.

For whatever reason, the Linux driver does not do things by the book, and there is a book. The Linux driver acknowledges receive interrupts as the last thing it does, which is simply backwards. It is important to keep in mind that the driver code which reads received packets from the DP8390’s ring buffer inevitably races the chip which may be receiving new packets at the same time. For that reason, it is necessary to acknowledge (clear) interrupts first, and then remove all received packets. The hardware works in the opposite order and first updates all of its state and writes to memory, and then raises an interrupt. That way, there could be a spurious interrupt for an already-processed packet but nothing will be missed.

The Linux driver does things in the wrong order and risks that the hardware receives a new packet and sets the interrupt status register in the window between the Linux driver removing previously received packets and clearing the interrupt status register. If that happens, the receive interrupt will be lost and Linux will not be aware that another packet was received. In many cases, another packet will arrive very soon and “fix” things by triggering another interrupt. But if the overlooked packet happens to be the last in a sequence, it will be stuck in limbo until something like a retransmission causes the driver to notice it. Depending on the upper layer protocols, that might only cause a slight delay or cause significant confusion. TCP/IP is quite sensitive to lost or duplicated packets.

There is a related bug in the Linux driver in that it only receives at most 9 packets per interrupt and then simply declares that it’s done receiving and clears the receive interrupt status. If more than 9 packets happen to have been queued up and no further packets arrive, the driver might again fail to notice one or more already received packets. It is not at all clear to me what this strange limit on the number of packets processed at a time was meant to solve, and the source code offers no hint either.

There was yet another somewhat related bug that did actually get fixed. In Linux 1.2, one could find the following code at the end of ei_receive():

    /* Bug alert!  Reset ENISR_OVER to avoid spurious overruns! */
    outb_p(ENISR_RX+ENISR_RX_ERR+ENISR_OVER, e8390_base+EN0_ISR);

That was changed in Linux 1.3.47 to the following (excerpted from Linux 5.11):

/* We used to also ack ENISR_OVER here, but that would sometimes 
   mask a real overrun, leaving the 8390 in a stopped state
   with rec'vr off. */
	ei_outb_p(ENISR_RX+ENISR_RX_ERR, e8390_base+EN0_ISR);

Well duh—if you acknowledge interrupts without handling them, bad things are bound to happen. If the overflow warning interrupt is acknowledged for no good reason, it is guaranteed that under some conditions the real interrupt will be missed (and since it will prevent further receive interrupts from happening, the receive logic will be stuck). It is also fascinating how the comment went from “Reset ENISR_OVER to avoid spurious overruns” to more or less the exact opposite.

The issue with clearing interrupts only after working through the receive ring (instead of before) was never fixed and survives in Linux 5.11. Again, the correct recipe was given in the Writing Drivers for the DP8390 NIC Family of Ethernet Controllers Application Note, but Linux chose to ignore that.

In contrast, for example the NetBSD 0.9 if_ed driver (1993) does not have these problems.

It is difficult to judge how often the deficiencies in the Linux driver described above caused user-visible problems. I am certain they did cause trouble sometimes, but perhaps rarely enough to not make a real difference. And they can’t have caused major problems if they went unfixed for so long.

But it does make one wonder if perhaps the bad name the NE2000 got among Linux users was caused in part by sub-optimal Linux drivers for it.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK