site banner

Friday Fun Thread for November 11, 2022

Be advised; this thread is not for serious in depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

3
Jump in the discussion.

No email address required.

Aaaaaaaaaa! I fucking did it!

Last week I mentioned there was this really pernicious issue on the NuXT I ordered. If the VGA card boots in monochrome mode, it appears that the system fails to boot. You'll see a cursor zipping across the screen, but no text. Then the XT-IDE bios prints it's messages, and then seemingly nothing. Except it's not nothing. At the bottom of all the empty space, the cursor sits in the 5th position at the bottom of the screen, blinking. If you type, you can see the blinking cursor move, but no characters appear.

A lot of forum post describe this behavior as failing to boot. But I wagered the system had booted, and was responsive. But something was corrupted in how it output to the display. My first question was, why the XT-IDE bios, and nothing else? Looked through the source code for that project, which was quite labyrinthine let me tell you. Layers upon layers upon layers of defines obfuscating what the actual code doing anything is. Eventually I find, it's not using any BIOS functions to print to the screen. It's reading the location it should print to from the BIOS Data Area, and just directly printing characters there. So I know that works.

The 8088 BIOS by Sergey Kislev is also open source. And looking through it's printing functions, it mostly uses Int 0x10, Function 0x0e, teletype output. So why might it not be working? I wrote a utility to dump the BIOS Data Area, and ran it in a good state versus a bad state and compared the results. Unfortunately, a serious case of survivorship bias confounded these efforts, more on that later.

I found some utilities that could kick the system back into a good state. But looking over it's source code, nothing obviously seemed to directly address the problem. The values it changed in the BIOS Data Area did not look misconfigured in the good state versus the bad state. And yet it still worked. So another dead end. Something in the VGA BIOS functions it called was rescuing it from the bad state. Which means venturing into the VGA BIOS.

But first, as aside about the good state versus the bad state. The bad state was the VGA card detecting a monochrome monitor, and initializing the system in Graphics Mode 7, 80x25 Monochrome. The good state was the VGA card detecting a color monitor, and initializing the system in Graphics Mode 3, 80x25 16 color. So we're dealing with different detected capabilities and different graphics modes. I probably should have started off comparing the BDA in Mode 7 working versus Mode 7 not working, but alas, it took me longer to get there. Onto the BIOS.

I wrote a utility to dump the VGA ROM, as well as reinitialize the card by doing a far call on c000:0003, which is apparently the entry point for initializing a VGA card using it's ROM. That worked, and I could re-initialize the card as much as I wanted, and doing this also resulted in a functional state. Even in monochrome mode, which it would randomly boot into. This is a hardware issue, near as I've researched. The Trident 9000i chip, and many old VGA cards, expect a certain pin on the VGA cable to either be grounded or open for a color monitor versus a monochrome monitor. Newer monitors use that pin as a data pin. It just compounds the bug in the 8088 bios where it's borked with a VGA card in monochrome mode.

At this point, I felt like I had little choice but to begin stepping through the disassembled VGA BIOS I had dumped. Found an online disassembler which worked rather well. After several days, I eventually discover it's setting, and then looking at, the equipment list in the BIOS Data Area, 0040:0010. It sets bits 5 and 6 depending on the display properties it detects. In monochrome mode, it gets ORed with 0x30. It then tests that byte against 0x30 every time it interacts with video memory to determine if it should be writing to segment 0xb800 or 0xb000. 0xb000 is only used in monochrome mode 7. Furthermore, zeroing in on the suspicious bits, I can observe them being set to 0b10 or 0b11 when I change to graphics mode 3 or 7 using utilities, assiting the VGA bios in determining it's active segment of memory.

Sure enough, knowing this I double check the dumps of the BIOS Data area in the good state versus the bad state, and realize the equipment bytes are misconfigured. Bits 5 and 6 are cleared in both, when they should both be set to 0b11 monochrome mode. I came to find out even in the functional color mode, it's misconfigured, and should be 0b10. It's only by good fortune that things worked at all with that particular VGA bios, since the it was only checking for 0b11 (at least that I saw), and then resorting to default behaviors for color mode outside of monochrome mode. If it were ever explicitly checking for 0b10 (80x25 color) or 0b01 (40x25 color), I expect more display functions would be going off the rails. At long last I found my culprit.

Back to the 8088 Bios, after initializing the VGA card, and the card detecting a monochrome monitor, it then wipes those bits believing the VGA BIOS does not need them. The author asserts only cards without a BIOS, CGA or MDA, should be utilizing them. Then it immediately attempts and fails to print the copywrite message to the screen. Which helps a lot in really conclusively isolating the problem, since there is literally only that single instruction clearing bits in the BDA between initializing the VGA card and attempting to print to the screen.

After reading the innards of the VGA ROM, the problem here was plain as day. A single semicolon commenting out the bit clearing instruction fixes the problem. I flashed the BIOS on my system, and the problems completely vanished with no discernable side effects. Then began my trials in getting people to believe me.

I haven't received a response on github, by on vogons the reaction has been hostile. People "explaining" to me that everyone knows old VGA cards randomly boot into monochrome mode. Yeah, I gathered that, but that doesn't explain why the 8088 BIOS fails to print to the screen when monochrome mode happens! One guy went hard that the relevant equipment bits must be set to 00, the "Pink Shirt Book" says so. I go read the relevant sections, and it doesn't say so. He goes onto insist that the initial video mode bits in the equipment list should never change like I observe, the "Pink Shirt Book" say so. The book actually says software can and will change it all the time. Still, books aside, what do other systems do?

I tested 3 other systems, a Pentium 233 with a Riva 128, a K6-2 with a Geforce 2 MX, and a Pentium III with a Geforce 2 GTS. All of them initialized the video mode bits to 0b10 for 80x25 Color Mode, not 0b00 for EGA+. And all of them changed those bits to 0b11 for 80x25 Monochrome mode when I switched it to Mode 7. I also perused the source code for another open source IBM PC/XT BIOS, GLaBIOS, and while I haven't flashed a system with it to test experientially, the code appears to initialize the VGA ROM, and then leave the initial video mode bits completely alone. In fact, it even tests if they are 0b00, and considers that an error!

I guess it's entirely possible I'm still wrong somehow. But I increasingly doubt it.

Congrats, man! Sounds like some solid troubleshooting work, I'm glad you were able to get it!