Monday, September 14, 2020

33209: Rocks -- Moving Ahead

Chapter 14.1 Rocks -- Bit Multiply

Chapter 14.2: Rocks -- Moving Ahead

[Please pardon the layout change. Google is being the 800 pound prima donna and making all blogspot users use a buggy blog editor now.]

"Are you sure that there's nothing in what you and Julia just went over that you won't be claiming as IP?" Bill wrinkled his forehead.

I shrugged. "Trying to claim IP on this kind of thing is only one step beyond trying to claim IP on binary addition. No real circuitry to base claims on, nothing but ideas and math."

(We won't mention a very famous, wealthy corporation that did, in fact, attempt patent claims on binary addition in an ALU, buried in its claims concerning a programming language and programming environment they developed and sold. We also won't dwell here on the fact that, once upon a time, ideas, math, and algorithms were considered well outside the domain of patents in the USA.)

"Would it be possible," I asked, pressing my own agenda, "to reduce the cycle count to one on reads and two on writes in the direct page RAM, without blowing the transistor budget on a 6805 or 6801? That alone could better than double the speed of software multiply and divide."

There was a bit of uncomfortable chuckling and clearing of throats.

"No?"

Several engineers looked at Pete. He shrugged.

Tobias tilted his head apologetically. "We'd have to fix the prefetch/decode circuit so it's a real pipeline of depth one."

"It's not a real pipeline?"

 "The eight-bit designs don't have a place to keep the instruction in its partial and fully-decoded states, so we go back and redo the prefetch if we don't use it immediately." 

"Oh."

"And then we'd have to test it. Testing is what we get stuck on budgeting time for. You should talk with your brother about that."

"Denny's not in charge of test, is he?"

"No, but he could tell you something about the backlog."

Our Bob spoke up, "Could interns help with the grunt work?"

Motorola's Bob exchanged glances with Bill, then turned to Jesse. "Should we look at that?"

"Maybe we should," Jesse frowned. "I'll discuss it with my group on Monday, see if we can separate something out that a non-engineering tech could handle."

"Remember that these guys seem to have a bit more of a handle on the tech than our usual crop of interns."

"We'll take that into consideration."

I ventured a bit further. "I'm not just thinking of fast direct-page RAM, though. The 6809 and the 68000 have enough index registers to support separating the parameter stack from the return pointer stack, and that means one might profitably attach a hysteric cache to both pointers, with the appropriate control signals."

That got me looks of confusion and amusement.

"I mean a cache that tracks the stack pointer with hysteresis." I borrowed Julia's notepad again and sketched out something like this:

Ms. Philips reached over and lifted the notepad and waved it at me. "I am sure this is IP."

"It's just a lousy diagram of spill-fill cache tied to a stack pointer. Calling it hysteric is a bit of a pun, is all. Not even a really good pun, at that."

Jesse started chuckling. "If it works," he commented, "it'd be more appropriate to call it an anti-histrionic stack cache." 

A number of other engineers echoed his chuckles of appreciation.

Ms. Philips and Ms. Steward put their heads together and started working on something. Bill and Motorola's Bob refrained from comment, keeping an unobtrusive eye on what they were working out.

I added, "If such a cache could also be accessed in single-cycle reads and two-cycle writes, local variables would be almost as good as registers."

Bill leaned forward. "We've taken a lot of your time on this blue-sky brainstorming, but Bob and I wanted to get your opinion on something."

I let the amusing, but perhaps meaningful mixed metaphor pass and nodded.

"If you were designing a mass-market personal computer using an existing CPU, would you use Intel's 8086 or 8088?"

It was my turn to be confused. "Maybe I should give it more careful consideration, but my impression is that instruction set is an improvement over the 8080, but not much. And it has those sloppy segments. No. I'd use the 6809 for its instruction set, addressing modes, and register set before I'd use the 8088, even though the 6809 is a bit slower on multiplies and a lot slower on divides, and, for a PC, would require bank-switching or the 6844 MMU. 

(PC? I had become accustomed to the abbreviation in Japan while I was there as a missionary. How quickly people forgot, in our real world, that there was more than half a decade of PCs before the IBM PC.)

"And I'd use the 68000 over the 8086 even though the 68000 costs a bit more, because the 8086 just doesn't make sense as a design. It requires 16 bit wide memory, but it still gives only 16 bit addresses unless you play bad programming practices games with your code. Sloppy segments are a bug generator and a security booby-trap."

Bob nodded. "Are you sure your antipathies are not colored by family loyalties? The tech industry doesn't forgive misplaced family loyalties."

"Family loyalties may induce some of the heat, but, really, if they want to map 16-bit logical addresses into a 20-bit physical address space, they should make the segments fully 20 bits wide. 24 or 32 bits wide would make even more sense, even if the top four or twelve bits aren't brought out of the package or don't even physically exist. And the segments should have limit registers, as well, if they're going to mean anything besides crude bank-switching with the improvement of being able to tie specific banks of memory to specific index registers, including the instruction pointer. Half-baked MMU."

"But potentially useful, no?"

"With extreme caution. Too much caution, really."

"How about segment registers for the 6809 or 68000?"

"You can use the 68000's address registers for segmentation if you want, although the segment limit problem remains, and there is a memory cycle penalty if you don't handle the segments well."

I stopped to think my next words through.

"If I were adding segmentation to the 6809, I'd want full 32-bit segment registers. The limit registers would be as wide as the index registers, so if you had a derivative with only 16-bit wide index registers, the limit registers would also be 16-bit. Instead of a segment override prefix like the 8086, I'd just have the register-to-register transfer instructions move the segment and limit registers, as well."

Bill and Bob were both nodding. Bill asked, "You've taken a look at the 68008, haven't you?"

"Yeah. But I'm letting Mike be the one to have fun with it."

Mike snickered.

"If it were available in, say, three months, in small lots, would you use it?"

"There are a lot of things that a 4 megahertz 68000 is going to be no faster doing than a 1 megahertz 6809, because of the memory cycle speed, the extra width of instructions, and other things. Many of those things are precisely what a personal computer is going to be used for, at least for the next several years. A 4 megahertz  68008 is going to be about half to two thirds of the speed of the 68000, I think. The only advantage is the megabyte address space, which really won't be quite enough in the near future."

Bill and Bob both frowned.

I continued, "Now, if we had a further evolution of the 6801 with an additional 8 bits attached to the top of the index register and program counter, a long jump, and either a long load of X or a transfer A to XHi or some such, at a price not too much higher than the 6801, that would make a good cheap personal computer. Or my pet imaginary evolved version of the 6809 with PC, X, Y, U, and S extended by 16 bits and new indexing modes to make the long addresses accessible, at a price not too much higher than the 6809, that would be ideal for the current market."

"One megabyte is too tight?" Bob asked.

"64 kilobytes is too tight?" Bill asked.

"Look at the 6847. Julia and I and my sister write reports using that because we are patient with the narrow window on the text, and we like the ability to type, think, erase, and type again. But my mom just gets frustrated, and my dad barely avoids going to sleep using it. People with no reason to be patient won't get it, and they are the ones who will be buying most of the personal computers sold. A personal computer has to be able to show the equivalent of a typewritten page on its screen, at minimum, or at least have a clear upgrade path to get there. That's what's stalling Radio Shack's Color Computer in the market right now. Besides lack of MMU."

Pete said, "But a typewritten page of text would only need a 2 kilobyte screen buffer. I've seen the Japanese personal computers, and they're pretty functional with only 16 bits of address."

"How functional?"

 "All the useful characters."

"Not by a long shot. Less than two thousand. The real count for a good newspaper is estimated at over 3,000 characters, but they aren't taking into account that what will be included in that 3000 will vary from month to month. And even newspapers will use really oddball characters regularly, when they need something more precise in meaning, and if you include the ability to display all the oddball characters, you're well into 9,000 characters or more. Add historical characters and you easily triple that count. Chinese is on the order of a hundred thousand characters. Sixteen bits doesn't cut it, except for very limited purposes like cash register receipts and utility bills."

"You can't be serious."

"I've lived over there. I know the hype they give the current crop of PCs and the sell-job they give the new student of the Japanese language, and I know the reality when you start reading serious literature. The standard character set is just enough to get started."

"How does anyone remember them all?"

"They don't, but that's going to be one of the things a real personal computer will be good for, helping them find and use the ones that they have trouble remembering. The personal computers they have now are very limited in scope relative to what they need, and what they will have in the future. They sell because they don't have anything better."

I continued after a moments' thought, "If the characters are to have decently defined glyphs, you want bit-mapped characters that are 32 by 32 pixels, not 16 by 16. 10,000 characters at 128 bytes per glyph is going to eat up a megabyte of address spaced pretty quickly." (Vector glyphs were still a bit exotic for a conversation like this until a couple of years later.)

"And graphics." I pointed at the TV. "How many kilobytes is the graphics mode screen buffer on the 6847, for just fuzzy monochrome on a color TV?"

"Six."

"How would the same resolution graphics in four colors be, if the 6847 supported it, or if you modified the output and added the RAM?"

"An extra bit per pixel, so twelve."

"That takes 12K out of the program space on the 6801 or 6809, just for four colors, and everyone will want a much bigger gamut of color. And resolution at least double what the 6847 offers. 64K was tight to start with, and a megabyte will soon be tight for color graphics. One advantage, I guess, to the 68008 is the implicit upgrade path to the 68000, but 24 bits of address will shortly be too few, also."

"16 megabytes too tight? RAM is expensive," Sharon pointed out.

"If you don't want to be a foundry for other companies' designs, you have to have a base technology where you develop your testing and manufacturing techniques. That's RAM. It pays for itself without even being on the market by helping you get your other products right, faster."

"That kind of thinking'll push the price of RAM right through the floor," Motorola's Bob said with a frown.

"Exactly. But you won't care, because RAM pays for itself in shortening your development cycles for your profitability products. RAM should be like candy, anyway."

"RAM should be like candy." Bill harrumphed. "I think you've said that before." He reached into his briefcase and pulled out an advanced information datasheet and handed it to me. "Has Denny shown you this?"

The datasheet described the planned 68010 and 68012. I scanned it quickly. "No. Can Julia and Mike also take a look at this?"

"Sure. And anyone else in this room, really."

I showed Julia the changes in the addressing mode, allowing 32 bit constant offsets, and the short loop cache mode. 

She tilted her head grinned apologetically. "I guess it's an improvement?"

"Definitely. And the exception frame looks more manageable."

I passed it to Mike, and Bob and Jennifer looked over his shoulder. 

After a quick scan, he looked up. "Why isn't the 68008 based on this? The short loop execution mode would be especially useful when memory's only eight bits wide."

"Timing. Market and management." Bob shrugged.

"If I were you guys, I'd hold the 68008 off until I could make it an 8-bit version of the 68010. In spite of the fact that I personally really want to get my hands on one."

I nodded my agreement with Mike. "Or, if you just have to have an eight-bit 68000 now and this allows testing to complete more quickly, plan and advertise a 68018 that will be an 8-bit 68010."

"What if we have plans for adding more addressing modes and wider math, and dropping the loop mode for a small general cache, in another CPU in the early planning stages?" Bill's face was unreadable. "Not saying we do, but what if?"

I took a deep breath. "You know, in the 6809, extended mode was added to the index post-byte for doing memory indirect on absolute addresses. I'm wondering how much more it would have cost to included direct page in the index post-byte, as well. That would allow using the load effective address instruction to get the address of a direct page variable without using the accumulator, which would make the direct page much more useful for statically allocated local variables. But adding many more addressing modes would quickly get into negative trade-offs."

"But that's talking about the 6809."

"Yeah. The 6502 needs two kinds of memory indirect because it's so register poor. And those two kinds were a very strategic choice. The 68000 already effectively has both kinds, because it has lots of indexable registers. It doesn't need more addressing modes, not considering how much it will cost to test and get right. Except for the 32-bit constant offsets, those will be worthwhile. And it especially doesn't need addressing modes that can be as quickly executed using existing instructions and a register or two. I'd have liked it to have memory indirection, but that's just an address register load, so maybe not really worth it. Sure, eight address registers is a shade tight for some uses, but you don't want to clutter the upgrade path to a 64-bit CPU with a bunch of new, untested addressing modes."

There was a chorus of cleared throats and exchanged glances.

"Would it cost too much to somehow allow engineers to experiment with variations of your primary designs, to push the envelope with real hardware, even if it's not tested?"

"What do you mean?" asked Bill.

"Like a skunkworks, but officially supported."

Motorola's Bob leaned forward. "Assuming we dare put our fab facilities at risk, where are we going to get the manpower?"

"Just let your engineers take up to eight hours a week on blue-sky projects on company time, no questions asked."

Sharon shook her head. "We're already short of time."

"Blue-sky projects give you a chance to figure out better ways to do things. You'll end up being more efficient and closer to on-schedule."

"Hard to believe," Pete complained.

I shrugged. "Well, you guys have the experience, not me. I've said my opinion."

"Okay, we have another addendum." Ms. Philips and Ms. Steward looked up from their writing and interrupted, and Ms. Philips showed Bill what they had. He passed the addendum to Bob, and Bob looked it over and passed it to me.

It consisted of mutual permission to use ideas and concepts we had talked about over the course of a couple of hours that night with a promise of best effort to offer each other consideration. The five of us figured that was more than agreeable, and added it to our agreement contracts.

As we wrapped up, Jesse asked me, "Could you put a Forth interpreter on a 6805?"

"Self-hosted?"

"Of course."

Julia looked up from the notes she and Ms. Steward were arranging to make copies of. 

"Self-hosted?" she asked. "That's where the language runs on the same processor that compiles the code, kind of the opposite of the cross-assembler that runs on the 6800 but produces code for the 6805?"

I nodded. "Yeah. Maybe self-hosted could be done, if you have enough ROM and RAM. The virtual instruction pointer needs more than 8 bits, but self-modifying code might work -- using an extended mode jump where the code writes over the jump address before executing the jump. Cheating, but it might work."

Jesse smirked and I chuckled.

Julia asked, "Can you show me an example?"

She handed me her pad again, and I wrote out some code:

NEXTIP
   LDA IP+1
   STA SELFMO+2 ; direct-threaded
   LDA IP
   STA SELFMO+1
SELFMO
   JMP $EEEE ; provisional target address
* The 16 bit address $EEEE just got overwritten by the target address. 

She looked at it with a frown. "What's the purpose in this?"

"It's the part of the virtual machine emulator where the CPU calls the code to emulate each virtual instruction. And each emulation routine ends in a jump back to NEXT."

She tilted her head. "Sorry. I'm totally lost."

"For example, the routine to add two numbers on the stack would look something like this:

PLUS
   LDX USP ; parameter stack
   LDA 3,X ; low bytes
   ADDA 1,X
   STA 3,X
   LDA 2,X ; high bytes
   ADCA ,X
   STA 2,X
   INX ; drop argument
   INX
   STX USP ; update the stack pointer
   JMP NEXT
"The routine for a jump would look something like this:"

BRANCH
   LDX IP ; IP is pointing at the in-line offset.
   LDA IP+1
   ADDA #2 ; bump past offset
   BCC BRANC0
   INC IP
BRANC0
   ADDA 1,X ; add the low byte of the offset
   STA IP+1
   LDA IP
   ADCA ,X ; and the high byte
   STA IP
   JMP NEXT

"And the routine for nesting calls would look something like this:"

CALL
   LDX RSP ; return address stack
   DEX ; room for old IP
   DEX
   STX RSP
   LDA IP+1
   ADDA #2 ; bump past call address
   BCC CALL0
   INC IP
CALL0
   STA 1,X ; tuck the address to return to away
   LDA IP
   STA ,X

And then I was stuck. "Wait. This isn't going to work."

Jesse chuckled again.

I went back to the NEXT routine. "Yep. I'm forgetting to actually get the jump address in the NEXT routine, and maybe a bit more."

Jesse agreed with a grunt. 

I shook my head and laughed. After staring at the code for NEXTIP for a minute or two while Jesse smirked and Julia looked puzzled, I shook my head. "Not having a sixteen-bit pointer is a real pain." 

Julia met my eyes and sighed. "Don't worry about it. I don't think the eight kilobyte maximum address space is going to leave much room for a program to run in, anyway."

"Yeah, but they're going to eventually make a chip with a full sixteen-bit wide CPU. I want to convince myself of this."

Her forehead creased.

"We need to grab two bytes pointed at by the sixteen bit IP in the direct page."

NEXTIP
   CLR NXADD1+1
   LDA IP+1
   STA NXADD1+2
   INCA
   STA NXADD2+2
   BNE NEXT00
   INC NXADD2+1
NEXT00
   LDA IP
   STA NXADD1+1
   ADDA NXADD2+1
   STA NXADD2+1
NXADD1
   LDA #$EEEE
   STA NXJMP+1
NXADD2
   LDA #$EEEE
   STA NXJMP+2
NXJMP
   JMP $EEEE ; provisional target address
* Had to overwrite lots of addresses.
I sighed. "And all of that in the small RAM is going to run us out of RAM."

Jesse let out a horse laugh.

"I guess this needs to be done a bit more simply."

"No. I think you nailed it. But put the code from NEXTIP to NXADD1 in ROM, followed by a jump to NXADD1 in RAM." He continued to chuckle.

Julia said, "It's okay. I don't care. We're all tired. Let's go home, or, well, back to your brother's place."

"But I want to work the rest of this out. Borrow from ..." 

She took the Forth listing I had picked back up and her pencil and the sheet of paper I was trying to work on out of my hands while Jesse laughed. 

"You got a real jewel there, Joe," he said. "You'd better listen to her. And don't worry about the Forth on the 6805. I think that's about as good as it gets, and as Julia says, it's not much use until we have a 6805 MPU with fourteen bits or more of address. And I look forward to working with you as an intern, and having you join us when you graduate. I like the way you think. I think we all do." He looked around at the engineers and his managers, and everyone nodded in agreement.

I suddenly turned Japanese and ducked my head. "Sorry. I mean, thanks."



[Backed up at https://joel-rees-economics.blogspot.com/2020/09/bk-33209-rocks-moving-ahead.html.]

No comments:

Post a Comment

33209: Discovering the 6800 -- Parents and Polygamy

A Look at the 8080/TOC "Whoa, Merry, look who's here!" Jim said, sotto voce. He, Roderick, and I were at our lab table ...