Difference between revisions of "First-Hand:Cryo CMOS and 40+ layer PC Boards - How Crazy is this?"

m (Text replace - "[[Category:Computers_and_information_processing" to "[[Category:Computing and electronics")
(39 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
'''Contributed by:''' Tony Vacca
 +
 
== How it started  ==
 
== How it started  ==
  
It was in the early 80's.  Control Data (CDC) had just launched the CYBER - 205 with modest success and the team was now focused on the next generation machine, the 2XX as I recall.  Speed, cost and meeting the schedule were all key objectives.  Speed because Cray Research under the guidance of Seymour Cray was setting  milestones for Supercomputers with the Cray 1 and then the Cray 2.  Cost, since Supercomputers were extremely expensive.  Schedules since the CYBER - 205 had established patience records as a machine that may never get out the door and this must not be repeated.  
+
<p>It was in the early 80's. Control Data (CDC) had just launched the CYBER - 205 with modest success and the team was now focused on the next generation machine, the 2XX as I recall. Speed, cost and meeting the schedule were all key objectives. Speed because Cray Research under the guidance of Seymour Cray was setting milestones for Supercomputers with the Cray 1 and then the Cray 2. Cost, since Supercomputers were extremely expensive. Schedules since the CYBER - 205 had established patience records as a machine that may never get out the door and this must not be repeated. </p>
 
 
A conventional evolutionary approach for Integrated Circuit (IC) logic was initially selected. &nbsp;Motorola, with some prodding, agreed to launch an 8,000 gate equivalent ECL (emitter-coupled-logic - the circuitry of choice for high performance processing units) provided that Control Data do the actual circuit development. &nbsp;There were insufficient customers for Motorola to commit their resources to this lofty development. &nbsp;Motorola did, however, commit their advanced ECL processes to CDC and a joint team was formed with the two companies. &nbsp;
 
  
Logic designers at the CDC Advanced Design Laboratory were given preliminary design rules based on computer device &nbsp;models and estimates of gate per chip densities. &nbsp;There was a natural follow up of grumbling by the logic design team led by very experienced and innovative folks (Ray Kort, Maurice Hudson and Dave Hill to name three) &nbsp;but circuit designers had learned to accept this since logic designers always found the circuits to be too slow and insufficient an quantity of gates and pins (I/O ports) per die. There was a lot of cooperation too. &nbsp;Basic building blocks were defined by the logic designers - gate functionality, register functionality, etc. From this set of preliminary rules &nbsp;function blocks were defined and capacity per reasonably-sized Printed Circuit (PC) boards defined. The initial design using the Cray CYBER - 205 based architecture was launched.  
+
<p>A conventional evolutionary approach for Integrated Circuit (IC) logic was initially selected. Motorola, with some prodding, agreed to launch an 8,000 gate equivalent ECL (emitter-coupled-logic - the circuitry of choice for high performance processing units) provided that Control Data do the actual circuit development. There were insufficient customers for Motorola to commit their resources to this lofty development. Motorola did, however, commit their advanced ECL processes to CDC and a joint team was formed with the two companies. </p>
  
<br>  
+
<p>Logic designers at the CDC Advanced Design Laboratory were given preliminary design rules based on computer device models and estimates of gate per chip densities. There was a natural follow up of grumbling by the logic design team led by very experienced and innovative folks (Ray Kort, Maurice Hudson and Dave Hill to name three) but circuit designers had learned to accept this since logic designers always found the circuits to be too slow and insufficient an quantity of gates and pins (I/O ports) per die. There was a lot of cooperation too. Basic building blocks were defined by the logic designers - gate functionality, register functionality, etc. From this set of preliminary rules function blocks were defined and capacity per reasonably-sized Printed Circuit (PC) boards defined. The initial design using the Cray CYBER - 205 based architecture was launched. </p>
  
''In parallel with this effort, and in the same design group; i.e.; circuit, packaging, PC board and newly formed CAD (tools for layout and design of chips and boards) - &nbsp;chief chip design engineer - Randy Bach - was assigned to develop an advanced CMOS chip for the Canadian Computer Development organization. &nbsp;At this time, early 80's CMOS was in it's infancy being used for memory devices, low performance peripherals and also for low performance microprocessors (5 to 10 MHz clock speeds). &nbsp;The design contained 5,000 gates plus appropriate input and output communication devices. &nbsp;Gate arrays for CMOS was also nearly non-existent so Randy and his small team of two assistants developed a cell library and worked closely with the Canadian Development team to meet their objectives as well. &nbsp;''
+
<p>In parallel with this effort, and in the same design group; i.e.; circuit, packaging, PC board and newly formed CAD (tools for layout and design of chips and boards) - chief chip design engineer - Randy Bach - was assigned to develop an advanced CMOS chip for the Canadian Computer Development organization. At this time, early 80's CMOS was in it's infancy being used for memory devices, low performance peripherals and also for low performance microprocessors (5 to 10 MHz clock speeds). The design contained 5,000 gates plus appropriate input and output communication devices. Gate arrays for CMOS was also nearly non-existent so Randy and his small team of two assistants developed a cell library and worked closely with the Canadian Development team to meet their objectives as well. </p>
  
''This effort was completely separate from the ECL based gate array to be used for the next generation Supercomputer. &nbsp;The product was developed for a low cost application.''
+
<p>This effort was completely separate from the ECL based gate array to be used for the next generation Supercomputer. The product was developed for a low cost application. </p>
  
<br> It was customary for Neil Lincoln - chief architect, Dale Handy - manufacturing manager and me to go off to lunch every 8 to 10 days to discuss status at either Author Treacher's Fish &amp; Chips or Zantigo's (high class - NOT - fast food restaurants). &nbsp;''As a side note, both of these fast food places disappeared during the ETA Systems brief duration.&nbsp;''&nbsp;&nbsp; ''Zantigo's has returned (I think because they know it is safe now that the three of us cannot visit together any longer - Neil unfortunately passed on a few years ago).''
+
<p>It was customary for Neil Lincoln - chief architect, Dale Handy - manufacturing manager and me to go off to lunch every 8 to 10 days to discuss status at either Author Treacher's Fish &amp; Chips or Zantigo's (high class - NOT) fast food restaurants. As a side note, both of these fast food places disappeared during the ETA Systems brief duration. Zantigo's has returned (I think because they know it is safe now that the three of us cannot visit together any longer - Neil unfortunately passed on a few years ago). </p>
  
At one of these meetings, Neil had "news" for me. &nbsp;Simply stated, the gate array in co-development with Motorola had unacceptable goals. &nbsp;The chip had too few I/O pins, consumed too much power and insufficient gates. He had determined that the CPU (some 3 Million gates) had to be assembled on a single board. &nbsp;"It was time for this to be done". &nbsp;He also reached the conclusion that the logic design required at least 15,000 gates per chip to meet these goals. &nbsp;
+
<p>At one of these meetings, Neil had "news" for me. Simply stated, the gate array in active co-development with Motorola had unacceptable goals. The chip had too few I/O pins, consumed too much power and insufficient gates. In addition, he completed a cost model which indicated an unacceptable cost figure for the CPU. He also determined that the CPU (some 3 Million gates) had to be assembled on a single board. "It was time for this goal to be reached". He also reached the conclusion that a proper logic design required at least 15,000 gates per chip to meet these goals. </p>
  
''The logic designers had gotten to him I surmised.'' Schedules, Neil reminded us, could not be altered - and that was that. &nbsp;To soften the blow he bought lunch that day, three Cokes and three orders of fish and chips - Neil's was a large order.  
+
<p>The logic designers had gotten to him I surmised. Schedules, Neil reminded us, could not be altered - and that was that. To soften the blow he bought lunch that day, three Cokes and three orders of fish and chips - Neil's was a large order. </p>
  
The trip back to the lab was pretty quiet, fortunately short since our eating places were all very close to the lab.  
+
<p>The trip back to the lab was pretty quiet, fortunately short since our eating places were all very close to the lab. </p>
  
<br>  
+
<p>That afternoon, I assembled the key folks - I might miss one or two but Randy Bach, Doug Carlson, Dave Resnick and John Ketzler were four that I recall now. Doug was a mechanical engineer that I assigned the Motorola project to because of his management skills - something he probably never forgave me for - John was the key circuit engineer on the Motorola project and Dave was and still is a very versatile and perceptive engineer. </p>
  
that afternoon, I assembled the key folks - I might miss one or two but Randy Bach, Doug Carlson, Dave Resnick and John Ketzler were four that I recall now. &nbsp;Doug was a mechanical engineer that I assigned the Motorola project to because of his management skills - ''something he probably never forgave me for ''- John was the key circuit engineer on the Motorola project and Dave was ''and still is'' a very versatile and perceptive engineer. &nbsp;
+
<p>Doug and I would inform Motorola of the decision not to continue. The team would package up what was accomplished and turn it over to Motorola to carry the ball forward if they wished. </p>
  
Doug and I would inform Motorola of the decision not to continue. &nbsp;The team would package up what was accomplished and turn it over to Motorola to carry the ball forward if they wished. &nbsp;'''''As a side note, Motorola and Cray did continue the design. &nbsp;It was the circuit design used in the Cray C90, a very successful computer'''''.  
+
<p>As a side note, Motorola and Cray did continue the design. It was the circuit design used in the Cray C90, a very successful Cray Research Supercomputer. </p>
  
The meeting turned to what were the next steps.  
+
<p>The meeting turned to what were the next steps. </p>
  
The key challenges that emerged were:&nbsp;
+
<p>The key challenges that emerged were: </p>
  
 
*IC Technology that could meet the new lofty goals  
 
*IC Technology that could meet the new lofty goals  
Line 40: Line 40:
 
*Testing of complex IC technology and complex PCB technology
 
*Testing of complex IC technology and complex PCB technology
  
== == Summary of IC technology accomplishments == <!--StartFragment-->  ==
+
== Results  ==
 
 
== <br><!--StartFragment-->  ==
 
 
 
==== <span style="font-size: 21.0pt;font-family:Helvetica">'''ETA Systems Hardware'''</span><span style="font-size:21.0pt;font-family:Helvetica"> '''Technology&nbsp;'''</span><span style="font-size: 21.0pt;font-family:Helvetica">'''1980 – 1989'''</span> ====
 
 
 
==== '''<span style="font-size:19.0pt; font-family:Helvetica">Preface:</span>'''  ====
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">To restate the challenge: ETA Systems Inc. was spun off
 
as the Supercomputer subsidiary from a struggling Control Data Corporation
 
(CDC).&nbsp; The objective was to develop and manufacture High Performance Computers
 
or commonly called in the 80’s and 90’s simply Supercomputers.&nbsp; Cray
 
Research Inc. dominated this market during this time frame and CDC had a minor
 
market position introducing the Star-100 followed by the CYBER-203 and
 
CYBER-205 systetms.&nbsp; Novel architecture (fast scalar performance and the
 
efficient use of vectors), innovative software and highest performance
 
integrated circuit (resulting in the fastest clock period), innovative
 
packaging (to optimize device spacing and thermal management) differentiated
 
Supercomputers from conventional computer systems during this period. It must
 
be sated to be "fair and balanced" that Supercomputers also had the
 
highest price tag and demanded the largest memories and highest performance
 
peripherals and system bandwidths. Systems dominating the market during the
 
80’s were the Cray-1, Cray XMP and CYBER-205.&nbsp; NEC, Fujitsu and Hitachi
 
also developed systems in this market.&nbsp; The word Supercomputer was applied
 
to other products as well. It is not intentional to dismiss their recognition.</span>
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">The following overview will not enter into the decisions
 
to separate ETA Systems from Control Data Corporation organizationally,
 
although that topic is interesting as well.&nbsp; Nor will the following
 
discuss software innovations at ETA Systems – and there were many.</span>
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">Architecture had a role in dictating the technology in
 
terms of number of logic circuits that were serial per clock cycle.&nbsp;
 
Architecture also demanded high performance large registers (temporary storage
 
devices) to be included which also dictated performance (clock cycle) of the
 
system.&nbsp; Other architecture features (instructions) dictated the number of
 
functions that constituted a processor (gates / CPU) that, in turn, determined
 
technology selection from a point of preferred Gates per Chip and Ports per
 
Chip. Proximity of chips to each other for processor design was crucial during
 
this time period since a CPU could not reside within the boundary of a single
 
chip as it easily does today. Bandwidth, i.e.; number of bytes per unit of time
 
that could be moved between functions within the CPU and the CPU and associated
 
memory is key and places demands on pins or logic paths between functions that
 
usually requires compromise in each and every design.</span>
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">Those reading this now find this humorous I am sure with
 
multiprocessing units (multi CPUs) now residing within the boundaries of a
 
single chip or IC die. In the 80’s and well into the mid 90’s, however, a CPU
 
processor partitioning of necessary logic or Boolean functions on multiple
 
integrated circuit chips (usually multiple hundreds of chips) and multiple
 
complex printed circuit boards (2 to 8) was an integral part of determining the
 
overall performance, power consumption, cost and reliability of the system.</span>
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">&nbsp;</span><span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
 
</span><span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
 
</span>
 
 
 
== <span style="font-size:19.0pt; font-family:Helvetica">'''Introduction'''</span><span style="font-size:13.0pt; font-family:Helvetica">&nbsp;</span> ==
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">ETA Systems technology was selected in 1980 (the
 
organization was the Advanced Design Laboratory of Control Data Corporation at
 
the inception) with the following objectives:</span>
 
 
 
*
 
==== <span style="font-size:13.0pt;font-family:Helvetica">The highest performance
 
</span>Supercomputer at the time of product delivery ====
 
 
 
 
 
*
 
==== <span style="font-size:13.0pt;font-family:Helvetica">The most cost effective technology
 
</span>available ====
 
 
 
 
 
*
 
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">The lowest possible power
 
</span>consumption while meeting other objectives ====
 
 
 
 
 
*
 
==== <span style="font-size:13.0pt;font-family:Helvetica">The largest product diversity
 
</span>with a single design&nbsp; ====
 
 
 
 
 
*
 
==== <span style="font-size:13.0pt;font-family:Helvetica">The highest possible
 
</span>reliability. pins and interconnects usually dictated the reliability since by that time Integrated Circuit technology reached a very high reliability for both logic and storage devices ====
 
 
 
 
 
*
 
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">Leverage as much of the
 
</span>technology as possible to the follow-on computer generation. This usually fell by the wayside until ECAD and MCAD technologies were introduced into the design ====
 
 
 
 
 
*
 
==== <span style="font-size:13.0pt;font-family:Helvetica">Utilize only standard IC
 
</span>technology processes being developed for other markets. ====
 
 
 
 
 
*
 
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">Demonstrate the prototype of the
 
</span>product in less than four years. ====
 
 
 
 
 
=== <span style="font-size:18.0pt; font-family:Helvetica">'''Digging deeper into objectives:'''</span> ===
 
 
 
*
 
==== <span style="font-size:14.0pt;font-family:Helvetica">The highest performance
 
Supercomputer at the time of product delivery</span> ====
 
 
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica">Simply stated, the highest performance processor solved
 
the largest problems most effectively.&nbsp; Performance was usually measured
 
in clock cycle that was unfortunate since variable amounts of calculations
 
could be made per clock cycle.&nbsp; This single parameter was the bragging
 
rights although later Gigaflops became the stated parameter and that also did
 
not necessarily reflect the true performance of a supercomputer.&nbsp; The
 
“king of the hill” at any given cycle (usually 2 to 4 years) held the largest market
 
share.</span>
 
 
 
*
 
==== <span style="font-size:14.0pt;font-family:Helvetica">The most cost effective
 
technology available</span> ====
 
 
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica">''<span style="mso-spacerun: yes">&nbsp;</span>''</span><span style="font-size:13.0pt;font-family:Helvetica">The most “bang” for the buck
 
applied to Supercomputers as well as other markets.&nbsp; The customer was
 
willing to get the highest performance when solving his particular challenges
 
as a higher priority provided that the performance clearly exceeded lower cost
 
alternatives.''&nbsp;A legend of the Supercomputer industry - Jim Thornton -''
 
once described the requirement as getting through an intersection without
 
having an accident. &nbsp;Since a Supercomputer required so many components and
 
interconnects, they were bound to fail more rapidly than small computers. So -
 
Jim surmised, the faster the computer, the more things that could be solved
 
before something went wrong. &nbsp;Go through an intersection as fast as possible
 
- not at a slow rate and you have a better chance of getting through safely.</span>
 
 
 
*<span style="font-size:14.0pt;font-family:Helvetica"></span>'''<span style="font-size: 13.0pt;font-family:Helvetica">&nbsp;</span>'''
 
 
 
<br> <span style="font-size:13.0pt;font-family:Helvetica">Each follow-on generation of Supercomputer products
 
witnessed increased power per processor, which was justified by the resultant
 
performance realized.&nbsp;(The lower the RC time constant (resistance -
 
capacitance) the faster the computer clock cycle.) &nbsp;Since the lower the R,</span> the higher the power, this was a trend. &nbsp;As multi-processor units increased per system the power consumption became a major issue; the largest users for site related “wall plug” power capacity limits and for the small users for basic life-of-system cost concerns (mainly power consumption, cooling and system reliability).
 
 
 
*
 
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:13.0pt;font-family:Helvetica">'''The largest product diversity&nbsp;<span class="Apple-style-span" style="font-family: -webkit-sans-serif; font-size: 13px; font-weight: normal; ">with a single design'''<span style="font-size:13.0pt;font-family:Helvetica">&nbsp;</span>'''&nbsp;</span>'''</span> ====
 
 
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">Design cycles were three to five years in duration and
 
large teams were assembled to complete a single design.&nbsp; Development costs
 
per product were significant. Product cost ($2M to $40M per system) and
 
performance ranges (greater than 20:1) for each generation of Supercomputers
 
were increasing.&nbsp; Since optimized cost points for each product were
 
technology dependent, this required multiple design teams – each design
 
utilizing a unique technology.&nbsp; Desire to utilize a single total design,
 
i.e., packaging, IC selection, manufacturing tooling, and associated boards,
 
connectors, etc. was desirable, therefore, for a myriad of obvious reasons.
 
&nbsp;In fact, most companies were focused on only a portion of the computer
 
market and dedicated to only a small portion of the product
 
"bandwidth". &nbsp;Cray Research focused on the high end, IBM, CDC,
 
Unisys and others did an admirable job in the middle and companies like DEC and
 
HP were at the lower end. &nbsp;There were others too across the world, but
 
these companies are only examples.</span>
 
 
 
*
 
==== <span style="font-size:14.0pt;font-family:Helvetica">'''The highest possible'''
 
</span>reliability'''<span style="font-size:13.0pt;font-family:Helvetica">&nbsp;</span>''' ====
 
 
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">Supercomputers required significant bandwidth to pass
 
data between processing units and processors and memory.&nbsp; Bandwidth is a
 
key differentiator that separates true Supercomputers from conventional
 
computer systems.&nbsp; Interconnects were and still are dominant reliability
 
concerns in large systems.&nbsp; Thermal management is also significant. Large
 
systems, by the nature of the design require a large quantity of components to
 
be simultaneously functional.&nbsp; Each interconnect and each active logic
 
device (integrated circuit) requires the highest reliability to permit large
 
user problems to be solved using Supercomputers. Simply stated; Supercomputer
 
operational time had to exceed the size of the largest customer problem.</span>
 
 
 
*
 
==== <span style="font-size:14.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:14.0pt;font-family:Helvetica">Leverage as much of the
 
</span>technology as possible to the follow-on generation ====
 
 
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">Significant cost for development of a given technology
 
“kit” had to be leveraged for as long a period as possible.&nbsp; It was a
 
“given” that most Integrated circuits would be developed for each generation of
 
computer. What about interconnects (connectors)? What about printed circuit
 
boards? What about support technologies like simulation tools, assembly tooling
 
and basic packaging?&nbsp; Can any of these technologies extend to the next
 
generation?&nbsp; And, are there any “mid life kickers” that could be inserted
 
into a successful product to extend its market life? IBM did exceptional work
 
in taking hardware across product boundaries and generations of new products.&nbsp;
 
The initial tooling for packaging was expensive but results in later products
 
appeared to prove dividends.&nbsp; Cray Research Inc. and Control Data, by
 
contrast, generated new packaging and connector technology with each new
 
generation of product.&nbsp;</span>
 
 
 
*
 
==== <span style="font-size:14.0pt;font-family:Helvetica">Utilize only standard IC
 
</span>technology processes being developed for other markets ====
 
 
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">This addresses two major issues, cost and access to
 
technology.&nbsp; Cost, since dedicated IC processing lines with unique
 
processes for low volume products – even if it could be realized – would not
 
allow effective amortization of costs for process development and
 
manufacturing.&nbsp; Access to technology addresses advanced popular (high
 
volume) processes to accommodate unique system designs. Innovation in the IC
 
industry was and is applied to the highest return on investment markets.&nbsp;
 
The “trick” was to apply this ”standard” and most innovative technology to low
 
volume Supercomputer applications.</span>
 
 
 
*
 
==== <span style="font-size:14.0pt;font-family:Helvetica">Demonstrate operating product
 
prototypes in less than four years&nbsp;</span> ====
 
 
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica">This requires discipline as well as good management and
 
leadership.&nbsp; Due to the complexity of Supercomputers, we felt that tools
 
had to be upgraded significantly and checkout and diagnostics improved as well.
 
At any given time improvements are made in Supercomputer technologies. Allowing
 
each incremental development to be incorporated extends the product development
 
cycle significantly.&nbsp; Selecting known (proven) technologies at the time of
 
product development initiation results in a non-competitive product.&nbsp; Risk
 
must be taken.&nbsp; What areas to take risk (technologies that are not
 
available at product initiation but look the most promising) as well as the
 
return on the risk must be carefully evaluated with factors clearly
 
understood.&nbsp; Where to invest in new technology development must be understood
 
as well as the return on investment and the leverage of investment where other
 
markets are interested in common technologies must be understood. &nbsp;Back-up
 
alternatives should be identified. The CEO of Cray Research - Jon Rollwagon -
 
defined the challenge as "How many 3-point shots should each project
 
take?" Missing market cycles is costly.&nbsp; Under reaching
 
(conservative) and over-reaching (bad choices in “betting on the come”) were
 
also costly and prohibitive.&nbsp; All of these factors were carefully
 
evaluated.&nbsp;&nbsp;It might be added, using the same basketball comparison,
 
cannot have too many “''lay ups''</span><span style="font-size:13.0pt; font-family:Helvetica">” (sure things already developed) either!</span>
 
 
 
<br> <span style="font-size:13.0pt;font-family:Helvetica;mso-ansi-language:EN-US; mso-fareast-language:EN-US">
 
</span>
 
 
 
== &nbsp;<span style="font-size:13.0pt;font-family:Helvetica">&lt;o:p&gt;&lt;/o:p&gt;</span>  ==
 
 
 
== <span style="font-size:16.0pt; font-family:Helvetica">'''Results'''</span><span style="font-size:16.0pt; font-family:Helvetica">&lt;o:p&gt;&lt;/o:p&gt;</span> ==
 
 
 
''<span style="font-size:13.0pt; font-family:Helvetica">Before getting to the details as to how decisions were</span>'' made and how the ETA System technologies the “kit” was selected and developed, a list of noteworthy accomplishments achieved are listed:''<span style="font-size:13.0pt; font-family:Helvetica">&lt;o:p&gt;&lt;/o:p&gt;</span>''
 
 
 
*
 
 
 
==== <span style="font-size:14.0pt;font-family:Helvetica">'''First Industry competitive'''</span>'''===='''
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica">&nbsp; Since 1995 – to the present (beginning 12 years after
 
the technology selection by ETA Systems I might add) ALL HPC (High Performance
 
Computers) are developed and manufactured using CMOS IC technology. &nbsp;Until
 
as late as 2000, bipolar technology (higher power, more costly to manufacture
 
and lower gate count per chip) dominated high performance computers throughout
 
the world.</span>
 
 
 
*
 
 
 
==== <span style="font-size:13.0pt; font-family:Helvetica"></span><span style="font-size:14.0pt;font-family:Helvetica">'''First Industry Single Board'''</span>'''===='''
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica"><span style="mso-spacerun: yes">&nbsp;</span>The chip density (gates per chip) allowed by advanced CMOS, the use of layout and design Computer aided design tools for optimum layout and simulation, the successful design of a 45 layer advance Printed Circuit board (''you read it right 45 layers''</span><span style="font-size:13.0pt;font-family:Helvetica">) and
 
innovative chip attachment and cooling permitted a single processor containing
 
nearly 3 million gates to be packaged on a single board.&lt;o:p&gt;&lt;/o:p&gt;</span>
 
 
 
*
 
 
 
==== <span style="font-size:13.0pt; font-family:Symbol"><span>&nbsp;</span></span><span style="font-size:14.0pt;font-family:Helvetica">'''First Industry system to be'''</span>'''<span style="font-size:13.0pt;font-family: Helvetica"> &lt;o:p&gt;&lt;/o:p&gt;</span> ===='''
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica">CPU Processing units (''≈3Million gates each'') were validated for
 
functionality and performance in less than 4 hours.&nbsp;&nbsp;Any interconnect
 
errors were recorded and allowed chip-to-chip replacement to occur in a minimal
 
time.&nbsp;Other CPU checkout during this same period required weeks to months
 
to check out and validate a processing unit.&nbsp; Incoming testing of the logic
 
IC Chip (''function and performance'') also used the same self-test innovations.&lt;o:p&gt;&lt;/o:p&gt;</span>
 
 
 
*
 
 
 
==== <span style="font-size:14.0pt;font-family:Helvetica">'''First Industry production'''</span>'''===='''
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica"><span style="mso-spacerun: yes">&nbsp;</span>The ETA Systems CPU was immersed in Liquid Nitrogen – 77 degrees Kelvin – to improve performance greater than two times that CMOS technology operated at room temperature – 300 degrees Kelvin.&lt;o:p&gt;&lt;/o:p&gt;</span>
 
 
 
*<span style="font-size:14.0pt;font-family:Helvetica">'''First system at CDC to fully'''
 
</span>
 
 
 
utilize Computer Design Software to design Chips, boards, validate Logic design and Auto Diagnostic test the system with Synergistic tools'''<span style="font-size:13.0pt;font-family:Helvetica">.&nbsp; &lt;o:p&gt;&lt;/o:p&gt;</span>'''
 
 
 
<span style="font-size:13.0pt; font-family:Helvetica">Permitted checkout of a CPU to be completed in less than
 
4 hours.&nbsp; Manufacturing costs were greatly reduced. &nbsp;This technique
 
was also used at the IC Supplier and greatly reduced any probe test hardware
 
and software.&lt;o:p&gt;&lt;/o:p&gt;</span>
 
 
 
*
 
 
 
==== <span style="font-size:13.0pt;font-family:Helvetica">'''First Industry system to have'''</span>'''<span style="font-size:13.0pt;font-family:Helvetica">&nbsp; &lt;o:p&gt;&lt;/o:p&gt;</span> ===='''
 
 
 
<br> <span style="font-size:13.0pt; font-family:Helvetica">Performance range of the ETA System products was greater
 
than 24:1 (8 processor system operating at 7 nanoseconds Clock period and a
 
single processor system operating at 24 nanoseconds.).&nbsp; Processors were
 
manufactured, tested and validated from a single manufacturing line using
 
identical components.&nbsp; (''IC''</span><span style="font-size:13.0pt; font-family:Helvetica"> ''Chips were performance sorted using auto self test''</span>''<span style="font-size:13.0pt; font-family:Helvetica">). Product differences began at the system packaging</span>'' level.&lt;o:p&gt;&lt;/o:p&gt;
 
 
 
== <span style="font-size:13.0pt; font-family:Helvetica">&nbsp;&lt;o:p&gt;&lt;/o:p&gt;</span>  ==
 
 
 
== <span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
 
<!--StartFragment-->
 
 
 
 
 
<span style="font-size:
 
20.0pt;font-family:Helvetica;color:red">Boring into the details<o:p></o:p></span>
 
 
 
<span style="font-size:
 
20.0pt;font-family:Helvetica">Boring into details<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">Any Technology kit must be driven by a customer
 
need.&nbsp; In the case of Supercomputers the craving for increased computer
 
performance at a lower cost (overall cost) was the deciding factor.&nbsp; In
 
any Supercomputer company a combination of marketing requirements, architecture
 
innovations and logic design demands dictate the initial objectives of the
 
hardware circuit and packaging organization.&nbsp; I state “initial” since once
 
the objectives are digested and key technologies are evaluated for the time
 
frame addressed, compromises are the norm. In the case of ETA Systems
 
technology selections in the early 1980’s, this was the strategy implemented.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">The following paragraphs sequence the thought process and
 
the technology selection strategy utilized.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">'''Integrated Circuit selection:'''</span><span style="font-size:13.0pt;font-family:Helvetica"><o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">The objectives, listed in earlier paragraphs were first
 
integrated into the architecture and logic design requirements.&nbsp; A market
 
survey of key integrated circuit suppliers was conducted with emphasis on what
 
was in development and planned for product introduction – not what was
 
available at the time of the survey.&nbsp; A risk assessment was made.&nbsp;
 
Primary focus was on the most dynamic technology, the IC Logic technology.&nbsp;
 
All decisions as to volume requirements, pins, packaging, etc. resulted from
 
what was determined by this survey and risk analysis.&nbsp; Merging the logic
 
design objectives (gates, bandwidth and performance of key functions) was
 
next.&nbsp;&nbsp;<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">An ECL (emitter coupled logic) high performance bipolar
 
gate array using Motorola advanced IC technology was selected.&nbsp; Since
 
Motorola was not fully staffed to begin the actual product development&nbsp;
 
(application) but did have the process development underway, a cooperative
 
development agreement was struck with the two companies (this occurred between
 
Motorola and Control Data since ETA Systems had yet not been formed).&nbsp; The
 
design called for basic logic cells to be incorporated into a larger version of
 
their existing gate array advancing the process for increased performance and
 
chip size for increased gate capacity.&nbsp; The existing gate or function
 
array utilized approximately 2,500 gates (which was used as the primary gate
 
array for the Cray Research very popular Y-MP Supercomputer) and the planned
 
gate array would contain an excess of 8,000 equivalent gates.&nbsp;<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">Logic cell libraries were agreed to (acceptable to both
 
Motorola for the general market and to CDC for the logic designs).&nbsp; Pin
 
counts (for power, ground and input/output logic communications) were
 
established and power consumption estimates were made.&nbsp; Once these
 
parameters were established, board size, power systems and thermal control were
 
evaluated in a trade off give-and-take.&nbsp; Features of Printed Circuit
 
Boards, (line widths, spacing, interconnect vias and number of layers were
 
compared to the board size capacities, laminating press capabilities, drill
 
designs and printed pc board processing limits.&nbsp; IC packaging, limits,
 
i.e.; minimum size of package, pin spacing, thermal removal, etc. was evaluated
 
in parallel with PC Board limits.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">The chip design began, the cell library began and the
 
packaging began once all parameters&nbsp; (pins, power consumption and die size
 
objectives) were agreed to.&nbsp; Printed circuit board experiments also
 
began.&nbsp; Once feasibility was established and practical limits established
 
(original goals could be met as to physical design and performance based on IC
 
Modeling and extrapolation from previous established functional systems, a
 
preliminary specification was presented to the architects and logic designers
 
for review.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">From initial design data, logic design based on the
 
parameters provided established a physical size for the Central Processing unit
 
or CPU, the heart of the system.&nbsp; A multiple board processor was required.
 
This placed additional constraints on packaging since within a single processor
 
all distances are crucial between circuits.&nbsp; Three-dimensional packaging
 
concepts were considered. Three dimensional packaging effectively meant a
 
“sandwich” effect of multiple boards with interconnects from board to board
 
were throughout the area – not exclusive to the periphery of the board such
 
that chips on each of the boards would minimize distances between them.&nbsp;
 
In addition, power consumption estimates were made; thermal removal paths and
 
techniques were considered.&nbsp; A cost model was generated as well. All of
 
these factors resulted in a preliminary estimate of the CPU volume. &nbsp;In
 
the introduction portion of the document, you already know that this was rejected
 
- more to follow for sure.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">In parallel with these efforts, memory design was
 
underway.&nbsp; Less freedom was available to memory since the basic
 
semiconductor device could not be altered to accommodate specific users.&nbsp;
 
There were a few packaging alternatives, very few, and device configurations
 
(Word – Bit architecture, pin numbering, power considerations, etc.) were
 
dictated by the industry.&nbsp; Since memory design has its own objectives for
 
cost, reliability and performance, this effort could continue quite independently
 
with one exception, the packaging of the total system must be synergistic and
 
compatible.&nbsp; A crucial parameter of this is the interconnect mechanism
 
between processors and memory.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">A hardware system cost model was established – not only
 
for current cost considerations but also estimates on volume costs based on
 
learning-curve estimates as well for the life of the system.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">The chief architect, after careful review, rejected the
 
design; this was covered in the introduction. Three key reasons were sited;
 
performance would be impacted due to the 8,000 gate limit, (worst case logic
 
paths could not reside in a single chip and multiple chip distances would
 
increase the clock period), power consumption per CPU, although lower on a
 
performance ratio basis to previous generations, was too high when the total
 
system size (including the multiprocessor objectives) were considered and
 
system cost appeared prohibitive – always a subjective issue but never-the-less
 
a key component of the design.&nbsp; Reliability concerns were also stated
 
since the pin-count per CPU, although quite reduced from previous designs, were
 
of concern. &nbsp;The architecture was committed to four CPUs (max) per system
 
so the interconnect "bar" was raised.<o:p></o:p></span>
 
 
 
<span style="font-size:19.0pt;
 
font-family:Helvetica">Back to the drawing board</span><span style="font-size:
 
13.0pt;font-family:Helvetica">.&nbsp;<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">mini tutorial <o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica"><span style="mso-spacerun: yes">&nbsp;</span>Bipolar technology refers to conventional NPN and PNP transistors operating in a non-saturating mode (collector-base).&nbsp; By not saturating the operating transistors (not allowing the base voltage being higher than the collector voltage) the switching characteristics were improved and balanced (off logic level and on logic levels had identical delays).&nbsp; In addition, the non-saturating circuitry – titled ECL for Emitter coupled logic – provided the TRUE and COMPLIMENT outputs for each logic function (i.e.; AND &amp; NAND, OR &amp; NOR, etc).&nbsp; This provided advantages to logicians to design complex Boolean functions (ADD units, MULTIPLY units, DIVIDE units, etc). Under the category of “no free ride” ECL circuitry consumed higher power than the more popular but much slower saturating logic circuitry (TTL – transistor-transistor logic).&nbsp; Other improvements in performance for integrated versions of ECL logic circuitry included replacing conventional junction isolation between circuit devices on a single die with Oxide isolation between circuits (lower capacitance per circuit so less charging and discharging when logic levels switched).&nbsp; <o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">''CMOS (Complementary Metal Oxide Silicon) circuitry,
 
especially at the time of ETA System, was a simpler and more efficient logic
 
circuit. This form of logic also had a simpler process.&nbsp; Stacking of P
 
channel and N channel transistors in series between voltage bus rails defines a
 
single complementary gate. Functionality of the logic devices is much more
 
forgiving to process variations due to the larger voltage swing and only active
 
transistors used to define the circuitry (no resistors, diodes, etc.).&nbsp;
 
The physical size of a logic function when compared to a bipolar equivalent is
 
significantly smaller, resulting in an increase in circuitry per equivalent die
 
(chip) size. CMOS technology also consumed power ONLY when the circuit was
 
switching (changing states) so power consumption was directly proportional to
 
the frequency it was operating.&nbsp;&nbsp;(P = CV''</span><span style="font-size:11.0pt;font-family:Helvetica">''<sup>2</sup>''</span><span style="font-size:13.0pt;font-family:Helvetica">''f) &nbsp;&nbsp;ECL circuitry,
 
by contrast, consumed approximately the same power – while switching or in a
 
quiescent state.&nbsp; (Later forms of CMOS – especially those designed in
 
early 2000 and beyond, had increased power consumption primarily caused by
 
increased bulk leakage currents as a result of processes developed for
 
lithography having features en excess (smaller) than 90 nanometers.&nbsp;''</span><span style="font-size:13.0pt;font-family:Helvetica"> Technology at the time of the
 
development of the ETA Supercomputers had minimum features of 1,200 nanometers.
 
(In 2009, by contrast, the production capability is 45 nanometers)<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">''Advantages of CMOS were obvious; more circuits per
 
given chip area, lower power consumption and higher functional yield.&nbsp; It
 
is important to stress “functional yield”. The CMOS devices functioned over a
 
much larger range of processing variations (&gt; 50% Vs. &lt; 15% to 25% for
 
ECL).&nbsp;''</span><span style="font-size:13.0pt;font-family:Helvetica">
 
Performance variations for a given process were approximately 2 to 3 times for
 
CMOS and 20% to 30% for ECL.&nbsp; For this reason CMOS devices were sold at a
 
much lower performance than any bipolar counterpart.&nbsp;(I.e.; if the product
 
was specified to accommodate the entire functional lot (wafers processed at the
 
same time), more IC devices yielded. &nbsp;There is one other key difference in
 
defining performance differences between Bipolar and CMOS devices.&nbsp; For
 
ECL (or any other bipolar device) the maximum operating frequency is defined,
 
in part, by the base width – the physical distance between the emitter and
 
collector of the transistor.&nbsp; This is determined by the spacing based on
 
diffusion or implant of the emitter and is controlled in the vertical direction
 
and limited by process control that is quite precise.&nbsp; This parameter is
 
very thin and the frequency is determined indirectly proportional to the base
 
width.&nbsp; For CMOS the gate length defines the critical performance
 
parameter.&nbsp; Gate length is defined by mask optic limitations for any
 
generation of processes.&nbsp; Bipolar devices in the 1980’s and well into the
 
later half on the 1990’s, therefore, had higher operating maximum frequencies
 
than their CMOS counterparts.&nbsp; As capital equipment – primary optics to
 
generate masking and etching capabilities defined smaller and smaller
 
geometries, CMOS technology improved dramatically in performance.&nbsp; This
 
was a result of smaller gate lengths but also each generation had smaller
 
devices resulting in lower capacitance loading and lower time constants to
 
charge and discharge.&nbsp; During the time of the ETA Systems Supercomputer
 
development, CMOS technology had not seen the advantages that bipolar devices
 
could realize – but the potential for future improvements was obvious and
 
projections clearly indicated that by the second half of the 1990’s (nearly 10
 
years after the first ETA Systems Supercomputer would be available), CMOS would
 
overtake Bipolar in the last and most important parameter – performance.<span style="mso-spacerun: yes">&nbsp; </span>To restate this; the IC industry was transitioning to CMOS technology and more funding at the device, and equipment level was being expended to accommodate new markets focused on potential of CMOS than was being expended for Bipolar devices. <o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">Bipolar technology was stretched to a practical limit
 
for the time frame in question.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">The IC industry, therefore, had only one other
 
technology candidate, CMOS, which was, in 1983, used exclusively for lower cost
 
and considerable lower performance applications and memory device technology
 
where more bits per die could be fabricated at the expense of lower performance
 
of the Bipolar counterpart(s).&nbsp; The impressive characteristic of CMOS
 
technology at this time was: Lower power consumption per function, smaller size
 
per logic function and lower cost per die due to two key factors (smaller
 
physical size per function meant more logical functions per unit of area, and
 
higher chip yield – chip functionality per wafer manufactured – due to reduced
 
number of processing steps to generate CMOS devices.&nbsp; That was the good
 
news.&nbsp; The concern was system performance.&nbsp; While bipolar technology
 
had set the standard for clock periods of 10 NSec for Supercomputer
 
architectures such as the ETA System projection, CMOS was at least 5 times
 
slower – in most cases 10 to 20 times slower for equivalent architectures.
 
Based on this parameter alone, CMOS was not a candidate for Supercomputers in
 
the 1999-1990 time frame (the time frame where the ETA Systems Supercomputer
 
would be in high volume production).<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">The next steps for CDC (recall that at this time CDC
 
still had a Supercomputer Division) were dramatic and at times emotional.&nbsp;
 
First, the team had to discard the ECL design and terminate the effort with
 
Motorola.&nbsp; This was very difficult since both companies depended on each
 
other and secondly, all objectives of the ECL product were being met within the
 
specifications established.&nbsp; CDC (team which later became ETA Systems)
 
provided Motorola with all of the design details to date.&nbsp; Considerable
 
effort was made to insure that the program was successful at Motorola.&nbsp;<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">A sidelight to this discussion – Motorola completed this
 
product as an industry product.&nbsp; Cray Research Inc. (the key competition
 
and leader of the Supercomputer market) engaged with Motorola to successfully
 
complete this complex IC development for a product announced in the late
 
1980’s. The product (Cray C-90) under the leadership of Les Davis, Steve Nelson
 
and other notable scientists (a key circuit designer was Mark Birrittella),
 
became another very successful supercomputer products developed and
 
manufactured by Cray Research Inc.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">Next, a full effort evaluation of all technology
 
candidates occurred.&nbsp; CMOS futures were explored in depth.&nbsp; GaAs
 
technology was also evaluated.&nbsp; Alternative ECL (bipolar) candidates were
 
also considered.&nbsp; CMOS was viewed as the technology of the future but the
 
future was beyond the time frame necessary for product introduction.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Helvetica">The following paragraphs summarize key events that led
 
to the decision to use CMOS technology.&lt;o:p&gt;&lt;/o:p&gt;<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Wingdings">Ø&nbsp;&nbsp; </span><span style="font-size:13.0pt;
 
font-family:Helvetica">Moore’s law (invented by the great innovator and
 
co-founder of Intel – Gordon Moore) stated that IC technology (CMOS)
 
technology, would double in performance and density every 18 months to two
 
years.&nbsp; The actual Moore’s law may have been stated somewhat differently
 
but this captured all the project cared about.&nbsp; To achieve this predicted
 
growth, several parameters had to occur:<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">The
 
die size would increase (more gates per manufactured chip).<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Features
 
on the chip (metal widths and spaces to interconnect devices and actual device
 
parameters) would be reduced every 16 months to 2 years. Reducing parameter
 
sizes have two positive results to goals of ETA Systems: increased performance
 
and more gates per die.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">The
 
technology would gain popularity – this would mean that capital equipment would
 
keep pace with the “law”, applications would increase thus increasing volume,
 
thus lowering cost and increasing performance and more applications and
 
industries would drive CMOS technology – the Supercomputer industry could not
 
drive such a large industry.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;font-family:Helvetica">&nbsp;<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;font-family:Helvetica">Key industry activities also
 
emerged at this time:<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">CDC
 
validated operational performance gains operating CMOS technology in a
 
cryogenic environment.<span style="mso-spacerun: yes">&nbsp; </span>Several ring counter configurations generated with the 5,000 gate chip discussed earlier were dipped in a Liquid Nitrogen thermos jug expecting to witness the shattering of the silicon and the detachment of the solder joints attached to the oscilloscope only to find the frequency of the ring oscillators double and the system operate for weeks until we turned off the experiment.<span style="mso-spacerun: yes">&nbsp; </span>Analytical analysis applied to the Silicon design validated the research done previously by others.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Key
 
US Government agencies began a technology acceleration program based on CMOS
 
technology – the Very High Speed Integrated Circuits (VHSIC) program under
 
direction of the Army, Navy and Air Force certainly captured our attention.<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Honeywell,
 
one of the participants in the VHSIC program held a technology luncheon IEEE
 
symposium in which they presented an 11,000-gate CMOS development effort.&nbsp;
 
Attendees from CDC were impressed (especially the key designer – Randy Bach -
 
with what the efforts.&nbsp; The chip was certainly larger than any that had
 
been developed to date and the performance was accelerated beyond what was
 
predicted for the 1988 time frame by the conventional IC industry (the
 
introduction date set for the ETA Systems Supercomputer – then the next
 
generation CDC Supercomputer).<span style="mso-spacerun: yes">&nbsp;
 
</span>Honeywell was a recipient of one of the VHSIC contracts..<o:p></o:p></span>
 
 
 
<span style="font-size:13.0pt;
 
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Logicians
 
and architects back at CDC - led by Neil Lincoln (chief architect), Ray Kort,
 
Maurice Hudson and Dave Hill and others - determined that an minimum gate
 
density of 15,000 gates per die would allow them to achieve a key objective;
 
having a worst case Register to Register clock path residing within a single
 
chip.&nbsp; Now additional explanation is required here.&nbsp; There were
 
technical reasons that the logicians wanted more beyond the knee jerk reaction
 
that asking for 50% more than offered was a standard mode of operation for
 
these guys. Each architecture configuration has a method of achieving its goals
 
of applying computational instructions to problems.&nbsp; The number of gates
 
that are connected in serial fashion between the input and output registers
 
(and this is truly simplifying the problem) determine the clock period that is
 
allowed.&nbsp; For the ETA Systems Supercomputer, therefore, it was determined
 
that a functional unit clock period could reside within the boundary of the
 
chip if the chip could provide 15,000 gates of logic to the designer.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Before getting to the details as to how decisions were made and how the ETA System technologies the “kit” was selected and developed, a list of noteworthy accomplishments achieved are listed: </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Research
 
into technology experiments uncovered significant performance features of CMOS
 
technology.&nbsp; First of all, the technology was functional across a wide
 
range of voltages and temperatures but performance was significantly
 
altered.&nbsp; The higher the operating voltage (within semiconductor
 
constraints, of course) the higher the performance resulted.&nbsp;
 
Unfortunately the Power consumption, although significantly lower than any
 
alternative technology, increased as the Square of the operating voltage.&nbsp;
 
The lower the operating temperature of CMOS the higher performance as well.
 
This factor was studied by others and carefully documented from 400 degrees
 
Kelvin (100 degrees above room temperature) to 77 degrees Kelvin.&nbsp; (77
 
Degrees Kelvin is the boiling point temperature of liquid Nitrogen.)&nbsp; <o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*First Industry competitive CMOS CPU
font-family:Helvetica">So, let’s summarize what was learned with this
 
evaluation:<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Since 1995 – to the present (beginning 12 years after the technology selection by ETA Systems I might add) ALL HPC (High Performance Computers) are developed and manufactured using CMOS IC technology. Until as late as 2000, bipolar technology (higher power, more costly to manufacture and lower gate count per chip) dominated high performance computers throughout the world. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">IC
 
chips currently (four years before the need for an ETA Systems product) had a
 
capacity of 11,000 gates.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*First Industry Single Board CPU
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">The
 
performance of these gates, when operated at liquid Nitrogen temperatures,
 
would perform at least two times faster than at room temperature – not yet
 
validated at CDC.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The chip density (gates per chip) allowed by advanced CMOS, the use of layout and design Computer aided design tools for optimum layout and simulation, the successful design of a 45 layer advance Printed Circuit board (you read it right 45 layers) and innovative chip attachment and cooling permitted a single processor containing nearly 3 million gates to be packaged on a single board </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">15,000
 
useable gates were required per chip to meet logic designer chip boundary
 
requirements.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*First Industry system to bedesigned with self-test
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">If
 
Moore’s law was applied to these parameters, within the time frame required, it
 
was possible to achieve both gates per chip densities and performance goals (if
 
the system operated in a liquid Nitrogen environment).&nbsp;&nbsp;<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>CPU Processing units (≈3Million gates each) were validated for functionality and performance in less than 4 hours.Any interconnect errors were recorded and allowed chip-to-chip replacement to occur in a minimal time.Other CPU checkout during this same period required weeks to months to check out and validate a processing unit. Incoming testing of the logic IC Chip (function and performance) also used the same self-test innovations. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">There
 
were at least two IC Suppliers (those having contracts with the US government)
 
that were pursuing CMOS as a performance and high gate/chip density technology
 
(the other known corporation was TRW).<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*First Industry production Liquid Nitrogen CPU
font-family:Helvetica">Computer Aided Design (CAD) tools were, during the
 
period of the 80’s, in the infancy stage if one was to compare them to today’s
 
capabilities.&nbsp; To design, place cells within the matrix of the gates
 
provided on the IC Chip, and route the interconnections of these cells
 
accurately to the logic or Boolean design required by the logicians and to
 
clock period constraints was a challenge.&nbsp; This challenge applied to board
 
layout designs as well.&nbsp; Control Data Corporation (CDC) recognized the
 
challenges and established a small but efficient and dedicated organization to
 
address these challenges.&nbsp; The industry had established a metric that to
 
use CAD tools for gate or cell arrays, an additional 20% to 30% gates were
 
required.&nbsp; This meant if the ETA Supercomputer required at least 15,000
 
useable gates to accomplish necessary designs based on its architecture, an 18,000
 
to ≈20,000-gate capacity was required.&nbsp; The technology organization set at
 
its objectives a design of 20,000 gates plus necessary circuitry to self-test
 
each gate or cell array.&nbsp; This as compared to the gate array in
 
development at Honeywell was nearly 2 times the capacity (11,000 total gates
 
Vs. 20,000 total gates plus circuitry for self test). <o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The ETA Systems CPU was immersed in Liquid Nitrogen – 77 degrees Kelvin – to improve performance greater than two times that CMOS technology operated at room temperature – 300 degrees Kelvin. </p>
font-family:Helvetica">The task was to convince Honeywell to project the next
 
generation size and layout rules and to accept an R&amp;D effort that would
 
allow CDC / ETA Systems achieve its objectives. Honeywell, an innovative
 
organization, took on the task after considerable discussion with key
 
requirements:<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*First system at CDC to fully utilize Computer Design Software to design Chips, boards, validate Logic design and Auto Diagnostic test the system with Synergistic tools
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">ETA
 
Systems (we were now ETA Systems by the time these discussions reached
 
negotiations) accept costs based on wafers processed, not functional
 
chips.&nbsp; Honeywell would provide necessary processing data to reflect
 
wafers were processed within process parameter specifications.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Permitted checkout of a CPU to be completed in less than 4 hours. Manufacturing costs were greatly reduced. This technique was also used at the IC Supplier and greatly reduced any probe test hardware and software. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">ETA
 
Systems provide test equipment for wafer testing and test parameters for chip
 
acceptance prior to packaging.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*First Industry system to havemultiple cost designs from single design effort
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Both
 
companies would share facilities and key resources and work as a single team –
 
as “open a Kimono relationship” that one could ever imagine during this dynamic
 
period of complex process developments within the IC Industry. – David Frankel
 
was assigned the task as ETA Systems interface and engergetically took on the
 
challenging task.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Performance range of the ETA System products was greater than 24:1 (8 processor system operating at 7 nanoseconds Clock period and a single processor system operating at 24 nanoseconds.). Processors were manufactured, tested and validated from a single manufacturing line using identical components. (IC Chips were performance sorted using auto self test). Product differences began at the system packaging level. </p>
font-family:Helvetica">&nbsp;<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
== Boring into details  ==
font-family:Helvetica">Self-test circuitry was designed into the basic cell
 
array periphery.&nbsp; The area consumed by this custom set of pseudo-random
 
generated logic and registers was less than 15% of the total chip area.&nbsp;
 
(David Resnick, resident do-it-all reduced concepts explored by ex CDC
 
scientist Nick Van Brunt who left the company a year previous to the formation
 
of ETA Systems.)<span style="mso-spacerun: yes">&nbsp; </span>This was one of many extra ordinary contributions David made to ETA Systems.<span style="mso-spacerun: yes">&nbsp; </span>Additionally to providing self test capability to accept or reject the circuitry – both functionality and performance sorting – the circuitry included in each 20,000 gate array had capability to test for interconnect between circuits on the final PC Board as well as circuit to I/O connections.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Any Technology kit must be driven by a customer need. In the case of Supercomputers the craving for increased computer performance at a lower cost (overall cost) was the deciding factor. In any Supercomputer company a combination of marketing requirements, architecture innovations and logic design demands dictate the initial objectives of the hardware circuit and packaging organization. I state “initial” since once the objectives are digested and key technologies are evaluated for the time frame addressed, compromises are the norm. In the case of ETA Systems technology selections in the early 1980’s, this was the strategy implemented. </p>
font-family:Helvetica">When the logic design team first heard of this area
 
“waste” of test circuits that could be used for logic design, they lobbied for
 
it to be removed in favor of more logic gates for function designs.&nbsp; Fortunately
 
this request was not honored.&nbsp; IC validation at both the supplier in wafer
 
form and at ETA Systems in packaged chip configuration coupled with the use of
 
the same circuitry in manufacturing checkout to detect board opens and shorts
 
between circuits assembled both in room temperature and cryogenic temperature
 
environments proved to be well worth this “waste” of circuitry area. Small,
 
relatively inexpensive testing systems were designed by ETA Systems and
 
provided to the supplier.&nbsp; The operands for initialization of the
 
pseudo-random logic were also supplied for each design (chip type).<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The following paragraphs sequence the thought process and the technology selection strategy utilized. </p>
font-family:Helvetica">Chip types (array design options) were carefully managed
 
as to not proliferate the chip types in the system.&nbsp; This was a new
 
constraint placed on logic designers and was dealt with most professionally and
 
responsibly by all participants once understood.&nbsp; The resultant chip total
 
for the CPU (processing unit) was fewer than 150 while the chip types including
 
clock chips and all logic design chips was fewer than 20 as best recalled.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
==== Integrated Circuit selection  ====
font-family:Helvetica">During the development cycle of the ETA System
 
Supercomputer, Honeywell moved the manufacturing capability from a local
 
Minneapolis facility to a state-of-the-art manufacturing facility in Colorado
 
Springs, CO.&nbsp; The transition was very transparent to ETA Systems (with the
 
exception of the traveling budget, of course). To accomplish this team
 
membership from both companies acted as one in all decisions addressing
 
scheduling and timing of needs of various chips, testing, packaging, etc. The
 
open book relationship was very beneficial to both companies. On one milestone
 
occasion – where Honeywell successfully completed an initial order – Dave
 
Frankel and I visited Honeywell, some 30 miles from the ETA Systems facility,
 
and served cake and coffee to all designers and operators – it was below zero
 
when this milestone was reached and no one cared.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The objectives, listed in earlier paragraphs were first integrated into the architecture and logic design requirements. A market survey of key integrated circuit suppliers was conducted with emphasis on what was in development and planned for product introduction – not what was available at the time of the survey. A risk assessment was made. Primary focus was on the most dynamic technology, the IC Logic technology. All decisions as to volume requirements, pins, packaging, etc. resulted from what was determined by this survey and risk analysis. Merging the logic design objectives (gates, bandwidth and performance of key functions) was next. </p>
font-family:Helvetica">One design that was incorporated into the chip was to
 
allow for next generation critical processing parameters to be added to the
 
existing design (present chip layout).&nbsp; Although this would not optimize
 
the features of new process features (all parameters were not considered), key
 
performance enhancements could be and were added to the present design.&nbsp; A
 
key feature was gate length and this was added transparently to the physical
 
chip and offered appreciable performance enhancements to the design.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>An ECL (emitter coupled logic) high performance bipolar gate array using Motorola advanced IC technology was selected. Since Motorola was not fully staffed to begin the actual product development (application) but did have the process development underway, a cooperative development agreement was struck with the two companies (this occurred between Motorola and Control Data since ETA Systems had yet not been formed). The design called for basic logic cells to be incorporated into a larger version of their existing gate array advancing the process for increased performance and chip size for increased gate capacity. The existing gate or function array utilized approximately 2,500 gates (which was used as the primary gate array for the Cray Research very popular Y-MP Supercomputer) and the planned gate array would contain an excess of 8,000 equivalent gates. </p>
font-family:Helvetica"><u>Chip design summary</u>: The decision to utilize CMOS
 
technology for the ETA Systems Supercomputer in the 1985 – 1988 time frame
 
(prematurely by all industry metrics) resulted in the following additional
 
“technology kit” decisions:<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Logic cell libraries were agreed to (acceptable to both Motorola for the general market and to CDC for the logic designs). Pin counts (for power, ground and input/output logic communications) were established and power consumption estimates were made. Once these parameters were established, board size, power systems and thermal control were evaluated in a trade off give-and-take. Features of Printed Circuit Boards, (line widths, spacing, interconnect vias and number of layers were compared to the board size capacities, laminating press capabilities, drill designs and printed pc board processing limits. IC packaging, limits, i.e.; minimum size of package, pin spacing, thermal removal, etc. was evaluated in parallel with PC Board limits. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Addition
 
of chip self-test. Feature established functionality at wafer test and
 
functionality and performance sorting at ETA Systems<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The chip design began, the cell library began and the packaging began once all parameters (pins, power consumption and die size objectives) were agreed to. Printed circuit board experiments also began. Once feasibility was established and practical limits established (original goals could be met as to physical design and performance based on IC Modeling and extrapolation from previous established functional systems, a preliminary specification was presented to the architects and logic designers for review. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Computer
 
Layout tools that validated logic prior to chip release for fabrication<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>From initial design data, logic design based on the parameters provided established a physical size for the Central Processing unit or CPU, the heart of the system. A multiple board processor was required. This placed additional constraints on packaging since within a single processor all distances are crucial between circuits. Three-dimensional packaging concepts were considered. Three dimensional packaging effectively meant a “sandwich” effect of multiple boards with interconnects from board to board were throughout the area – not exclusive to the periphery of the board such that chips on each of the boards would minimize distances between them. In addition, power consumption estimates were made; thermal removal paths and techniques were considered. A cost model was generated as well. All of these factors resulted in a preliminary estimate of the CPU volume. In the introduction portion of the document, you already know that this was rejected - more to follow for sure. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Requirement
 
to operate the chip at 77 degrees Kelvin or in liquid Nitrogen<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>In parallel with these efforts, memory design was underway. Less freedom was available to memory since the basic semiconductor device could not be altered to accommodate specific users. There were a few packaging alternatives, very few, and device configurations (Word – Bit architecture, pin numbering, power considerations, etc.) were dictated by the industry. Since memory design has its own objectives for cost, reliability and performance, this effort could continue quite independently with one exception, the packaging of the total system must be synergistic and compatible. A crucial parameter of this is the interconnect mechanism between processors and memory. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Packaging,
 
interconnect &amp; assembly decisions based on liquid Nitrogen operation
 
challenges<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>A hardware system cost model was established – not only for current cost considerations but also estimates on volume costs based on learning-curve estimates as well for the life of the system. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Remote
 
testing of the CPU because of liquid Nitrogen operation challenges<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The chief architect, after careful review, rejected the design; this was covered in the introduction. Three key reasons were sited; performance would be impacted due to the 8,000 gate limit, (worst case logic paths could not reside in a single chip and multiple chip distances would increase the clock period), power consumption per CPU, although lower on a performance ratio basis to previous generations, was too high when the total system size (including the multiprocessor objectives) were considered and system cost appeared prohibitive – always a subjective issue but never-the-less a key component of the design. Reliability concerns were also stated since the pin-count per CPU, although quite reduced from previous designs, were of concern. The architecture was committed to four CPUs (max) per system so the interconnect "bar" was raised. </p>
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Logic
 
design partitioning challenges to design within 15,000-gate per chip boundaries
 
and a minimum of IC chip types<o:p></o:p></span>
 
  
<span style="font-size:14.0pt;
+
== Back to the drawing board  ==
font-family:Helvetica">Printed Circuit Board Design Selection</span><span style="font-size:13.0pt;font-family:Helvetica">:<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Bipolar technology refers to conventional NPN and PNP transistors operating in a non-saturating mode (collector-base). By not saturating the operating transistors (not allowing the base voltage being higher than the collector voltage) the switching characteristics were improved and balanced (off logic level and on logic levels had identical delays). In addition, the non-saturating circuitry – titled ECL for Emitter coupled logic – provided the TRUE and COMPLIMENT outputs for each logic function (i.e.; AND &amp; NAND, OR &amp; NOR, etc). This provided advantages to logicians to design complex Boolean functions (ADD units, MULTIPLY units, DIVIDE units, etc). Under the category of “no free ride” ECL circuitry consumed higher power than the more popular but much slower saturating logic circuitry (TTL – transistor-transistor logic). Other improvements in performance for integrated versions of ECL logic circuitry included replacing conventional junction isolation between circuit devices on a single die with Oxide isolation between circuits (lower capacitance per circuit so less charging and discharging when logic levels switched). </p>
font-family:Helvetica">In the period of the 1980s, the time frame of the ETA
 
Systems Supercomputer development, Printed circuit boards had maximum
 
dimensions of approximately a square foot and the number of total layers fewer
 
than 20.&nbsp; (Layers provide power and ground stability, interconnect
 
capability for the circuits attached to the board as well as inputs and outputs
 
to and from the board.)&nbsp; If these total layers are allocated properly,
 
approximately 50% are used for interconnect and the remaining for power and
 
ground.&nbsp; Positioning of power and ground layers also serve to provide
 
interconnect layers that have transmission line capabilities to insure signal
 
integrity throughout the board. During this period, a state-of-the-art printed
 
circuit board was approximately one square foot of active circuitry and as
 
stated earlier, 20 layers or fewer usually restricted to a total thickness of
 
0.063 inches.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>CMOS (Complementary Metal Oxide Silicon) circuitry, especially at the time of ETA System, was a simpler and more efficient logic circuit. This form of logic also had a simpler process. Stacking of P channel and N channel transistors in series between voltage bus rails defines a single complementary gate. Functionality of the logic devices is much more forgiving to process variations due to the larger voltage swing and only active transistors used to define the circuitry (no resistors, diodes, etc.). The physical size of a logic function when compared to a bipolar equivalent is significantly smaller, resulting in an increase in circuitry per equivalent die (chip) size. CMOS technology also consumed power ONLY when the circuit was switching (changing states) so power consumption was directly proportional to the frequency it was operating.(P = CV2f) ECL circuitry, by contrast, consumed approximately the same power – while switching or in a quiescent state. (Later forms of CMOS – especially those designed in early 2000 and beyond, had increased power consumption primarily caused by increased bulk leakage currents as a result of processes developed for lithography having features en excess (smaller) than 90 nanometers. Technology at the time of the development of the ETA Supercomputers had minimum features of 1,200 nanometers. (In 2009, by contrast, the production capability is 45 nanometers) </p>
font-family:Helvetica">It was determined that a maximum of 150 chips would be
 
required to design the ETA Systems Supercomputer CPU.&nbsp; Packaging of the IC
 
and interconnecting the chip to a PC board with minimum spacing between chips
 
(some spacing was required to allow interconnects to all of the necessary
 
layers) resulted in a 1.2x1.2 sq. inch “footprint”.&nbsp; Doing the simple math
 
results in a pc board of a minimum of 220 sq. inches.&nbsp; The number of total
 
layers required to interconnect the 150 chips and the necessary Input and
 
Output at the board periphery was determined to be 45.&nbsp; Looking at design
 
parameters of the board layers in more depth and insuring transmission line
 
features to insure signal integrity defined the board thickness at slightly
 
greater than 0.25 inches.&nbsp; This thickness was approximately three times
 
greater than high-end printed circuit boards produced in this time frame.&nbsp;
 
With a board having an area of greater than 1.5 times the size of what was able
 
to be produced, a thickness of 300% of what was produced and a the number of
 
layers 2.5 times of what was produced in this time frame it was clear that the
 
printed circuit board industry was not ready for the ETA Systems design! The
 
design has other limitations.&nbsp; A key factor when designing pc boards is to
 
insure proper connecting of the layers, i.e.; connecting the chip pins to the
 
board and the proper layer of interconnect in the board and back to the proper
 
receiving chip.&nbsp; Drilling holes in the layers and plating the wall of the
 
holes with copper for conduction make these connections.&nbsp; These are called
 
plated thru holes or PTH. A key parameter to insure that plating occurs in
 
these holes is the hole diameter to depth ratio.&nbsp; The industry at this
 
period (not much better today) is 6:1, i.e.; the thickness of the board must be
 
no more than 6 times the diameter of the hole.&nbsp; This ratio would dominate
 
the size of the board. If this ratio is used to design the board the board size
 
would be increased in area by greater than 9 times.&nbsp; Talk about piling
 
on!&nbsp; Since it was deemed not feasible, issues like cost and time to
 
fabricate the board were not even addressed.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Advantages of CMOS were obvious; more circuits per given chip area, lower power consumption and higher functional yield. It is important to stress “functional yield”. The CMOS devices functioned over a much larger range of processing variations (&gt; 50% Vs. &lt; 15% to 25% for ECL). Performance variations for a given process were approximately 2 to 3 times for CMOS and 20% to 30% for ECL. For this reason CMOS devices were sold at a much lower performance than any bipolar counterpart.(I.e.; if the product was specified to accommodate the entire functional lot (wafers processed at the same time), more IC devices yielded. There is one other key difference in defining performance differences between Bipolar and CMOS devices. For ECL (or any other bipolar device) the maximum operating frequency is defined, in part, by the base width – the physical distance between the emitter and collector of the transistor. This is determined by the spacing based on diffusion or implant of the emitter and is controlled in the vertical direction and limited by process control that is quite precise. This parameter is very thin and the frequency is determined indirectly proportional to the base width. For CMOS the gate length defines the critical performance parameter. Gate length is defined by mask optic limitations for any generation of processes. Bipolar devices in the 1980’s and well into the later half on the 1990’s, therefore, had higher operating maximum frequencies than their CMOS counterparts. As capital equipment – primary optics to generate masking and etching capabilities defined smaller and smaller geometries, CMOS technology improved dramatically in performance. This was a result of smaller gate lengths but also each generation had smaller devices resulting in lower capacitance loading and lower time constants to charge and discharge. During the time of the ETA Systems Supercomputer development, CMOS technology had not seen the advantages that bipolar devices could realize – but the potential for future improvements was obvious and projections clearly indicated that by the second half of the 1990’s (nearly 10 years after the first ETA Systems Supercomputer would be available), CMOS would overtake Bipolar in the last and most important parameter – performance.<span style=""> </span>To restate this; the IC industry was transitioning to CMOS technology and more funding at the device, and equipment level was being expended to accommodate new markets focused on potential of CMOS than was being expended for Bipolar devices. </p>
font-family:Helvetica">Nestled into the design laboratory of Control Data
 
Corporation was a small but very innovative printed circuit board prototype
 
facility. The leader of this group, LeRoy Beckman, never said “no” to
 
challenges.<span style="mso-spacerun: yes">&nbsp; </span>He just bit his pipe a little harder and tried not to snicker out loud.<span style="mso-spacerun:
 
yes">&nbsp; </span>LeRoy kept his eyes and ears out for innovative alternatives to conventional board fabrication techniques and had previously displayed innovation (evolutionary in nature) in previous generations.&nbsp; Embedded termination resistors in layers was one invention he brought to CDC when resistor termination took up too much board area; finer features than the industry was producing another, and higher plated through hole (pth) ratios than the industry a third.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Bipolar technology was stretched to a practical limit for the time frame in question. </p>
font-family:Helvetica">New technologies in the printed circuit board were few
 
and far between.&nbsp; The industry was set in it’s ways of subtractive etching
 
of circuit layers (removing unwanted copper from a pre-copper clad layer,
 
convention wet etch processes and relatively simple assembly, i.e.; lamination
 
of layers with pressure.&nbsp; One inventor, Mr. Peter P. Pellegrino, arrived
 
on the scene to discuss innovative, revolutionary and proven pc board processing.&nbsp;
 
At first the claims appeared to be too good to be true.&nbsp; Board size
 
relatively independent, aspect ratios exceeding 20:1 for PTH, an additive
 
process that permitted finer lines to be fabricated on individual layers.&nbsp;
 
The layers were also embedded into the laminate so the opportunity for higher
 
yield with reduced features.&nbsp; An additional benefit of additive plating is
 
reduction in waste and water usage.&nbsp; <o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The IC industry, therefore, had only one other technology candidate, CMOS, which was, in 1983, used exclusively for lower cost and considerable lower performance applications and memory device technology where more bits per die could be fabricated at the expense of lower performance of the Bipolar counterpart(s). The impressive characteristic of CMOS technology at this time was: Lower power consumption per function, smaller size per logic function and lower cost per die due to two key factors (smaller physical size per function meant more logical functions per unit of area, and higher chip yield – chip functionality per wafer manufactured – due to reduced number of processing steps to generate CMOS devices. That was the good news. The concern was system performance. While bipolar technology had set the standard for clock periods of 10 NSec for Supercomputer architectures such as the ETA System projection, CMOS was at least 5 times slower – in most cases 10 to 20 times slower for equivalent architectures. Based on this parameter alone, CMOS was not a candidate for Supercomputers in the 1999-1990 time frame (the time frame where the ETA Systems Supercomputer would be in high volume production). </p>
font-family:Helvetica">A special plating cell was also introduced that
 
permitted uniform deep hole plating by forcing plating fluid into each of the
 
thousands of PTH.&nbsp; The process titled “Push-Pull<sup>TM</sup>” also
 
accelerated the plating manufacturing cycle by over an order of magnitude,
 
reducing cost.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The next steps for CDC (recall that at this time CDC still had a Supercomputer Division) were dramatic and at times emotional. First, the team had to discard the ECL design and terminate the effort with Motorola. This was very difficult since both companies depended on each other and secondly, all objectives of the ECL product were being met within the specifications established. CDC (team which later became ETA Systems) provided Motorola with all of the design details to date. Considerable effort was made to insure that the program was successful at Motorola. </p>
font-family:Helvetica">A small plating cell was incorporated into the prototype
 
facility at CDC and a controlled set of experiments conducted.&nbsp;
 
Experiments were thorough and challenging since no one in the industry could
 
approach the lofty objectives of the ETA Systems Supercomputer CPU board nor
 
the lofty claims of the inventor. The results were simply outstanding.&nbsp;
 
From the results and a commitment to fabricate a larger manufacturing line of
 
plating insert cells, the 45 layer 15” x 24” CPU board became a realistic
 
finalized goal of ETA Systems.&nbsp;Anyone told of this goal openly scoffed at
 
this as too risky and unrealistic.<span style="mso-spacerun: yes">&nbsp;
 
</span>This included some in the company as well.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>A sidelight to this discussion – Motorola completed this product as an industry product. Cray Research Inc. (the key competition and leader of the Supercomputer market) engaged with Motorola to successfully complete this complex IC development for a product announced in the late 1980’s. The product (Cray C-90) under the leadership of Les Davis, Steve Nelson and other notable scientists (a key circuit designer was Mark Birrittella), became another very successful supercomputer products developed and manufactured by Cray Research Inc.. </p>
font-family:Helvetica">Later, when manufacturing of the systems was viable, a
 
production capacity was developed for manufacturing.&nbsp; It is noted that
 
hundreds of these boards were fabricated from a period of 1987 through early
 
1989.&nbsp; The yield of final boards was nearly perfect – only one finished
 
board was scrapped. <o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Next, a full effort evaluation of all technology candidates occurred. CMOS futures were explored in depth. GaAs technology was also evaluated. Alternative ECL (bipolar) candidates were also considered. CMOS was viewed as the technology of the future but the future was beyond the time frame necessary for product introduction. </p>
font-family:Helvetica">To this day (2009) few realize what a monumental
 
accomplishment this was and still is. This a tribute to LeRoy Beckman, Peter
 
Pellegrino, the manufacturing facility at ETA Systems (now a banking building
 
in St. Paul) and those who trusted that the lofty objectives could be realized.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
==== Key events that led to the decision to use CMOS technology. ====
font-family:Helvetica">To accommodate routing and designing for minimum
 
distance between IC chips, CAD tools were developed and the first use of diagonal
 
routed layers were introduced.&nbsp; Prior to this only x–y layers were
 
permitted with manual and/or auto tools (CAD).&nbsp; This enhancement permitted
 
timing constraints to be realized between chips.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Moore’s law (invented by the great innovator and co-founder of Intel – Gordon Moore) stated that IC technology (CMOS) technology, would double in performance and density every 18 months to two years. The actual Moore’s law may have been stated somewhat differently but this captured all the project cared about. To achieve this predicted growth, several parameters had to occur: </p>
font-family:Helvetica">The final board had the following noteworthy
 
characteristics:<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*The die size would increase (more gates per manufactured chip).  
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
*Features on the chip (metal widths and spaces to interconnect devices and actual device parameters) would be reduced every 16 months to 2 years. Reducing parameter sizes have two positive results to goals of ETA Systems: increased performance and more gates per die.  
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Board
+
*The technology would gain broad industry popularity – this would mean that capital equipment would keep pace with the “law”, applications would increase thus increasing volume, thus lowering cost and increasing performance and more applications and industries would drive CMOS technology – the Supercomputer industry could not drive such a large industry.
size: 15 inches by 22 inches by 0.26 inches<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
==== Key industry activities also emerged at this time: ====
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">Pth
 
hole ratio ≈ 20:1 – plating time – less than 20 minutes<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*CDC validated operational performance gains operating CMOS technology in a cryogenic environment. Several ring counter configurations generated with the 5,000 gate chip discussed earlier were dipped in a Liquid Nitrogen thermos jug expecting to witness the shattering of the silicon and the detachment of the solder joints attached to the oscilloscope only to find the frequency of the ring oscillators double and the system operate for weeks until we turned off the experiment. Analytical analysis applied to the Silicon design validated the research done previously by others.  
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
*Key US Government agencies began a technology acceleration program based on CMOS technology – the Very High Speed Integrated Circuits (VHSIC) program under direction of the Army, Navy and Air Force certainly captured our attention.  
</span></span><span style="font-size:13.0pt;font-family:Helvetica">45
+
*Honeywell, one of the participants in the VHSIC program held a technology luncheon IEEE symposium in which they presented an 11,000-gate CMOS development effort. Attendees from CDC were impressed (especially the key designer – Randy Bach - with what the efforts. The chip was certainly larger than any that had been developed to date and the performance was accelerated beyond what was predicted for the 1988 time frame by the conventional IC industry (the introduction date set for the ETA Systems Supercomputer – then the next generation CDC Supercomputer). Honeywell was a recipient of one of the VHSIC contracts.
total layers per CPU panel<o:p></o:p></span>
+
*Logicians and architects back at CDC - led by Neil Lincoln (chief architect), Ray Kort, Maurice Hudson and Dave Hill and others - determined that an minimum gate density of 15,000 gates per die would allow them to achieve a key objective; having a worst case Register to Register clock path residing within a single chip. Now additional explanation is required here. There were technical reasons that the logicians wanted more beyond the knee jerk reaction that asking for 50% more than offered was a standard mode of operation for these guys. Each architecture configuration has a method of achieving its goals of applying computational instructions to problems. The number of gates that are connected in serial fashion between the input and output registers (and this is truly simplifying the problem) determine the clock period that is allowed. For the ETA Systems Supercomputer, therefore, it was determined that a functional unit clock period could reside within the boundary of the chip if the chip could provide 15,000 gates of logic to the designer.
 +
*Research into technology experiments uncovered significant performance features of CMOS technology. First of all, the technology was functional across a wide range of voltages and temperatures but performance was significantly altered. The higher the operating voltage (within semiconductor constraints, of course) the higher the performance resulted. Unfortunately the Power consumption, although significantly lower than any alternative technology, increased as the Square of the operating voltage. The lower the operating temperature of CMOS the higher performance as well. This factor was studied by others and carefully documented from 400 degrees Kelvin (100 degrees above room temperature) to 77 degrees Kelvin. (77 Degrees Kelvin is the boiling point temperature of liquid Nitrogen.)
  
<span style="font-size:13.0pt;
+
==== Summary of what was learned with this evaluation  ====
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
 
</span></span><span style="font-size:13.0pt;font-family:Helvetica">150
 
IC chip locations (fewer were used in final design)<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*IC chips currently (four years before the need for an ETA Systems product) had a capacity of 11,000 gates.
font-family:Symbol">¨<span style="font:7.0pt &quot;Times New Roman"">&nbsp;&nbsp;&nbsp;
+
*The performance of these gates, when operated at liquid Nitrogen temperatures, would perform at least two times faster than at room temperature – not yet validated at CDC.  
</span></span><span style="font-size:13.0pt;font-family:Helvetica">More
+
*15,000 useable gates were required per chip to meet logic designer chip boundary requirements.&lt;o:p&gt;&lt;/o:p&gt; ¨ If Moore’s law was applied to these parameters, within the time frame required, it was possible to achieve both gates per chip densities and performance goals (if the system operated in a liquid Nitrogen environment).
than 30,000 board plated thru holes (pth) were used for interconnect<o:p></o:p></span>
+
*There were at least two IC Suppliers (those having contracts with the US government) that were pursuing CMOS as a performance and high gate/chip density technology (the other known corporation was TRW).  
 +
*Computer Aided Design (CAD) tools were, during the period of the 80’s, in the infancy stage if one was to compare them to today’s capabilities. To design, place cells within the matrix of the gates provided on the IC Chip, and route the interconnections of these cells accurately to the logic or Boolean design required by the logicians and to clock period constraints was a challenge. This challenge applied to board layout designs as well. Control Data Corporation (CDC) recognized the challenges and established a small but efficient and dedicated organization to address these challenges. The industry had established a metric that to use CAD tools for gate or cell arrays, an additional 20% to 30% gates were required. This meant if the ETA Supercomputer required at least 15,000 useable gates to accomplish necessary designs based on its architecture, an 18,000 to ≈20,000-gate capacity was required. The technology organization set at its objectives a design of 20,000 gates plus necessary circuitry to self-test each gate or cell array. This as compared to the gate array in development at Honeywell was nearly 2 times the capacity (11,000 total gates Vs. 20,000 total gates plus circuitry for self test).
  
<span style="font-size:13.0pt;
+
<p>The task was to convince Honeywell to project the next generation size and layout rules and to accept an R&amp;D effort that would allow CDC / ETA Systems achieve its objectives. Honeywell, an innovative organization, took on the task after considerable discussion with key requirements: </p>
font-family:Helvetica">In 2009 this board development and manufacturing stands
 
out as one of the major technology developments by ETA Systems<o:p></o:p></span>
 
  
<span style="font-size:14.0pt;
+
*ETA Systems (we were now ETA Systems by the time these discussions reached negotiations) accept costs based on wafers processed, not functional chips. Honeywell would provide necessary processing data to reflect wafers were processed within process parameter specifications.
font-family:Helvetica">Packaging <o:p></o:p></span>
+
*ETA Systems provide test equipment for wafer testing and test parameters for chip acceptance prior to packaging.
 +
*Both companies would share facilities and key resources and work as a single team – as “open a Kimono relationship” that one could ever imagine during this dynamic period of complex process developments within the IC Industry. – David Frankel was assigned the task as ETA Systems interface and engergetically took on the challenging task.
 +
*Self-test circuitry was designed into the basic cell array periphery. The area consumed by this custom set of pseudo-random generated logic and registers was less than 15% of the total chip area. (David Resnick, resident do-it-all reduced concepts explored by ex CDC scientist Nick Van Brunt who left the company a year previous to the formation of ETA Systems.) This was one of many extra ordinary contributions David made to ETA Systems. Additionally to providing self test capability to accept or reject the circuitry – both functionality and performance sorting – the circuitry included in each 20,000 gate array had capability to test for interconnect between circuits on the final PC Board as well as circuit to I/O connections.
  
<span style="font-size:13.0pt;
+
<p>When the logic design team first heard of this area “waste” of test circuits that could be used for logic design, they lobbied for it to be removed in favor of more logic gates for function designs. Fortunately this request was not honored. IC validation at both the supplier in wafer form and at ETA Systems in packaged chip configuration coupled with the use of the same circuitry in manufacturing checkout to detect board opens and shorts between circuits assembled both in room temperature and cryogenic temperature environments proved to be well worth this “waste” of circuitry area. Small, relatively inexpensive testing systems were designed by ETA Systems and provided to the supplier. The operands for initialization of the pseudo-random logic were also supplied for each design (chip type). </p>
font-family:Helvetica">The key challenge for packaging the ETA Supercomputer
 
processing unit was the cryogenic chamber for the processor.&nbsp; The Cryostat
 
to contain the processor (two processor units) had a conventional (and quite
 
heavy) circular cryostat containing a vacuum chamber between the outside
 
environment and the inner environment.&nbsp; Input of liquid Nitrogen was at the
 
bottom of the chamber and the escaping of the gaseous Nitrogen was provided for
 
near the top of the unit.&nbsp; The piping containing the Nitrogen to and from
 
the regeneration unit was also temperature protected with vacuum lines.&nbsp;
 
Dan Sullivan and his design team led this admirable effort.<span style="mso-spacerun: yes">&nbsp; </span>(Unfortunately, Dan passed on a few years ago). It was felt that a less heavy and equally efficient chamber (proposed by Carl Breske – a very innovative scientist) could be designed if time permitted but the selection of the vacuum based design was conservative to accommodate schedule and also to familiarize the team with the challenges of Cryogenics.&nbsp; The compressor unit was a conventional Liquid Nitrogen system (very large and bulky) used for generation of Liquid Nitrogen for the commercial market.&nbsp; The system was not pretty.<span style="mso-spacerun: yes">&nbsp;
 
</span>Marketing, led by Bobby Robertson (also now deceased), prohibited the engineers to show this to perspective customers fearful that this would scare them away.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Chip types (array design options) were carefully managed as to not proliferate the chip types in the system. This was a new constraint placed on logic designers and was dealt with most professionally and responsibly by all participants once understood. The resultant chip total for the CPU (processing unit) was fewer than 150 while the chip types including clock chips and all logic design chips was fewer than 20 as best recalled. </p>
font-family:Helvetica">Thought was given to actually eliminate the need to
 
regenerate the system in a closed system but rather purchase Liquid Nitrogen –
 
readily available in tanks - and have them periodically refilled as is done in
 
the IC and other industries using Liquid Nitrogen.&nbsp; This was discarded for
 
the initial design since several customer sites did not easily accommodate the
 
external access to Liquid Nitrogen tanks.&nbsp; It was to be an option for
 
future systems and those customers that easily accommodated such an option.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>During the development cycle of the ETA System Supercomputer, Honeywell moved the manufacturing capability from a local Minneapolis facility to a state-of-the-art manufacturing facility in Colorado Springs, CO. The transition was very transparent to ETA Systems (with the exception of the traveling budget, of course). To accomplish this team membership from both companies acted as one in all decisions addressing scheduling and timing of needs of various chips, testing, packaging, etc. The open book relationship was very beneficial to both companies. On one milestone occasion – where Honeywell successfully completed an initial order – Dave Frankel and I visited Honeywell, some 30 miles from the ETA Systems facility, and served cake and coffee to all designers and operators – it was below zero when this milestone was reached and no one cared. </p>
font-family:Helvetica">The final design was then a closed recycled Liquid Nitrogen
 
system with the compressor located remote, much like Freon compressors, which
 
many Supercomputer customers were already accommodating.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>One design that was incorporated into the chip was to allow for next generation critical processing parameters to be added to the existing design (present chip layout). Although this would not optimize the features of new process features (all parameters were not considered), key performance enhancements could be and were added to the present design. A key feature was gate length and this was added transparently to the physical chip and offered appreciable performance enhancements to the design. </p>
font-family:Helvetica">The design challenge was at the surface (looked much
 
like a two slice toaster) where the processing boards were inserted. This seal
 
had to accommodate the connecting transmission to the external and room
 
temperature memory and I/O subsystems. A printed circuit board was designed to
 
connect the processor to the outside world.&nbsp; Heaters were applied to the
 
surface to prevent icing at the cryostat surface.<span style="mso-spacerun:
 
yes">&nbsp; </span>The separation, only a few short inches had memory operating at 300 degrees Kelvin and the CPU operating at 77 Degrees Kelvin.<span style="mso-spacerun: yes">&nbsp; </span>There were a few “frosty” events in this development cycle!<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
==== Chip design summary: ====
font-family:Helvetica">&nbsp;<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The decision to utilize CMOS technology for the ETA Systems Supercomputer in the 1985 – 1988 time frame (prematurely by all industry metrics) resulted in the following additional “technology kit” decisions: </p>
font-family:Helvetica">The third challenge was to provide reliable soldering of
 
the circuitry to the board amidst the severe temperature difference that the
 
solder joints would be subjected to (greater than 250 degrees) during the cool
 
down and warm up cycles.&nbsp; Studies at the National Bureau of Standards provided
 
input that the temperature cycle should be profiled in a precise sequence as
 
the board was cooled and heated.&nbsp; In addition, care as not to remove the
 
board and to care for condensation that would occur if the board had not been
 
heated to room temperature was considered.&nbsp; The result was a 20-minute
 
cycle to remove or insert the board was designed with a specifically prescribed
 
sequence of temperature lowering and rising for both cycles.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Addition of chip self-test. Feature established functionality at wafer test and functionality and performance sorting at ETA Systems </p>
font-family:Helvetica">At the time of the unfortunate termination of ETA
 
Systems, a more refined, lower cost and lower weight design as stated earlier
 
was on the drawing boards.&nbsp; Although the cryostat and associated cooling
 
was costly, an analysis clearly showed that for the performance resulting from
 
the design, the cost was less than any Bipolar IC system designed at the
 
time.<span style="mso-spacerun: yes">&nbsp; </span>Once the connector was finalized and the process and assembly designed, the system operated flawlessly.<span style="mso-spacerun: yes">&nbsp; </span><o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
*Computer Layout tools that validated logic prior to chip release for fabrication
font-family:Helvetica">Checkout on the manufacturing floor of the system
+
*Requirement to operate the chip at 77 degrees Kelvin or in liquid Nitrogen
utilized the “Self-Test” capability exhaustively so specific interconnect flaws
+
*Packaging, interconnect &amp; assembly decisions based on liquid Nitrogen operation challengesRemote testing of the CPU because of liquid Nitrogen operation challenges
were clearly understood prior to removing a CPU from the cryostat, thus
+
*Logic design partitioning challenges to design within 15,000-gate per chip boundaries and a minimum of IC chip types
reducing checkout time considerably as well.<span style="mso-spacerun:
 
yes">&nbsp; </span>These designs were well done, significant and challenged laws of thermodynamics and physics to new limits.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
== Printed Circuit Board Design Selection: ==
font-family:Helvetica">'''AIR COOLED SYSTEM '''</span><span style="font-size:
 
13.0pt;font-family:Helvetica"><o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>In the period of the 1980s, the time frame of the ETA Systems Supercomputer development, Printed circuit boards had maximum dimensions of approximately a square foot and the number of total layers fewer than 20. (Layers provide power and ground stability, interconnect capability for the circuits attached to the board as well as inputs and outputs to and from the board.) If these total layers are allocated properly, approximately 50% are used for interconnect and the remaining for power and ground. Positioning of power and ground layers also serve to provide interconnect layers that have transmission line capabilities to insure signal integrity throughout the board. During this period, a state-of-the-art printed circuit board was approximately one square foot of active circuitry and as stated earlier, 20 layers or fewer usually restricted to a total thickness of 0.063 inches. </p>
font-family:Helvetica">As stated earlier in the document, an Air-cooled
 
processor would operate considerably slower (2x slower) when operated in normal
 
or “room temperature” environments.&nbsp; ETA Systems by sorting the devices
 
for performance at incoming inspection, allowed for a three times performance
 
differential to be realized.&nbsp; Only the highest performance devices were
 
reserved for the Cryogenic cooled system.&nbsp; The remaining parts were then
 
re-sorted into two categories for room temperature; the differential would be a
 
4-nanosecond clock period between the two room temperature systems and 17
 
nanoseconds (24 to 7) for the total system product set.<span style="mso-spacerun: yes">&nbsp; </span>The sorting and using the entire distribution of Integrated Circuits had a significant cost reduction factor for the entire product line.<span style="mso-spacerun: yes">&nbsp; </span>Bipolar devices, by contrast had lower functional yield to begin with coupled with additional loss of product due to performance yield.<span style="mso-spacerun:
 
yes">&nbsp; </span>This was a definite cost reduction asset to the ETA System.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>It was determined that a maximum of 150 chips would be required to design the ETA Systems Supercomputer CPU. Packaging of the IC and interconnecting the chip to a PC board with minimum spacing between chips (some spacing was required to allow interconnects to all of the necessary layers) resulted in a 1.2x1.2 sq. inch “footprint”. Doing the simple math results in a pc board of a minimum of 220 sq. inches. The number of total layers required to interconnect the 150 chips and the necessary Input and Output at the board periphery was determined to be 45. Looking at design parameters of the board layers in more depth and insuring transmission line features to insure signal integrity defined the board thickness at slightly greater than 0.25 inches. This thickness was approximately three times greater than high-end printed circuit boards produced in this time frame. With a board having an area of greater than 1.5 times the size of what was able to be produced, a thickness of 300% of what was produced and a the number of layers 2.5 times of what was produced in this time frame it was clear that the printed circuit board industry was not ready for the ETA Systems design! The design has other limitations. A key factor when designing pc boards is to insure proper connecting of the layers, i.e.; connecting the chip pins to the board and the proper layer of interconnect in the board and back to the proper receiving chip. Drilling holes in the layers and plating the wall of the holes with copper for conduction make these connections. These are called plated thru holes or PTH. A key parameter to insure that plating occurs in these holes is the hole diameter to depth ratio. The industry at this period (not much better today) is 6:1, i.e.; the thickness of the board must be no more than 6 times the diameter of the hole. This ratio would dominate the size of the board. If this ratio is used to design the board the board size would be increased in area by greater than 9 times. Talk about piling on! Since it was deemed not feasible, issues like cost and time to fabricate the board were not even addressed. </p>
font-family:Helvetica">To cool the CPU air was forced on to the processor chips
 
by using a plenum that was designed to cover each chip.&nbsp; Holes were
 
designed in the plenum such that equal operating temperature would result for
 
each operating chip.&nbsp; Since the power consumption variation significant
 
for several part types, designing the appropriate number of holes above each
 
chip location could provide custom cooling.&nbsp; The plenum could then be
 
molded for mass production of the processing unit.&nbsp; Large volume cooling
 
fans were designed for the system as well.&nbsp; Cost was the focus for the
 
air-cooled systems since the price tag was below $1M. Recall, that the
 
air-cooled design was identical in parts at the CPU and storage level.<span style="mso-spacerun: yes">&nbsp; </span>A single development was achieved for a wide range of products with one design team.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>Nestled into the design laboratory of Control Data Corporation was a small but very innovative printed circuit board prototype facility. The leader of this group, LeRoy Beckman, never said “no” to challenges. He just bit his pipe a little harder and tried not to snicker out loud. LeRoy kept his eyes and ears out for innovative alternatives to conventional board fabrication techniques and had previously displayed innovation (evolutionary in nature) in previous generations. Embedded termination resistors in layers was one invention he brought to CDC when resistor termination took up too much board area; finer features than the industry was producing another, and higher plated through hole (pth) ratios than the industry a third.New technologies in the printed circuit board were few and far between. The industry was set in it’s ways of subtractive etching of circuit layers (removing unwanted copper from a pre-copper clad layer, convention wet etch processes and relatively simple assembly, i.e.; lamination of layers with pressure. One inventor, Mr. Peter P. Pellegrino, arrived on the scene to discuss innovative, revolutionary and proven pc board processing. At first the claims appeared to be too good to be true. Board size relatively independent, aspect ratios exceeding 20:1 for PTH, an additive process that permitted finer lines to be fabricated on individual layers. The layers were also embedded into the laminate so the opportunity for higher yield with reduced features. An additional benefit of additive plating is reduction in waste and water usage.A special plating cell was also introduced that permitted uniform deep hole plating by forcing plating fluid into each of the thousands of PTH. The process titled “Push-PullTM” also accelerated the plating manufacturing cycle by over an order of magnitude, reducing cost.A small plating cell was incorporated into the prototype facility at CDC and a controlled set of experiments conducted. Experiments were thorough and challenging since no one in the industry could approach the lofty objectives of the ETA Systems Supercomputer CPU board nor the lofty claims of the inventor. The results were simply outstanding. From the results and a commitment to fabricate a larger manufacturing line of plating insert cells, the 45 layer 15” x 24” CPU board became a realistic finalized goal of ETA Systems.Anyone told of this goal openly scoffed at this as too risky and unrealistic. This included some in the company as well.Later, when manufacturing of the systems was viable, a production capacity was developed for manufacturing. It is noted that hundreds of these boards were fabricated from a period of 1987 through early 1989. The yield of final boards was nearly perfect – only one finished board was scrapped. </p>
font-family:Helvetica">Storage<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>To this day (2009) few realize what a monumental accomplishment this was and still is. This a tribute to LeRoy Beckman, Peter Pellegrino, the manufacturing facility at ETA Systems (now a banking building in St. Paul) and those who trusted that the lofty objectives could be realized. </p>
font-family:Helvetica">Stacks using three-dimensional characteristics were
 
designed under the leadership of Brent Doyle for the memory – both static (high
 
performance) and dynamic (high density and lower performance) memories of the
 
ETA Systems Supercomputer.&nbsp; These unique designs provided for highest
 
density and optimum performance of the standard memory devices used.&nbsp;
 
Ability to upgrade to future generations of memory (more storage capacity
 
Integrated Circuits) was built into the design as well.&nbsp; The design worked
 
well and stacking became commonplace in the computer industry for future
 
designs – eventually eliminating the chip package entirely.<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>To accommodate routing and designing for minimum distance between IC chips, CAD tools were developed and the first use of diagonal routed layers were introduced. Prior to this only x–y layers were permitted with manual and/or auto tools (CAD). This enhancement permitted timing constraints to be realized between chips. </p>
font-family:Helvetica">&nbsp;<o:p></o:p></span>
 
<span style="font-size:13.0pt;font-family:Helvetica;mso-ansi-language:EN-US;
 
mso-fareast-language:EN-US">
 
  
</span>
+
<p>The final board had the following noteworthy characteristics: </p>
<span style="font-size:13.0pt;
 
font-family:Helvetica">The Air-cooled system was defined as a Piper.<span style="mso-spacerun: yes">&nbsp; </span>An Illustration of “Piper” is shown below.<o:p></o:p></span>
 
  
<!--[if gte vml 1]><v:shapetype
+
*Board size: 15 inches by 22 inches by 0.26 inches
id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t"
+
*Pth hole ratio ≈ 20:1 – plating time – less than 20 minutes
path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
+
*45 total layers per CPU panel
<v:stroke joinstyle="miter"/>
+
*150 IC chip locations (fewer were used in final design)
<v:formulas>
+
*More than 30,000 board plated thru holes (pth) were used for interconnect
  <v:f eqn="if lineDrawn pixelLineWidth 0"/>
 
  <v:f eqn="sum @0 1 0"/>
 
  <v:f eqn="sum 0 0 @1"/>
 
  <v:f eqn="prod @2 1 2"/>
 
  <v:f eqn="prod @3 21600 pixelWidth"/>
 
  <v:f eqn="prod @3 21600 pixelHeight"/>
 
  <v:f eqn="sum @0 0 1"/>
 
  <v:f eqn="prod @6 1 2"/>
 
  <v:f eqn="prod @7 21600 pixelWidth"/>
 
  <v:f eqn="sum @8 21600 0"/>
 
  <v:f eqn="prod @7 21600 pixelHeight"/>
 
  <v:f eqn="sum @10 21600 0"/>
 
</v:formulas>
 
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
 
<o:lock v:ext="edit" aspectratio="t"/>
 
</v:shapetype><v:shape id="_x0000_i1025" type="#_x0000_t75" style='width:399pt;
 
height:391pt'>
 
<v:imagedata src="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image001.gif"
 
  o:althref="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image002.pct"
 
  o:title=""/>
 
</v:shape><![endif]-->[[Image:]]<span style="font-size:13.0pt;font-family:
 
Helvetica">
 
  
<o:p></o:p></span>
+
<p>In 2009 this board development and manufacturing stands out as one of the major technology developments by ETA Systems </p>
  
<span style="font-size:13.0pt;
+
== Packaging  ==
font-family:Helvetica">&nbsp;<o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>The key challenge for packaging the ETA Supercomputer processing unit was the cryogenic chamber for the processor. The Cryostat to contain the processor (two processor units) had a conventional (and quite heavy) circular cryostat containing a vacuum chamber between the outside environment and the inner environment. Input of liquid Nitrogen was at the bottom of the chamber and the escaping of the gaseous Nitrogen was provided for near the top of the unit. The piping containing the Nitrogen to and from the regeneration unit was also temperature protected with vacuum lines. Dan Sullivan and his design team led this admirable effort. (Unfortunately, Dan passed on a few years ago). It was felt that a less heavy and equally efficient chamber (proposed by Carl Breske – a very innovative scientist) could be designed if time permitted but the selection of the vacuum based design was conservative to accommodate schedule and also to familiarize the team with the challenges of Cryogenics. The compressor unit was a conventional Liquid Nitrogen system (very large and bulky) used for generation of Liquid Nitrogen for the commercial market. The system was not pretty. Marketing, led by Bobby Robertson (also now deceased), prohibited the engineers to show this to perspective customers fearful that this would scare them away. </p>
font-family:Helvetica">&nbsp;<o:p></o:p></span>
 
  
<span style="font-size:14.0pt;
+
<p>Thought was given to actually eliminate the need to regenerate the system in a closed system but rather purchase Liquid Nitrogen – readily available in tanks - and have them periodically refilled as is done in the IC and other industries using Liquid Nitrogen. This was discarded for the initial design since several customer sites did not easily accommodate the external access to Liquid Nitrogen tanks. It was to be an option for future systems and those customers that easily accommodated such an option. The final design was then a closed recycled Liquid Nitrogen system with the compressor located remote, much like Freon compressors, which many Supercomputer customers were already accommodating. The design challenge was at the surface (looked much like a two slice toaster) where the processing boards were inserted. This seal had to accommodate the connecting transmission to the external and room temperature memory and I/O subsystems. A printed circuit board was designed to connect the processor to the outside world. Heaters were applied to the surface to prevent icing at the cryostat surface. The separation, only a few short inches had memory operating at 300 degrees Kelvin and the CPU operating at 77 Degrees Kelvin. There were a few “frosty” events in this development cycle! The third challenge was to provide reliable soldering of the circuitry to the board amidst the severe temperature difference that the solder joints would be subjected to (greater than 250 degrees) during the cool down and warm up cycles. Studies at the National Bureau of Standards provided input that the temperature cycle should be profiled in a precise sequence as the board was cooled and heated. In addition, care as not to remove the board and to care for condensation that would occur if the board had not been heated to room temperature was considered. The result was a 20-minute cycle to remove or insert the board was designed with a specifically prescribed sequence of temperature lowering and rising for both cycles. At the time of the unfortunate termination of ETA Systems, a more refined, lower cost and lower weight design as stated earlier was on the drawing boards. Although the cryostat and associated cooling was costly, an analysis clearly showed that for the performance resulting from the design, the cost was less than any Bipolar IC system designed at the time. Once the connector was finalized and the process and assembly designed, the system operated flawlessly. Checkout on the manufacturing floor of the system utilized the “Self-Test” capability exhaustively so specific interconnect flaws were clearly understood prior to removing a CPU from the cryostat, thus reducing checkout time considerably as well. These designs were well done, significant and challenged laws of thermodynamics and physics to new limits. </p>
font-family:Helvetica">Summary <o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
== Air Cooled System  ==
font-family:Helvetica">The design of the ETA Systems Supercomputer hardware had
 
many unique features.&nbsp; The brief pages highlight some of them.&nbsp; <o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>As stated earlier in the document, an Air-cooled processor would operate considerably slower (2x slower) when operated in normal or “room temperature” environments. ETA Systems by sorting the devices for performance at incoming inspection, allowed for a three times performance differential to be realized. Only the highest performance devices were reserved for the Cryogenic cooled system. The remaining parts were then re-sorted into two categories for room temperature; the differential would be a 4-nanosecond clock period between the two room temperature systems and 17 nanoseconds (24 to 7) for the total system product set. The sorting and using the entire distribution of Integrated Circuits had a significant cost reduction factor for the entire product line. Bipolar devices, by contrast had lower functional yield to begin with coupled with additional loss of product due to performance yield. This was a definite cost reduction asset to the ETA System. </p>
font-family:Helvetica">It would be remiss not to briefly discuss the “team”
 
concept used to design the hardware.&nbsp; By having the CAD, Packaging,
 
memory, circuit and power expertise located in a close proximity and holding
 
concise project reviews at all levels at periodic and timely phases, all were
 
kept abreast of the progress and challenges of each other.&nbsp; This permitted
 
changes to be made to necessary designs to properly accommodate the challenges
 
and opportunities in a timely fashion.&nbsp; Hardware was demonstrated on or
 
near schedule despite the innovations required in each aspect of the
 
design.&nbsp; The team was truly a “team”.&nbsp; <o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
<p>To cool the CPU air was forced on to the processor chips by using a plenum that was designed to cover each chip. Holes were designed in the plenum such that equal operating temperature would result for each operating chip. Since the power consumption variation significant for several part types, designing the appropriate number of holes above each chip location could provide custom cooling. The plenum could then be molded for mass production of the processing unit. Large volume cooling fans were designed for the system as well. Cost was the focus for the air-cooled systems since the price tag was below $1M. Recall, that the air-cooled design was identical in parts at the CPU and storage level. A single development was achieved for a wide range of products with one design team. </p>
font-family:Helvetica">A missing link to the team was the logic design.&nbsp;
 
These folks were separate and actually on another floor of the ETA Systems
 
facility.&nbsp; It was strongly suggested and accepted for future designs, that
 
the logic team would be a part of this common organization. I had the
 
opportunity to lead one additional hardware development that included the logic
 
design team (at Cray Research, not ETA Systems) later.<span style="mso-spacerun: yes">&nbsp; </span>It was a smoother and more effective and thorough team.<span style="mso-spacerun: yes">&nbsp; </span>Like ETA Systems the communications were open and included both manufacturing and software participation (the later two were voluntary).<span style="mso-spacerun: yes">&nbsp; </span><o:p></o:p></span>
 
  
<span style="font-size:13.0pt;
+
== Storage  ==
font-family:Helvetica">Clearly, communications – effective communications at
 
all levels of the organization was key to this hardware design success.<o:p></o:p></span>
 
  
&nbsp;<o:p></o:p>
+
<p>Stacks using three-dimensional characteristics were designed under the leadership of Brent Doyle for the memory – both static (high performance) and dynamic (high density and lower performance) memories of the ETA Systems Supercomputer. These unique designs provided for highest density and optimum performance of the standard memory devices used. Ability to upgrade to future generations of memory (more storage capacity Integrated Circuits) was built into the design as well. The design worked well and stacking became commonplace in the computer industry for future designs – eventually eliminating the chip package entirely. </p>
<span style="font-size:12.0pt;font-family:&quot;Times New Roman";mso-ansi-language:
 
EN-US;mso-fareast-language:EN-US">
 
  
</span>  
+
<p>The Air-cooled system was defined as a Piper. An Illustration of “Piper” is shown below. </p>
<!--[if gte vml 1]><v:shape id="_x0000_s1026"
 
type="#_x0000_t75" style='position:absolute;left:0;text-align:left;
 
margin-left:0;margin-top:180.2pt;width:414pt;height:335.65pt;z-index:1'>
 
<v:imagedata src="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image004.gif"
 
  o:althref="file://localhost/Users/tonyvacca/Library/Caches/TemporaryItems/msoclip1/01/clip_image005.pct"
 
  o:title=""/>
 
</v:shape><v:rect id="_x0000_s1027" style='position:absolute;left:0;
 
text-align:left;margin-left:65.7pt;margin-top:78.15pt;width:231.5pt;height:64.55pt;
 
z-index:2;mso-wrap-style:none' filled="f" fillcolor="#618ffd" stroked="f"
 
strokeweight="1pt">
 
<v:shadow color="#919191"/>
 
<v:textbox style='mso-fit-shape-to-text:t' inset="90487emu,3.5pt,90487emu,3.5pt"/>
 
</v:rect><![endif]--><span style="mso-ignore:vglayout">
 
  
 +
<p>[[Image:Piper processor.jpg|thumb|left]] </p>
  
{| cellpadding="0" cellspacing="0" align="left"
+
== Summary  ==
|-
 
| width="0" height="78" |
 
| width="66" |
 
| width="233" |
 
| width="115" |
 
|-
 
| height="67" |
 
|
 
| width="233" height="67" align="left" valign="top" style="vertical-align:top" | <span style="position:absolute;z-index:2">
 
 
 
{| cellpadding="0" cellspacing="0" width="100%"
 
|-
 
|
 
    <div v:shape="_x0000_s1027" style="padding:3.5pt 7.1249pt 3.5pt 7.1249pt" class="shape">
 
   
 
<span style="font-size:24.0pt;font-family:Helvetica;color:#00279F;text-shadow:
 
    auto">'''ETA 10 CPU<o:p></o:p>'''</span>
 
  
<span style="font-size:24.0pt;font-family:Helvetica;color:#00279F;text-shadow:
+
<p>The design of the ETA Systems Supercomputer hardware had many unique features. The brief pages highlight some of them. </p>
    auto">'''1988<span style="mso-spacerun: yes">&nbsp; </span>Product<o:p></o:p>'''</span>
 
</div>
 
|}
 
</span>&nbsp;
 
|-
 
| height="35" |
 
|-
 
| height="336" |
 
| colspan="3" align="left" valign="top" | [[Image:]]
 
|}
 
</span>&nbsp;<o:p></o:p>
 
<!--EndFragment--></span> ==
 
  
== <span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
+
<p>It would be remiss not to briefly discuss the “team” concept used to design the hardware. By having the CAD, Packaging, memory, circuit and power expertise located in a close proximity and holding concise project reviews at all levels at periodic and timely phases, all were kept abreast of the progress and challenges of each other. This permitted changes to be made to necessary designs to properly accommodate the challenges and opportunities in a timely fashion. Hardware was demonstrated on or near schedule despite the innovations required in each aspect of the design. The team was truly a “team”. </p>
</span> ==
 
  
== <span class="Apple-style-span" style="font-family: Helvetica; font-size: 17px;">
+
<p>A missing link to the team was the logic design. These folks were separate and actually on another floor of the ETA Systems facility. It was strongly suggested and accepted for future designs, that the logic team would be a part of this common organization. I had the opportunity to lead one additional hardware development that included the logic design team (at Cray Research, not ETA Systems) later. It was a smoother and more effective and thorough team. Like ETA Systems the communications were open and included both manufacturing and software participation (the later two were voluntary). </p>
</span> ==
 
  
== <span style="font-size:13.0pt;font-family:Helvetica;mso-ansi-language:EN-US; mso-fareast-language:EN-US">
+
<p>Clearly, communications – effective communications at all levels of the organization was key to this hardware design success.</p>
</span>
 
  
<!--EndFragment--> ==
+
[[Category:Microprocessors|{{PAGENAME}}]]
 +
[[Category:Computing and electronics|{{PAGENAME}}]]

Revision as of 16:16, 22 July 2014

Contributed by: Tony Vacca

How it started

It was in the early 80's. Control Data (CDC) had just launched the CYBER - 205 with modest success and the team was now focused on the next generation machine, the 2XX as I recall. Speed, cost and meeting the schedule were all key objectives. Speed because Cray Research under the guidance of Seymour Cray was setting milestones for Supercomputers with the Cray 1 and then the Cray 2. Cost, since Supercomputers were extremely expensive. Schedules since the CYBER - 205 had established patience records as a machine that may never get out the door and this must not be repeated.

A conventional evolutionary approach for Integrated Circuit (IC) logic was initially selected. Motorola, with some prodding, agreed to launch an 8,000 gate equivalent ECL (emitter-coupled-logic - the circuitry of choice for high performance processing units) provided that Control Data do the actual circuit development. There were insufficient customers for Motorola to commit their resources to this lofty development. Motorola did, however, commit their advanced ECL processes to CDC and a joint team was formed with the two companies.

Logic designers at the CDC Advanced Design Laboratory were given preliminary design rules based on computer device models and estimates of gate per chip densities. There was a natural follow up of grumbling by the logic design team led by very experienced and innovative folks (Ray Kort, Maurice Hudson and Dave Hill to name three) but circuit designers had learned to accept this since logic designers always found the circuits to be too slow and insufficient an quantity of gates and pins (I/O ports) per die. There was a lot of cooperation too. Basic building blocks were defined by the logic designers - gate functionality, register functionality, etc. From this set of preliminary rules function blocks were defined and capacity per reasonably-sized Printed Circuit (PC) boards defined. The initial design using the Cray CYBER - 205 based architecture was launched.

In parallel with this effort, and in the same design group; i.e.; circuit, packaging, PC board and newly formed CAD (tools for layout and design of chips and boards) - chief chip design engineer - Randy Bach - was assigned to develop an advanced CMOS chip for the Canadian Computer Development organization. At this time, early 80's CMOS was in it's infancy being used for memory devices, low performance peripherals and also for low performance microprocessors (5 to 10 MHz clock speeds). The design contained 5,000 gates plus appropriate input and output communication devices. Gate arrays for CMOS was also nearly non-existent so Randy and his small team of two assistants developed a cell library and worked closely with the Canadian Development team to meet their objectives as well.

This effort was completely separate from the ECL based gate array to be used for the next generation Supercomputer. The product was developed for a low cost application.

It was customary for Neil Lincoln - chief architect, Dale Handy - manufacturing manager and me to go off to lunch every 8 to 10 days to discuss status at either Author Treacher's Fish & Chips or Zantigo's (high class - NOT) fast food restaurants. As a side note, both of these fast food places disappeared during the ETA Systems brief duration. Zantigo's has returned (I think because they know it is safe now that the three of us cannot visit together any longer - Neil unfortunately passed on a few years ago).

At one of these meetings, Neil had "news" for me. Simply stated, the gate array in active co-development with Motorola had unacceptable goals. The chip had too few I/O pins, consumed too much power and insufficient gates. In addition, he completed a cost model which indicated an unacceptable cost figure for the CPU. He also determined that the CPU (some 3 Million gates) had to be assembled on a single board. "It was time for this goal to be reached". He also reached the conclusion that a proper logic design required at least 15,000 gates per chip to meet these goals.

The logic designers had gotten to him I surmised. Schedules, Neil reminded us, could not be altered - and that was that. To soften the blow he bought lunch that day, three Cokes and three orders of fish and chips - Neil's was a large order.

The trip back to the lab was pretty quiet, fortunately short since our eating places were all very close to the lab.

That afternoon, I assembled the key folks - I might miss one or two but Randy Bach, Doug Carlson, Dave Resnick and John Ketzler were four that I recall now. Doug was a mechanical engineer that I assigned the Motorola project to because of his management skills - something he probably never forgave me for - John was the key circuit engineer on the Motorola project and Dave was and still is a very versatile and perceptive engineer.

Doug and I would inform Motorola of the decision not to continue. The team would package up what was accomplished and turn it over to Motorola to carry the ball forward if they wished.

As a side note, Motorola and Cray did continue the design. It was the circuit design used in the Cray C90, a very successful Cray Research Supercomputer.

The meeting turned to what were the next steps.

The key challenges that emerged were:

  • IC Technology that could meet the new lofty goals
  • The PC board technology required to meet a single board CPU
  • Packaging and interconnect technology required to support the two above requirements
  • Computer Aided Design (CAD) technology necessary to accurately design IC and PCB technologies
  • Suppliers for all - do they exist?
  • What additional internal resources were required to achieve objectives
  • System packaging beyond a single CPU. (Memory, peripherals, I/O, etc.)
  • Testing of complex IC technology and complex PCB technology

Results

Before getting to the details as to how decisions were made and how the ETA System technologies the “kit” was selected and developed, a list of noteworthy accomplishments achieved are listed:

  • First Industry competitive CMOS CPU

Since 1995 – to the present (beginning 12 years after the technology selection by ETA Systems I might add) ALL HPC (High Performance Computers) are developed and manufactured using CMOS IC technology. Until as late as 2000, bipolar technology (higher power, more costly to manufacture and lower gate count per chip) dominated high performance computers throughout the world.

  • First Industry Single Board CPU

The chip density (gates per chip) allowed by advanced CMOS, the use of layout and design Computer aided design tools for optimum layout and simulation, the successful design of a 45 layer advance Printed Circuit board (you read it right 45 layers) and innovative chip attachment and cooling permitted a single processor containing nearly 3 million gates to be packaged on a single board

  • First Industry system to bedesigned with self-test

CPU Processing units (≈3Million gates each) were validated for functionality and performance in less than 4 hours.Any interconnect errors were recorded and allowed chip-to-chip replacement to occur in a minimal time.Other CPU checkout during this same period required weeks to months to check out and validate a processing unit. Incoming testing of the logic IC Chip (function and performance) also used the same self-test innovations.

  • First Industry production Liquid Nitrogen CPU

The ETA Systems CPU was immersed in Liquid Nitrogen – 77 degrees Kelvin – to improve performance greater than two times that CMOS technology operated at room temperature – 300 degrees Kelvin.

  • First system at CDC to fully utilize Computer Design Software to design Chips, boards, validate Logic design and Auto Diagnostic test the system with Synergistic tools

Permitted checkout of a CPU to be completed in less than 4 hours. Manufacturing costs were greatly reduced. This technique was also used at the IC Supplier and greatly reduced any probe test hardware and software.

  • First Industry system to havemultiple cost designs from single design effort

Performance range of the ETA System products was greater than 24:1 (8 processor system operating at 7 nanoseconds Clock period and a single processor system operating at 24 nanoseconds.). Processors were manufactured, tested and validated from a single manufacturing line using identical components. (IC Chips were performance sorted using auto self test). Product differences began at the system packaging level.

Boring into details

Any Technology kit must be driven by a customer need. In the case of Supercomputers the craving for increased computer performance at a lower cost (overall cost) was the deciding factor. In any Supercomputer company a combination of marketing requirements, architecture innovations and logic design demands dictate the initial objectives of the hardware circuit and packaging organization. I state “initial” since once the objectives are digested and key technologies are evaluated for the time frame addressed, compromises are the norm. In the case of ETA Systems technology selections in the early 1980’s, this was the strategy implemented.

The following paragraphs sequence the thought process and the technology selection strategy utilized.

Integrated Circuit selection

The objectives, listed in earlier paragraphs were first integrated into the architecture and logic design requirements. A market survey of key integrated circuit suppliers was conducted with emphasis on what was in development and planned for product introduction – not what was available at the time of the survey. A risk assessment was made. Primary focus was on the most dynamic technology, the IC Logic technology. All decisions as to volume requirements, pins, packaging, etc. resulted from what was determined by this survey and risk analysis. Merging the logic design objectives (gates, bandwidth and performance of key functions) was next.

An ECL (emitter coupled logic) high performance bipolar gate array using Motorola advanced IC technology was selected. Since Motorola was not fully staffed to begin the actual product development (application) but did have the process development underway, a cooperative development agreement was struck with the two companies (this occurred between Motorola and Control Data since ETA Systems had yet not been formed). The design called for basic logic cells to be incorporated into a larger version of their existing gate array advancing the process for increased performance and chip size for increased gate capacity. The existing gate or function array utilized approximately 2,500 gates (which was used as the primary gate array for the Cray Research very popular Y-MP Supercomputer) and the planned gate array would contain an excess of 8,000 equivalent gates.

Logic cell libraries were agreed to (acceptable to both Motorola for the general market and to CDC for the logic designs). Pin counts (for power, ground and input/output logic communications) were established and power consumption estimates were made. Once these parameters were established, board size, power systems and thermal control were evaluated in a trade off give-and-take. Features of Printed Circuit Boards, (line widths, spacing, interconnect vias and number of layers were compared to the board size capacities, laminating press capabilities, drill designs and printed pc board processing limits. IC packaging, limits, i.e.; minimum size of package, pin spacing, thermal removal, etc. was evaluated in parallel with PC Board limits.

The chip design began, the cell library began and the packaging began once all parameters (pins, power consumption and die size objectives) were agreed to. Printed circuit board experiments also began. Once feasibility was established and practical limits established (original goals could be met as to physical design and performance based on IC Modeling and extrapolation from previous established functional systems, a preliminary specification was presented to the architects and logic designers for review.

From initial design data, logic design based on the parameters provided established a physical size for the Central Processing unit or CPU, the heart of the system. A multiple board processor was required. This placed additional constraints on packaging since within a single processor all distances are crucial between circuits. Three-dimensional packaging concepts were considered. Three dimensional packaging effectively meant a “sandwich” effect of multiple boards with interconnects from board to board were throughout the area – not exclusive to the periphery of the board such that chips on each of the boards would minimize distances between them. In addition, power consumption estimates were made; thermal removal paths and techniques were considered. A cost model was generated as well. All of these factors resulted in a preliminary estimate of the CPU volume. In the introduction portion of the document, you already know that this was rejected - more to follow for sure.

In parallel with these efforts, memory design was underway. Less freedom was available to memory since the basic semiconductor device could not be altered to accommodate specific users. There were a few packaging alternatives, very few, and device configurations (Word – Bit architecture, pin numbering, power considerations, etc.) were dictated by the industry. Since memory design has its own objectives for cost, reliability and performance, this effort could continue quite independently with one exception, the packaging of the total system must be synergistic and compatible. A crucial parameter of this is the interconnect mechanism between processors and memory.

A hardware system cost model was established – not only for current cost considerations but also estimates on volume costs based on learning-curve estimates as well for the life of the system.

The chief architect, after careful review, rejected the design; this was covered in the introduction. Three key reasons were sited; performance would be impacted due to the 8,000 gate limit, (worst case logic paths could not reside in a single chip and multiple chip distances would increase the clock period), power consumption per CPU, although lower on a performance ratio basis to previous generations, was too high when the total system size (including the multiprocessor objectives) were considered and system cost appeared prohibitive – always a subjective issue but never-the-less a key component of the design. Reliability concerns were also stated since the pin-count per CPU, although quite reduced from previous designs, were of concern. The architecture was committed to four CPUs (max) per system so the interconnect "bar" was raised.

Back to the drawing board

Bipolar technology refers to conventional NPN and PNP transistors operating in a non-saturating mode (collector-base). By not saturating the operating transistors (not allowing the base voltage being higher than the collector voltage) the switching characteristics were improved and balanced (off logic level and on logic levels had identical delays). In addition, the non-saturating circuitry – titled ECL for Emitter coupled logic – provided the TRUE and COMPLIMENT outputs for each logic function (i.e.; AND & NAND, OR & NOR, etc). This provided advantages to logicians to design complex Boolean functions (ADD units, MULTIPLY units, DIVIDE units, etc). Under the category of “no free ride” ECL circuitry consumed higher power than the more popular but much slower saturating logic circuitry (TTL – transistor-transistor logic). Other improvements in performance for integrated versions of ECL logic circuitry included replacing conventional junction isolation between circuit devices on a single die with Oxide isolation between circuits (lower capacitance per circuit so less charging and discharging when logic levels switched).

CMOS (Complementary Metal Oxide Silicon) circuitry, especially at the time of ETA System, was a simpler and more efficient logic circuit. This form of logic also had a simpler process. Stacking of P channel and N channel transistors in series between voltage bus rails defines a single complementary gate. Functionality of the logic devices is much more forgiving to process variations due to the larger voltage swing and only active transistors used to define the circuitry (no resistors, diodes, etc.). The physical size of a logic function when compared to a bipolar equivalent is significantly smaller, resulting in an increase in circuitry per equivalent die (chip) size. CMOS technology also consumed power ONLY when the circuit was switching (changing states) so power consumption was directly proportional to the frequency it was operating.(P = CV2f) ECL circuitry, by contrast, consumed approximately the same power – while switching or in a quiescent state. (Later forms of CMOS – especially those designed in early 2000 and beyond, had increased power consumption primarily caused by increased bulk leakage currents as a result of processes developed for lithography having features en excess (smaller) than 90 nanometers. Technology at the time of the development of the ETA Supercomputers had minimum features of 1,200 nanometers. (In 2009, by contrast, the production capability is 45 nanometers)

Advantages of CMOS were obvious; more circuits per given chip area, lower power consumption and higher functional yield. It is important to stress “functional yield”. The CMOS devices functioned over a much larger range of processing variations (> 50% Vs. < 15% to 25% for ECL). Performance variations for a given process were approximately 2 to 3 times for CMOS and 20% to 30% for ECL. For this reason CMOS devices were sold at a much lower performance than any bipolar counterpart.(I.e.; if the product was specified to accommodate the entire functional lot (wafers processed at the same time), more IC devices yielded. There is one other key difference in defining performance differences between Bipolar and CMOS devices. For ECL (or any other bipolar device) the maximum operating frequency is defined, in part, by the base width – the physical distance between the emitter and collector of the transistor. This is determined by the spacing based on diffusion or implant of the emitter and is controlled in the vertical direction and limited by process control that is quite precise. This parameter is very thin and the frequency is determined indirectly proportional to the base width. For CMOS the gate length defines the critical performance parameter. Gate length is defined by mask optic limitations for any generation of processes. Bipolar devices in the 1980’s and well into the later half on the 1990’s, therefore, had higher operating maximum frequencies than their CMOS counterparts. As capital equipment – primary optics to generate masking and etching capabilities defined smaller and smaller geometries, CMOS technology improved dramatically in performance. This was a result of smaller gate lengths but also each generation had smaller devices resulting in lower capacitance loading and lower time constants to charge and discharge. During the time of the ETA Systems Supercomputer development, CMOS technology had not seen the advantages that bipolar devices could realize – but the potential for future improvements was obvious and projections clearly indicated that by the second half of the 1990’s (nearly 10 years after the first ETA Systems Supercomputer would be available), CMOS would overtake Bipolar in the last and most important parameter – performance. To restate this; the IC industry was transitioning to CMOS technology and more funding at the device, and equipment level was being expended to accommodate new markets focused on potential of CMOS than was being expended for Bipolar devices.

Bipolar technology was stretched to a practical limit for the time frame in question.

The IC industry, therefore, had only one other technology candidate, CMOS, which was, in 1983, used exclusively for lower cost and considerable lower performance applications and memory device technology where more bits per die could be fabricated at the expense of lower performance of the Bipolar counterpart(s). The impressive characteristic of CMOS technology at this time was: Lower power consumption per function, smaller size per logic function and lower cost per die due to two key factors (smaller physical size per function meant more logical functions per unit of area, and higher chip yield – chip functionality per wafer manufactured – due to reduced number of processing steps to generate CMOS devices. That was the good news. The concern was system performance. While bipolar technology had set the standard for clock periods of 10 NSec for Supercomputer architectures such as the ETA System projection, CMOS was at least 5 times slower – in most cases 10 to 20 times slower for equivalent architectures. Based on this parameter alone, CMOS was not a candidate for Supercomputers in the 1999-1990 time frame (the time frame where the ETA Systems Supercomputer would be in high volume production).

The next steps for CDC (recall that at this time CDC still had a Supercomputer Division) were dramatic and at times emotional. First, the team had to discard the ECL design and terminate the effort with Motorola. This was very difficult since both companies depended on each other and secondly, all objectives of the ECL product were being met within the specifications established. CDC (team which later became ETA Systems) provided Motorola with all of the design details to date. Considerable effort was made to insure that the program was successful at Motorola.

A sidelight to this discussion – Motorola completed this product as an industry product. Cray Research Inc. (the key competition and leader of the Supercomputer market) engaged with Motorola to successfully complete this complex IC development for a product announced in the late 1980’s. The product (Cray C-90) under the leadership of Les Davis, Steve Nelson and other notable scientists (a key circuit designer was Mark Birrittella), became another very successful supercomputer products developed and manufactured by Cray Research Inc..

Next, a full effort evaluation of all technology candidates occurred. CMOS futures were explored in depth. GaAs technology was also evaluated. Alternative ECL (bipolar) candidates were also considered. CMOS was viewed as the technology of the future but the future was beyond the time frame necessary for product introduction.

Key events that led to the decision to use CMOS technology.

Moore’s law (invented by the great innovator and co-founder of Intel – Gordon Moore) stated that IC technology (CMOS) technology, would double in performance and density every 18 months to two years. The actual Moore’s law may have been stated somewhat differently but this captured all the project cared about. To achieve this predicted growth, several parameters had to occur:

  • The die size would increase (more gates per manufactured chip).
  • Features on the chip (metal widths and spaces to interconnect devices and actual device parameters) would be reduced every 16 months to 2 years. Reducing parameter sizes have two positive results to goals of ETA Systems: increased performance and more gates per die.
  • The technology would gain broad industry popularity – this would mean that capital equipment would keep pace with the “law”, applications would increase thus increasing volume, thus lowering cost and increasing performance and more applications and industries would drive CMOS technology – the Supercomputer industry could not drive such a large industry.

Key industry activities also emerged at this time:

  • CDC validated operational performance gains operating CMOS technology in a cryogenic environment. Several ring counter configurations generated with the 5,000 gate chip discussed earlier were dipped in a Liquid Nitrogen thermos jug expecting to witness the shattering of the silicon and the detachment of the solder joints attached to the oscilloscope only to find the frequency of the ring oscillators double and the system operate for weeks until we turned off the experiment. Analytical analysis applied to the Silicon design validated the research done previously by others.
  • Key US Government agencies began a technology acceleration program based on CMOS technology – the Very High Speed Integrated Circuits (VHSIC) program under direction of the Army, Navy and Air Force certainly captured our attention.
  • Honeywell, one of the participants in the VHSIC program held a technology luncheon IEEE symposium in which they presented an 11,000-gate CMOS development effort. Attendees from CDC were impressed (especially the key designer – Randy Bach - with what the efforts. The chip was certainly larger than any that had been developed to date and the performance was accelerated beyond what was predicted for the 1988 time frame by the conventional IC industry (the introduction date set for the ETA Systems Supercomputer – then the next generation CDC Supercomputer). Honeywell was a recipient of one of the VHSIC contracts.
  • Logicians and architects back at CDC - led by Neil Lincoln (chief architect), Ray Kort, Maurice Hudson and Dave Hill and others - determined that an minimum gate density of 15,000 gates per die would allow them to achieve a key objective; having a worst case Register to Register clock path residing within a single chip. Now additional explanation is required here. There were technical reasons that the logicians wanted more beyond the knee jerk reaction that asking for 50% more than offered was a standard mode of operation for these guys. Each architecture configuration has a method of achieving its goals of applying computational instructions to problems. The number of gates that are connected in serial fashion between the input and output registers (and this is truly simplifying the problem) determine the clock period that is allowed. For the ETA Systems Supercomputer, therefore, it was determined that a functional unit clock period could reside within the boundary of the chip if the chip could provide 15,000 gates of logic to the designer.
  • Research into technology experiments uncovered significant performance features of CMOS technology. First of all, the technology was functional across a wide range of voltages and temperatures but performance was significantly altered. The higher the operating voltage (within semiconductor constraints, of course) the higher the performance resulted. Unfortunately the Power consumption, although significantly lower than any alternative technology, increased as the Square of the operating voltage. The lower the operating temperature of CMOS the higher performance as well. This factor was studied by others and carefully documented from 400 degrees Kelvin (100 degrees above room temperature) to 77 degrees Kelvin. (77 Degrees Kelvin is the boiling point temperature of liquid Nitrogen.)

Summary of what was learned with this evaluation

  • IC chips currently (four years before the need for an ETA Systems product) had a capacity of 11,000 gates.
  • The performance of these gates, when operated at liquid Nitrogen temperatures, would perform at least two times faster than at room temperature – not yet validated at CDC.
  • 15,000 useable gates were required per chip to meet logic designer chip boundary requirements.<o:p></o:p> ¨ If Moore’s law was applied to these parameters, within the time frame required, it was possible to achieve both gates per chip densities and performance goals (if the system operated in a liquid Nitrogen environment).
  • There were at least two IC Suppliers (those having contracts with the US government) that were pursuing CMOS as a performance and high gate/chip density technology (the other known corporation was TRW).
  • Computer Aided Design (CAD) tools were, during the period of the 80’s, in the infancy stage if one was to compare them to today’s capabilities. To design, place cells within the matrix of the gates provided on the IC Chip, and route the interconnections of these cells accurately to the logic or Boolean design required by the logicians and to clock period constraints was a challenge. This challenge applied to board layout designs as well. Control Data Corporation (CDC) recognized the challenges and established a small but efficient and dedicated organization to address these challenges. The industry had established a metric that to use CAD tools for gate or cell arrays, an additional 20% to 30% gates were required. This meant if the ETA Supercomputer required at least 15,000 useable gates to accomplish necessary designs based on its architecture, an 18,000 to ≈20,000-gate capacity was required. The technology organization set at its objectives a design of 20,000 gates plus necessary circuitry to self-test each gate or cell array. This as compared to the gate array in development at Honeywell was nearly 2 times the capacity (11,000 total gates Vs. 20,000 total gates plus circuitry for self test).

The task was to convince Honeywell to project the next generation size and layout rules and to accept an R&D effort that would allow CDC / ETA Systems achieve its objectives. Honeywell, an innovative organization, took on the task after considerable discussion with key requirements:

  • ETA Systems (we were now ETA Systems by the time these discussions reached negotiations) accept costs based on wafers processed, not functional chips. Honeywell would provide necessary processing data to reflect wafers were processed within process parameter specifications.
  • ETA Systems provide test equipment for wafer testing and test parameters for chip acceptance prior to packaging.
  • Both companies would share facilities and key resources and work as a single team – as “open a Kimono relationship” that one could ever imagine during this dynamic period of complex process developments within the IC Industry. – David Frankel was assigned the task as ETA Systems interface and engergetically took on the challenging task.
  • Self-test circuitry was designed into the basic cell array periphery. The area consumed by this custom set of pseudo-random generated logic and registers was less than 15% of the total chip area. (David Resnick, resident do-it-all reduced concepts explored by ex CDC scientist Nick Van Brunt who left the company a year previous to the formation of ETA Systems.) This was one of many extra ordinary contributions David made to ETA Systems. Additionally to providing self test capability to accept or reject the circuitry – both functionality and performance sorting – the circuitry included in each 20,000 gate array had capability to test for interconnect between circuits on the final PC Board as well as circuit to I/O connections.

When the logic design team first heard of this area “waste” of test circuits that could be used for logic design, they lobbied for it to be removed in favor of more logic gates for function designs. Fortunately this request was not honored. IC validation at both the supplier in wafer form and at ETA Systems in packaged chip configuration coupled with the use of the same circuitry in manufacturing checkout to detect board opens and shorts between circuits assembled both in room temperature and cryogenic temperature environments proved to be well worth this “waste” of circuitry area. Small, relatively inexpensive testing systems were designed by ETA Systems and provided to the supplier. The operands for initialization of the pseudo-random logic were also supplied for each design (chip type).

Chip types (array design options) were carefully managed as to not proliferate the chip types in the system. This was a new constraint placed on logic designers and was dealt with most professionally and responsibly by all participants once understood. The resultant chip total for the CPU (processing unit) was fewer than 150 while the chip types including clock chips and all logic design chips was fewer than 20 as best recalled.

During the development cycle of the ETA System Supercomputer, Honeywell moved the manufacturing capability from a local Minneapolis facility to a state-of-the-art manufacturing facility in Colorado Springs, CO. The transition was very transparent to ETA Systems (with the exception of the traveling budget, of course). To accomplish this team membership from both companies acted as one in all decisions addressing scheduling and timing of needs of various chips, testing, packaging, etc. The open book relationship was very beneficial to both companies. On one milestone occasion – where Honeywell successfully completed an initial order – Dave Frankel and I visited Honeywell, some 30 miles from the ETA Systems facility, and served cake and coffee to all designers and operators – it was below zero when this milestone was reached and no one cared.

One design that was incorporated into the chip was to allow for next generation critical processing parameters to be added to the existing design (present chip layout). Although this would not optimize the features of new process features (all parameters were not considered), key performance enhancements could be and were added to the present design. A key feature was gate length and this was added transparently to the physical chip and offered appreciable performance enhancements to the design.

Chip design summary:

The decision to utilize CMOS technology for the ETA Systems Supercomputer in the 1985 – 1988 time frame (prematurely by all industry metrics) resulted in the following additional “technology kit” decisions:

Addition of chip self-test. Feature established functionality at wafer test and functionality and performance sorting at ETA Systems

  • Computer Layout tools that validated logic prior to chip release for fabrication
  • Requirement to operate the chip at 77 degrees Kelvin or in liquid Nitrogen
  • Packaging, interconnect & assembly decisions based on liquid Nitrogen operation challengesRemote testing of the CPU because of liquid Nitrogen operation challenges
  • Logic design partitioning challenges to design within 15,000-gate per chip boundaries and a minimum of IC chip types

Printed Circuit Board Design Selection:

In the period of the 1980s, the time frame of the ETA Systems Supercomputer development, Printed circuit boards had maximum dimensions of approximately a square foot and the number of total layers fewer than 20. (Layers provide power and ground stability, interconnect capability for the circuits attached to the board as well as inputs and outputs to and from the board.) If these total layers are allocated properly, approximately 50% are used for interconnect and the remaining for power and ground. Positioning of power and ground layers also serve to provide interconnect layers that have transmission line capabilities to insure signal integrity throughout the board. During this period, a state-of-the-art printed circuit board was approximately one square foot of active circuitry and as stated earlier, 20 layers or fewer usually restricted to a total thickness of 0.063 inches.

It was determined that a maximum of 150 chips would be required to design the ETA Systems Supercomputer CPU. Packaging of the IC and interconnecting the chip to a PC board with minimum spacing between chips (some spacing was required to allow interconnects to all of the necessary layers) resulted in a 1.2x1.2 sq. inch “footprint”. Doing the simple math results in a pc board of a minimum of 220 sq. inches. The number of total layers required to interconnect the 150 chips and the necessary Input and Output at the board periphery was determined to be 45. Looking at design parameters of the board layers in more depth and insuring transmission line features to insure signal integrity defined the board thickness at slightly greater than 0.25 inches. This thickness was approximately three times greater than high-end printed circuit boards produced in this time frame. With a board having an area of greater than 1.5 times the size of what was able to be produced, a thickness of 300% of what was produced and a the number of layers 2.5 times of what was produced in this time frame it was clear that the printed circuit board industry was not ready for the ETA Systems design! The design has other limitations. A key factor when designing pc boards is to insure proper connecting of the layers, i.e.; connecting the chip pins to the board and the proper layer of interconnect in the board and back to the proper receiving chip. Drilling holes in the layers and plating the wall of the holes with copper for conduction make these connections. These are called plated thru holes or PTH. A key parameter to insure that plating occurs in these holes is the hole diameter to depth ratio. The industry at this period (not much better today) is 6:1, i.e.; the thickness of the board must be no more than 6 times the diameter of the hole. This ratio would dominate the size of the board. If this ratio is used to design the board the board size would be increased in area by greater than 9 times. Talk about piling on! Since it was deemed not feasible, issues like cost and time to fabricate the board were not even addressed.

Nestled into the design laboratory of Control Data Corporation was a small but very innovative printed circuit board prototype facility. The leader of this group, LeRoy Beckman, never said “no” to challenges. He just bit his pipe a little harder and tried not to snicker out loud. LeRoy kept his eyes and ears out for innovative alternatives to conventional board fabrication techniques and had previously displayed innovation (evolutionary in nature) in previous generations. Embedded termination resistors in layers was one invention he brought to CDC when resistor termination took up too much board area; finer features than the industry was producing another, and higher plated through hole (pth) ratios than the industry a third.New technologies in the printed circuit board were few and far between. The industry was set in it’s ways of subtractive etching of circuit layers (removing unwanted copper from a pre-copper clad layer, convention wet etch processes and relatively simple assembly, i.e.; lamination of layers with pressure. One inventor, Mr. Peter P. Pellegrino, arrived on the scene to discuss innovative, revolutionary and proven pc board processing. At first the claims appeared to be too good to be true. Board size relatively independent, aspect ratios exceeding 20:1 for PTH, an additive process that permitted finer lines to be fabricated on individual layers. The layers were also embedded into the laminate so the opportunity for higher yield with reduced features. An additional benefit of additive plating is reduction in waste and water usage.A special plating cell was also introduced that permitted uniform deep hole plating by forcing plating fluid into each of the thousands of PTH. The process titled “Push-PullTM” also accelerated the plating manufacturing cycle by over an order of magnitude, reducing cost.A small plating cell was incorporated into the prototype facility at CDC and a controlled set of experiments conducted. Experiments were thorough and challenging since no one in the industry could approach the lofty objectives of the ETA Systems Supercomputer CPU board nor the lofty claims of the inventor. The results were simply outstanding. From the results and a commitment to fabricate a larger manufacturing line of plating insert cells, the 45 layer 15” x 24” CPU board became a realistic finalized goal of ETA Systems.Anyone told of this goal openly scoffed at this as too risky and unrealistic. This included some in the company as well.Later, when manufacturing of the systems was viable, a production capacity was developed for manufacturing. It is noted that hundreds of these boards were fabricated from a period of 1987 through early 1989. The yield of final boards was nearly perfect – only one finished board was scrapped.

To this day (2009) few realize what a monumental accomplishment this was and still is. This a tribute to LeRoy Beckman, Peter Pellegrino, the manufacturing facility at ETA Systems (now a banking building in St. Paul) and those who trusted that the lofty objectives could be realized.

To accommodate routing and designing for minimum distance between IC chips, CAD tools were developed and the first use of diagonal routed layers were introduced. Prior to this only x–y layers were permitted with manual and/or auto tools (CAD). This enhancement permitted timing constraints to be realized between chips.

The final board had the following noteworthy characteristics:

  • Board size: 15 inches by 22 inches by 0.26 inches
  • Pth hole ratio ≈ 20:1 – plating time – less than 20 minutes
  • 45 total layers per CPU panel
  • 150 IC chip locations (fewer were used in final design)
  • More than 30,000 board plated thru holes (pth) were used for interconnect

In 2009 this board development and manufacturing stands out as one of the major technology developments by ETA Systems

Packaging

The key challenge for packaging the ETA Supercomputer processing unit was the cryogenic chamber for the processor. The Cryostat to contain the processor (two processor units) had a conventional (and quite heavy) circular cryostat containing a vacuum chamber between the outside environment and the inner environment. Input of liquid Nitrogen was at the bottom of the chamber and the escaping of the gaseous Nitrogen was provided for near the top of the unit. The piping containing the Nitrogen to and from the regeneration unit was also temperature protected with vacuum lines. Dan Sullivan and his design team led this admirable effort. (Unfortunately, Dan passed on a few years ago). It was felt that a less heavy and equally efficient chamber (proposed by Carl Breske – a very innovative scientist) could be designed if time permitted but the selection of the vacuum based design was conservative to accommodate schedule and also to familiarize the team with the challenges of Cryogenics. The compressor unit was a conventional Liquid Nitrogen system (very large and bulky) used for generation of Liquid Nitrogen for the commercial market. The system was not pretty. Marketing, led by Bobby Robertson (also now deceased), prohibited the engineers to show this to perspective customers fearful that this would scare them away.

Thought was given to actually eliminate the need to regenerate the system in a closed system but rather purchase Liquid Nitrogen – readily available in tanks - and have them periodically refilled as is done in the IC and other industries using Liquid Nitrogen. This was discarded for the initial design since several customer sites did not easily accommodate the external access to Liquid Nitrogen tanks. It was to be an option for future systems and those customers that easily accommodated such an option. The final design was then a closed recycled Liquid Nitrogen system with the compressor located remote, much like Freon compressors, which many Supercomputer customers were already accommodating. The design challenge was at the surface (looked much like a two slice toaster) where the processing boards were inserted. This seal had to accommodate the connecting transmission to the external and room temperature memory and I/O subsystems. A printed circuit board was designed to connect the processor to the outside world. Heaters were applied to the surface to prevent icing at the cryostat surface. The separation, only a few short inches had memory operating at 300 degrees Kelvin and the CPU operating at 77 Degrees Kelvin. There were a few “frosty” events in this development cycle! The third challenge was to provide reliable soldering of the circuitry to the board amidst the severe temperature difference that the solder joints would be subjected to (greater than 250 degrees) during the cool down and warm up cycles. Studies at the National Bureau of Standards provided input that the temperature cycle should be profiled in a precise sequence as the board was cooled and heated. In addition, care as not to remove the board and to care for condensation that would occur if the board had not been heated to room temperature was considered. The result was a 20-minute cycle to remove or insert the board was designed with a specifically prescribed sequence of temperature lowering and rising for both cycles. At the time of the unfortunate termination of ETA Systems, a more refined, lower cost and lower weight design as stated earlier was on the drawing boards. Although the cryostat and associated cooling was costly, an analysis clearly showed that for the performance resulting from the design, the cost was less than any Bipolar IC system designed at the time. Once the connector was finalized and the process and assembly designed, the system operated flawlessly. Checkout on the manufacturing floor of the system utilized the “Self-Test” capability exhaustively so specific interconnect flaws were clearly understood prior to removing a CPU from the cryostat, thus reducing checkout time considerably as well. These designs were well done, significant and challenged laws of thermodynamics and physics to new limits.

Air Cooled System

As stated earlier in the document, an Air-cooled processor would operate considerably slower (2x slower) when operated in normal or “room temperature” environments. ETA Systems by sorting the devices for performance at incoming inspection, allowed for a three times performance differential to be realized. Only the highest performance devices were reserved for the Cryogenic cooled system. The remaining parts were then re-sorted into two categories for room temperature; the differential would be a 4-nanosecond clock period between the two room temperature systems and 17 nanoseconds (24 to 7) for the total system product set. The sorting and using the entire distribution of Integrated Circuits had a significant cost reduction factor for the entire product line. Bipolar devices, by contrast had lower functional yield to begin with coupled with additional loss of product due to performance yield. This was a definite cost reduction asset to the ETA System.

To cool the CPU air was forced on to the processor chips by using a plenum that was designed to cover each chip. Holes were designed in the plenum such that equal operating temperature would result for each operating chip. Since the power consumption variation significant for several part types, designing the appropriate number of holes above each chip location could provide custom cooling. The plenum could then be molded for mass production of the processing unit. Large volume cooling fans were designed for the system as well. Cost was the focus for the air-cooled systems since the price tag was below $1M. Recall, that the air-cooled design was identical in parts at the CPU and storage level. A single development was achieved for a wide range of products with one design team.

Storage

Stacks using three-dimensional characteristics were designed under the leadership of Brent Doyle for the memory – both static (high performance) and dynamic (high density and lower performance) memories of the ETA Systems Supercomputer. These unique designs provided for highest density and optimum performance of the standard memory devices used. Ability to upgrade to future generations of memory (more storage capacity Integrated Circuits) was built into the design as well. The design worked well and stacking became commonplace in the computer industry for future designs – eventually eliminating the chip package entirely.

The Air-cooled system was defined as a Piper. An Illustration of “Piper” is shown below.

Piper processor.jpg

Summary

The design of the ETA Systems Supercomputer hardware had many unique features. The brief pages highlight some of them.

It would be remiss not to briefly discuss the “team” concept used to design the hardware. By having the CAD, Packaging, memory, circuit and power expertise located in a close proximity and holding concise project reviews at all levels at periodic and timely phases, all were kept abreast of the progress and challenges of each other. This permitted changes to be made to necessary designs to properly accommodate the challenges and opportunities in a timely fashion. Hardware was demonstrated on or near schedule despite the innovations required in each aspect of the design. The team was truly a “team”.

A missing link to the team was the logic design. These folks were separate and actually on another floor of the ETA Systems facility. It was strongly suggested and accepted for future designs, that the logic team would be a part of this common organization. I had the opportunity to lead one additional hardware development that included the logic design team (at Cray Research, not ETA Systems) later. It was a smoother and more effective and thorough team. Like ETA Systems the communications were open and included both manufacturing and software participation (the later two were voluntary).

Clearly, communications – effective communications at all levels of the organization was key to this hardware design success.