34 results on '"Simon M. Tam"'
Search Results
2. A 22 nm 15-Core Enterprise Xeon® Processor Family.
- Author
-
Stefan Rusu, Harry Muljono, David Ayers, Simon M. Tam, Wei Chen 0129, Aaron Martin, Shenggao Li, Sujal Vora, Raj Varada, and Eddie Wang
- Published
- 2015
- Full Text
- View/download PDF
3. Power reduction techniques for an 8-core xeon® processor.
- Author
-
Stefan Rusu, Simon M. Tam, Harry Muljono, Jason Stinson, David Ayers, Jonathan Chang, Raj Varada, Matt Ratta, Sailesh Kottapalli, and Sujal Vora
- Published
- 2009
- Full Text
- View/download PDF
4. A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache.
- Author
-
Stefan Rusu, Simon M. Tam, Harry Muljono, David Ayers, and Jonathan Chang
- Published
- 2006
- Full Text
- View/download PDF
5. Itanium processor clock design.
- Author
-
Utpal Desai, Simon M. Tam, Robert Kim, Ji Zhang, and Stefan Rusu
- Published
- 2000
- Full Text
- View/download PDF
6. A 45 nm 8-Core Enterprise Xeon¯ Processor.
- Author
-
Stefan Rusu, Simon M. Tam, Harry Muljono, Jason Stinson, David Ayers, Jonathan Chang, Raj Varada, Matt Ratta, Sailesh Kottapalli, and Sujal Vora
- Published
- 2010
- Full Text
- View/download PDF
7. A 65-nm Dual-Core Multithreaded Xeon® Processor With 16-MB L3 Cache.
- Author
-
Stefan Rusu, Simon M. Tam, Harry Muljono, David Ayers, Jonathan Chang, Brian S. Cherkauer, Jason Stinson, John Benoit, Raj Varada, Justin Leung, Rahul Dilip Limaye, and Sujal Vora
- Published
- 2007
- Full Text
- View/download PDF
8. 5.4 Ivytown: A 22nm 15-core enterprise Xeon® processor family.
- Author
-
Stefan Rusu, Harry Muljono, David Ayers, Simon M. Tam, Wei Chen 0129, Aaron Martin, Shenggao Li, Sujal Vora, Raj Varada, and Eddie Wang
- Published
- 2014
- Full Text
- View/download PDF
9. A 45nm 8-core enterprise Xeon® processor.
- Author
-
Stefan Rusu, Simon M. Tam, Harry Muljono, Jason Stinson, David Ayers, Jonathan Chang, Raj Varada, Matt Ratta, and Sailesh Kottapalli
- Published
- 2009
- Full Text
- View/download PDF
10. Analog VLSI neural networks for impact signal processing.
- Author
-
Jeff Brauch, Simon M. Tam, Mark A. Holler, and Arthur L. Shmurun
- Published
- 1992
- Full Text
- View/download PDF
11. SkyLake-SP: A 14nm 28-Core xeon® processor
- Author
-
Edward Wang, Tom Wang, Rizwan Qureshi, Harry Muljono, Hubert Hsieh, Sitaraman V. Iyer, Wei Chen, Min Huang, Kalapi Roy-Neogi, Nagmohan Satti, Sujal Vora, and Simon M. Tam
- Subjects
020203 distributed computing ,CMOS ,Xeon ,Computer science ,Server ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,020207 software engineering ,02 engineering and technology ,Parallel computing ,Cache ,PCI Express - Abstract
SkyLake-SP (Scalable Performance), code name SKX, is the next generation Xeon® server processor fabricated on the Intel® 14nm tri-gate CMOS technology with 11-metal layers [1,2]. The SKX processor family has three core-count configurations. Each SKX core is accompanied by 1MB of dedicated L2 (2nd level cache) and 1.375MB of non-exclusive L3 (3rd level cache). At its maximum configuration of 28 cores, the SKX processor supports 6 DDR4 channels (2666MT/s), 3×20-lanes UPI processor-to-processor links (10.4GT/s) and x48+4 PCIE links (8GT/s). SKX supports per-core power-performance optimization enabled by on-die integrated voltage regulators (FIVR) [3, 4]. A new 2-dimensional synchronous on-die MESH fabric interconnects all the on-die components. Fig. 2.1.1 shows the overall architecture of the SKX processor.
- Published
- 2018
12. A 22 nm 15-Core Enterprise Xeon® Processor Family
- Author
-
Edward Wang, Aaron K. Martin, Simon M. Tam, Shenggao Li, Wei Chen, Raj Varada, Sujal Vora, Harry Muljono, David J. Ayers, and Stefan Rusu
- Subjects
Memory buffer register ,Engineering ,Xeon ,business.industry ,CPU cache ,Interface (computing) ,Modular design ,Floorplan ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,Cache ,Electrical and Electronic Engineering ,business ,PCI Express - Abstract
This paper describes a 4.3B transistors, 15-cores, 30-threads enterprise Xeon® processor with a 37.5 MB shared L3 cache implemented in a 22 nm 9M Hi-K metal gate tri-gate process. A modular floorplan methodology enables easy chops to 10 and 6 cores. Multiple clock and voltage domains are used to reduce power consumption. The clock distribution uses a single PLL per column to save power and minimize deskew crossing points. Integrated PCIe Gen3 and Quick Path Interconnect® (QPI) ports operate at 8GT/s. The 4-channel memory interface supports both 1866 MT/s DDR3 and a new memory buffer interface running at 2667 MT/s on the same pins. The core, cache and I/O recovery techniques improve manufacturing yields and enable multiple product flavors from the same silicon die.
- Published
- 2015
13. Active Filter-Based Hybrid On-Chip DC–DC Converter for Point-of-Load Voltage Regulation
- Author
-
Simon M. Tam, Bruce C. McDermott, Eby G. Friedman, Selcuk Kose, and S. Pinzon
- Subjects
Engineering ,Low-dropout regulator ,business.industry ,Voltage divider ,Hardware_PERFORMANCEANDRELIABILITY ,Voltage regulator ,Hardware_GENERAL ,Hardware and Architecture ,Dropout voltage ,Boost converter ,Hardware_INTEGRATEDCIRCUITS ,Electronic engineering ,Voltage multiplier ,Voltage regulation ,Electrical and Electronic Engineering ,business ,Software ,Voltage converter - Abstract
An active filter-based on-chip DC-DC voltage converter for application to distributed on-chip power supplies in multivoltage systems is described in this paper. No inductor or output capacitor is required in the proposed converter. The area of the voltage converter is therefore significantly less than that of a conventional low-dropout (LDO) regulator. Hence, the proposed circuit is appropriate for point-of-load voltage regulation for noise sensitive portions of an integrated circuit. The performance of the circuit has been verified with Cadence Spectre simulations and fabricated with a commercial 110 nm complimentary metal oxide semiconductor (CMOS) technology. The area of the voltage regulator is 0.015 mm2 and delivers up to 80 mA of output current. The transient response with no output capacitor ranges from 72 to 192 ns. The parameter sensitivity of the active filter is also described. The advantages and disadvantages of the active filter-based, conventional switching, linear, and switched capacitor voltage converters are compared. The proposed circuit is an alternative to classical LDO voltage regulators, providing a means for distributing multiple local power supplies across an integrated circuit while maintaining high current efficiency and fast response time within a small area.
- Published
- 2013
14. Low-Cost Dynamic Compensation Scheme for Local Clocks of Next Generation High Performance Microprocessors
- Author
-
Simon M. Tam, Tak M. Mak, Martin Omana, Cecilia Metra, M. Omaña, C. Metra, T. M. Mak, and S. Tam
- Subjects
Scheme (programming language) ,Engineering ,business.industry ,Skew ,HIGH PERFORMANCE MICROPROCESSORS ,CLOCK COMPENSATION SCHEME ,Compensation (engineering) ,Application-specific integrated circuit ,Hardware and Architecture ,Duty cycle ,Power consumption ,Low-power electronics ,Electronic engineering ,Overhead (computing) ,Electrical and Electronic Engineering ,business ,computer ,Software ,computer.programming_language - Abstract
We propose a low cost scheme for the dynamic compensation in the field of undesired skew and duty cycle variations of local clocks of high performance microprocessors and high end ASICs. Compared to alternate approaches, our solution features lower power consumption, smaller compensation error, and a lower or comparable area overhead.
- Published
- 2011
15. A 45 nm 8-Core Enterprise Xeon¯ Processor
- Author
-
Sailesh Kottapalli, Sujal Vora, Stefan Rusu, Matt Ratta, Raj Varada, Harry Muljono, J. Stinson, J. Chang, Simon M. Tam, and David J. Ayers
- Subjects
Engineering ,Hardware_MEMORYSTRUCTURES ,Xeon ,CPU cache ,business.industry ,Hardware_PERFORMANCEANDRELIABILITY ,Voltage regulator ,law.invention ,Microprocessor ,CMOS ,law ,Embedded system ,Power semiconductor device ,Cache ,Electrical and Electronic Engineering ,business ,Sleep mode - Abstract
This paper describes a 2.3 Billion transistors, 8-core, 16-thread, 64-bit Xeon® EX processor with a 24 MB shared L3 cache implemented in a 45 nm nine-metal process. Multiple clock and voltage domains are used to reduce power consumption. Long channel devices and cache sleep mode are used to minimize leakage. Core and cache recovery improve manufacturing yields and enable multiple product flavors from the same silicon die. The disabled blocks are both clock and power gated to minimize their power consumption. Idle power is reduced by shutting off the unterminated I/O links and shedding phases in the voltage regulator to improve the power conversion efficiency.
- Published
- 2010
16. A 65-nm Dual-Core Multithreaded Xeon® Processor With 16-MB L3 Cache
- Author
-
B. Cherkauer, Rahul Limaye, J. Chang, Harry Muljono, J. Stinson, Sujal Vora, John Benoit, Simon M. Tam, Justin Leung, Raj Varada, David J. Ayers, and Stefan Rusu
- Subjects
Hardware_MEMORYSTRUCTURES ,Xeon ,CPU cache ,Computer science ,business.industry ,Pipeline burst cache ,Hardware_PERFORMANCEANDRELIABILITY ,Parallel computing ,Uncore ,Smart Cache ,Logic gate ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,Electrical and Electronic Engineering ,business ,Cache algorithms - Abstract
This paper describes a dual-core 64-b Xeon MP processor implemented in a 65-nm eight-metal process. The 435-mm2 die has 1.328-B transistors. Each core has two threads and a unified 1-MB L2 cache. The 16-MB shared, 16-way set-associative L3 cache implements both sleep and shut-off leakage reduction modes. Long channel transistors are used to reduce subthreshold leakage in cores and uncore (all portions of the die that are outside the cores) control logic. Multiple voltage and clock domains are employed to reduce power
- Published
- 2007
17. Low-cost on-chip clock jitter measurement scheme
- Author
-
Daniele Rossi, Daniele Giaffreda, Simon M. Tam, Asifur Rahman, Martin Omana, Tak M. Mak, Cecilia Metra, M. Omaña, D. Rossi, D. Giaffreda, C. Metra, TM Mak, A. Rahman, and S. Tam
- Subjects
Clock jitter ,high performance microprocessor ,jitter measurement ,Computer science ,Noise (signal processing) ,HIGH PERFORMANCE MICROPROCESSORS ,Ring oscillator ,Power (physics) ,law.invention ,Microprocessor ,clockt jiter ,Hardware and Architecture ,law ,Electronic engineering ,Overhead (computing) ,System on a chip ,Node (circuits) ,Electrical and Electronic Engineering ,Software ,Jitter - Abstract
In this paper we present a low cost, on-chip clock jitter digital measurement scheme for high performance microprocessors. It enables in-situ jitter measurement during the test or debug phase. It provides very high measurement resolution and accuracy, despite the possible presence of power supply noise (representing a major source of clock jitter), at low area and power costs. The achieved resolution is scalable with technology node and can in principle be increased as much as desired, at low additional costs in terms of area overhead and power consumption. We show that, for the case of high performance microprocessors employing Ring Oscillators (ROs) to measure process parameter variations, our jitter measurement scheme can be implemented by re-using part of such ROs, thus allowing to measure clock jitter with very limited cost increase compared to process parameter variation measurement only, and with no impact on parameter variation measurement resolution.
- Published
- 2015
18. A 130-nm triple-V/sub t/ 9-MB third-level on-die cache for the 1.7-GHz Itanium/spl reg/ 2 processor
- Author
-
Stefan Rusu, Ming Huang, G. Leong, Simon M. Tam, M. Haque, Sarvesh H. Kulkarni, K. Desai, Jonathan Shoemaker, R. Goe, J. Chang, Siufu Chiu, M. Karim, and Kevin Truong
- Subjects
Microprocessor ,CPU cache ,business.industry ,law ,Computer science ,Hardware_INTEGRATEDCIRCUITS ,Itanium ,Cache ,Parallel computing ,Electrical and Electronic Engineering ,business ,Computer hardware ,law.invention - Abstract
The 18-way set-associative, single-ported 9 MB cache for the Itanium 2 processor uses 210 identical 48-kB sub-arrays with a 2.21-/spl mu/m/sup 2/ cell in a 130-nm 6-metal technology. The processor runs at 1.7 GHz at 1.35 V and dissipates 130 W. The 432-mm/sup 2/ die contains 592 M transistors, the largest transistor count reported for a microprocessor. This paper reviews circuit design and implementation details for the L3 cache data and tag arrays. The staged mode ECC scheme avoids a latency increase in the L3 tag. A high V/sub t/ implant improves the read stability and reduces the sub-threshold leakage.
- Published
- 2005
19. Clock Generation and Distribution for the 130-nm Itanium¯ 2 Processor With 6-MB On-Die L3 Cache
- Author
-
U.N. Desai, Simon M. Tam, and Rahul Limaye
- Subjects
Synchronous circuit ,Computer science ,Clock signal ,Underclocking ,Clock rate ,Skew ,Static timing analysis ,Clock gating ,Digital clock manager ,Parallel computing ,Clock skew ,Clock domain crossing ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Electrical and Electronic Engineering ,CPU multiplier ,Asynchronous circuit - Abstract
The clock generation and distribution system for the 130-nm Itanium 2 processor operates at 1.5 GHz with a skew of 24 ps. The Itanium 2 processor features 6 MB of on-die L3 cache and has a die size of 374 mm/sup 2/. Fuse-based clock de-skew enables post-silicon clock optimization to gain higher frequency. This paper describes the clock generation, global clock distribution, local clocking, and the clock skew optimization feature.
- Published
- 2004
20. A 1.5-GHz 130-nm Itanium 2 processor with 6-MB on-die L3 cache
- Author
-
Simon M. Tam, Justin Leung, Stefan Rusu, B. Cherkauer, J. Stinson, and Harry Muljono
- Subjects
CPU cache ,Computer science ,Circuit design ,Design for testing ,Mixed-signal integrated circuit ,Hardware_PERFORMANCEANDRELIABILITY ,Integrated circuit design ,Parallel computing ,Circuit extraction ,Explicitly parallel instruction computing ,Hardware_INTEGRATEDCIRCUITS ,Itanium ,Cache ,Electrical and Electronic Engineering ,Physical design - Abstract
This 130-nm Itanium 2 processor implements the explicitly parallel instruction computing (EPIC) architecture and features an on-die 6-MB 24-way set-associative level-3 cache. The 374-mm/sup 2/ die contains 410 M transistors and is implemented in a dual-V/sub t/ process with six Cu interconnect layers and FSG dielectric. The processor runs at 1.5 GHz at 1.3 V and dissipates a maximum of 130 W. This paper reviews circuit design and package details, power delivery, the reliability, availability, and serviceability (RAS) features, design for test (DFT), and design for manufacturability (DFM) features, as well as an overview of the design and verification methodology. The fuse-based clock deskew circuit achieves 24-ps skew across the entire die, while the scan-based skew control further reduces it to 7 ps. The 128-bit front-side bus has a bandwidth of 6.4 GB/s and supports up to four processors on a single bus.
- Published
- 2003
21. Clock generation and distribution for the first IA-64 microprocessor
- Author
-
Ian A. Young, Stefan Rusu, R. Kim, U. Nagarji Desai, Ji Zhang, and Simon M. Tam
- Subjects
Synchronous circuit ,Clock signal ,Computer science ,business.industry ,Underclocking ,Clock rate ,Skew ,Static timing analysis ,Clock gating ,Integrated circuit design ,Digital clock manager ,Clock skew ,Clock synchronization ,Timing failure ,law.invention ,Microprocessor ,Logic synthesis ,Clock domain crossing ,law ,Embedded system ,Master clock ,Electrical and Electronic Engineering ,business ,CPU multiplier ,Asynchronous circuit - Abstract
The clock design for the first implementation of the IA-64 microprocessor is presented. A clock distribution with an active distributed deskewing technique is used to achieve a low skew of 28 ps. This technique is capable of compensating skews caused by within-die process variations that are becoming a significant factor of the clock design. The global, regional and local clock distributions are described. A multilevel skew budget and local clock timing methodology are used to enable a high-performance design by providing support for intentional clock skew injection and time borrowing. By providing a test access port interface to the deskew architecture and the incorporation of the on-die-clock-shrink, this design is equipped with two very powerful post-silicon timing debug tools that are critical to high-performance microprocessor design and enabled quick time-to-market.
- Published
- 2000
22. New Design For Testability Approach for Clock Fault Testing
- Author
-
Simon M. Tam, Martin Omana, Tak M. Mak, Cecilia Metra, C. Metra, M. Omaña, TM Mak, and S. Tam
- Subjects
business.industry ,Computer science ,Underclocking ,Design for testing ,MANUFACTURING TEST ,Clock gating ,Digital clock manager ,Clock skew ,CLOCK BUFFER ,Timing failure ,Fault detection and isolation ,Theoretical Computer Science ,law.invention ,Microprocessor ,Computational Theory and Mathematics ,Application-specific integrated circuit ,Hardware and Architecture ,law ,Embedded system ,HIGH PERFORMANCE MICROPROCESSOR ,CLOCK FAULTS ,business ,Software ,CPU multiplier - Abstract
We propose a new design for testability approach for testing clock faults of next generation high performance microprocessors. In fact, it has been shown that conventional manufacturing test is unable to guarantee their detection, although they could compromise the effectiveness of delay fault testing, as well as the microprocessor correct operation in the field. These conditions will of course worsen with technology scaling, due to the expected increase in fault likelihood, included clock faults. To deal with these problems we propose a design for testability approach that, by means of simple modifications to conventional clock buffers, allows clock fault detection through any conventional manufacturing test approach. This is achieved at the cost of very low increase in area and power consumption of clock buffers, and with no additional test cost or impact on the microprocessor performance and in-field operation. We then introduce a possible further modification to clock buffers that, at additional limited costs in terms of area and power consumption, allows their calibration after fabrication in order to compensate for parameter variations possibly occurring during manufacturing, thus minimizing the likelihood of either false test fails, or test misses. As an example, we show the application of our approach to the clock distribution network of the Pentium® 4 microprocessor (Other names and brands may be claimed as property of others). However, it can be applied to the clock distribution of any high performance ASIC, or microprocessor.
- Published
- 2012
23. Implementation and performance of an analog nonvolatile neural network
- Author
-
Mark A. Holler, Hernan A. Castro, and Simon M. Tam
- Subjects
Artificial neural network ,Computer science ,Integrated circuit ,Chip ,Surfaces, Coatings and Films ,law.invention ,Synaptic weight ,CMOS ,Hardware and Architecture ,law ,Signal Processing ,Operational amplifier ,Electronic engineering ,Gain stage ,Electronic circuit - Abstract
An integrated circuit implementation of a fully parallel analog artificial neural network is presented. We include details of the architecture, some of the important design considerations, a description of the circuits and finally actual performance data. The electrically trainable artificial neural network (ETANN) chip incorporates 64 analog neurons and 10,240 analog synapses and utilizes a 1-µm CMOS NVM process. The network calculates the dot product between a 64-element analog input vector and a 64 × 64 nonvolatile (EEPROM based) analog synaptic weight array. These calculations occur at a rate in excess of 1.3 billion interconnections per second. All elements of the computation are stored and calculated in the analog domain and strictly in parallel. A 2:1 input and neuron multiplex mode permits rates in excess of 2 billion interconnections per second and a single-chip effective network size of 64 inputs by 128 outputs. The ETANN incorporates differential signal techniques throughout for improved noise rejection. Current summing is employed for the sum of products calculations. The chip integrates approximately 400 op amps, including variable gain stages of from 20 to 54 dB. Inevitable component to component variations due to the use of minimum dimension elements are found not to be significant for operation in an adaptive environment.
- Published
- 1993
24. On-die Ring Oscillator Based Measurement Scheme for Process Parameter Variations and Clock Jitter
- Author
-
Martin Omana, Simon M. Tam, Daniele Giaffreda, Cecilia Metra, Asifur Rahman, Tak M. Mak, M. Omaña, D. Giaffreda, C. Metra, TM Mak, S. Tam, and A. Rahman
- Subjects
Control theory ,Computer science ,Clock domain crossing ,Phase (waves) ,Electronic engineering ,Digital clock manager ,Process variable ,Ring oscillator ,Clock skew ,Jitter ,Power (physics) - Abstract
We present a novel low cost scheme for the on-die measurement of either clock jitter, or process parameter variations. By re-using and properly modifying the Ring Oscillators (ROs) that are currently widely employed for process parameter variation measurement in high performance microprocessors, our proposed scheme can be easily set in either the process parameter variation measurement mode, or the clock jitter measurement mode, by acting on an external control signal. This way, during the test or debug phase, clock jitter can also be measured at negligible area and power costs with respect to process parameter variation measurement only. Our scheme is scalable in the provided clock jitter measurement resolution, while allowing the same process parameter variation measurement resolution as the currently employed RO based schemes. Moreover, due to its allowing both process parameter variation and clock jitter measurements, our scheme features accurate clock jitter measurement despite the possible presence of significant process parameter variations.
- Published
- 2010
25. Modern Clock Distribution Systems
- Author
-
Simon M. Tam
- Subjects
Variable (computer science) ,Traverse ,Computer science ,Clock rate ,Electronic engineering ,Master clock ,Digital clock manager ,Chip ,Clock synchronization ,Clock network - Abstract
Modern clock distribution design continues to face challenges in spite of significant advances in the last decade. We can distinguish three primary challenges. The first is the need to support higher clock frequencies based on the strong correlation between frequency and chip performance. Figure 2.1 shows processor clock frequency trend suggesting a continuous exponential increase in clock frequency with variable rates. Second, process technology scaling allows higher level of integration and larger die size leading to higher clock loading and larger distances the clock network needs to traverse. The final challenge is that technology scaling leads to an increase in on-die variations that may degrade clock performance if not properly addressed.
- Published
- 2009
26. Novel On-Chip Clock Jitter Measurement Scheme For High Performance Microprocessors
- Author
-
Martin Omana, Asifur Rahman, Cecilia Metra, Simon M. Tam, Tak M. Mak, C. Metra, M. Omaña, T.M. Mak, A. Rahman, and S. Tam
- Subjects
Microprocessor ,Computer science ,law ,Underclocking ,Electronic engineering ,Clock gating ,System on a chip ,Digital clock manager ,Clock skew ,law.invention ,Jitter ,CPU multiplier - Abstract
In this paper we present an on-chip clock jitter digital measurement scheme for high performance microprocessors. The scheme enables in-situ jitter measurement of the clock distribution network during the test or the debug phase. It provides very high measurement resolution, despite the possible presence of power supply noise (constituting a major cause of clock jitter) affecting itself. The resolution is higher than a min sized inverter input-output delay, and can on principle be further increased, at some additional costs in terms of area overhead and power consumption. In this paper, a resolution of the 1.8% of the clock period is achieved with limited area and power costs.
- Published
- 2008
27. A 65nm 95W Dual-Core Multi-Threaded Xeon� Processor with L3 Cache
- Author
-
David J. Ayers, Simon M. Tam, Stefan Rusu, J. Chang, Sujal Vora, and B. Cherkauer
- Subjects
Smart Cache ,Snoopy cache ,Xeon ,business.industry ,Cache coloring ,CPU cache ,Computer science ,Pipeline burst cache ,Cache ,business ,MESIF protocol ,Computer hardware - Abstract
This paper describes a 95 W dual-core 64-bit Xeonreg MP processor implemented in a 65 nm 8 metal layer process. Each processor core has a unified 1MB L2 cache and supports the Intelreg Extended Memory 64 Technology and the Hyper-Threading Technology. The shared L3 cache has extensive RAS features including the Intelreg Cache Safe Technology and Error Correction Codes (ECC). The processor is designed and optimized to operate at a 95W thermal design power envelope at the target product frequency. The front-side bus operates at 667 MT/s or 800 MT/s in a 3 load topology that is compatible with existing platforms.
- Published
- 2006
28. Clock Generation and Distribution of a Dual-Core Xeon Processor with 16MB L3 Cache
- Author
-
M. Adachi, Rahul Limaye, Sujal Vora, Simon M. Tam, Justin Leung, and S. Choy
- Subjects
CPU cache ,Computer science ,Underclocking ,Clock rate ,Matrix clock ,Pipeline burst cache ,Clock gating ,Parallel computing ,Digital clock manager ,Clock skew ,Clock synchronization ,Clock domain crossing ,ComputerSystemsOrganization_MISCELLANEOUS ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,CPU multiplier - Abstract
The clock generation and hybrid clock distribution for a dual-core Xeonreg processor with 16MB L3 cache are designed for
- Published
- 2006
29. Clock generation and distribution for the third generation Itanium/spl reg/ processor
- Author
-
Simon M. Tam, U. Desai, and Rahul Limaye
- Subjects
Distribution system ,Engineering ,CMOS ,Distribution (number theory) ,business.industry ,Skew ,Itanium ,Parallel computing ,Digital clock manager ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,business ,Third generation ,PATH (variable) - Abstract
The clock generation and distribution system for the third generation Itanium/spl reg/ processor operates at 1.5 GHz with a skew of 24 ps. Clock optimization fuses enable post-silicon speed path balancing for higher performance.
- Published
- 2004
30. Learning on an analog VLSI neural network chip
- Author
-
Simon M. Tam, B. Gupta, H.A. Castro, and M. Holler
- Subjects
Very-large-scale integration ,Scheme (programming language) ,Theoretical computer science ,Artificial neural network ,Computer science ,business.industry ,Process (computing) ,business ,Chip ,computer ,Backpropagation ,Computer hardware ,computer.programming_language - Abstract
The issues associated with implementing the error backpropagation algorithm on a 64-neuron nonvolatile analog VLSI neural network chip (ETANN) are described. Imperfections in the analog ETANN chip were identified and found to impose constraints on the learning process. A chip-in-the-loop learning technique and an adaptive, reinforced, bake-train-bake scheme are reported. These techniques have shown potential in surmounting the difficulties connected with learning on an analog neural network chip. Experimental results are reported. >
- Published
- 2002
31. Itanium processor clock design
- Author
-
Stefan Rusu, Robert Kim, Ji Zhang, Utpal Desai, and Simon M. Tam
- Subjects
business.industry ,Computer science ,Clock rate ,Performance tuning ,Skew ,Digital clock manager ,law.invention ,Microprocessor ,law ,Itanium ,IA-64 ,business ,Computer hardware ,CPU multiplier - Abstract
The Itanium processor is Intel's first 64-bit microprocessor [1] and features a highly parallel architecture fabricated using the 0.18um process. This higher integration of features requires a significant silicon real estate and high clock loading. These factors, coupled with more prominent on-die variations because of reduced device geometries, call for special techniques to manage the clock design. The Itanium processor employs very well balanced clock routing along with distributed deskew buffers (DSK) to achieve low skew. The ItaniumTM processor also includes additional features to aid performance tuning and timing debug. This paper highlights the salient features of the Itanium processor clock design and presents clock characterization data from initial silicon.
- Published
- 2000
32. Width dependence of substrate and gate currents in MOSFET's
- Author
-
Simon M. Tam, Cheming Hu, P.K. Ko, and T.-C. Ong
- Subjects
Materials science ,Field (physics) ,Equivalent series resistance ,Dopant ,business.industry ,Electrical engineering ,Substrate (electronics) ,Electronic, Optical and Magnetic Materials ,MOSFET ,Optoelectronics ,LOCOS ,Electrical and Electronic Engineering ,business ,AND gate ,Voltage drop - Abstract
Width dependence of hot-electron currents in MOSFET's fabricated with LOCOS, non-LOCOS, and a modified LOCOS processes are studied. The experimental results show that the substrate and gate currents are apparently enhanced in narrow width devices. The enhancement, however, is due to different voltage drops across the source-drain series resistance. The voltage drops are usually larger in wider devices. After correcting for the resistance effect, the substrate and gate currents scale with the device width. With this typical LOCOS process, the bird's beak and in-diffusion of field implant dopants do not cause excess hot-electron activities along the channel/field edges as has been suspected. Some other LOCOS process could, of course, produce a different result. Studies using wide test devices must consider the series resistance effect. With this precaution taken, models derived from wide-channel data will be applicable to narrow-channel devices, at least for some processes.
- Published
- 1985
33. Hot-electron-induced photon and photocarrier generation in Silicon MOSFET's
- Author
-
Chenming Hu and Simon M. Tam
- Subjects
Physics ,business.industry ,Bremsstrahlung ,Biasing ,Electron ,Electronic, Optical and Magnetic Materials ,Impact ionization ,MOSFET ,Optoelectronics ,Field-effect transistor ,Charge carrier ,Electrical and Electronic Engineering ,Atomic physics ,business ,NMOS logic - Abstract
The phenomenon of and the physical mechanisms for the generation of minority carriers in the substrate of NMOS and CMOS are studied. Secondary impact ionization is not responsible. The responsible mechanisms are hot-electron-induced photocarrier generation and, under extreme conditions, forward biasing of the source-substrate junction. The photon generation is believed to be due to the bremsstrahlung of the channel hot electrons. A theoretical model based on the lucky electron concept and the bremsstrahlung mechanism is proposed. The calculated characteristics of photon generation agree well with experimental results. About 2 × 10-5photogenerated minority carriers are generated for every (primary) impact-ionization event in NMOSFET. Photocarrier-induced leakage current can be fitted with either an inverse square dependence on distance or an exponential dependence with an effective decay length of about 780 µm.
- Published
- 1984
34. Lucky-electron model of channel hot-electron injection in MOSFET'S
- Author
-
P.K. Ko, Chenming Hu, and Simon M. Tam
- Subjects
Free electron model ,Channel length modulation ,Chemistry ,MOSFET ,Electronic engineering ,Electron temperature ,Field-effect transistor ,Short-channel effect ,Electron ,Electrical and Electronic Engineering ,Electronic, Optical and Magnetic Materials ,Hot-carrier injection ,Computational physics - Abstract
The lucky-electron concept is successfully applied to the modeling of channel hot-electron injection in n-channel MOSFET's, although the result can be interpreted in terms of electron temperature as well. This results in a relatively simple expression that can quantitatively predict channel hot-electron injection current in MOSFET's. The model is compared with measurements on a series of n-channel MOSFET's and good agreement is achieved. In the process, new values for many physical parameters such as hot-electron scattering mean-free-path, impact-ionization energy are determined. Of perhaps even greater practical significance is the quantitative correlation between the gate current and the substrate current that this model suggests.
- Published
- 1984
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.