31 results on '"Cliff Sze"'
Search Results
2. Obstacle-Avoiding and Slew-Constrained Clock Tree Synthesis With Efficient Buffer Insertion
- Author
-
Qiang Zhou, Yici Cai, Feifei Niu, Chao Deng, Hailong Yao, and Cliff Sze
- Subjects
Very-large-scale integration ,Reduction (complexity) ,Hardware and Architecture ,Computer science ,Skew ,Topology (electrical circuits) ,Integrated circuit design ,Parallel computing ,Electrical and Electronic Engineering ,Routing (electronic design automation) ,Algorithm ,Software - Abstract
As VLSI technology continuously scales down, buffered clock tree synthesis (CTS) has become increasingly critical in an attempt to generate a high-performance synchronous chip design. This paper presents a novel obstacle-avoiding CTS approach with slew constraints satisfied and signal polarity corrected. We build a look-up table through NGSPICE simulation to achieve accurate buffer delay and slew, which guarantees that the final skew after NGSPICE simulation is as satisfactory as expected. Aiming at skew optimization under constraints of slew and obstacles, our CTS approach features the clock tree construction stage with the obstacle-aware topology generation algorithm called OBB, balanced insertion of candidate buffer positions and a fast heuristic buffer insertion algorithm. With an overall view on obstacles to explore the global optimization space, our CTS approach effectively overcomes the negative influence on skew brought by the obstacles. Experimental results show the effectiveness of our CTS approach with significantly improved skew and latency by 69.0% and 72.0% on average. In addition, the accuracy of the look-up table is demonstrated through the huge skew reduction by 87.3% on average. Moreover, our OBB heuristic algorithm obtains 53.2% improvement in skew than the classic balanced bipartition algorithm.
- Published
- 2015
- Full Text
- View/download PDF
3. Challenges and Future Directions of 3D Physical Design
- Author
-
Jens Lienig, Johann Knechtel, and Cliff Sze
- Subjects
Computer science ,Systems engineering ,Physical design - Published
- 2017
- Full Text
- View/download PDF
4. Fast and Highly Scalable Bayesian MDP on a GPU Platform
- Author
-
Frank Liu, Jiang Hu, Cliff Sze, He Zhou, and Sunil P. Khatri
- Subjects
0301 basic medicine ,050208 finance ,Speedup ,Computer science ,business.industry ,05 social sciences ,Bayesian probability ,Duplex (telecommunications) ,Parallel computing ,Solver ,Machine learning ,computer.software_genre ,03 medical and health sciences ,030104 developmental biology ,0502 economics and business ,Computer data storage ,Scalability ,Markov decision process ,Artificial intelligence ,business ,computer ,Curse of dimensionality - Abstract
By employing the Optimal Bayesian Robust (OBR) policy, Bayesian Markov Decision Process (BMDP) can be used to solve the Gene Regulatory Network (GRN) control problem. However, due to the "curse of dimensionality", the data storage limitation hinders the practical applicability of the BMDP. To overcome this impediment, we propose a novel Duplex Sparse Storage (DSS) scheme in this paper, and develop a BMDP solver with the DSS scheme on a heterogeneous GPU-based platform. The simulation results demonstrate that our approach achieves a 5x reduction in memory utilization with a 2.4% "decision difference" and an average speedup of 4.1x compared to the full matrix based storage scheme. Additionally, we present the tradeoff between the runtime and result accuracy for our DSS techniques versus the full matrix approach. We also compare our results with the well known Compressed Sparse Row (CSR) approach for reducing memory utilization, and discuss the benefits of DSS over CSR.
- Published
- 2017
- Full Text
- View/download PDF
5. Techniques for scalable and effective routability evaluation
- Author
-
Yaoguang Wei, Douglas Keller, Cliff Sze, Charles J. Alpert, Lakshmi Reddy, Sachin S. Sapatnekar, Zhuo Li, Natarajan Viswanathan, Gustavo E. Tellez, and Andrew D. Huber
- Subjects
Router ,Computer science ,Distributed computing ,media_common.quotation_subject ,Real-time computing ,Fidelity ,Computer Graphics and Computer-Aided Design ,Computer Science Applications ,Routing congestion ,Scalability ,Hardware_INTEGRATEDCIRCUITS ,Electrical and Electronic Engineering ,Physical design ,Design cycle ,Design closure ,Smoothing ,media_common - Abstract
Routing congestion has become a critical layout challenge in nanoscale circuits since it is a critical factor in determining the routability of a design. An unroutable design is not useful even though it closes on all other design metrics. Fast design closure can only be achieved by accurately evaluating whether a design is routable or not early in the design cycle. Lately, it has become common to use a “light mode” version of a global router to quickly evaluate the routability of a given placement. This approach suffers from three weaknesses: (i) it does not adequately model local routing resources, which can cause incorrect routability predictions that are only detected late, during detailed routing; (ii) the congestion maps obtained by it tend to have isolated hotspots surrounded by noncongested spots, called “noisy hotspots”, which further affects the accuracy in routability evaluation; and (iii) the metrics used to represent congestion may yield numbers that do not provide sufficient intuition to the designer, and moreover, they may often fail to predict the routability accurately. This article presents solutions to these issues. First, we propose three approaches to model local routing resources. Second, we propose a smoothing technique to reduce the number of noisy hotspots and obtain a more accurate routability evaluation result. Finally, we develop a new metric which represents congestion maps with higher fidelity. We apply the proposed techniques to several industrial circuits and demonstrate that one can better predict and evaluate design routability and that congestion mitigation tools can perform much better to improve the design routability.
- Published
- 2014
- Full Text
- View/download PDF
6. GPU acceleration for Bayesian control of Markovian genetic regulatory networks
- Author
-
Cliff Sze, Mohammadmahdi R. Yousefi, Jiang Hu, Sunil P. Khatri, Frank Liu, and He Zhou
- Subjects
Speedup ,Theoretical computer science ,Computational complexity theory ,Computer science ,Bayesian probability ,Brute-force search ,Markov process ,020207 software engineering ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,System dynamics ,symbols.namesake ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Markov decision process ,0105 earth and related environmental sciences - Abstract
A recently developed approach to precision medicine is the use of Markov Decision Processes (MDPs) on Gene Regulatory Networks (GRNs). Due to very limited information on the system dynamics of GRNs, the MDP must repeatedly conduct exhaustive search for a non-stationary policy, and thus entails exponential computational complexity. This has hindered its practical applications to date. With the goal of overcoming this obstacle, we investigate acceleration techniques, using the Graphic Processing Unit (GPU) platform, which allows massive parallelism. Our GPU-based acceleration techniques are applied with two different MDP approaches: the optimal Bayesian robust (OBR) policy and the forward search sparse sampling (FSSS) method. Simulation results demonstrate that our techniques achieve a speedup of two orders of magnitude over sequential implementations. In addition, we present a study on the memory utilization and error trends of these techniques.
- Published
- 2016
- Full Text
- View/download PDF
7. Physical Synthesis with Clock-Network Optimization for Large Systems on Chips
- Author
-
Charles J. Alpert, Zhuo Li, David A. Papa, Gi-Joon Nam, Cliff Sze, Natarajan Viswanathan, and Igor L. Markov
- Subjects
Cloning (programming) ,Computer science ,business.industry ,Hardware_PERFORMANCEANDRELIABILITY ,Timing closure ,Clock network ,Computer architecture ,Hardware and Architecture ,Logic gate ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,System on a chip ,Physical synthesis ,Electrical and Electronic Engineering ,Network synthesis filters ,business ,Software ,Hardware_LOGICDESIGN ,Degradation (telecommunications) - Abstract
In traditional physical-synthesis methodologies, the placement of flip-flops and latches is problematic, especially for large systems on chips. A next-generation electronic-design-automation methodology improves timing closure through clock-network synthesis and placement of flip-flops and latches to avoid timing disruptions or immediately recover from them. When evaluated on large CPU designs, the methodology saw double-digit improvements in timing, wirelength, and area versus current technology.
- Published
- 2011
- Full Text
- View/download PDF
8. PACMAN
- Author
-
Haifeng Qian, Joseph N. Kozhaya, Cliff Sze, Zhuo Li, Charles J. Alpert, Phillip J. Restle, Joseph J. Palumbo, and Nancy Zhou
- Subjects
Computer Science::Hardware Architecture ,Clock domain crossing ,Computer science ,Clock drift ,Electronic engineering ,Clock gating ,Hardware_PERFORMANCEANDRELIABILITY ,Digital clock manager ,Hardware_ARITHMETICANDLOGICSTRUCTURES ,Clock skew ,Timing failure ,CPU multiplier ,Clock network - Abstract
Clock grid is a mainstream clock network methodology for high performance microprocessor and SOC designs. Clock skew, power usage and robustness to PVT (power, voltage, temperature) are all important metrics for a high quality clock grid design. Tree-driven-grid clock network is a typical clock grid clock network. It includes a clock source, a buffered tree, leaf buffers, a mesh clock grid, local clock buffers, and latches as shown in Fig. 1. For such network, one big challenge is how to connect the leaf level buffers of the global tree to the grid with nonuniform loads under tight slew and skew constraints. The choice of tapping points that connect the leaf buffers to the clock grid are critical to the quality of the clock designs. Good tapping points can minimize the clock skew and reduce power. In this paper, we proposed a new algorithm to select the tapping points to build the global tree as regular and symmetric as possible. From our experimental results, the proposed algorithm can efficiently reduce global clock skew, rising slew, maximum overshoot, reduce power, and avoid local skew violation.
- Published
- 2014
- Full Text
- View/download PDF
9. Session details: Welcome and Monday keynote address
- Author
-
Cliff Sze
- Subjects
Computer science ,Library science ,Session (computer science) - Published
- 2014
- Full Text
- View/download PDF
10. Session details: Commemoration for Dr. Bryan Preas
- Author
-
Cliff Sze
- Subjects
Computer science ,Library science ,Session (computer science) - Published
- 2014
- Full Text
- View/download PDF
11. Routing congestion estimation with real design constraints
- Author
-
Zhuo Li, Cliff Sze, Charles J. Alpert, Yih-Lang Li, Yaoguang Wei, Natarajan Viswanathan, and Wen-Hao Liu
- Subjects
Dynamic Source Routing ,Mathematical optimization ,Engineering ,Static routing ,Equal-cost multi-path routing ,business.industry ,Policy-based routing ,Real-time computing ,Link-state routing protocol ,Multipath routing ,Hardware_INTEGRATEDCIRCUITS ,Destination-Sequenced Distance Vector routing ,business ,Hierarchical routing - Abstract
To address the routability issue, routing congestion estimators (RCE) become essential in industrial design flow. Recently, several RCEs [1-4] based on global routing engines are developed, but they typically ignore the effects of routing on timing so that the identified routing paths may be overlong and thus impractical. To be aware of the timing issues, our proposed global-routing-based RCE obeys the layer directive and scenic constraints to respectively limit the routing layers and the maximum routing wirelength of the potentially timing-critical nets. To handle the scenic constrains, we propose a novel method based on a relaxation-legalization scheme. Also, because the work in [5] reveals that congestion ratio is a better indicator than overflow to evaluate routability, this work focuses on minimizing the congestion ratio rather than overflows. As will be shown, the problem of minimizing congestion ratio is more complicated than minimizing overflows, so we develop a new rip-up and rerouting scheme to reduce congestion and further to approach a target congestion ratio. Moreover, to fit the demands of practical uses, this work presents a control utility to trade off runtime and quality, which is an essential function to an industrial RCE tool. Experiments reveal that the proposed RCE is faster and more accurate than another industrial global-routing-based RCE.
- Published
- 2013
- Full Text
- View/download PDF
12. Session details: Expert designer/user session (EDS)
- Author
-
Cliff Sze
- Subjects
Multimedia ,Computer science ,Session (computer science) ,computer.software_genre ,computer - Published
- 2013
- Full Text
- View/download PDF
13. CATALYST: Planning Layer Directives for Effective Design Closure
- Author
-
Yaoguang Wei, Zhuo Li, Cliff Sze, Shiyan Hu, Charles J. Alpert, and Sachin S. Sapatnekar
- Published
- 2013
- Full Text
- View/download PDF
14. ICCAD-2012 CAD contest in design hierarchy aware routability-driven placement and benchmark suite
- Author
-
Charles J. Alpert, Cliff Sze, Yaoguang Wei, Zhuo Li, and Natarajan Viswanathan
- Subjects
Engineering ,Hierarchy ,business.industry ,Suite ,Distributed computing ,CAD ,Integrated circuit design ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,Benchmark (computing) ,Routing (electronic design automation) ,Physical design ,business ,Placement - Abstract
The impact of considering design hierarchy during physical synthesis remains a fairly under-researched area. This is especially true for large-scale circuit placement. This is in large part due to the non-availability of realistic public designs with the design hierarchy information. Additionally, modern designs are fairly complex with numerous placement blockages, non-uniform wiring stacks, partial and/or complete routing blockages, etc. This significantly complicates both, the placement and routing steps of physical synthesis. The aim of the ICCAD-2012 contest is to evaluate the impact of considering design hierarchy on the wire length and routability of placement. This is addressed by way of the following: (a) release industrial-strength place-and-route benchmarks that contain the design hierarchy information, (b) present an accurate congestion analysis framework to evaluate and compare the routability of various placement algorithms. We hope that a set of challenging benchmarks containing the design hierarchy information, along with a standardized evaluation framework, will further advance research in design hierarchy aware routability-driven placement.
- Published
- 2012
- Full Text
- View/download PDF
15. GLARE
- Author
-
Yaoguang Wei, Zhuo Li, Douglas Keller, Cliff Sze, Lakshmi Reddy, Gustavo E. Tellez, Sachin S. Sapatnekar, Charles J. Alpert, Andrew D. Huber, and Natarajan Viswanathan
- Subjects
Router ,Engineering ,business.industry ,Node (networking) ,Real-time computing ,Integrated circuit layout ,Design for manufacturability ,Mode (computer interface) ,Hardware_INTEGRATEDCIRCUITS ,Point (geometry) ,Physical design ,Routing (electronic design automation) ,business ,Computer network - Abstract
Industry routers are very complex and time consuming, and are becoming more so with the explosion in design rules and design for manufacturability requirements that multiply with each technology node. Global routing is just the first phase of a router and serves the dual purpose of (i) seeding the following phases of a router and (ii) evaluating whether the current design point is routable. Lately, it has become common to use a "light mode" version of the global router, similar to today's academic routers, to quickly evaluate the routability of a given placement. This use model suffers from two primary weaknesses: (i) it does not adequately model the local routing resources, while the model is important to remove opens and shorts and eliminate DRC violations, (ii) the metrics used to represent congestion are non-intuitive and often fail to pinpoint the key issues that need to be addressed. This paper presents solutions to both issues, and empirically demonstrates that incorporating the proposed solutions within a global routing based congestion analyzer yields a more accurate view of design routability.
- Published
- 2012
- Full Text
- View/download PDF
16. The DAC 2012 routability-driven placement contest and benchmark suite
- Author
-
Zhuo Li, Cliff Sze, Natarajan Viswanathan, Yaoguang Wei, and Charles J. Alpert
- Subjects
Engineering ,business.industry ,Suite ,Distributed computing ,Integrated circuit design ,Set (abstract data type) ,Application-specific integrated circuit ,Embedded system ,Metric (mathematics) ,Hardware_INTEGRATEDCIRCUITS ,Benchmark (computing) ,Physical design ,Routing (electronic design automation) ,business - Abstract
Existing routability-driven placers mostly employ rudimentary and often crude congestion models that fail to account for the complexities in modern designs, e.g., the impact of non-uniform wiring stacks, layer directives, partial and/or complete routing blockages, etc. In addition, they are hampered by congestion metrics that do not accurately score or represent design congestion. This is in large part due to the non-availability of public designs depicting industrial wiring stacks and other complexities affecting design routability. The aim of the DAC 2012 routability-driven placement contest is to address these issues, by way of the following: (a) release challenging benchmark designs that are derived from modern industrial ASICs, and contain information to perform both placement and routing, (b) present a new congestion metric, as well as an accurate congestion analysis framework to evaluate and compare the routability of various placement algorithms. We hope that a set of challenging benchmarks, along with a standard, publicly available evaluation framework will further advance research in routability-driven placement.
- Published
- 2012
- Full Text
- View/download PDF
17. Guiding a physical design closure system to produce easier-to-route designs with more predictable timing
- Author
-
Natarajan Viswanathan, Charles J. Alpert, Zhuo Li, Nancy Zhou, Cliff Sze, and Gi-Joon Nam
- Subjects
Engineering ,Theoretical computer science ,Signoff ,business.industry ,Distributed computing ,Design flow ,Steiner tree problem ,symbols.namesake ,Hardware_INTEGRATEDCIRCUITS ,symbols ,Netlist ,Place and route ,Routing (electronic design automation) ,Physical design ,business ,Design closure - Abstract
Physical synthesis has emerged as one of the most important tools in design closure, which starts with the logic synthesis step and generates a new optimized netlist and its layout for the final signoff process. As stated in [1], "it is a wrapper around traditional place and route, whereby synthesis-based optimization are interwoven with placement and routing." A traditional physical synthesis tool generally focuses on design closure with Steiner wire model. It optimizes timing/area/power with the assumption that each net can be routed with optimal Steiner tree. However, advanced design rules, more IP and hierarchical design styles for super-large billion-gate designs, serious buffering problems from interconnect scaling and metal layer stacks make routing a much more challenging problem [2]. This paper discusses a series of techniques that may relieve this problem, and guide the physical design closure system to produce not only easier to route designs, but also better timing quality. Open challenges are also overviewed at the end.
- Published
- 2012
- Full Text
- View/download PDF
18. WRIP
- Author
-
Charles J. Alpert, Xing Wei, Yu-Liang Wu, Cliff Sze, and Wai-Chung Tang
- Subjects
Standard cell ,Logic synthesis ,Computer engineering ,Application-specific integrated circuit ,Accurate estimation ,Computer science ,Real-time computing ,Metric (mathematics) ,Hardware_INTEGRATEDCIRCUITS ,Routing (electronic design automation) ,Placement ,Hardware_LOGICDESIGN ,Electronic circuit - Abstract
This paper presents WRIP - a Wirelength-driven Rewiring-based Incremental Placement which effectively reduces wirelength of the optimized placement of industrial large-scale standard cell designs. WRIP uses a powerful logic synthesis technique called logic rewiring which restructures the local circuits while preserving the logic functionality and reduces the wirelength under an accurate estimation of the half perimeter wirelength (HPWL) metric. We integrated WRIP into an industrial EDA tool and tested it upon several real designs with hundreds of thousands of movable objects. Tested on circuits which has been fully optimized by the state-of-the-art industrial placement tool, our experiments showed that on average WRIP reduces wirelength by 2.25% after placement and 2.45% after global routing in HPWL and Steiner WL model respectively. The runtime of WRIP is only about half an hour for the largest tested ASIC circuit. This is the first attempt to fully integrate powerful logic synthesis into industrial placement tools with real-life effectiveness and efficiency.
- Published
- 2012
- Full Text
- View/download PDF
19. Obstacle-avoiding and slew-constrained buffered clock tree synthesis for skew optimization
- Author
-
Jianlei Yang, Feifei Niu, Cliff Sze, Qiang Zhou, Hailong Yao, and Yici Cai
- Subjects
Very-large-scale integration ,Mathematical optimization ,Obstacle ,Obstacle avoidance ,Skew ,Clock tree ,Latency (engineering) ,Clock tree synthesis ,Algorithm ,Global optimization ,Mathematics - Abstract
Buered clock tree synthesis (CTS) is increasingly critical as VLSI technology continually scales down. Many researches have been done on this topic due to its key role in CTS, but current approaches either lack the obstacle-avoiding functionality or lead to large clock latency and/or skew. This paper presents a new obstacle-avoiding CTS approach with separate clock tree construction and buer insertion stages based on an integral view to explore the global optimization space. Aiming at skew optimization under constraints of slew and obstacles, our CTS approach features the clock tree construction stage with the obstacle-aware topology generation algorithm called OBB, balanced insertion of candidate buer positions, and a fast heuristic buer insertion algorithm. Experimental results show the eectiveness of our CTS approach with significantly improved skew and latency than [6] by 46% and 63% on average, and 15.3% reduction in skew than [5]. Our OBB heuristic obtains 36% improvement in skew than the classic balanced bipartition algorithm (BB) in [10].
- Published
- 2011
- Full Text
- View/download PDF
20. Grid-to-ports clock routing for high performance microprocessor designs
- Author
-
Haitong Tian, Cliff Sze, Wai-Chung Tang, and Evangeline F. Y. Young
- Subjects
business.industry ,Computer science ,Underclocking ,Clock rate ,Clock drift ,Clock gating ,Digital clock manager ,Clock skew ,Clock domain crossing ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,business ,Computer hardware ,CPU multiplier - Abstract
Clock distribution in VLSI designs is of crucial importance and it is also a major source of power dissipation of a system. For today's high performance microprocessors, clock signals are usually distributed by a global clock grid covering the whole chip, followed by post-grid routing that connects clock loads to the clock grid. Early study [2] shows that about 18.1% of the total clock capacitance dissipation was due to this post-grid clock routing (i.e., lower mesh wires plus clock twig wires). This post-grid clock routing problem is thus an important one but not many previous works have addressed it. In this paper, we try to solve this problem of connecting clock ports to the clock grid through reserved tracks on multiple metal layers, with delay and slew constraints. Note that a set of routing tracks are reserved for this grid-to-ports clock wires in practice because of the conventional modular design style of high-performance microprocessors. We propose a new expansion algorithm based on the heap data structure to solve the problem effectively. Experimental results on industrial test cases show that our algorithm can improve over the latest work on this problem [1] significantly by reducing the capacitance by 24.6% and the wire length by 23.6%. We also validate our results using hspice simulation. Finally, our approach is very efficient and for larger test cases with about 2000 ports, the runtime is in seconds.
- Published
- 2011
- Full Text
- View/download PDF
21. The ISPD-2011 routability-driven placement contest and benchmark suite
- Author
-
Gi-Joon Nam, Zhuo Li, Cliff Sze, Natarajan Viswanathan, Jarrod A. Roy, and Charles J. Alpert
- Subjects
Computer science ,Suite ,CONTEST ,computer.software_genre ,Computer engineering ,Application-specific integrated circuit ,Metric (mathematics) ,Hardware_INTEGRATEDCIRCUITS ,Benchmark (computing) ,Data mining ,Routing (electronic design automation) ,Physical design ,Placement ,computer - Abstract
The last few years have seen significant advances in the quality of placement algorithms. This is in part due to the availability of large, challenging testcases by way of the ISPD-2005 [17] and ISPD-2006 [16] placement contests. These contests primarily evaluated the placers based on the half-perimeter wire length metric. Although wire length is an important metric, it still does not address a fundamental requirement for placement algorithms, namely, the ability to produce routable placements.This paper describes the ISPD-2011 routability-driven placement contest, and a new benchmark suite that is being released in conjunction with the contest. All designs in the new benchmark suite are derived from industrial ASIC designs, and can be used to perform both placement and global routing. By way of the contest and the associated benchmark suite, we hope to provide a standard, publicly available framework to help advance research in the area of routability-driven placement.
- Published
- 2011
- Full Text
- View/download PDF
22. Quantifying academic placer performance on custom designs
- Author
-
Samuel I. Ward, Zhuo Li, Cliff Sze, Earl E. Swartzlander, David A. Papa, and Charles J. Alpert
- Subjects
Very-large-scale integration ,Range (mathematics) ,Test case ,Computer engineering ,Design styles ,Computer science ,Datapath ,Real-time computing ,Hardware_INTEGRATEDCIRCUITS ,Benchmark (computing) ,Common logic - Abstract
There have been significant prior efforts to quantify performance of academic placement algorithms, primarily by creating artificial test cases that attempt to mimic real designs, such as the PEKO benchmark containing known optimas [5]. The idea was to create benchmarks with a known optimal solution and then measure how far existing placers were from the known optimal. Since the benchmarks do not necessarily correspond to properties of real VLSI netlists, the conclusions were met with some skepticism. This work presents two custom constructed datapath designs that perform common logic functions with hand-designed layouts for each. The new generation of academic placers is then compared against them to see how the placers performed for these design styles. Experiments show that all academic placers have wirelengths significantly greater then the manual solution; solutions range from 1.75 to 4.88 times greater wirelengths. These testcases will be released publically to stimulate research into automatically solving structured datapath placement problems.
- Published
- 2011
- Full Text
- View/download PDF
23. Guest Editorial: Special Section on Contemporary and Emerging Issues in Physical Design
- Author
-
Cliff Sze and Cheng-Kok Koh
- Subjects
Engineering ,business.industry ,Special section ,Mechanical engineering ,Engineering ethics ,Electrical and Electronic Engineering ,Physical design ,business ,Computer Graphics and Computer-Aided Design ,Software - Abstract
The eight papers in this special section highlight several studies on contemporary and emerging issues in physical design.
- Published
- 2014
- Full Text
- View/download PDF
24. Ultra-fast interconnect driven cell cloning for minimizing critical path delay
- Author
-
Weiping Shi, Shiyan Hu, David A. Papa, Charles J. Alpert, Zhuo Li, Cliff Sze, and Ying Zhou
- Subjects
Delay calculation ,Interconnection ,Mathematical optimization ,Series (mathematics) ,Flow (mathematics) ,Computer science ,Hardware_INTEGRATEDCIRCUITS ,Node (circuits) ,Physical synthesis ,Parallel computing ,AND gate ,Sizing - Abstract
In a complete physical synthesis flow, optimization transforms, that can improve the timing on critical paths that are already well-optimized by a series of powerful transforms (timing driven placement, buffering and gate sizing) are invaluable. Finding such a transform is quite challenging, to say nothing of efficiency. This work explores innovative cloning (gate duplication) techniques to improve timing-closure in a physical synthesis environment.With a buffer-aware interconnect timing model, new polynomial-time optimal algorithms are proposed for timing-driven cloning, including both finding optimal sink partitions (identifying the fan-outs) for the original and the duplicated gates, as well as physical locations for both gates. In particular, we present an O(m)-time optimal algorithm to minimize the worst slack if the original gate is movable, and an O(m log m) algorithm if the original gate is fixed, where $m$ is the number of fan-outs. To the best of our knowledge, this work is the first one considering the timing-driven cloning problem under a buffer-aware interconnect delay model.For a hundred testcases in 45nm technology node, we demonstrate significant timing improvement due to our cloning techniques as compared to other existing timing-optimization transforms. Extensions to other factors, such as wirelength, FOM and placement obstacles are further discussed.
- Published
- 2010
- Full Text
- View/download PDF
25. Session details: Clocking and the ISPD'09 clock synthesis contest
- Author
-
Cliff Sze
- Subjects
Multimedia ,Computer science ,Session (computer science) ,computer.software_genre ,CONTEST ,computer - Published
- 2009
- Full Text
- View/download PDF
26. The ISPD global routing benchmark suite
- Author
-
Mehmet Yildiz, Cliff Sze, and Gi-Joon Nam
- Subjects
Computer science ,Routing congestion ,Suite ,Hardware_INTEGRATEDCIRCUITS ,Benchmark (computing) ,Macro porosity ,Hardware_PERFORMANCEANDRELIABILITY ,Parallel computing ,Routing (electronic design automation) ,Physical design - Abstract
This paper describes the ISPD global routing benchmark suite and related contests. Total 16 global routing benchmarks are produced from the ISPD placement contest benchmark suite using a variety of publicly available academic placement tools. The representative characteristics of the ISPD global routing benchmark suite include multiple metal layers with layer assignment requirement, wire and via width/space modeling, and macro porosity modeling. The benchmarks have routable nets from 200 thousand 1.6 million. While primarily intended for global routing, they can be certainly extended for detailed routing or routing congestion estimation. In conjunction with the previous ISPD placement contest benchmark suite, the new global routing benchmarks will present realistic and challenging physical design problems of modern complex IC designs
- Published
- 2008
- Full Text
- View/download PDF
27. The nuts and bolts of physical synthesis
- Author
-
S.K. Karandikar, Charles J. Alpert, Haoxing Ren, Paul G. Villarrubia, Zhuo Li, Gi-Joon Nam, Cliff Sze, Stephen T. Quay, and Mehmet Yildiz
- Subjects
Engineering ,Nuts and bolts ,business.industry ,Process (engineering) ,Embedded system ,Distributed computing ,Component (UML) ,Netlist ,Timing closure ,business ,Chip ,Throughput (business) ,Design closure - Abstract
As technology scaling advances to the 45 and 32 nanometer nodes, more devices can fit onto a chip, which impliescontinued rapid design size growth. Naturally, it becomes increasingly challenging to achieve design closure on these enormous chips with tight performance and power constraints. Physical synthesis has emerged as a critical and powerful component of modern design methodologies to conquer such challenges. Starting from logic-level net list, physical synthesis creates a legally placed design while attempting to satisfy timing, power, and electrical constraints simultaneously. This paper briefly outlines the core components of physical synthesis timing closure and discusses some recent techniques that improve the solution quality and throughput of the physical synthesis process.
- Published
- 2007
- Full Text
- View/download PDF
28. Integrated Placement and Skew Optimization for Rotary Clocking
- Author
-
Cliff Sze, Jiang Hu, Ganesh Venkataraman, and Frank Liu
- Subjects
Very-large-scale integration ,Engineering ,Logic synthesis ,business.industry ,Design flow ,Electronic engineering ,Skew ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Integrated circuit design ,Flow network ,Clock skew ,business ,Clock network - Abstract
The clock distribution network is a key component on any synchronous VLSI design. As technology moves into the nanometer era, innovative clocking techniques are required to solve the power dissipation and variability issues. Rotary clocking is a novel technique which employs unterminated rings formed by differential transmission lines to save power and reduce skew variability. Despite its appealing advantages, rotary clocking requires latch locations to match pre-designed clock skew on rotary clock rings. This requirement is a difficult chicken-and-egg problem which prevents its wide application. In this work, we proposed an integrated placement and skew scheduling methodology to break this hurdle, making rotary clocking compatible with practical design flows. A network flow based latch assignment algorithm and a cost-driven skew optimization algorithm are developed. Experiments show that our method can generate chip placements which satisfy the unique requirements of rotary clocks, without sacrificing design quality. By enabling concurrent clock network and placement design, our method can also be applied in other clocking methodologies as well
- Published
- 2006
- Full Text
- View/download PDF
29. Design methodology for the IBM POWER7 microprocessor
- Author
-
Stephen Douglas Posluszny, Frank J. Musante, Michael A. Kazda, Joshua Friedrich, Lakshmi Reddy, Vasant Rao, Douglass T. Lamb, Jeremy T. Hopkins, Gustavo E. Tellez, Cliff Sze, Markus J. Buehler, Uwe Brandt, Benjamin R. Russell, Thomas Edward Rosser, Zahi M. Kurzum, Jack DiLullo, Jens Noack, Ruchir Puri, Haifeng Qian, Peter J. Osler, Alice Lee, Joachim Keinert, Shyam Ramji, Haoxing Ren, and Mozammel Hossain
- Subjects
Dynamic random-access memory ,General Computer Science ,Computer science ,Integrated circuit design ,Timing closure ,computer.software_genre ,Modularity ,law.invention ,Microprocessor ,Computer architecture ,Basic telecommunications access method ,law ,Hardware_INTEGRATEDCIRCUITS ,Operating system ,IBM ,Design methods ,computer - Abstract
The IBM POWER7® microprocessor, which is the next-generation IBM POWER® processor, leverages IBM's 45-nm silicon-on-insulator (SOI) process with embedded dynamic random access memory to achieve industry-leading performance. To deliver this complex 567-mm2 die, the IBM design team made significant innovations in chip design methodology. This paper describes the most critical methodology innovations specific to POWER7 design, which were in modularity, timing closure, and design efficiency.
- Published
- 2011
- Full Text
- View/download PDF
30. Routing congestion estimation with real design constraints.
- Author
-
Wen-Hao Liu, Yaoguang Wei, Cliff Sze, Alpert, Charles J., Zhuo Li, Yih-Lang Li, and Viswanathan, Natarajan
- Published
- 2013
- Full Text
- View/download PDF
31. Guiding a Physical Design Closure System to Produce Easier-to-Route Designs with More Predictable Timing.
- Author
-
Zhuo Li, Alpert, Charles J., Gi-Joon Nam, Cliff Sze, Viswanathan, Natarajan, and Zhou, Nancy Y.
- Subjects
ROUTING (Computer network management) ,INTEGRATED circuits ,DATA structures ,ELECTRONIC data processing ,STEINER systems - Abstract
Physical synthesis has emerged as one of the most important tools in design closure, which starts with the logic synthesis step and generates a new optimized netlist and its layout for the final signoff process. As stated in [1], "it is a wrapper around traditional place and route, whereby synthesis-based optimization are interwoven with placement and routing." A traditional physical synthesis tool generally focuses on design closure with Steiner wire model. It optimizes timing/area/power with the assumption that each net can be routed with optimal Steiner tree. However, advanced design rules, more IP and hierarchical design styles for super-large billion-gate designs, serious buffering problems from interconnect scaling and metal layer stacks make routing a much more challenging problem [2]. This paper discusses a series of techniques that may relieve this problem, and guide the physical design closure system to produce not only easier to route designs, but also better timing quality. Open challenges are also overviewed at the end. [ABSTRACT FROM AUTHOR]
- Published
- 2012
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.