244 results on '"Peh, Li-Shiuan"'
Search Results
202. Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI.
- Author
-
Park, Sunghyun, Krishna, Tushar, Chen, Chia-Hsin, Daya, Bhavya, Chandrakasan, Anantha, and Peh, Li-Shiuan
- Abstract
In this paper, we present a case study of our chip prototype of a 16-node 4x4 mesh NoC fabricated in 45nm SOI CMOS that aims to simultaneously optimize energy-latency-throughput for unicasts, multicasts and broadcasts. We first define and analyze the theoretical limits of a mesh NoC in latency, throughput and energy, then describe how we approach these limits through a combination of microarchitecture and circuit techniques. Our 1.1V 1GHz NoC chip achieves 1-cycle router-and-link latency at each hop and energy-efficient router-level multicast support, delivering 892Gb/s (87.1% of the theoretical bandwidth limit) at 531.4mW for a mixed traffic of unicasts and broadcasts. Through this fabrication, we derive insights that help guide our research, and we believe, will also be useful to the NoC and multicore research community. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
203. A low-swing crossbar and link generator for low-power networks-on-chip.
- Author
-
Chen, Chia-Hsin Owen, Park, Sunghyun, Krishna, Tushar, and Peh, Li-Shiuan
- Published
- 2011
204. Enabling system-level modeling of variation-induced faults in networks-on-chips.
- Author
-
Aisopos, Konstantinos, Chen, Chia-Hsin Owen, and Peh, Li-Shiuan
- Published
- 2011
- Full Text
- View/download PDF
205. DRAIN.
- Author
-
DeOrio, Andrew, Aisopos, Kostantinos, Bertacco, Valeria, and Peh, Li-Shiuan
- Published
- 2011
- Full Text
- View/download PDF
206. PowerHerd
- Author
-
Shang, Li, primary, Peh, Li-Shiuan, additional, and Jha, Niraj K., additional
- Published
- 2003
- Full Text
- View/download PDF
207. Leakage power modeling and optimization in interconnection networks
- Author
-
Chen, Xuning, primary and Peh, Li-Shiuan, additional
- Published
- 2003
- Full Text
- View/download PDF
208. In-network coherence filtering.
- Author
-
Agarwal, Niket, Peh, Li-Shiuan, and Jha, Niraj K.
- Published
- 2009
- Full Text
- View/download PDF
209. ORION 2.0.
- Author
-
Kahng, Andrew B., Li, Bin, Peh, Li-Shiuan, and Samadi, Kambiz
- Published
- 2009
210. Design of low-power short-distance opto-electronic transceiver front-ends with scalable supply voltages and frequencies.
- Author
-
Chen, Xuning, Wei, Gu-Yeon, and Peh, Li-Shiuan
- Published
- 2008
- Full Text
- View/download PDF
211. Extending open core protocol to support system-level cache coherence.
- Author
-
Aisopos, Konstantinos, Chou, Chien-Chun, and Peh, Li-Shiuan
- Published
- 2008
- Full Text
- View/download PDF
212. Energy-efficient computing for wildlife tracking
- Author
-
Juang, Philo, primary, Oki, Hidekazu, additional, Wang, Yong, additional, Martonosi, Margaret, additional, Peh, Li Shiuan, additional, and Rubenstein, Daniel, additional
- Published
- 2002
- Full Text
- View/download PDF
213. In-Network Cache Coherence.
- Author
-
Eisley, Noel, Peh, Li-Shiuan, and Shang, Li
- Published
- 2006
- Full Text
- View/download PDF
214. High-level power analysis for multi-core chips.
- Author
-
Eisley, Noel, Soteriou, Vassos, and Peh, Li-Shiuan
- Published
- 2006
- Full Text
- View/download PDF
215. HybDTM.
- Author
-
Kumar, Amit, Shang, Li, Peh, Li-Shiuan, and Jha, Niraj K.
- Published
- 2006
- Full Text
- View/download PDF
216. Coordinated, distributed, formal energy management of chip multiprocessors.
- Author
-
Juang, Philo, Wu, Qiang, Peh, Li-Shiuan, Martonosi, Margaret, and Clark, Douglas W.
- Published
- 2005
- Full Text
- View/download PDF
217. A Technology-Aware and Energy-Oriented Topology Exploration for On-Chip Networks.
- Author
-
Wang, Hangsheng, Peh, Li-Shiuan, and Malik, Sharad
- Published
- 2005
- Full Text
- View/download PDF
218. Thermal Modeling, Characterization and Management of On-Chip Networks.
- Author
-
Shang, Li, Peh, Li-Shiuan, Kumar, Amit, and Jha, Niraj K.
- Published
- 2004
- Full Text
- View/download PDF
219. Orion.
- Author
-
Wang, Hang-Sheng, Zhu, Xinping, Peh, Li-Shiuan, and Malik, Sharad
- Published
- 2002
220. Energy-efficient computing for wildlife tracking.
- Author
-
Juang, Philo, Oki, Hidekazu, Wang, Yong, Martonosi, Margaret, Peh, Li Shiuan, and Rubenstein, Daniel
- Published
- 2002
- Full Text
- View/download PDF
221. Extending the Effective Throughput of NoCs With Distributed Shared-Buffer Routers.
- Author
-
Ramanujam, Rohit Sunkam, Soteriou, Vassos, Lin, Bill, and Peh, Li-Shiuan
- Subjects
NETWORK routers ,SYSTEMS on a chip ,DISTRIBUTED computing ,BUFFER storage (Computer science) ,COMPUTER network architectures ,INTEGRATED circuit interconnections ,PERFORMANCE evaluation - Abstract
Router microarchitecture plays a central role in the performance of networks-on-chip (NoCs). Buffers are needed in routers to house incoming flits that cannot be immediately forwarded due to contention. This buffering can be done at the inputs or the outputs of a router, corresponding to an input-buffered router (IBR) or an output-buffered router (OBR). OBRs are attractive because they can sustain higher throughputs and have lower queuing delays under high loads than IBRs. However, a direct implementation of an OBR requires a router speedup equal to the number of ports, making such a design prohibitive under aggressive clocking needs and limited power budgets of most NoC applications. In this paper, a new router design based on a distributed shared-buffer (DSB) architecture is proposed that aims to practically emulate an OBR. The proposed architecture introduces innovations to address the unique constraints of NoCs, including efficient pipelining and novel flow control. Practical DSB configurations are also presented with reduced power overheads while exhibiting negligible performance degradation. Compared to a state-of-the-art pipelined IBR, the proposed DSB router achieves up to 19% higher throughput on synthetic traffic and reduces packet latency on average by 61% when running SPLASH-2 benchmarks with high contention. On average, the saturation throughput of DSB routers is within 7% of the theoretically ideal saturation throughput under the synthetic workloads evaluated. [ABSTRACT FROM PUBLISHER]
- Published
- 2011
- Full Text
- View/download PDF
222. Simultaneous Dynamic Voltage Scaling of Processors and Communication Links in Real-Time Distributed Embedded Systems.
- Author
-
Jiong Luo, Jha, Niraj K., and Peh, Li-Shiuan
- Subjects
POWER resources ,EMBEDDED computer systems ,ELECTRIC potential ,DATA transmission systems ,ALGORITHMS ,INTEGRATED circuits - Abstract
Dynamic voltage scaling has been widely acknowledged as a powerful technique for trading off power consumption and delay for processors. Recently, variable-frequency (and variable-voltage) parallel and serial links have also been proposed, which can save link power consumption by exploiting variations in the bandwidth requirement. This provides a new dimension for power optimization in a distributed embedded system connected by a voltage-scalable interconnection network. At the same time, it imposes new challenges for variable-voltage scheduling as well as flow control. First, the variable-voltage scheduling algorithm should be able to trade off the power consumption and delay jointly for both processors and links. Second, for the variable-frequency network, the scheduling algorithm should not only consider the real-time constraints, but should also be consistent with the underlying flow control techniques. In this paper, we address joint dynamic voltage scaling for variable-voltage processors and communication links in such systems. We propose a scheduling algorithm for real-time applications that captures both data flow and control flow information. It performs efficient routing of communication events through multihops, as well as efficient slack allocation among heterogeneous processors and communication links to maximize energy savings, while meeting all real-time constraints. Our experimental study shows that on an average, joint voltage scaling on processors and links can achieve 32% less power compared with voltage scaling on processors alone. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
223. PowerHerd: A Distributed Scheme for Dynamically Satisfying Peak-Power Constraints in Interconnection Networks.
- Author
-
Li Shang, Peh, Li-Shiuan, and Jha, Niraj K.
- Subjects
- *
INTEGRATED circuit interconnections , *INTEGRATED circuits , *ENERGY consumption , *ONLINE algorithms , *INTERNETWORKING devices , *COMPUTER networks - Abstract
As interconnection networks proliferate to a wide range of high-performance systems, power consumption is becoming a significant architectural issue. In interconnection networks, the peak-power consumption directly affects the solution for package cooling and power-delivery design. Off-line worst-case power analysis is typically used to estimate the network peak-power consumption and guarantee safe online operation, which not only increases system cost, but also constrains network performance. In this paper, we present an online mechanism, called PowerHerd, to efficiently manage network power resources at runtime, and guarantee that network peak-power constraints are not exceeded. PowerHerd is a distributed approach-within the interconnection network, each router dynamically maintains a local power budget, controls its local power dissipation, and exchanges spare power resources with its neighboring routers to optimize network performance. Experiments demonstrate that PowerHerd can effectively regulate network power consumption, meeting peak-power constraints with negligible network-performance penalty. Armed with PowerHerd, network designers can focus on system performance and power optimization for the average case, rather than the worst-case, thus making it possible to employ a more powerful interconnection network in the system. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
224. Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs.
- Author
-
Kwon, Woo-Cheol, Krishna, Tushar, and Peh, Li-Shiuan
- Published
- 2014
- Full Text
- View/download PDF
225. Power-driven Design of Router Microarchitectures in On-chip Networks.
- Author
-
Wang, Hangsheng, Peh, Li-Shiuan, and Malik, Sharad
- Published
- 2003
226. Simultaneous Dynamic Voltage Scaling of Processors and Communication Links in Real-Time Distributed Embedded Systems.
- Author
-
Luo, Jiong, Peh, Li-Shiuan, and Jha, Niraj
- Published
- 2003
227. SimMobility Short-Term: An Integrated Microscopic Mobility Simulator
- Author
-
Neeraj Milind Deshmukh, Li-Shiuan Peh, Simon Oh, Moshe Ben-Akiva, Balakumar Marimuthu, Carlos Lima Azevedo, Tomer Toledo, Kakali Basak, Katarzyna Anna Marczuk, Harold Soh, Massachusetts Institute of Technology. Center for Transportation & Logistics, Massachusetts Institute of Technology. Department of Aeronautics and Astronautics, Massachusetts Institute of Technology. Department of Civil and Environmental Engineering, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Marczuk, Katarzyna Anna, Toledo, Tomer, Peh, Li-Shiuan, and Ben-Akiva, Moshe E
- Subjects
050210 logistics & transportation ,Computer science ,Mechanical Engineering ,05 social sciences ,0211 other engineering and technologies ,Microsimulation ,Traffic simulation ,021107 urban & regional planning ,02 engineering and technology ,Term (time) ,0502 economics and business ,Code (cryptography) ,Systems architecture ,Realization (systems) ,Simulation ,Civil and Structural Engineering - Abstract
This paper presents the development of an integrated microscopic mobility simulator, SimMobility Short-Term (ST). The simulator is integrated because its models, inputs and outputs, simulated components, and code base are integrated within a multiscale agent- and activity-based simu- lation platform capable of simulating different spatiotemporal resolutions and accounting for different levels of travelers’ decision making. The simulator is microscopic because both the demand (agents and its trips) and the supply (trip realization and movements on the network) are microscopic (i.e., modeled individually). Finally, the simulator has mobility because it copes with the multimodal nature of urban networks and the need for the flexible simulation of innovative transportation ser - vices, such as on-demand and smart mobility solutions. This paper follows previous publications that describe SimMobility’s overall framework and models. SimMobility is an open-source, multiscale platform that considers land use, transportation, and mobility-sensitive behavioral models. SimMobility ST aims at simulating the high-resolution movement of agents (traffic, transit, pedestrians, and goods) and the operation of different mobility services and control and information systems. This paper presents the SimMobility ST modeling framework and system architecture and reports on its successful calibration for Singapore and its use in several scenarios of innovative mobility applications. The paper also shows how detailed performance measures from SimMobility ST can be integrated with a daily activity and mobility patterns simulator. Such integration is crucial to model accurately the effect of different technologies and service operations at the urban level, as the identity and preferences of simulated agents are maintained across temporal decision scales, ensuring the consistency and accuracy of simulated accessibility and performance measures of each scenario., Singapore. National Research Foundation (CREATE program), Singapore-MIT Alliance. Center. Future Urban Mobility Interdisciplinary Research Group
- Published
- 2017
- Full Text
- View/download PDF
228. A case for leveraging 802.11p for direct phone-to-phone communications
- Author
-
Nadesh Ramanathan, Shipeng Xu, Mengda Mao, Chirn Chye Boon, Jason Hao Gao, Pilsoon Choi, Suhaib A. Fahmy, Li-Shiuan Peh, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Choi, Pilsoon, Gao, Jason Hao, and Peh, Li-Shiuan
- Subjects
Engineering ,business.industry ,Phone ,Embedded system ,Baseband ,IEEE 802.11p ,Baseband processor ,Android (operating system) ,business ,Field-programmable gate array ,Dedicated short-range communications ,Power control - Abstract
WiFi cannot effectively handle the demands of device-to-device communication between phones, due to insufficient range and poor reliability. We make the case for using IEEE 802.11p DSRC instead, which has been adopted for vehicle-to-vehicle communications, providing lower latency and longer range. We demonstrate a prototype motivated by a novel fabrication process that deposits both III-V and CMOS devices on the same die. In our system prototype, the designed RF front-end is interfaced with a baseband processor on an FPGA, connected to Android phones. It consumes 0.02uJ/bit across 100m assuming free space. Application-level power control dramatically reduces power consumption by 47-56%., Singapore-MIT Alliance for Research and Technology, American Society for Engineering Education. National Defense Science and Engineering Graduate Fellowship
- Published
- 2014
- Full Text
- View/download PDF
229. SCORPIO: A 36-Core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering
- Author
-
Suvinay Subramanian, Sunghyun Park, Tushar Krishna, Li-Shiuan Peh, Jim Holt, Bhavya K. Daya, Chia-Hsin Owen Chen, Woo-Cheol Kwon, Anantha P. Chandrakasan, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Peh, Li-Shiuan, Daya, Bhavya Kishor, Chen, Chia-Hsin, Subramanian, Suvinay, Kwon, Woo Cheol, Park, Sunghyun, Krishna, Tushar, Holt, Jim, and Chandrakasan, Anantha P.
- Subjects
Multi-core processor ,Hardware_MEMORYSTRUCTURES ,Computer science ,business.industry ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,General Medicine ,Coherence (statistics) ,Chip ,Shared memory ,Computer architecture ,Embedded system ,HyperTransport ,Scalability ,ComputerApplications_GENERAL ,Overhead (computing) ,business ,ComputingMilieux_MISCELLANEOUS - Abstract
URL to conference program, In the many-core era, scalable coherence and on-chip interconnects are crucial for shared memory processors. While snoopy coherence is common in small multicore systems, directory-based coherence is the de facto choice for scalability to many cores, as snoopy relies on ordered interconnects which do not scale. However, directory-based coherence does not scale beyond tens of cores due to excessive directory area overhead or inaccurate sharer tracking. Prior techniques supporting ordering on arbitrary unordered networks are impractical for full multicore chip designs. We present SCORPIO, an ordered mesh Network-on-Chip(NoC) architecture with a separate fixed-latency, bufferless network to achieve distributed global ordering. Message delivery is decoupled from the ordering, allowing messages to arrive in any order and at any time, and still be correctly ordered. The architecture is designed to plug-and-play with existing multicore IP and with practicality, timing, area, and power as top concerns. Full-system 36 and 64-core simulations on SPLASH-2 and PARSEC benchmarks show an average application run time reduction of 24.1% and 12.9%, in comparison to distributed directory and AMD HyperTransport coherence protocols, respectively. The SCORPIO architecture is incorporated in an 11 mm-by- 13 mm chip prototype, fabricated in IBM 45nm SOI technology, comprising 36 Freescale e200 Power Architecture TM cores with private L1 and L2 caches interfacing with the NoC via ARM AMBA, along with two Cadence on-chip DDR2 controllers. The chip prototype achieves a post synthesis operating frequency of 1 GHz (833 MHz post-layout) with an estimated power of 28.8 W (768 mW per tile), while the network consumes only 10% of tile area and 19 % of tile power., United States. Defense Advanced Research Projects Agency (DARPA UHPC grant at MIT (Angstrom)), Center for Future Architectures Research, Microelectronics Advanced Research Corporation (MARCO), Semiconductor Research Corporation
- Published
- 2014
230. MobiStreams: A Reliable Distributed Stream Processing System for Mobile Devices
- Author
-
Huayong Wang, Li-Shiuan Peh, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Wang, Huayong, and Peh, Li-Shiuan
- Subjects
Stream processing ,business.industry ,Computer science ,Embedded system ,Server ,Cellular network ,Mobile computing ,Fault tolerance ,business ,Mobile device ,Bottleneck ,Computer network - Abstract
Multi-core phones are now pervasive. Yet, existing applications rely predominantly on a client-server computing paradigm, using phones only as thin clients, sending sensed information via the cellular network to servers for processing. This makes the cellular network the bottleneck, limiting overall application performance. In this paper, we propose Mobi Streams, a Distributed Stream Processing System (DSPS) that runs directly on smartphones. Mobi Streams can offload computing from remote servers to local phones and thus alleviate the pressure on the cellular network. Implementing DSPS on smartphones faces significant challenges: 1) multiple phones can readily fail simultaneously, and 2) the phones' ad-hoc WiFi network has low bandwidth. Mobi Streams tackles these challenges through two new techniques: 1) token-triggered check pointing, and 2) broadcast-based check pointing. Our evaluations driven by two real world applications deployed in the US and Singapore show that migrating from a server platform to a smartphone platform eliminates the cellular network bottleneck, leading to 0.78~42.6X throughput increase and 10%~94.8% latency decrease. Also, Mobi Streams' fault tolerance scheme increases throughput by 230% and reduces latency by 40% vs. prior state-of-the-art fault-tolerant DSPSs.
- Published
- 2014
231. Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs
- Author
-
KwonWoo-Cheol, PehLi-Shiuan, KrishnaTushar, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Kwon, Woo Cheol, Peh, Li-Shiuan, and Krishna, Tushar
- Subjects
Computer science ,Cache coloring ,Distributed computing ,Multiprocessing ,Cache-oblivious algorithm ,Cache pollution ,Hop (networking) ,Cache invalidation ,Cache algorithms ,Single cycle ,Snoopy cache ,Hardware_MEMORYSTRUCTURES ,business.industry ,MESI protocol ,Locality ,General Medicine ,Computer Graphics and Computer-Aided Design ,Smart Cache ,Network on a chip ,Shared memory ,Bus sniffing ,Page cache ,Cache ,business ,Software ,Cache coherence ,Computer network - Abstract
Locality has always been a critical factor in on-chip data placement on CMPs as accessing further-away caches has in the past been more costly than accessing nearby ones. Substantial research on locality-aware designs have thus focused on keeping a copy of the data private. However, this complicatesthe problem of data tracking and search/invalidation; tracking the state of a line at all on-chip caches at a directory or performing full-chip broadcasts are both non-scalable and extremely expensive solutions. In this paper, we make the case for Locality-Oblivious Cache Organization (LOCO), a CMP cache organization that leverages the on-chip network to create virtual single-cycle paths between distant caches, thus redefining the notion of locality. LOCO is a clustered cache organization, supporting both homogeneous and heterogeneous cluster sizes, and provides near single-cycle accesses to data anywhere within the cluster, just like a private cache. Globally, LOCO dynamically creates a virtual mesh connecting all the clusters, and performs an efficient global data search and migration over this virtual mesh, without having to resort to full-chip broadcasts or perform expensive directory lookups. Trace-driven and full system simulations running SPLASH-2 and PARSEC benchmarks show that LOCO improves application run time by up to 44.5% over baseline private and shared cache., Semiconductor Research Corporation, United States. Defense Advanced Research Projects Agency (Semiconductor Technology Advanced Research Network)
- Published
- 2014
232. Modeling reaction time within a traffic simulation model
- Author
-
Yan Xu, Harish Loganathan, Moshe Ben-Akiva, Kakali Basak, Seth N. Hetu, Zhemin Li, Carlos Lima Azevedo, Runmin Xu, Tomer Toledo, Li-Shiuan Peh, Massachusetts Institute of Technology. Department of Civil and Environmental Engineering, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Xu, Runmin, Peh, Li-Shiuan, and Ben-Akiva, Moshe E.
- Subjects
Engineering ,Event (computing) ,Mental chronometry ,business.industry ,Traffic conditions ,Real-time computing ,Process (computing) ,Traffic simulation ,Pedestrian ,Set (psychology) ,business ,Simulation ,Term (time) - Abstract
Human reaction time has a substantial effect on modeling of human behavior at a microscopic level. Drivers and pedestrian do not react to an event instantaneously; rather, they take time to perceive the event, process the information, decide on a response and finally enact their decision. All these processes introduce delay. As human movement is simulated at increasingly fine-grained resolutions, it becomes critical to consider the delay due to reaction time if one is to achieve accurate results. Most existing simulators over-simplify the reaction time implementation to reduce computational overhead and memory requirements. In this paper, we detail the framework which we are developing within the SimMobility Short Term Simulator (a microscopic traffic simulator), which is capable of explicitly modeling reaction time for each person in a detailed, flexible manner. This framework will enable modelers to set realistic reaction time values, relying on the simulator to handle implementation and optimization considerations. Following this, we report our findings demonstrating the impact of reaction time on traffic dynamics within several simulation scenarios. The findings indicate that in the incorporation of reaction time within microscopic simulations improves the traffic dynamics that produces more realistic traffic condition., Singapore-MIT Alliance for Research and Technology
- Published
- 2013
233. SWIFT: A Low-Power Network-On-Chip Implementing the Token Flow Control Router Architecture With Swing-Reduced Interconnects
- Author
-
Tushar Krishna, Li-Shiuan Peh, Patrick Chiang, Jacob Postman, Christopher Douglas Edmonds, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Krishna, Tushar, and Peh, Li-Shiuan
- Subjects
Flow control (data) ,Engineering ,Interconnection ,business.industry ,Integrated circuit design ,Hardware_PERFORMANCEANDRELIABILITY ,Chip ,Network on a chip ,Hardware and Architecture ,Embedded system ,Low-power electronics ,Datapath ,Hardware_INTEGRATEDCIRCUITS ,Electrical and Electronic Engineering ,Power network design ,business ,Software - Abstract
A 64-bit, 8 × 8 mesh network-on-chip (NoC) is presented that uses both new architectural and circuit design techniques to improve on-chip network energy-efficiency, latency, and throughput. First, we propose token flow control, which enables bypassing of flit buffering in routers, thereby reducing buffer size and their power consumption. We also incorporate reduced-swing signaling in on-chip links and crossbars to minimize datapath interconnect energy. The 64-node NoC is experimentally validated with a 2 × 2 test chip in 90 nm, 1.2 V CMOS that incorporates traffic generators to emulate the traffic of the full network. Compared with a fully synthesized baseline 8 × 8 NoC architecture designed to meet the same peak throughput, the fabricated prototype reduces network latency by 20% under uniform random traffic, when both networks are run at their maximum operating frequencies. When operated at the same frequencies, the SWIFT NoC reduces network power by 38% and 25% at saturation and low loads, respectively.
- Published
- 2013
234. Low cost crowd counting using audio tones
- Author
-
Pravein Govindan Kannan, A.L. Ananda, Mun Choon Chan, Seshadri Padmanabha Venkatagiri, Li-Shiuan Peh, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, and Peh, Li-Shiuan
- Subjects
business.industry ,Computer science ,Embedded system ,Real-time computing ,Scalability ,Android (operating system) ,business ,Audio signal processing ,computer.software_genre ,Mobile device ,computer ,Crowd counting ,Efficient energy use - Abstract
With mobile devices becoming ubiquitous, collaborative applications have become increasingly pervasive. In these applications, there is a strong need to obtain a count of the number of mobile devices present in an area, as it closely approximates the size of the crowd. Ideally, a crowd counting solution should be easy to deploy, scalable, energy efficient, be minimally intrusive to the user and reasonably accurate. Existing solutions using data communication or RFID do not meet these criteria. In this paper, we propose a crowd counting solution based on audio tones, leveraging the microphones and speaker phones that are commonly available on most phones, tackling all the above criteria. We have implemented our solution on 25 Android phones and run several experiments at a bus stop, aboard a bus, within a cafeteria and a classroom. Experimental evaluations show that we are able to achieve up to 90% accuracy and consume 81% less energy than the WiFi interface in idle mode.
- Published
- 2012
235. Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI
- Author
-
Sunghyun Park, Chia-Hsin Chen, Li-Shiuan Peh, Tushar Krishna, Anantha P. Chandrakasan, Bhavya K. Daya, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Peh, Li-Shiuan, Park, Sunghyun, Krishna, Tushar, Chen, Chia-Hsin, Daya, Bhavya Kishor, and Chandrakasan, Anantha P.
- Subjects
Engineering ,Multi-core processor ,Multicast ,business.industry ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,Bandwidth cap ,Hardware_PERFORMANCEANDRELIABILITY ,Chip ,Microarchitecture ,Network on a chip ,CMOS ,Embedded system ,Hardware_INTEGRATEDCIRCUITS ,Latency (engineering) ,business - Abstract
In this paper, we present a case study of our chip prototype of a 16-node 4×4 mesh NoC fabricated in 45nm SOI CMOS that aims to simultaneously optimize energy-latency-throughput for unicasts, multicasts and broadcasts. We first define and analyze the theoretical limits of a mesh NoC in latency, throughput and energy, then describe how we approach these limits through a combination of microarchitecture and circuit techniques. Our 1.1V 1GHz NoC chip achieves 1-cycle router-and-link latency at each hop and energy-efficient router-level multicast support, delivering 892Gb/s (87.1% of the theoretical bandwidth limit) at 531.4mW for a mixed traffic of unicasts and broadcasts. Through this fabrication, we derive insights that help guide our research, and we believe, will also be useful to the NoC and multicore research community.
- Published
- 2012
236. DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling
- Author
-
Anant Agarwal, Vladimir Stojanovic, Lan Wei, Chia-Hsin Owen Chen, Jason Miller, Li-Shiuan Peh, George Kurian, Chen Sun, delete, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Sun, Chen, Chen, Chia-Hsin, Kurian, George, Wei, Lan, Miller, Jason E., Agarwal, Anant, Peh, Li-Shiuan, and Stojanovic, Vladimir
- Subjects
Computer Science::Hardware Architecture ,Network on a chip ,Computer science ,Design space exploration ,business.industry ,Photonic integrated circuit ,Bandwidth (signal processing) ,Electronic engineering ,Electronics ,Photonics ,Power network design ,business ,Electrical engineering technology - Abstract
With the rise of many-core chips that require substantial bandwidth from the network on chip (NoC), integrated photonic links have been investigated as a promising alternative to traditional electrical interconnects. While numerous opto-electronic NoCs have been proposed, evaluations of photonic architectures have thus-far had to use a number of simplifications, reflecting the need for a modeling tool that accurately captures the tradeoffs for the emerging technology and its impacts on the overall network. In this paper, we present DSENT, a NoC modeling tool for rapid design space exploration of electrical and opto-electrical networks. We explain our modeling framework and perform an energy-driven case study, focusing on electrical technology scaling, photonic parameters, and thermal tuning. Our results show the implications of different technology scenarios and, in particular, the need to reduce laser and thermal tuning power in a photonic network due to their non-data-dependent nature., United States. Defense Advanced Research Projects Agency, National Science Foundation (U.S.), Focus Center Research Program, Microelectronics Advanced Research Corporation (MARCO). Interconnect Focus Center, Singapore-MIT Alliance for Research and Technology Center. Low Energy Electronic Systems, United States. National Security Agency. Trusted Access Program Office, Intel Corporation, APIC Corporation, MIT Center for Integrated Circuits and Systems
- Published
- 2012
- Full Text
- View/download PDF
237. SignalGuru
- Author
-
Margaret Martonosi, Li-Shiuan Peh, Emmanouil Koukoumidis, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, and Peh, Li-Shiuan
- Subjects
Service (systems architecture) ,Schedule ,business.industry ,Orientation (computer vision) ,Computer science ,Mobile phone ,Embedded system ,Real-time computing ,Window (computing) ,Filter (signal processing) ,business ,Signal ,Intersection (aeronautics) - Abstract
While traffic signals are necessary to safely control competing flows of traffic, they inevitably enforce a stop-and-go movement pattern that increases fuel consumption, reduces traffic flow and causes traffic jams. These side effects can be alleviated by providing drivers and their onboard computational devices (e.g., vehicle computer, smartphone) with information about the schedule of the traffic signals ahead. Based on when the signal ahead will turn green, drivers can then adjust speed so as to avoid coming to a complete halt. Such information is called Green Light Optimal Speed Advisory (GLOSA). Alternatively, the onboard computational device may suggest an efficient detour that will save the driver from stops and long waits at red lights ahead. This paper introduces and evaluates SignalGuru, a novel software service that relies solely on a collection of mobile phones to detect and predict the traffic signal schedule, enabling GLOSA and other novel applications. Our SignalGuru leverages windshield-mounted phones to opportunistically detect current traffic signals with their cameras, collaboratively communicate and learn traffic signal schedule patterns, and predict their future schedule. Results from two deployments of SignalGuru, using iPhones in cars in Cambridge (MA, USA) and Singapore, show that traffic signal schedules can be predicted accurately. On average, SignalGuru comes within 0.66s, for pre-timed traffic signals and within 2.45s, for traffic-adaptive traffic signals. Feeding SignalGuru's predicted traffic schedule to our GLOSA application, our vehicle fuel consumption measurements show savings of 20.3%, on average., National Science Foundation (U.S.). (Grant number CSR-EHS-0615175), Singapore-MIT Alliance for Research and Technology Center. Future Urban Mobility
- Published
- 2011
- Full Text
- View/download PDF
238. Enabling system-level modeling of variation-induced faults in networks-on-chips
- Author
-
Li-Shiuan Peh, Chia-Hsin Owen Chen, Konstantinos Aisopos, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, Peh, Li-Shiuan, Aisopos, Konstanti, and Chen, Chia-Hsin
- Subjects
Router ,Engineering ,System level modeling ,business.industry ,Reliability (computer networking) ,Hardware_PERFORMANCEANDRELIABILITY ,Variation (game tree) ,Fault (power engineering) ,Reliability engineering ,Data modeling ,Process variation ,Network on a chip ,Embedded system ,business - Abstract
Process Variation (PV) is increasingly threatening the reliability of Networks-on-Chips. Thus, various resilient router designs have been recently proposed and evaluated. However, these evaluations assume random fault distributions, which result in 52%--81% inaccuracy. We propose an accurate circuit-level fault-modeling tool, which can be plugged into any system-level NoC simulator, quantify the system-level impact of PV-induced faults at runtime, pinpoint fault-prone router components that should be protected, and accurately evaluate alternative resilient multi-core designs., GigaScale Systems Research Center, Focus Center Research Program. Focus Center for Circuit & System Solutions. Semiconductor Research Corporation. Interconnect Focus Center
- Published
- 2011
- Full Text
- View/download PDF
239. RegReS: Adaptively maintaining a target density of regional services in opportunistic vehicular networks
- Author
-
Margaret Martonosi, Li-Shiuan Peh, Emmanouil Koukoumidis, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, and Peh, Li-Shiuan
- Subjects
Service (business) ,Vehicular ad hoc network ,Computer science ,business.industry ,Wireless ad hoc network ,Node (networking) ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Testbed ,Mobile ad hoc network ,Middleware ,ComputerApplications_GENERAL ,Location-based service ,business ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) ,ComputingMilieux_MISCELLANEOUS ,Computer network - Abstract
URL to paper listed on conference program., Pervasive vehicle-mounted mobile devices are increasingly common, and can be viewed as a large-scale ad hoc network on which collaborative, location-based services can be directly supported. In order to support such services within a geographic region, a certain number of computational, storage and sensing mobile devices need to be carriers of the services. This paper introduces and evaluates Region- Resident Services (RegReS), a middleware that supports such regional services by maintaining, in a fully distributed fashion, a targeted density of service carriers. Carriers collaborate opportunistically to estimate the current service density in the region and coordinate the spawning of new service carriers when necessary. Unlike previous approaches that are static, RegReS adapts to dynamic conditions such as node speed, effectively maintaining the targeted density of service carriers in highly volatile vehicular networks. Results from the ORBIT testbed, using synthetic and real bus mobility traces, show that RegReS adapts to different system configurations, preserving the desired service density with less than 16% mean absolute error. We deployed an outdoor collaborative parking availability service atop RegReS and demonstrated RegReS’s ability to maintain the target service density with only 10% error.
- Published
- 2011
- Full Text
- View/download PDF
240. ORION 2.0: A Power-Area Simulator for Interconnection Networks
- Author
-
Bin Li, Li-Shiuan Peh, Andrew B. Kahng, Kambiz Samadi, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, and Peh, Li-Shiuan
- Subjects
Interconnection ,Engineering ,Multi-core processor ,Design space exploration ,business.industry ,Integrated circuit design ,Hardware_PERFORMANCEANDRELIABILITY ,ComputerSystemsOrganization_PROCESSORARCHITECTURES ,Power (physics) ,Network on a chip ,Hardware and Architecture ,Logic gate ,Embedded system ,Scalability ,Hardware_INTEGRATEDCIRCUITS ,Electrical and Electronic Engineering ,business ,Software ,Simulation - Abstract
As industry moves towards multicore chips, networks-on-chip (NoCs) are emerging as the scalable fabric for interconnecting the cores. With power now the first-order design constraint, early-stage estimation of NoC power has become crucially important. In this work, we present ORION 2.0, an enhanced NoC power and area simulator, which offers significant accuracy improvement relative to its predecessor, ORION 1.0., GigaScale Systems Research Center
- Published
- 2011
241. Adaptive Spatiotemporal Node Selection in Dynamic Networks
- Author
-
Pradip Hari, Marcus Henry, Margaret Martonosi, Kevin Ko, Ulrich Kremer, Emmanouil Koukoumidis, John B. P. McCabe, Li-Shiuan Peh, Jonathan Banafato, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, and Peh, Li-Shiuan
- Subjects
Dynamic network analysis ,Exploit ,Java ,Computer science ,Wireless ad hoc network ,Node (networking) ,Distributed computing ,Process (computing) ,Location awareness ,Compiler ,computer.software_genre ,computer ,computer.programming_language - Abstract
Dynamic networks - spontaneous, self-organizing groups of devices - are a promising new computing platform. Writing applications for such networks is a daunting task, however, due to their extreme variability and unpredictability, with many devices having significant resource limitations. Intelligent, automated distribution of work across network nodes is needed to get the most out of limited resource budgets. We propose a novel framework for distributing computations across a dynamic network, in which applications specify their spatiotemporal properties at a very high level. The underlying system makes node selection decisions to exploit these properties, producing high quality results within a fixed resource budget. A distributed computation is expressed as a semantically parallel loop over a geographic area and time period. Feedback from the application about the quality of node selection decisions is used to guide future decisions, even while the loop is still in progress. This simplifies the process of writing dynamic network applications by allowing programmers to focus on the goals of their applications, rather than on the topology and environment of the network. Our framework implementation consists of extensions to the Java language, a compiler for this extended language, and a run-time system that work together to provide a simple, powerful architecture for dynamic network programming. We evaluate our system using 11 Nokia N810 tablet PC devices and 14 Neo FreeRunner (Openmoko) smartphones, as well as a simulation environment that models the behavior of up to 500 devices. For three representative applications, we obtain significant improvements in the number of useful results obtained when compared with baseline node selection algorithms: up to 745% (measured), 117% (simulated) for an Amber Alert application; 38% (measured), 142% (simulated) for a Bird Tracking application; and 86% (measured), 209% (simulated) for a Crowd Estimation application., National Science Foundation (U.S.) (CNS-EHS #0615175), National Science Foundation (U.S.) (#0614949)
- Published
- 2010
242. ORION 2.0: A Fast and Accurate NoC Power and Area Model for Early-Stage Design Space Exploration
- Author
-
A.B. Kahng, null Bin Li, null Li-Shiuan Peh, K. Samadi, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, and Peh, Li-Shiuan
- Subjects
ComputingMilieux_MISCELLANEOUS - Abstract
conference website, As industry moves towards many-core chips, networks-on-chip (NoCs) are emerging as the scalable fabric for interconnecting the cores. With power now the first-order design constraint, earlystage estimation of NoC power has become crucially important. ORION [29] was amongst the first NoC power models released, and has since been fairly widely used for early-stage power estimation of NoCs. However, when validated against recent NoC prototypes – the Intel 80-core Teraflops chip and the Intel Scalable Communications Core (SCC) chip – we saw significant deviation that can lead to erroneous NoC design choices. This prompted our development of ORION 2.0, an extensive enhancement of the original ORION models which includes completely new subcomponent power models, area models, as well as improved and updated technology models. Validation against the two Intel chips confirms a substantial improvement in accuracy over the original ORION. A case study with these power models plugged within the COSI-OCC NoC design space exploration tool [23] confirms the need for, and value of, accurate early-stage NoC power estimation. To ensure the longevity of ORION 2.0, we will be releasing it wrapped within a semi-automated flow that automatically updates its models as new technology files become available.
- Published
- 2009
243. Detectability of active triangulation range finder: a solar irradiance approach.
- Author
-
Liu H, Gao J, Bui VP, Liu Z, Lee KE, Peh LS, and Png CE
- Abstract
Active triangulation range finders are widely used in a variety of applications such as robotics and assistive technologies. The power of the laser source should be carefully selected in order to satisfy detectability and still remain eye-safe. In this paper, we present a systematic approach to assess the detectability of an active triangulation range finder in an outdoor environment. For the first time, we accurately quantify the background noise of a laser system due to solar irradiance by coupling the Perez all-weather sky model and ray tracing techniques. The model is validated with measurements with a modeling error of less than 14.0%. Being highly generic and sufficiently flexible, the proposed model serves as a guide to define a laser system for any geographical location and microclimate.
- Published
- 2016
- Full Text
- View/download PDF
244. SARANA: language, compiler and run-time system support for spatially aware and resource-aware mobile computing.
- Author
-
Hari P, Ko K, Koukoumidis E, Kremer U, Martonosi M, Ottoni D, Peh LS, and Zhang P
- Subjects
- Systems Integration, Computer Communication Networks trends, Microcomputers trends, Programming Languages, Signal Processing, Computer-Assisted, Software, User-Computer Interface
- Abstract
Increasingly, spatial awareness plays a central role in many distributed and mobile computing applications. Spatially aware applications rely on information about the geographical position of compute devices and their supported services in order to support novel functionality. While many spatial application drivers already exist in mobile and distributed computing, very little systems research has explored how best to program these applications, to express their spatial and temporal constraints, and to allow efficient implementations on highly dynamic real-world platforms. This paper proposes the SARANA system architecture, which includes language and run-time system support for spatially aware and resource-aware applications. SARANA allows users to express spatial regions of interest, as well as trade-offs between quality of result (QoR), latency and cost. The goal is to produce applications that use resources efficiently and that can be run on diverse resource-constrained platforms ranging from laptops to personal digital assistants and to smart phones. SARANA's run-time system manages QoR and cost trade-offs dynamically by tracking resource availability and locations, brokering usage/pricing agreements and migrating programs to nodes accordingly. A resource cost model permeates the SARANA system layers, permitting users to express their resource needs and QoR expectations in units that make sense to them. Although we are still early in the system development, initial versions have been demonstrated on a nine-node system prototype.
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.