In-Memory and In-Storage Computing with Emerging Technologies

  • CE
    The First International Workshop on

    In-Memory and In-Storage Computing with Emerging Technologies

    Haifa Israel September 11, 2016

    In conjunction with 25th International Conference on Parallel Computing and Compilation Techniques




    Most contemporary high performance computer systems are based on classical von Neumann architecture.  It is widely recognised that such architecture suffers from CPU-memory bottleneck, also known as von Neumann bottleneck. The problem affects both performance and power efficiency of multicore and manycore architectures. With continuation of CPU scaling (driven by Moore’s law and parallelization), the von Neumann bottleneck problem will become even more acute.  
    Another major factor affecting today’s high performance computing is the slowdown in scaling of traditional charge-based memories such as DRAM and NAND Flash. In response, many novel nano-devices and materials are under investigation to create an alternative to charge-based memory. These alternatives include memristors, RRAM, PCM, 3D Xpoint, STT-MRAM and others. These technologies are CMOS compatible, have zero standby power, nanosecond switching speed, great scalability and high density, and are non-volatile. More importantly, these technologies have a wide range of potential applications including non-volatile memory applications, solid state disks, digital computing, neuromorphic computing, etc.
    This workshop will discuss the use of emerging technologies as enabler of a next generation of new architectures that address the major shortcomings of today’s conventional high-performance computing such as latency, energy, power efficiency and scalability. Different architectures and their implications on computing systems and software will be discussed. These architectures include:
    •    Memory-intensive architectures that use massive on-die memory physically stacked on logic die.
    •    Solid State Disks (SSD) with in-storage computing capabilities, allowing implementation of a wide class of computing algorithms internally in SSD.
    •    Computation-In-Memory architectures where both the computation and storage are integrated in the same physical location.
    •    New computing paradigm like quantum computing and others


    8:30-9:15    Keynote: Uri Weiser – Technion
       Location, Location, Location – Where Accelerators Should Reside?
    9:20-9:40    Rotem Ben-Hur, Nishil Talati, Nimrod Wald, and Shahar Kvatinsky – Technion
       Memory Processing Unit for In-Memory Processing
    9:40-10:00    HoangAnh DuNguyen, Lei Xie, Mottaqiallah Taouil, Said Hamdioui, and Koen Bertels – TU Delft
       CIM Architecture Communication Schemes
    10:00-10:30    Break
    10:30-11:15    Keynote: Engin Ipek – University of Rochester
       Accelerator Design with Emerging Memory Technologies
    11:20-11:40    Dietmar Fey and Jonas Schmitt – FAU Erlangen-Nürnberg
       Evaluating Ternary Adders using a Hybrid Memristor/CMOS Approach
    11:40-12:00    Roman Kaplan, Leonid Yavits, Uri Weiser, and Ran Ginosar – Technion
       An In-Storage Implementation of Smith-Waterman in Resistive CAM
    12:00-12:20    Jintao Yu, Adib Haron, Razvan Nane, Said Hamdioui, Henk Corporaal – TU Eindhoven
       and Koen Bertels – TU Delft

       Hardware Reuse for Skeleton-based Implementation of Computation-In-Memory Architecture
    12:30-13:50    Lunch


    Engin Ipek – University of Rochester, Accelerator Design with Emerging Memory Technologies

    DRAM is facing severe scalability challenges due to precise charge placement and sensing hurdles in deep-submicron geometries. Resistive memories, such as phase-change memory (PCM), resistive RAM (RRAM), and spin-torque transfer magnetoresistive RAM (STT-MRAM), hold the potential to scale well beyond DRAM and are promising DRAM replacements. Although the near term application of these technologies will likely be in main memory and storage, their electrical properties also make it possible to design qualitatively new methods of accelerating important classes of workloads.

    In this talk, I will first examine high-performance associative compute engines that combine two powerful capabilities: associative search and processing in memory. Implementations of these engines using PCM and STT-MRAM will be described. I will then present a hardware accelerator for large-scale combinatorial optimization and deep learning based on a memristive Boltzmann machine. The accelerator exploits the electrical properties of RRAM to realize in situ, fine-grained parallel computation within the memory arrays, thereby eliminating the need for exchanging data between the memory cells and the computational units.
    Short Bio
    Engin Ipek is an Associate Professor of Electrical & Computer Engineering and of Computer Science at the University of Rochester. His research interests are in energy-efficient architectures, high-performance memory systems, and the application of emerging technologies to computer systems. Dr. Ipek received his BS (2003), MS (2007), and Ph.D. (2008) degrees from Cornell University, all in Electrical and Computer Engineering. Prior to joining the University of Rochester, he was a researcher in the computer architecture group at Microsoft Research (2007-2009). His work has been recognized by the 2014 IEEE Computer Society TCCA Young Computer Architect Award, an HPCA 2016 best paper award, two IEEE Micro Top Picks awards, an ASPLOS 2010 best paper award,  an NSF CAREER award, and an invited Communications of the ACM research highlights article.


    Uri Weiser – Technion, Location, Location, Location – Where Accelerators Should Reside?

    The era of Heterogeneous systems and Big Data computing is already here. Handling huge amount of data poses new challenges in data processing and in the effective usage of memory, caches, heterogeneous (accelerators) structures and available bandwidth. In addition, computing requirements of Big Data is unique; in many occasions the storage access Bandwidth per processing operation is substantial (i.e. high storage-Byte/Instructions) which presents a new challenge and opportunities for computer architects. The increasing performance requirements are driving the industry towards parallel execution of threads (or tasks) being performed by energy efficient computing (e.g. heterogeneous) engines.  
    In this talk we’ll present some of the research being performed at the Technion, which is related to effective usage of caches and heterogeneous systems in multi-tasking environment and will provide a scent of remedy to the situation. In addition we will present the case of non-temporal-memory-accesses in Big Data environment and suggest some hopefully stimulating solutions.

    Short Bio
    Uri Weiser is a Professor emeritus at the Electrical Engineering department, the Technion IIT and is in the advisory board of numerous startups. Weiser’s main research concentrate in the Computer System architecture area; e.g. Memory subsystem, Big Data architectures, Machine Learning for Architecture, Heterogeneous architectures and more.
    He received the bachelor and master degrees in EE from the Technion and Ph.D in CS from the University of Utah, Salt Lake City.
    Professor Weiser worked at Intel from 1988-2006. At Intel, Weiser initiated the definition of the Pentium® processor, drove the definition of Intel's MMX™ technology, invented the Trace Cache, co-managed the new Intel Microprocessor Design Center at Austin Texas and formed an Advanced Media applications research activity.
    Prior to his career at Intel, Professor Weiser worked for the Israeli Department of Defense as a research and system engineer and later with National Semiconductor Design Center in Israel, where he led the design of the NS32532 microprocessor.
    Weiser was appointed an Intel Fellow in 1996, in 2002 he became an IEEE Fellow, in 2005 an ACM Fellow and in 2016 he awarded the IEEE/ACM Eckert-Mauchly award. Weiser delivered more than hundred keynotes and distinguished lectures


    Dietmar Fey
    Evaluating Ternary Adders using a hybrid Memristor / CMOS approach

    Dietmar Fey studied Computer Science at Friedrich-Alexander-University Erlangen-Nürnberg (FAU) from 1981 to 1987. In 1992 he received at FAU a Ph.D. with a work on Optical Computing Architectures. From 1994 to 1999 he worked as scientific assistant at University of Jena in Germany, where he also concluded his Habilitation in Computer Engineering. From 1999-2000 he was lecturer at University of Siegen. In 2000 Dietmar Fey became Professor for Computer Engineering at University of Jena. In 2001 he received an appointment as Professor for Embedded Systems at University Freiburg. Since 2009 he holds the Chair for Computer Architecture at FAU, since October 2015 he is the Head of the Department Computer Science at FAU. His research interests are focused on parallel computing architectures, parallel embedded systems, HPC and using new technologies like for example memristors for new processor architectures.

    Jintao Yu
    Hardware Reuse for Skeleton-based Implementation of Computation-In-Memory Architecture
    & CIM Architecture Communication Schemes

    Jintao Yu received the B.S. degree from Tsinghua University, Beijing, China, in 2010, and the M.S. degree from National Digital Switching System Engineering & Technological Research Center (NDSC), Zhengzhou, China, in 2013. He is currently pursuing the Ph.D. degree from Delft University of Technology, Delft, the Netherlands. His current research interests include high-level synthesis and memristor-based computation in memory."

    Roman Kaplan
    An In-Storage Implementation of Smith-Waterman in Resistive CAM

    Roman finished BSc and MSc at Technion. He is currently a PhD candidate at Technion working with Prof. Ran Ginosar and Dr. Leonid Yavits on In-Storage processing

    Rotem Ben Hur
    Memory Processing Unit for In-Memory Processing

    Rotem Ben Hur received the B.Sc. degree in electrical engineering from the Technion – Israel Institute of Technology in 2014. She is currently pursuing the M.Sc. degree in Electrical Engineering at the Technion.  Her main area of research interest is in-memory computing using emerging non-volatile memory technologies.


    • Albert Cohen, INRIA
    • Mattan Erez, UT Austin
    • Dietmar Fey, FAU
    • Said Hamdioui, TU Delft
    • Engin Ipek, University of Rochester
    • Onur Mutlu, CMU
    • Moin Qureshi, Georgia Tech
    • Ronny Ronnen, Intel
    • Uri Weiser, Technion
    • Yuan Xie, UC Santa Barbara