ISCA-33 Tutorial Call for
Participation:
Using the Pin Instrumentation Tool for Computer
Architecture Research
Saturday, June 17, 2006
Time |
Talks |
1:30 – 2:15 |
“Introduction to Pin” by Chi-Keung Luk (CK) |
2:15 – 2:45 |
“Micro-architecture Studies Using Pin: Branch Predictors, Caches, & Simple Timing Models” by Aamer Jaleel |
2:45 – 3:30 |
“Techniques for Speeding Up Pin Based Simulations” by Harish Patil |
3:30 – 4:00 |
-------------------- Break ---------------------- |
4:00 – 4:30 |
“Performance Optimization of Pin Tools” by Chi-Keung Luk (CK) |
4:30 – 5:00 |
“Fault Analysis Using Pin” by Srilatha Manne (Bobbie) |
ABSTRACT
Tired of waiting for your
simulations to get done? Wish you could simulate
billions of instructions an hour? Wish you had ready-to-use infrastructure that
could help you understand the performance bottlenecks of emerging/complex
workloads (e.g. Oracle, Java)? If you
answered yes, then, this is the venue for YOU.
In this tutorial, we will
show that the binary instrumentation tool, Pin, is a great candidate for
computer architecture studies such as performance modeling, trace generation,
and fault tolerance. We demonstrate the
use of Pin to create simple performance models without resorting to the
complexities of building detailed performance models. We show that Pin’s robust design makes it
easy to conduct performance studies of simple workloads (e.g. SPEC2000,
BioBench), multi-threaded workloads (e.g. SPLASH, SPECOMP, NAS, BioParallel) or
even complex server workloads like commercial database applications (e.g.
Oracle). Since instrumentation is
typically much faster than simulation, we show that Pin allows for running
workloads to completion without the long waiting time associated with detailed
performance models. Such functionality
provides for understanding overall program behavior of different
workloads.
Besides performance studies,
Pin also allows users the capability of conducting reliability studies. Since Pin provides access to architecture
specific details, injecting faults and studying the propagation of a fault into
different portions of the instruction/data stream can be explored in great
detail.
Pin is publicly available for
four architectures (IA32, EM64T, Itanium, Xscale) and three operating systems
(Linux, Windows, MacOS).
AGENDA
This tutorial consists of
four presentations. The first
presentation provides an introduction to the Pin API and the basic concepts of
writing instrumentation tools in Pin. Detailed
instructions on writing instrumentation tools useful for architecture research
are also presented. The next three
presentations consist of research projects that use Pin.
Introduction To Pin, CK Luk, Intel
The primary goal of Pin is to
provide easy-to-use, portable, transparent, and efficient instrumentation.
Instrumentation tools, called Pin tools,
are written in C/C++ using Pin's rich API. Pin follows
the model of ATOM, allowing the tool writer to analyze an application at the
instruction level without the need for detailed knowledge of the underlying
instruction set. The API is designed to be architecture independent whenever
possible, making Pin tools source
compatible across different architectures.
Cache Performance of Emerging
Workloads on CMPs, Aamer
Jaleel, Intel
Chip-multiprocessors (CMPs) have
become the next attractive point in the design space of future high performance
microprocessors. There is a growing need for simulation methodologies to
determine the memory system requirements of emerging parallel workloads in a
reasonable amount of time. This session of the tutorial demonstrates the use of
PIN as an alternative to execution-driven and trace-driven simulation
methodologies. We present the implementation of a cache simulation PIN tool:
CMP$im. We show that CMP$im can be used
to understand overall memory behavior of
a workload as well as characterize instruction profile, cache performance, and
data sharing behavior of workloads at speeds of 4-10 MIPS.
Techniques for Speeding up Pin-based Simulation, Harish Patil, Intel
This
session of the tutorial discusses stand alone simulation as well as interfacing
Pin with existing simulators such as Simple Scalar. Since detailed
whole-program simulation can be very slow, techniques for speeding up Pin-based
simulation using Simpoints/PinPoints are discussed. Additionally, a Pin-based
interface to SimpleScalar.x86 (under development) will also be discussed as a
case study.
Understanding Software Fault Resilience, Bobbie Manne, Intel
Recently,
there has been much research on transient fault analysis. Much of this
work is at the RTL or architectural simulation level. This produces the
most accurate picture of the affect of the fault. However, the analysis
cannot cover more than a small section of the application run, and cannot
explore the impact of the fault beyond a few million instructions. This leads
to some conservative assumptions. For instance, if the fault propagates
to the memory subsystem, then it is assumed to be relevant if the faulty data
is not overwritten before the end of the analysis window. In some cases,
any fault that impacts the control flow is assumed to cause program disruption.
This
session of the tutorial addresses pin tools that have been created to analyze
fault propagation behavior beyond the range of existing simulation
infrastructures by looking at the impact of the fault for the complete run of
the program. The pin tools perform two tasks: 1.) injects the fault into
an input, output or architectural register (depending on what the user wants to
model in terms of where the fault occurs in hardware), and 2.) analyzes the
impact of the fault in the application. The tutorial describes the
methodology and subtleties associated with doing this analysis.
If you have any questions please
feel free to contact:
Robert
Cohn:: robert.s.cohn@intel.com
Aamer
Jaleel: aamer.jaleel@intel.com