CMC Tutorial


cmc-logo

Introduction

With the HMCSim-2.0 Alpha release, we have included a very extensible feature that permits users to craft their own custom memory cube, or CMC, extensions to HMCSim.  The extensions are built in such a manner that the user does not need to modify or learn the internal implementation details of HMCSim.  Rather, we provide a very simple template that builds a shared library that is loaded at runtime.  This enables users to mix and match CMC operations and truly utilize HMCSim as a research tool.

The CMC functionality works using a two-stage approach.  First, users construct a simple implementation of a single CMC operation that is compiled into a shared library object (*.so).  This object provides all the arithmetic functionality required to perform of mimc the CMC operation.  Second, we provide a simple function interface and associated backend handlers in the HMCSim core library (libhmcsim.a) that permit users to load any abstract CMC shared library object and utilize their respective arithmetic functions (using dynamic loading and function pointers).

Slide1

Before we get into the details of implementing a new CMC operation, there are a few items to note:

  • The CMC functionality requires HMCSim version 2.0 Alpha+ (It will not function in the packet encoding/decoding space of version 1.0)
  • The CMC functionality does not require the user to understand the inner workings of HMCSim, however, it does require the user write a small bit of code (but we provide a handy template).
  • The CMC user implementation does NOT need to maintain the BSD-style license of HMCSim.  I can distributed separately from HMCSim (or kept private).
  • We make use of all of the unused Gen2 command codes for CMC, so we can support up to 70 concurrent user-provided CMC operations.
  • The HMCSim interface for CMC operations only requires one additional function call.  The HMCSim function interfaces are otherwise unchanged.
  • You will need to know a bit about HMC packet styles, but you don’t need to know everything.

Step 1: Get the Code

Now that we have some of the basics out of the way, we can get into the details of implementing a new CMC operation.  The first thing you need to do is pull a current version of the HMCSim code.  You can do so by following the instructions (for version 2.0 or greater) in the HMCSim tutorial.  Once you check out the code, you should note that the top-level Makefiles will automatically build any CMC libraries found in the ~/cmc/ directory.  You should also note that we provide a convenient template for new CMC implementations in ~/cmc/template/.

Step 2: Prepare the Tree

Change directories to the ~/cmc/ directory and add a new directory for your new operation.  For the purpose of this tutorial, we’ll refer to this as MY_CMC.  Once you have done so, copy the contents of the template directory to your new MY_CMC directory.

$> cd cmc
$> mkdir MY_CMC
$> cd MY_CMC
$> cp -R ../template/* ./

Step 3: Prepare the Makefile

Now that we have the template code in place, we need to modify the provided Makefile to reflect the name of our new CMC operation.  Note that the name you will use here, will be reflected in the eventual library naming convention (eg, our library will be libMY_CMC.so).  Edit the makefile and change the value of the LIBNAME directive to your desired CMC operation name.

 

Step 4: Prepare the Global Variables

Now that we have our build structure in place, we can begin editing the one required source file, cmc.c.  This file provides all the state, functionality and data for an individual CMC operation.  There are two blocks of information in this file that we need to edit:

  1. Global variables
  2. Arithmetic function

The remainder of the functions and data in this file can be left unchanged as it is prepared to be universal across CMC implementations.  In this section, we focus on modifying the global variable state.  Each of the global variables in the source file can easily be identified by the leading two underscores, EG: __variable.  These variables hold relevant pieces of data that drive the CMC operation implementation and inform the libhmcsim handlers of pertinent information.  This is where you need to have some relative knowledge of the HMC packet formats.  We’ll do our best to make it easy and guide you through it.

  • __op_name: The first global variable that we need to modify is the __op_name variable.  This variable is character string that holds the name of the respective CMC operation.  This is used to the hmc tracing functions to uniquely identify the operations in the logs.  This is much better than simply printing “CMC.”  Note that the maximum number of characters is 256 and we highly suggest you avoid using special characters and spaces (makes parsing the logs easier).
  • __rqst: The second global variables assigns the CMC operation enumerated value for the respective implementation.  This assigns the command enum similar to “WR128” or “RD16”.  The tricky part of this is that these must be unique across all the CMC libraries loaded into an application.  EG, they cannot be overloaded or reused within a single simulation.  The list of permissible values are noted in the hmc_sim_types.h header file under the hmc_rqst_t enum table.  All the permissible CMC operations are denoted with CMC**, ** is a two digit integer.  There are 70 unique values, so just pick one.
  • __cmd: The third global variable designates the integer command code that corresponds to the __rqst enum.  For example, if you chose CMC05, then your cmd code is “5”.
  • __rqst_len: The fourth global variable designates the length, in flits, of the request packet.  A “flit” is a 128-bit block of data in the packet.  All request packets are a minimum of 1 flit (64-bits of header data and 64-bits of tail data).  If your packet also includes additional data (such as data to write into memory or perform an atomic operation), you may increase the request length up to 17 flits.  A 17 flit request packet is equivalent to a 256 byte WRITE operation (WR256).
  • __rsp_len: The fifth global variable designates the length, in flits, of the response packet.  As mentioned above, a flit is a 128-bit block of data in the packet.  Remember, response packets are optional.  If the CMC operation is posted, then the response packet length is 0.  Otherwise, you may specify response packets of up to 17 flits.  A 17 flit response packet is equivalent to the response packet for a 256 byte READ operation (RD256).
  • __rsp_cmd: The sixth global variable designates the official response command code for the respective operation.  These commands correspond to the hmc_response_t enumerated values specified in the hmc_sim_types.h header file.  Note that this field is essentially ignored if your __rsp_len is zero.  You can also specify custom CMC response command codes by specifying RSP_CMC.  If you use the RSP_CMC command enum, then you must also specify the __rsp_cmd_code integer.
  • __rsp_cmd_code [OPTIONAL]: The seventh and final global variable is optional.  It specifies the CMC response command code for the respective operation.  Standard respond command codes are < 64.  CMC command codes are be 64 through 127.  These are utilized in the command field of the response packet to identify unique, CMC response commands.
  • __row_ops: Contains the number of row operations for the respective CMC operation.  If this field is unknown, the CMC infrastructure will assume a value of 1.

Step 5: Prepare the Implementation Function

This is the stage where the user is required to implement the one function required to implement the operation associated with the CMC command.  The hmcsim_execute_cmc function is designed to be utilized to implement the operation.  Note that the function arguments are such that the relevant portions of the request packet are already decoded for you.  Also note that the raw request and response payloads are provided via pointers.  It is up to the user to read the necessary blocks from the request block and write the necessary blocks into the response payload.

 

Step 6: Compile It

Now that all the implementation code is complete, you can compile the library by simply typing ‘make’.  Note that you can also do this from the ~/cmc/ directory and build all the CMC libraries.  Alternatively, you can build the hmcsim core library and all the CMC libraries from the root directory using ‘make’.

 

Step 7: Modify the App/Driver

Now that we have the library built, we need to add the one function required to load and initialize the CMC operation in the application/driver.  Immediately following the HMCSim initialization steps, we need to load the CMC operation using the hmcsim_load_cmc() function.  Note that this function returns zero if successful, and nonzero otherwise.  If a nonzero return code is encountered, then the library failed to load.

if( hmcsim_load_cmc( &hmc, "/path/to/libMY_CMC.so" ) != 0 ){
  printf( "error occurred\n" );
}