HMC-Sim Simple API


Introduction

Following several user requests, we have developed a simplified API interface alongside the initial HMC-Sim API.  The goal of this effort is to provide users the ability to rapidly develop tests that utilize HMC-Sim without a tremendous software development effort.  We also sought to ensure that any simplified API interfaces could be utilized alongside the existing API seamlessly and without fear of poisoning the simulation environment.  In many case, the simplified API actually makes use of the underlying, full API interface.

This tutorial presents the simplified interface present in the current HMC-Sim 3.0 development branch.  The API is written in C using six new functions.  These functions can be utilized to initialize the simulation environment, perform basic I/O (read and write) operations and perform complex I/O operations (atomic memory operations and CMC operations).

The simplified interface is constructed to mimic what would otherwise be an HMC memory controller.  The I/O functions present in the simplified API return tokens.  These tokens are utilized to query the status of a request just as in the actual hardware implementation.  Tokens are stored within the HMC-Sim infrastructure in a token cache.  As requests migrate through the infrastructure, the token cache is updated to reflect the state of the request.  When requests have finalized with an active response, their respective tokens are updated to reflect their completion state.  Posted requests (requests with no response) are marked as completed in the token cache immediately after their request packets have been found to be valid.  Incoming requests are injected into the link queues in a round-robin fashion in order to maintain a reasonable degree of balance in incoming and outgoing memory bandwidth.

Users or user applications have the ability to construct simulations that honor strong ordering or (traditional) weak ordering when using the simplified API.  Strong ordering can be honored by maintaining a single outstanding memory request.  Once multiple requests are injected into the HMC-Sim infrastructure, no guarantees are made on the order by which memory requests are fielded.  Weak ordering (the default) is maintained by default.

API Interface

The following represents the six new simplified API interfaces.

extern int hmcsim_simple_init( struct hmcsim_t *hmc, int size );
extern int hmcsim_simple_read( struct hmcsim_t *hmc, uint64_t addr, int size );
extern int hmcsim_simple_write( struct hmcsim_t *hmc, uint64_t addr, int size, uint8_t *data );
extern int hmcsim_simple_stat( struct hmcsim_t *hmc, int token, uint8_t *data );
extern int hmcsim_simple_cmc( struct hmcsim_t *hmc, uint64_t addr, uint8_t *data, hmc_rqst_t op );
extern int hmcsim_simple_amo( struct hmcsim_t *hmc, uint64_t addr, uint8_t *data, hmc_rqst_t op );

Example

**Note: All of the forthcoming code can be found in ~/gc64-hmcsim/test/simple_api

Initialization

Just as in the original API, the simplified API requires that the user/driver initializes the HMC-Sim environment.  However, the simplified API provides a single function that initializes the basic simulation environment, the maximum request block size (defaults to 256 byte requests) and the link infrastructure.  The default environment connects all the links from the target device to the host.  In this manner, the simplified API represents a single HMC device connected to a single host device.  The simplified API also requires that you specify the size of the target device to initialize.  Acceptable values are “4” or “8” for 4GB and 8GB devices, respectively.  An example of initializing the environment is as follows:

4GB device:

struct hmcsim_t hmc;

ret = hmcsim_simple_init( &hmc, 4 );
if( ret != 0 ){ 
  return -1;
}
// do stuff
hmcsim_free( &hmc );

8GB device:

struct hmcsim_t hmc;

ret = hmcsim_simple_init( &hmc, 8 );
if( ret != 0 ){ 
  return -1;
}
// do stuff
hmcsim_free( &hmc );

Notice that we also require the user/driver to call the ‘hmcsim_free()’ function to safely complete the simulation.  This is identical to the traditional HMC-Sim API.

Basic I/O

Once the HMC-Sim environment has been initialized, we can begin sending memory requests to the target device.  Unlike the original HMC-Sim API, the simplified API provides two functions to send basic read and write requests.  Each of the hmcsim_simple_read and hmcsim_simple_write functions permit the user/driver to specify the size of the target memory request (in bytes).  The memory request size must be one of the permitted request sizes in the HMC packet specification.  Upon success, the basic I/O functions return the token ID of the successful request injected into the simulated device.  This token is subsequently utilized to monitor for completion status using the hmcsim_simple_stat function.

An example of performing a basic read and write request are as follows.  Pay special attention to the arguments to hmcsim_simple_stat for the read and write requests, respectively.  Note that read requests require a valid “data” argument in order to successfully retrieve the data for the respective read operation.

ret = hmcsim_simple_init( &hmc, 4 );
if( ret != 0 ){
  return -1;
}

int token = -1;
token = hmcsim_simple_read( &hmc, addr, 16 ); // 16-byte read
if( token < 0 ){
  hmcsim_free( &hmc );
  return -1;
}

uint8_t data[16];
bool rsp = false;
while( !rsp ){
  hmcsim_clock(&hmc);  // clock the sim

  // check for status
  ret = hmcsim_simple_stat( &hmc, token, &(data[0]) );
  if( ret == 1 ){
    // response was found
    rsp = true;
  }else if( ret == -1 ){
    // an error was returned
    hmcsim_free( &hmc );
    return -1;
  }
}

token = -1;
token = hmcsim_simple_write( &hmc, addr, 16, &(data[0]) ); // 16-byte write
if( token < 0 ){
  hmcsim_free( &hmc );
  return -1;
}

// check for status
rsp = false;
while( !rsp ){
  hmcsim_clock(&hmc);  // clock the sim

  // check for status
  ret = hmcsim_simple_stat( &hmc, token, NULL );
  if( ret == 1 ){
    // response was found
    rsp = true;
  }else if( ret == -1 ){
    // an error was returned
    hmcsim_free( &hmc );
    return -1;
  }
}

hmcsim_free( &hmc );

Notice that for each of the requests, we still need to make a call to hmcsim_clock in order to make forward progress in the simulation.  Also note that the tokens returned by the read and write functions must be handled correctly in order to successfully track the status of the requests.

Atomic Memory Ops

In addition to the basic read and write operations, users also have the ability to dispatch atomic memory operations.  The atomic memory operations are handled just as any other read or write I/O request.  As a result, they can exist in the token cache just as another other memory operation.  However, the interface by which to utilize the AMO’s requires the user to define an additional parameter that represents the target operation to perform.  These hmc_rqst_t command enums can be found in the hmc_sim_types.h header file.  The simplified API supports all variants of the HMC Gen 2 specification atomic memory operations.

An example of performing an AMO using the simplified API is as follows:

ret = hmcsim_simple_init( &hmc, 4 );
if( ret != 0 ){
  return -1;
}

int token = -1;
uint64_t addr = 0x00ull;
uint8_t data[16];

// perform a TWOADD8 AMO
token = hmcsim_simple_amo( &hmc, addr, &(data[0]), TWOADD8 );
if( token < 0 ){
  hmcsim_free( &hmc );
  return -1;
}

// check for status
rsp = false;
while( !rsp ){
  hmcsim_clock(&hmc);  // clock the sim

  // check for status
  ret = hmcsim_simple_stat( &hmc, token, NULL );
  if( ret == 1 ){
    // response was found
    rsp = true;
  }else if( ret == -1 ){
    // an error was returned
    hmcsim_free( &hmc );
    return -1;
  }
}

hmcsim_free( &hmc );

CMC Ops

In addition to the aforementioned atomic memory operations, the simplified API also supports dispatching custom memory cube, or CMC, operations as well.  Much like the AMO simplified API function, the CMC function requires that the user specify the target CMC operation using the hmc_rqst_t designator.  Make sure to load the target CMC shared library prior to attempting to send a CMC request!

ret = hmcsim_simple_init( &hmc, 4 );
if( ret != 0 ){
  return -1;
}

// load the CMC library
ret = hmcsim_load_cmc( &hmc, "../../../cmc/amo_popcount/libamopopcount.so");
if( ret != 0 ){
  hmcsim_free( &hmc );
  return -1;
}
int token = -1;
uint64_t addr = 0x00ull;
uint8_t data[16];
// perform a CMC op
token = hmcsim_simple_cmc( &hmc, addr, &(data[0]), CMC05 );
if( token < 0 ){
  hmcsim_free( &hmc );
  return -1;
}

// check for status
rsp = false;
while( !rsp ){
  hmcsim_clock(&hmc);  // clock the sim

  // check for status
  ret = hmcsim_simple_stat( &hmc, token, &(data[0]) );
  if( ret == 1 ){
    // response was found
    rsp = true;
  }else if( ret == -1 ){
    // an error was returned
    hmcsim_free( &hmc );
    return -1;
  }
}

hmcsim_free( &hmc );

Interoperability

As mentioned above, the simplified API is entirely interoperable with the existing HMC-Sim API.  This permits us to build and utilize all the existing functionality (tracing, visualization, etc) that is present in the traditional interface.  An example of utilizing tracing alongside the simplified API is as follows.

ret = hmcsim_simple_init( &hmc, 4 );
if( ret != 0 ){
  return -1;
}

FILE *ofile = fopen( "trace.out", "w" );
if( ofile == NULL ){
  hmcsim_free( &hmc );
  return -1;
}

hmcsim_trace_handle( &hmc, ofile );
hmcsim_trace_level( &hmc, (HMC_TRACE_BANK|
 HMC_TRACE_QUEUE|
 HMC_TRACE_CMD|
 HMC_TRACE_STALL|
 HMC_TRACE_LATENCY|
 HMC_TRACE_POWER) );
hmcsim_trace_header( &hmc );

int token = -1;
token = hmcsim_simple_read( &hmc, addr, 16 ); // 16-byte read
if( token < 0 ){
  hmcsim_free( &hmc );
  return -1;
}

uint8_t data[16];
bool rsp = false;
while( !rsp ){
  hmcsim_clock(&hmc);  // clock the sim

  // check for status
  ret = hmcsim_simple_stat( &hmc, token, &(data[0]) );
  if( ret == 1 ){
    // response was found
    rsp = true;
  }else if( ret == -1 ){
    // an error was returned
    hmcsim_free( &hmc );
    return -1;
  }
}

token = -1;
token = hmcsim_simple_write( &hmc, addr, 16, &(data[0]) ); // 16-byte write
if( token < 0 ){
  hmcsim_free( &hmc );
  return -1;
}

// check for status
rsp = false;
while( !rsp ){
  hmcsim_clock(&hmc);  // clock the sim

  // check for status
  ret = hmcsim_simple_stat( &hmc, token, NULL );
  if( ret == 1 ){
    // response was found
    rsp = true;
  }else if( ret == -1 ){
    // an error was returned
    hmcsim_free( &hmc );
    return -1;
  }
}

hmcsim_free( &hmc );
fclose(ofile);