.. _awnn_memory: Memory management ================= There are three several types of memory used, as shown in the figure: .. image:: /../imgs/net-memory.png :scale: 50% * layer learnable params(data and diff) * layer output(data and diff) * layer input(data and diff) * layer cache(data only) They are all attached in the model_t: :: struct A{ SomeType Data; struct list_head list_all_params[1]; // list of all learnable params struct list_head list_layer_out[1]; // list of output of each layer struct list_head list_layer_in[1]; // list of input of each layer struct list_head list_layer_cache[1]; // list of layer cache. } .. note:: Use with caution, don't pass A by value, since all list data structure rely on list_head *head, to determine the end of the list. The struct list_head is a list structure derived from linux kernel list.h. To use it we need to first call *init_list_head(head)*, then for each nodes, init_list_head is also used to init, then the node can be added to the list by *list_add* or *list_add_tail* Allocation ---------- Currently the net knows about the maximum batch size, so all the sizes of memory types above can be inferred. * In the mlp_init function, all the *output*, *learnable params* are directly allocated and added to the net using *net_attach_param*. * *input* can be special, since the input of upper layer is the output of the bottom layer, we still use the *list* structure above to track all the data/diff of input, but they are just shadow copy of lower output. * *cache* Currently, each layer can different types of caches. For the simplicity, lcache_t is just a place holder to track caches used by *each layer*. inside each lcache_t, tensors can be pushed in and popped out Access --------- After *net_attach_param*, each param can be accessed by :: tensor_t w = net_get_param(model->list_all_params, "fc1.weight")->data tensor_t dw = net_get_param(model->list_all_params, "fc1.weight")->diff Destroy ---------- Only problem currently is destroy input. Currently the shadow copy from the output is simply a copy of tensor_t, which means free from one side needs to explicitly synced to the other side. For now, I just free output tensors, i will revisit it in future.