CIBC:Discussion:MemoryEfficiency

From NCRR Biomedical Software Development, Engineering, and Dissemination Wiki

Jump to: navigation, search

Memory Efficiency

Motivation

When applying SCIRun techniques to real-scale modelling problems, one quickly finds out that one is running out of addressable memory space. In order to be able to use SCIRun as a platform in the current collaboration projects, memory usage needs to be organized more efficient. The focus of this discussion should be how to optimize the CPU and memory usage in a dataflow environment.


Port Caching

Currently all ports are cached at the end of the execution of a module. The advantage of this is that when I want to reexecute something all intermediate steps are there, disadvantage is that it requires a lot of memory resources which we do not have when doing large scale models. Currently there is a solution in SCIRun to switch off all port caching, disadvantage is that you cannot redo a part of a network and all interactivity is lost.

[1] A better option would be to cache to disk and selectively not cache any output. Hence being able to do "Cache/Cache on disk/No cache" for every port.

[2] This seems to solve some problems, but there is still fragmentation of memory and address-space and on multi processor machines a lot of modules are executed in parallel so still a lot of data is remaining in memory. Only for long pipelines it is a real improvement, when there is no fragmentation of memory going on.


Data blocks


[1] One concept that might be used to deal with the fragmentation issue is to build a wrapper around every block of memory and the notion of being able to store these blocks to disk and reload them when they are needed again. In more detail: it would be great if the scheduler could go into the datastream objects, mark them as currently not used and then selectively store large data blocks to disk and reload them when they are needed for a module execution. That way when memory consumption is high the scheduler can freeze execution of modules and dump everything to disk and reload all needed data, de facto defragmenting memory.


Parallel Issues


[1] The problem soetimes seems to be that SCIRun executes too many modules at once. It uses a lot temporary memory and a lot ofmemory swapping, reducing the overall performance. It would be great to have an option to limit the number of modules executed at the same time. Preferably adjusted to the number of available modules. Being able to steer this would be great.


Memory manager


[1] How does the current allocator work? (Could someone fill me in on the specifics? -- Jeroen)

[2] It seems to me that we should grab a pretty large piece of memory to put our dataflow objects in.


Thirdparty libraries


[1] Currently we allow thirdparty libraries to allocate memory and we wrap these objects and use them. The disadvantage is that we do not have any control over where memory is allocated. So we might want to rethink what we do with thirdparty libraries

[2] Mainly Teem is allocating memory for objects we reuse, these objects can be big. So how does that affect our memory management. Is Teem currently able to allocate a huge chunk of data or even a small amount in the middle of the addressing space.

[3] In SCIRun we have a lot of STL objects that do not make use of 'scinew', do we want them to use our memory manager so we can better separate out small and large memory blocks. Or is this automatically done. Note some Field classes use large vector<vector> arrays, which could potentially fill up all the space reserved for small memory blocks. Do we have a good strategy for this?


Memory and CPU resource management


[1] It would be nice if modules could tell the scheduler a prediction on memory use and CPU use. This could then be used to optimize network performance, to use as many free memory as available to boost speed but staying within memory limits.


Alternative strategies

[1] Maybe should have the option write everything to disk, and for each module just load read in the necessary streams into memory and keep swapping thing to disk manually.