Written new contents of "Mapping memory" documentation chapter.

2018-03-01 12:19:37 +01:00 · 2018-03-01 12:19:37 +01:00 · a3e019d476
commit a3e019d476
parent f44305429c
7 changed files with 223 additions and 181 deletions
--- a/src/vk_mem_alloc.h
+++ b/src/vk_mem_alloc.h
@ -155,8 +155,9 @@ It is valid, although not very useful.
 \section choosing_memory_type_usage Usage

 The easiest way to specify memory requirements is to fill member
-VmaAllocationCreateInfo::usage using one of the values of enum `VmaMemoryUsage`.
+VmaAllocationCreateInfo::usage using one of the values of enum #VmaMemoryUsage.
 It defines high level, common usage types.
+For more details, see description of this enum.

 For example, if you want to create a uniform buffer that will be filled using
 transfer only once or infrequently and used for rendering every frame, you can
@ -237,26 +238,62 @@ that pool. For further details, see \ref custom_memory_pools.

 \page memory_mapping Memory mapping

-\section persistently_mapped_memory Persistently mapped memory
+To "map memory" in Vulkan means to obtain a CPU pointer to `VkDeviceMemory`,
+to be able to read from it or write to it in CPU code.
+Mapping is possible only of memory allocated from a memory type that has
+`VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT` flag.
+Functions `vkMapMemory()`, `vkUnmapMemory()` are designed for this purpose.
+You can use them directly with memory allocated by this library,
+but it is not recommended because of following issue:
+Mapping the same `VkDeviceMemory` block multiple times is illegal - only one mapping at a time is allowed.
+This includes mapping disjoint regions. Mapping is not reference-counted internally by Vulkan.
+Because of this, Vulkan Memory Allocator provides following facilities:

-If you need to map memory on host, it may happen that two allocations are
-assigned to the same `VkDeviceMemory` block, so if you map them both at the same
-time, it will cause error because mapping single memory block multiple times is
-illegal in Vulkan.
+\section memory_mapping_mapping_functions Mapping functions

-TODO update this...
+The library provides following functions for mapping of a specific `VmaAllocation`: vmaMapMemory(), vmaUnmapMemory().
+They are safer and more convenient to use than standard Vulkan functions.
+You can map an allocation multiple times simultaneously - mapping is reference-counted internally.
+You can also map different allocations simultaneously regardless of whether they use the same `VkDeviceMemory` block.
+They way it's implemented is that the library always maps entire memory block, not just region of the allocation.
+For further details, see description of vmaMapMemory() function.
+Example:

-It is safer, more convenient and more efficient to use special feature designed
-for that: persistently mapped memory. Allocations made with
-`VMA_ALLOCATION_CREATE_MAPPED_BIT` flag set in
-VmaAllocationCreateInfo::flags are returned from device memory
-blocks that stay mapped all the time, so you can just access CPU pointer to it.
-VmaAllocationInfo::pMappedData pointer is already offseted to the beginning of
-particular allocation. Example:
+\code
+// Having these objects initialized:
+
+struct ConstantBuffer
+{
+    ...
+};
+ConstantBuffer constantBufferData;
+
+VmaAllocator allocator;
+VmaBuffer constantBuffer;
+VmaAllocation constantBufferAllocation;
+
+// You can map and fill your buffer using following code:
+
+void* mappedData;
+vmaMapMemory(allocator, constantBufferAllocation, &mappedData);
+memcpy(mappedData, &constantBufferData, sizeof(constantBufferData));
+vmaUnmapMemory(allocator, constantBufferAllocation);
+\endcode
+
+\section memory_mapping_persistently_mapped_memory Persistently mapped memory
+
+Kepping your memory persistently mapped is generally OK in Vulkan.
+You don't need to unmap it before using its data on the GPU.
+The library provides a special feature designed for that:
+Allocations made with `VMA_ALLOCATION_CREATE_MAPPED_BIT` flag set in
+VmaAllocationCreateInfo::flags stay mapped all the time,
+so you can just access CPU pointer to it any time
+without a need to call any "map" or "unmap" function.
+Example:

 \code
 VkBufferCreateInfo bufCreateInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
-bufCreateInfo.size = 1024;
+bufCreateInfo.size = sizeof(ConstantBuffer);
 bufCreateInfo.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;

 VmaAllocationCreateInfo allocCreateInfo = {};
@ -268,17 +305,36 @@ VmaAllocation alloc;
 VmaAllocationInfo allocInfo;
 vmaCreateBuffer(allocator, &bufCreateInfo, &allocCreateInfo, &buf, &alloc, &allocInfo);

-// Buffer is immediately mapped. You can access its memory.
-memcpy(allocInfo.pMappedData, myData, 1024);
+// Buffer is already mapped. You can access its memory.
+memcpy(allocInfo.pMappedData, &constantBufferData, sizeof(constantBufferData));
 \endcode

-Memory in Vulkan doesn't need to be unmapped before using it e.g. for transfers,
-but if you are not sure whether it's `HOST_COHERENT` (here is surely is because
-it's created with `VMA_MEMORY_USAGE_CPU_ONLY`), you should check it. If it's
-not, you should call `vkInvalidateMappedMemoryRanges()` before reading and
-`vkFlushMappedMemoryRanges()` after writing to mapped memory on CPU. Example:
+There are some exceptions though, when you should consider mapping memory only for a short period of time:
+
+- When operating system is Windows 7 or 8.x (Windows 10 is not affected because it uses WDDM2),
+  device is discrete AMD GPU,
+  and memory type is the special 256 MiB pool of `DEVICE_LOCAL + HOST_VISIBLE` memory
+  (selected when you use `VMA_MEMORY_USAGE_CPU_TO_GPU`),
+  then whenever a memory block allocated from this memory type stays mapped
+  for the time of any call to `vkQueueSubmit()` or `vkQueuePresentKHR()`, this
+  block is migrated by WDDM to system RAM, which degrades performance. It doesn't
+  matter if that particular memory block is actually used by the command buffer
+  being submitted. 
+- Keeping many large memory blocks mapped may impact performance or stability of some debugging tools.
+
+\section memory_mapping_cache_control Cache control
+  
+Memory in Vulkan doesn't need to be unmapped before using it on GPU,
+but unless a memory types has `VK_MEMORY_PROPERTY_HOST_COHERENT_BIT` flag set,
+you need to manually invalidate cache before reading of mapped pointer
+using function `vkvkInvalidateMappedMemoryRanges()`
+and flush cache after writing to mapped pointer
+using function `vkFlushMappedMemoryRanges()`.
+Example:

 \code
+memcpy(allocInfo.pMappedData, &constantBufferData, sizeof(constantBufferData));
+
 VkMemoryPropertyFlags memFlags;
 vmaGetMemoryTypeProperties(allocator, allocInfo.memoryType, &memFlags);
 if((memFlags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) == 0)
@ -291,27 +347,12 @@ if((memFlags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT) == 0)
 }
 \endcode

-\section amd_perf_note Note on performance
+Please note that memory allocated with `VMA_MEMORY_USAGE_CPU_ONLY` is guaranteed to be host coherent.

-There is a situation that you should be careful about. It happens only if all of
-following conditions are met:
+Also, Windows drivers from all 3 PC GPU vendors (AMD, Intel, NVIDIA)
+currently provide `VK_MEMORY_PROPERTY_HOST_COHERENT_BIT` flag on all memory types that are
+`VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT`, so on this platform you may not need to bother.

-# You use AMD GPU.
-# You use the memory type that is both `DEVICE_LOCAL` and `HOST_VISIBLE`
-   (used when you specify `VMA_MEMORY_USAGE_CPU_TO_GPU`).
-# Operating system is Windows 7 or 8.x (Windows 10 is not affected because it
-   uses WDDM2).
-
-Then whenever a `VkDeviceMemory` block allocated from this memory type is mapped
-for the time of any call to `vkQueueSubmit()` or `vkQueuePresentKHR()`, this
-block is migrated by WDDM to system RAM, which degrades performance. It doesn't
-matter if that particular memory block is actually used by the command buffer
-being submitted.
-
-To avoid this problem, either make sure to unmap all allocations made from this
-memory type before your Submit and Present, or use `VMA_MEMORY_USAGE_GPU_ONLY`
-and transfer from a staging buffer in `VMA_MEMORY_USAGE_CPU_ONLY`, which can
-safely stay mapped all the time.

 \page custom_memory_pools Custom memory pools

@ -824,7 +865,7 @@ typedef struct VmaVulkanFunctions {
 /// Description of a Allocator to be created.
 typedef struct VmaAllocatorCreateInfo
 {
-    /// Flags for created allocator. Use VmaAllocatorCreateFlagBits enum.
+    /// Flags for created allocator. Use #VmaAllocatorCreateFlagBits enum.
    VmaAllocatorCreateFlags flags;
    /// Vulkan physical device.
    /** It must be valid throughout whole lifetime of created allocator. */
@ -1001,7 +1042,7 @@ typedef enum VmaMemoryUsage
    /** Memory will be used on device only, so fast access from the device is preferred.
    It usually means device-local GPU (video) memory.
    No need to be mappable on host.
-    It is roughly equivalent of D3D12_HEAP_TYPE_DEFAULT.
+    It is roughly equivalent of `D3D12_HEAP_TYPE_DEFAULT`.

    Usage:
    
@ -1017,10 +1058,10 @@ typedef enum VmaMemoryUsage
    VMA_MEMORY_USAGE_GPU_ONLY = 1,
    /** Memory will be mappable on host.
    It usually means CPU (system) memory.
-    Resources created in this pool are still accessible to the device, but access to them can be slower.
+    Resources created in this pool may still be accessible to the device, but access to them can be slower.
    Guarantees to be `HOST_VISIBLE` and `HOST_COHERENT`.
    CPU read may be uncached.
-    It is roughly equivalent of D3D12_HEAP_TYPE_UPLOAD.
+    It is roughly equivalent of `D3D12_HEAP_TYPE_UPLOAD`.

    Usage: Staging copy of resources used as transfer source.
    */
@ -1033,12 +1074,12 @@ typedef enum VmaMemoryUsage
    */
    VMA_MEMORY_USAGE_CPU_TO_GPU = 3,
    /** Memory mappable on host (guarantees to be `HOST_VISIBLE`) and cached.
-    It is roughly equivalent of D3D12_HEAP_TYPE_READBACK.
+    It is roughly equivalent of `D3D12_HEAP_TYPE_READBACK`.

    Usage:

    - Resources written by device, read by host - results of some computations, e.g. screen capture, average scene luminance for HDR tone mapping.
-    - Any resources read on host, e.g. CPU-side copy of vertex buffer used as source of transfer, but also used for collision detection.
+    - Any resources read or accessed randomly on host, e.g. CPU-side copy of vertex buffer used as source of transfer, but also used for collision detection.
    */
    VMA_MEMORY_USAGE_GPU_TO_CPU = 4,
    VMA_MEMORY_USAGE_MAX_ENUM = 0x7FFFFFFF
@ -1207,7 +1248,7 @@ typedef struct VmaPoolCreateInfo {
    /** \brief Vulkan memory type index to allocate this pool from.
    */
    uint32_t memoryTypeIndex;
-    /** \brief Use combination of `VmaPoolCreateFlagBits`.
+    /** \brief Use combination of #VmaPoolCreateFlagBits.
    */
    VmaPoolCreateFlags flags;
    /** \brief Size of a single `VkDeviceMemory` block to be allocated as part of this pool, in bytes.