The Passage
Programming Title

December 22, 2011
Michael Schoell

One of the most recent upgrades to DarkForge has been a massive overhaul of the scene graph management system. It was always a clunky system for the user to setup objects for rendering with a long list of steps that needed to be made with the hopes that you did not accidentally mess up another object. Before I begin, I'll define a few terms.

  • Render Object - Every time the user wants to render another object, even if it's visually the same as another, you create a render object. Simply it stores the unique position of the object, along with all the other objects that dictate what is being rendered and how.
  • Texture - As you can imagine, this object is the instance of a texture in the scene. Each texture that is loaded up is an instance.
  • Render Context - This object holds the shaders and various other video card related data to render the render object.
  • Mesh - Whatever mesh you want to render. It holds bounding volume, the collision mesh, and the rendered mesh data. Later it will contain more advanced data to facilitate level of detail.
  • Scene - The scene is a collection of all the objects to be rendered from a specific camera view. One main scene is used for the world, for all the visual objects that the player will see. Most of the time if you want a different view, change the camera, not the scene. Other scenes would be used for holding meshes to receive shadows, there the scene is rendered from a different perspective and some objects may not be desired to have shadows for performance.

Now, before this overhaul, creating a mesh would require the user to create a texture for the mesh being used. Then set in that mesh the render context. If you wanted a different context for an object using the same mesh, you needed a new instance of that mesh. The complexity of where to set objects and when you can and cannot, made the system clunky. It was easy to change more objects than you desired to a new render context or texture.

One reason for this was that the objects that contained the textures, render contexts, meshes, and render objects, were themselves the objects that formed the scene graph. They would link themselves to their parents and they could only have one parent at a time. If you changed their parent, all children would receive this change. This would have also led to problems of multiple scenes and thus, multiple scene graphs.

My systems always have one major criteria for pursuit: speed. The rendering pipeline thrives on efficiency. I have found that even with the multithreading, which has allowed me to triple the number of objects being rendered at similar frames per second, DarkForge is still CPU bound. On a six core processor, I have found that three instances of the same project (rendering 10,000 objects, 100 lights) did not make any of their frames per second suffer. Since each instance isolates itself to separate cores, neither is slowing the others down on the CPU, however the single GPU would be receiving commands from all three. That would amount to 30,000 objects and 300 lights at 80 FPS.

This has made me conclude that the GPU is still idle too often. There is little I can do to speed this process up, DirectX suffers from it's own performance issues and the rest would lie in hard to find code optimizations. The device thread, dedicated to only processing GPU commands, should already be well design for cache hits.

None of this is that important however, it's beyond the scope of priorities currently. I only brought it up because I do not want to slow the pipeline down by adding a system that must use up a lot of cycles. With these issues in mind, I restructured the code.

Render objects now are told the render context, texture, and mesh used for rendering. Changing the data for one will have no effect on another. These objects also no longer contain scene graph information and are parented in no way. The scene graph comes in when the object is added to a scene. Using the relationship between child and parent, I create a hash and look up the object in a hash table.If it finds a current instance of the object, it will simply use it as a parent. If not found, it creates one and adds it to the hash table.

With this system, any time data changes in the render object (such as what texture to use), the scene will just re-hash the object to find the new parent. Objects will share parents only if it is the correct object. Speed-wise, the only new overhead is when objects are added, modified, and removed. That should not happen every frame, and even if some objects are changed that often, it should not be a major performance hit.

DarkForge advances to match my needs. If this system proves to be poor later, it will be remade to match needs. When needs change, the system will change to try and match new and old needs effectively. It must perform well under a variety of conditions with little user hassle.

Here is a small list of future tasks I am working towards with DarkForge:

  • Shadow Mapping
  • Normal Mapping
  • Distant Light Culling
  • Depth of Field
  • Level of Detail

These are some of the major focuses, though I have some personal game projects that do help direct features for DarkForge, such as the above scene graph changes.

Programming Title

November 22, 2011
Michael Schoell

There has not been a lot of visual progress made with DarkForge recently. Much of the work has been tied up in the core architecture. Since DarkForge is meant to support DirectX 9 and DirectX 11, this work has mostly involved the support for them. Between the two graphics APIs, things are different, what you have access to and how you have access to it.

This was the case with sending constants to the shaders. Where in DirectX 9 you simply call a line of code on a per constant basis, like so:


DirectX 11 needs an entire structure ID3D11Buffer, setup for the shader and filled with the constants. Fewer individual commands being setup and passed to the video card is likely a good thing, but offers some hasle with code setup. Since I did not want to add another class to seperate DirectX 9 and DirectX 11 and how they handle things, I simply opted for a void pointer.

The void pointer is sized to like a tiny memory pool to hold a list of constants in DirectX 9, and in DirectX 11 it holds the structure of constants with room in the last four bytes to hold a pointer to the ID3D11Buffer to pass the data to. Keeps the code from requiring another set of virtual function tables or function pointers and has zero impact on the end user.

User Interface has also been improved over the past several weeks. From how objects are managed to various display options such as anchor points, how they anchor to parents, and improved text options. Even while it does not support LUA commands as my previous version did, it is far more optimized and robust and should prove easy to expand into the future.

Below are some screenshots highlighting the anchor points. To set one up is no more complex than a line of code. The window is the tiled chess board, the sub object that we are modifying the anchor point of is the grass texture.

CenterCenter The grass image is centered in the chessboard window.
UpperLeftUpperLeft Default setup, the grass image is at 0, 0 in it's parent window as expected.
CenterUpperLeft The grass image's anchor is in the center, making it appear to begin at -32, -32.
LowerRightLowerRight The grass image's position is still 0, 0. Not it appears in the lower right hand corner due to a change in it's anchors.

Beyond basic setup to render an empty screen, which amounts to a dozen or so lines of code, there is not much to get the above to show. The user must create a few objects, all of which could be done in XML files if so desired and reducing the required C++ code to simply one line. All objects default to the upper left corner of their parent with an upper left anchor point.

dfTexture *pTexture, *pTexture2;

//Load the textures.
pTexture = pSceneManager->loadTexture("Assets/Texture.png");
pTexture2 = pSceneManager->loadTexture("Assets/Texture2.png");

//Create the window.
dfWindow *pWindow = pUIManager->createWindow(0, pTexture, 0, 0, 256, 256); //Parameters are: parent pointer (root windows are NULL), the texture, position X, position Y, width, height

//Anchor it to its parent
pWindow->setAnchorToParent(CENTER_ANCHOR); //Centers the window on the screen.
pWindow->setAnchor(CENTER_ANCHOR); //Makes the window center by it's center point rather than the upper left corner.

dfImage *pImage = pUIManager->createImage(pWindow, 0, 0, 64, 64, pTexture2); //Parameters are: parent pointer (the previously made window will house us), the texture, position X, position Y, width, height

pImage->setAnchor(LOWER_RIGHT_ANCHOR); //Bases our position off of the lower right corner. If we did not have the next line, this line would extend the image off the screen from -64, -64 to 0, 0.

pImage->setAnchorToParent(LOWER_RIGHT_ANCHOR); //Moves the image to the lower right corner of pWindow.

Programming Title

September 09, 2011
Michael Schoell

DarkForge DX9vsDX11

DirectX 9 vs DirectX 11

The above screenshot shows a basic sample of DarkForge utilizing DirectX 9 and DirectX 11. Both are using a multithreaded approach, DirectX 9 using my custom command list and DirectX 11 using render lists. Using DirectX 11 is still in its infancy stages, this is about all that it can do right now. Ultimately I hope to be able to switch between the two without any code differences, though obviously certain DirectX 11 features such as Shader Model 5.0 and Compute Shaders will be specific.

Recent feature additions and changes to DarkForge have revealed issues and temporarily broken full-screen mode. Fixing full-screen is the least of my worries, as it will fix itself once the long term changes to data management are finished. The revealed issues however are proving troublesome, as they involve multithreading. These issues almost never pop up so it takes a long time to test my theories, so it is a slow process.

Soon, I hope to have some DirectX 11 rendering screenshots, showing the comparable scene with DirectX 9 and the speed differences. One of the first projects to show off the difference in capabilities will be my Mandelbrot program. DirectX 11 supports doubles in shaders, which will allow for more zooming in the project.

Programming Title

August 1, 2011
Michael Schoell

Point Lights

This image is the results of point lights added to my traditional deferred shading. As in previous images, there are ten thousand individual draw calls here. Two different sized spheres are drawn without using scaling and they have 13 vertical and horizontal slices each. There are three different textures applied to both sized spheres. Due to the scene map, there is only a small handful of state changes and thus command calls, which is crucial for speed.

Thats how the system has worked, nothing new so far. What is new are the point lights, one thousand of them. Frames per second are still above thirty, and while there is almost no game logic happening, it does show the room for keeping an object dense scene while running a game. Thus far I am happy with the work and even happier knowing there is room for improvement.

Some immediate work with the system will be set in making DarkForge more usable for a game. Just because it can pump a lot of polygons to the scene does not mean its worth anything if it can not make pong (in 3d of course). Collision handling and various line segment to triangle checks are first on the list and I will see where things go from there.

Programming Title

May 26, 2011
Michael Schoell

Work on the DarkForge engine has progressed slowly and is mostly in a stage of consolidation. Parts are being revised for their ease of use, flexibility, and robustness. Everything from how you create meshes to how they are managed has been under revision. A new Trie data structure was implemented to provide fast look-up's by name of various objects managed by the scene. This works for most objects though a solution as to how to rapidly iterate through certain lists for the rendering pipeline is still in the works.

More tests with the multi-threading component of the renderer was done on various computers and the results were startling. Where the renderer excelled on my quad core machine, on a dual core machine it seems to run at a fourth the speed when not rendering anything more than a blank screen. I expect the threads are causing too much overhead when they are not doing much since the renderer still far exceeds how many objects can easily be rendered using a single threaded pipeline. However, more examination will be put into this to ensure it is not prematurely being stunted by poorly written code.

Once this consolidation process is complete, work will commence on deferred point lights.


Site Development and Design by <CS>

Graphic Design by Nathan Schoell