Chandrajit Bajaj

TRLIB


Description

TRLIB(Texture-based Rendering Library) is a texture based parallel volume rendering library which provide fast volume rendering for very large datasets via server-client interface.

  • Runs in parallel using MPI
  • Uses graphics hardware for fast rendering
  • Load balanced for a given dataset
  • CORBA or Wireless interface for coupling with external clients

References

C. Bajaj, S. Park and A. G. Thane
Parallel Multi-PC Volume Rendering System
CS & TICAM Technical Report, University of Texas at Austin, 2002. (pdf)

Sanghun Park, Sangmin Park and C. Bajaj
Hardware Accelerated Multipipe Parallel Rendering of Large Data Streams
CS & TICAM Technical Report, University of Texas at Austin, 2001.

C. Bajaj, I. Ihm and S. Park
Compression-Based 3D Texture Mapping for Real-Time Rendering
Graphical Models, 2000, 62(6), pp. 391-410. (pdf)

C. Bajaj, I. Ihm, S. Park and D. Song
Compression-Based Ray Casting of Very Large Volume Data in Distributed Environments
Proc. of HPC-Asia 2000, May 2000, pp. 720-725. (pdf)


Download
  • Binary of TRLIB Cleint is available through our molecular visualization and processing tool VolRover.
  • Source package of TRLIB is delivered on demand upon explicit request.
    Please contact to Dr. Chandrajit Bajaj (bajaj@cs.utexas.edu).


Software Usage
  • Installation:
    • Requirements: Linux or Windows 2000 or XP, Nvidia GeForce3 and above, C++ compiler
    • Building: On Linux, Makefile should be ready. On Windows, you should be able to compile each library and the main program with proper includes set up.

  • User Documentation:

    • *** REQUIREMENT to run texture based volume rendering program: ***
        1. MPI should be installed in Server to communicate between root and sub-Rendering servers.
        2. CORBA should be installed in both Server, client and display server.
          Set the evironment variables: - CORBA Library
          i.e) export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/ooc/lib
        3. extract files and compile for server
          tar xvf server.tar
          cd server
          make
        4. server should have Machine list - i.e) Machines.eye
          Machine1 full address i.e) eye4.ices.utexas.edu
          Machine2 full address
          .
          .
          .

      < How to run "volserver" program >
        The "volserver" program is in the following directory of cluster:
        ./server
        1. Log in all machines which are going to be used and make X-Windows avaiable.
        2. Set the evironment variables: - CORBA Library
          export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/ooc/lib

        3. Run the "mpirun" using the following format
        mpirun -machinefile Machinenames -np 5 volserver -display localhost:0.0 -OAport 20000

        -- Options --
          -machinefile Machinenames
          The host names, which is in the "Machinenames" file, will be used for rendering as sub nodes

          -np 5
          The number of processors will be 5

          -display localhost:0.0
          Use my computer as a root node

        -- Summary --
          export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/ooc/lib
          mpirun -machinefile Machinenames -np 4 volserver -display localhost:0.0 -OAport 20000

      *** REQUIREMENT to run display server program: ***
        1. CORBA should be installed in both Server, client and display server.
          Set the evironment variables: - CORBA Library
          i.e) export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/ooc/lib
        2. extract files and compile for display
          tar xvf display.tar
          cd display
          make
        3. display also should have Machine list for cluster - Machinenames
          Machine1 full address i.e) compute0-0
          Machine2 full address i.e) compute0-1
          .
          .
          .

      < How to run "display" program >
        cf) This program can be located in any server(platform)

        ----- Linux Red Hat CLUSTER -----
          The "diplay" program is in the following directory of eye cluster:
            /home/junenim/volserver/display
          1. Log on the each machine on which you want to display
          2. Set the evironment variables: - CORBA Library
            i.e) export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/ooc/lib
          3. displaytile -display localhost:0.0 0
        ------------------------------------------

        ----- SGI Shared Memory Architecture -----
          The "diplay" program is in the following directory:
          ./display
          1. Log on the Shared Memory Architecture
          2. Set the evironment variables: - CORBA Library
            i.e) export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/ccv-pub/xyz/OB-4.0.3/ob/lib
          3. Execute run10Display script
        -----------------------------------------------------

        -- run10Display --
          displaytile -display localhost:0.1 0 &
          displaytile -display localhost:0.2 1 &
          displaytile -display localhost:0.3 2 &
          displaytile -display localhost:0.4 3 &
          displaytile -display localhost:0.5 4 &
            localhost:0.1 - indicate pipe number
              0 - indicate tile number will be displayed
          .
          .
          .

        ----- 130 node Cluster -----
          Every machine is behind one machine (Prism)
          1. Log on the Prism
          2. Set the evironment variables: - CORBA Library
            i.e) export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/xyzhang/ooc/lib
          3. mpirun -machinefile Machinenames -np 11 displaytile -display localhost:0.0
        ----------------------------------

        displaytile -diplay localhost:0.0 0

        -- Options --
          1. diplaytile
            display executable file
          2. -display localhost:0.0
            Use My computer as a display node
          3. 0
            tile number can be between 0 and 9

        -- Options --
          -machinefile Machines.alpha
          The host names, which is in the "Machinenames" file, will be used for display nodes

          -np 5
          The number of processors will be 5

          -display localhost:0.0
          Use my computer as a root node


Further Details
  • Parallel Algorithm and Multipipe of Onyx2
    • In parallel algorithm, it is important to determine how to effectively subdivide given tasks into a number of small jobs. Image-space and object-space subdivision are the most popular methods in parallel volume rendering. In image-space subdivision, various optimization techniques, such as the early ray termination and hierarchical data structure proposed for enhancing rendering speed, can be used. But, the data access pattern during parallel rendering is very irregular, which results in swapping texture memory quite often. The method is better suited for parallel rendering of volume data smaller than texture memory size. While object-space subdivision is more amenable to parallellization and can expolit the data coherence very easily, it is difficult to apply the early ray termination technique.
    • The above figure illustrates the overview of our algorithm based upon the object-space division method to minimize texture swapping. The master pipe P0 plays an important role in controlling the slave pipes P1, P2, P3, P4, P5 and compositing sub-images. The slave pipes render assigned bricks and write sub-images on shared memory space under the control of the master pipe. And the polygonization process as a separate thread process continues to generate the sequences of polygons perpendicular to the viewing direction until the program is finished.
    • As soon as the current view is set, a rendering order of bricks is determined. Each pipe starts creating sub-images for the assigned bricks and then the master pipe composes the sub-images according to the order. Our algorithm needs synchronization for every frame between master and slave pipes. Because the master pipe has to wait until all slave pipes write sub-images in shared memory space, the actual frame rate is affected by the slowest pipe. We tried to solve this problem using the proportional brick assignment. The same multipipe rendering scheme can be applied to not only static but also time-varying data with little modification. It is natural that the rendering of time-varying data should require much more texture swapping. To reduce the cost of texture loading, the algorithm routinely checks whether the current brick to be rendered is already stored in texture memory. We can control the rendering speed and playing direction for each timestep of time-varying data. Since our texture-based parallel algorithm is based on rendering sampling planes intersected with bricks, volume data is also considered a geometric object. So, it is easy to combine volume data with common geometric objects in an image.

  • Experimental Results
    Visible human female MRI data
    Shaded head with skin
    Data Resolution : 256x512x512
    Visible human male CT data
    Shaded head with skin
    Data Resolution : 512x512x256
    Visible human male CT data
    Shaded head with bone
    Data Resolution : 512x512x256
         
    Visible human male CT data
    Shaded whole body with skin
    Data Resolution : 512x512x1294
    Visible human male CT data
    Shaded whole body with muscle and bone
    Data Resolution : 512x512x1294
    Visible human male CT data
    Shaded whole body with bone
    Data Resolution : 512x512x1294



    Barbados data
    Barbados only
    Data Resolution : 512x256x2048
    Barbados data
    Barbados with wells
    Data Resolution : 512x256x2048
         
    Barbados data
    Barbados with wells
    Data Resolution : 512x256x2048
    Barbados data
    Barbados
    Data Resolution : 512x256x2048


    The above 10 images was genereated by SGI Onyx2 system which has 24 R12000 processors, 25 giga bytes of main memory, six InfiniteReality2 graphics pipes with multiple 64 mega bytes of raster managers, and RM9 Raster Managers.

    The following 12 images was generated by PC which has NVIDIA's GeForce3 GPU graphics card. The hardware accelerated rendering techniques are used to make the following images. To use the GeForce3 hardware, we have to use the OpenGL extensions like GL_NV_texture_shader2, GL_NV_register_combiners, GL_EXT_texture3D, GL_EXT_paletted_texture, GL_ARB_multitexture and so on. Since the GeForce3 card allows the 3D texture shading, the image quality is better than the pixel of frame buffer based shading in showing two materials at the same time.



    Visible human female CT data
    Data Resolution : 256x256x128
    Visible human male CT data
    Data Resolution : 256x256x128
    Visible human male CT data
    Data Resolution : 256x256x128
         

    Visible human female CT data
    Skin only
    Data Resolution : 256x256x128
    Visible human male CT data
    Skin only
    Data Resolution : 256x256x128
    Visible human male CT data
    Skin only
    Data Resolution : 256x256x128
         

    Visible human female CT data
    Bone only
    Data Resolution : 256x256x128
    Visible human male CT data
    Bone only
    Data Resolution : 256x256x128
    Visible human male CT data
    Bone only
    Data Resolution : 256x256x128
         

    Visible human female CT data
    Skin and Bone
    Data Resolution : 256x256x128
    Visible human male CT data
    Skin and Bone
    Data Resolution : 256x256x128
    Visible human male CT data
    Skin and Bone
    Data Resolution : 256x256x128