Brooklyn/mesa 3D driver/docs/drivers/openswr/faq.rst

FAQ
===

Why another software rasterizer?
--------------------------------

Good question, given there are already three (swrast, softpipe,
llvmpipe) in the Mesa tree. Two important reasons for this:

 * Architecture - given our focus on scientific visualization, our
   workloads are much different than the typical game; we have heavy
   vertex load and relatively simple shaders.  In addition, the core
   counts of machines we run on are much higher.  These parameters led
   to design decisions much different than llvmpipe.

 * Historical - Intel had developed a high performance software
   graphics stack for internal purposes.  Later we adapted this
   graphics stack for use in visualization and decided to move forward
   with Mesa to provide a high quality API layer while at the same
   time benefiting from the excellent performance the software
   rasterizerizer gives us.

What's the architecture?
------------------------

SWR is a tile based immediate mode renderer with a sort-free threading
model which is arranged as a ring of queues.  Each entry in the ring
represents a draw context that contains all of the draw state and work
queues.  An API thread sets up each draw context and worker threads
will execute both the frontend (vertex/geometry processing) and
backend (fragment) work as required.  The ring allows for backend
threads to pull work in order.  Large draws are split into chunks to
allow vertex processing to happen in parallel, with the backend work
pickup preserving draw ordering.

Our pipeline uses just-in-time compiled code for the fetch shader that
does vertex attribute gathering and AOS to SOA conversions, the vertex
shader and fragment shaders, streamout, and fragment blending. SWR
core also supports geometry and compute shaders but we haven't exposed
them through our driver yet. The fetch shader, streamout, and blend is
built internally to swr core using LLVM directly, while for the vertex
and pixel shaders we reuse bits of llvmpipe from
``gallium/auxiliary/gallivm`` to build the kernels, which we wrap
differently than llvmpipe's ``auxiliary/draw`` code.

What's the performance?
-----------------------

For the types of high-geometry workloads we're interested in, we are
significantly faster than llvmpipe.  This is to be expected, as
llvmpipe only threads the fragment processing and not the geometry
frontend.  The performance advantage over llvmpipe roughly scales
linearly with the number of cores available.

While our current performance is quite good, we know there is more
potential in this architecture.  When we switched from a prototype
OpenGL driver to Mesa we regressed performance severely, some due to
interface issues that need tuning, some differences in shader code
generation, and some due to conformance and feature additions to the
core swr.  We are looking to recovering most of this performance back.

What's the conformance?
-----------------------

The major applications we are targeting are all based on the
Visualization Toolkit (VTK), and as such our development efforts have
been focused on making sure these work as best as possible.  Our
current code passes vtk's rendering tests with their new "OpenGL2"
(really OpenGL 3.2) backend at 99%.

piglit testing shows a much lower pass rate, roughly 80% at the time
of writing.  Core SWR undergoes rigorous unit testing and we are quite
confident in the rasterizer, and understand the areas where it
currently has issues (example: line rendering is done with triangles,
so doesn't match the strict line rendering rules).  The majority of
the piglit failures are errors in our driver layer interfacing Mesa
and SWR.  Fixing these issues is one of our major future development
goals.

Why are you open sourcing this?
-------------------------------

 * Our customers prefer open source, and allowing them to simply
   download the Mesa source and enable our driver makes life much
   easier for them.

 * The internal gallium APIs are not stable, so we'd like our driver
   to be visible for changes.

 * It's easier to work with the Mesa community when the source we're
   working with can be used as reference.

What are your development plans?
--------------------------------

 * Performance - see the performance section earlier for details.

 * Conformance - see the conformance section earlier for details.

 * Features - core SWR has a lot of functionality we have yet to
   expose through our driver, such as MSAA, geometry shaders, compute
   shaders, and tesselation.

 * AVX512 support

What is the licensing of the code?
----------------------------------

 * All code is under the normal Mesa MIT license.

Will this work on AMD?
----------------------

 * If using an AMD processor with AVX or AVX2, it should work though
   we don't have that hardware around to test.  Patches if needed
   would be welcome.

Will this work on ARM, MIPS, POWER, <other non-x86 architecture>?
-------------------------------------------------------------------------

 * Not without a lot of work.  We make extensive use of AVX and AVX2
   intrinsics in our code and the in-tree JIT creation.  It is not the
   intention for this codebase to support non-x86 architectures.

What hardware do I need?
------------------------

 * Any x86 processor with at least AVX (introduced in the Intel
   SandyBridge and AMD Bulldozer microarchitectures in 2011) will
   work.

 * You don't need a fire-breathing Xeon machine to work on SWR - we do
   day-to-day development with laptops and desktop CPUs.

Does one build work on both AVX and AVX2?
-----------------------------------------

Yes. The build system creates two shared libraries, ``libswrAVX.so`` and
``libswrAVX2.so``, and ``swr_create_screen()`` loads the appropriate one at
runtime.
Mesa 3D sources 2021-06-01 11:24:18 +05:00			`FAQ`
			`===`

			`Why another software rasterizer?`
			`--------------------------------`

			`Good question, given there are already three (swrast, softpipe,`
			`llvmpipe) in the Mesa tree. Two important reasons for this:`

			`* Architecture - given our focus on scientific visualization, our`
			`workloads are much different than the typical game; we have heavy`
			`vertex load and relatively simple shaders. In addition, the core`
			`counts of machines we run on are much higher. These parameters led`
			`to design decisions much different than llvmpipe.`

			`* Historical - Intel had developed a high performance software`
			`graphics stack for internal purposes. Later we adapted this`
			`graphics stack for use in visualization and decided to move forward`
			`with Mesa to provide a high quality API layer while at the same`
			`time benefiting from the excellent performance the software`
			`rasterizerizer gives us.`

			`What's the architecture?`
			`------------------------`

			`SWR is a tile based immediate mode renderer with a sort-free threading`
			`model which is arranged as a ring of queues. Each entry in the ring`
			`represents a draw context that contains all of the draw state and work`
			`queues. An API thread sets up each draw context and worker threads`
			`will execute both the frontend (vertex/geometry processing) and`
			`backend (fragment) work as required. The ring allows for backend`
			`threads to pull work in order. Large draws are split into chunks to`
			`allow vertex processing to happen in parallel, with the backend work`
			`pickup preserving draw ordering.`

			`Our pipeline uses just-in-time compiled code for the fetch shader that`
			`does vertex attribute gathering and AOS to SOA conversions, the vertex`
			`shader and fragment shaders, streamout, and fragment blending. SWR`
			`core also supports geometry and compute shaders but we haven't exposed`
			`them through our driver yet. The fetch shader, streamout, and blend is`
			`built internally to swr core using LLVM directly, while for the vertex`
			`and pixel shaders we reuse bits of llvmpipe from`
			``gallium/auxiliary/gallivm`` to build the kernels, which we wrap
			differently than llvmpipe's ``auxiliary/draw`` code.

			`What's the performance?`
			`-----------------------`

			`For the types of high-geometry workloads we're interested in, we are`
			`significantly faster than llvmpipe. This is to be expected, as`
			`llvmpipe only threads the fragment processing and not the geometry`
			`frontend. The performance advantage over llvmpipe roughly scales`
			`linearly with the number of cores available.`

			`While our current performance is quite good, we know there is more`
			`potential in this architecture. When we switched from a prototype`
			`OpenGL driver to Mesa we regressed performance severely, some due to`
			`interface issues that need tuning, some differences in shader code`
			`generation, and some due to conformance and feature additions to the`
			`core swr. We are looking to recovering most of this performance back.`

			`What's the conformance?`
			`-----------------------`

			`The major applications we are targeting are all based on the`
			`Visualization Toolkit (VTK), and as such our development efforts have`
			`been focused on making sure these work as best as possible. Our`
			`current code passes vtk's rendering tests with their new "OpenGL2"`
			`(really OpenGL 3.2) backend at 99%.`

			`piglit testing shows a much lower pass rate, roughly 80% at the time`
			`of writing. Core SWR undergoes rigorous unit testing and we are quite`
			`confident in the rasterizer, and understand the areas where it`
			`currently has issues (example: line rendering is done with triangles,`
			`so doesn't match the strict line rendering rules). The majority of`
			`the piglit failures are errors in our driver layer interfacing Mesa`
			`and SWR. Fixing these issues is one of our major future development`
			`goals.`

			`Why are you open sourcing this?`
			`-------------------------------`

			`* Our customers prefer open source, and allowing them to simply`
			`download the Mesa source and enable our driver makes life much`
			`easier for them.`

			`* The internal gallium APIs are not stable, so we'd like our driver`
			`to be visible for changes.`

			`* It's easier to work with the Mesa community when the source we're`
			`working with can be used as reference.`

			`What are your development plans?`
			`--------------------------------`

			`* Performance - see the performance section earlier for details.`

			`* Conformance - see the conformance section earlier for details.`

			`* Features - core SWR has a lot of functionality we have yet to`
			`expose through our driver, such as MSAA, geometry shaders, compute`
			`shaders, and tesselation.`

			`* AVX512 support`

			`What is the licensing of the code?`
			`----------------------------------`

			`* All code is under the normal Mesa MIT license.`

			`Will this work on AMD?`
			`----------------------`

			`* If using an AMD processor with AVX or AVX2, it should work though`
			`we don't have that hardware around to test. Patches if needed`
			`would be welcome.`

			`Will this work on ARM, MIPS, POWER, <other non-x86 architecture>?`
			`-------------------------------------------------------------------------`

			`* Not without a lot of work. We make extensive use of AVX and AVX2`
			`intrinsics in our code and the in-tree JIT creation. It is not the`
			`intention for this codebase to support non-x86 architectures.`

			`What hardware do I need?`
			`------------------------`

			`* Any x86 processor with at least AVX (introduced in the Intel`
			`SandyBridge and AMD Bulldozer microarchitectures in 2011) will`
			`work.`

			`* You don't need a fire-breathing Xeon machine to work on SWR - we do`
			`day-to-day development with laptops and desktop CPUs.`

			`Does one build work on both AVX and AVX2?`
			`-----------------------------------------`

			Yes. The build system creates two shared libraries, ``libswrAVX.so`` and
			``libswrAVX2.so``, and ``swr_create_screen()`` loads the appropriate one at
			`runtime.`