forked from Qortal/Brooklyn
142 lines
5.6 KiB
ReStructuredText
142 lines
5.6 KiB
ReStructuredText
|
FAQ
|
||
|
===
|
||
|
|
||
|
Why another software rasterizer?
|
||
|
--------------------------------
|
||
|
|
||
|
Good question, given there are already three (swrast, softpipe,
|
||
|
llvmpipe) in the Mesa tree. Two important reasons for this:
|
||
|
|
||
|
* Architecture - given our focus on scientific visualization, our
|
||
|
workloads are much different than the typical game; we have heavy
|
||
|
vertex load and relatively simple shaders. In addition, the core
|
||
|
counts of machines we run on are much higher. These parameters led
|
||
|
to design decisions much different than llvmpipe.
|
||
|
|
||
|
* Historical - Intel had developed a high performance software
|
||
|
graphics stack for internal purposes. Later we adapted this
|
||
|
graphics stack for use in visualization and decided to move forward
|
||
|
with Mesa to provide a high quality API layer while at the same
|
||
|
time benefiting from the excellent performance the software
|
||
|
rasterizerizer gives us.
|
||
|
|
||
|
What's the architecture?
|
||
|
------------------------
|
||
|
|
||
|
SWR is a tile based immediate mode renderer with a sort-free threading
|
||
|
model which is arranged as a ring of queues. Each entry in the ring
|
||
|
represents a draw context that contains all of the draw state and work
|
||
|
queues. An API thread sets up each draw context and worker threads
|
||
|
will execute both the frontend (vertex/geometry processing) and
|
||
|
backend (fragment) work as required. The ring allows for backend
|
||
|
threads to pull work in order. Large draws are split into chunks to
|
||
|
allow vertex processing to happen in parallel, with the backend work
|
||
|
pickup preserving draw ordering.
|
||
|
|
||
|
Our pipeline uses just-in-time compiled code for the fetch shader that
|
||
|
does vertex attribute gathering and AOS to SOA conversions, the vertex
|
||
|
shader and fragment shaders, streamout, and fragment blending. SWR
|
||
|
core also supports geometry and compute shaders but we haven't exposed
|
||
|
them through our driver yet. The fetch shader, streamout, and blend is
|
||
|
built internally to swr core using LLVM directly, while for the vertex
|
||
|
and pixel shaders we reuse bits of llvmpipe from
|
||
|
``gallium/auxiliary/gallivm`` to build the kernels, which we wrap
|
||
|
differently than llvmpipe's ``auxiliary/draw`` code.
|
||
|
|
||
|
What's the performance?
|
||
|
-----------------------
|
||
|
|
||
|
For the types of high-geometry workloads we're interested in, we are
|
||
|
significantly faster than llvmpipe. This is to be expected, as
|
||
|
llvmpipe only threads the fragment processing and not the geometry
|
||
|
frontend. The performance advantage over llvmpipe roughly scales
|
||
|
linearly with the number of cores available.
|
||
|
|
||
|
While our current performance is quite good, we know there is more
|
||
|
potential in this architecture. When we switched from a prototype
|
||
|
OpenGL driver to Mesa we regressed performance severely, some due to
|
||
|
interface issues that need tuning, some differences in shader code
|
||
|
generation, and some due to conformance and feature additions to the
|
||
|
core swr. We are looking to recovering most of this performance back.
|
||
|
|
||
|
What's the conformance?
|
||
|
-----------------------
|
||
|
|
||
|
The major applications we are targeting are all based on the
|
||
|
Visualization Toolkit (VTK), and as such our development efforts have
|
||
|
been focused on making sure these work as best as possible. Our
|
||
|
current code passes vtk's rendering tests with their new "OpenGL2"
|
||
|
(really OpenGL 3.2) backend at 99%.
|
||
|
|
||
|
piglit testing shows a much lower pass rate, roughly 80% at the time
|
||
|
of writing. Core SWR undergoes rigorous unit testing and we are quite
|
||
|
confident in the rasterizer, and understand the areas where it
|
||
|
currently has issues (example: line rendering is done with triangles,
|
||
|
so doesn't match the strict line rendering rules). The majority of
|
||
|
the piglit failures are errors in our driver layer interfacing Mesa
|
||
|
and SWR. Fixing these issues is one of our major future development
|
||
|
goals.
|
||
|
|
||
|
Why are you open sourcing this?
|
||
|
-------------------------------
|
||
|
|
||
|
* Our customers prefer open source, and allowing them to simply
|
||
|
download the Mesa source and enable our driver makes life much
|
||
|
easier for them.
|
||
|
|
||
|
* The internal gallium APIs are not stable, so we'd like our driver
|
||
|
to be visible for changes.
|
||
|
|
||
|
* It's easier to work with the Mesa community when the source we're
|
||
|
working with can be used as reference.
|
||
|
|
||
|
What are your development plans?
|
||
|
--------------------------------
|
||
|
|
||
|
* Performance - see the performance section earlier for details.
|
||
|
|
||
|
* Conformance - see the conformance section earlier for details.
|
||
|
|
||
|
* Features - core SWR has a lot of functionality we have yet to
|
||
|
expose through our driver, such as MSAA, geometry shaders, compute
|
||
|
shaders, and tesselation.
|
||
|
|
||
|
* AVX512 support
|
||
|
|
||
|
What is the licensing of the code?
|
||
|
----------------------------------
|
||
|
|
||
|
* All code is under the normal Mesa MIT license.
|
||
|
|
||
|
Will this work on AMD?
|
||
|
----------------------
|
||
|
|
||
|
* If using an AMD processor with AVX or AVX2, it should work though
|
||
|
we don't have that hardware around to test. Patches if needed
|
||
|
would be welcome.
|
||
|
|
||
|
Will this work on ARM, MIPS, POWER, <other non-x86 architecture>?
|
||
|
-------------------------------------------------------------------------
|
||
|
|
||
|
* Not without a lot of work. We make extensive use of AVX and AVX2
|
||
|
intrinsics in our code and the in-tree JIT creation. It is not the
|
||
|
intention for this codebase to support non-x86 architectures.
|
||
|
|
||
|
What hardware do I need?
|
||
|
------------------------
|
||
|
|
||
|
* Any x86 processor with at least AVX (introduced in the Intel
|
||
|
SandyBridge and AMD Bulldozer microarchitectures in 2011) will
|
||
|
work.
|
||
|
|
||
|
* You don't need a fire-breathing Xeon machine to work on SWR - we do
|
||
|
day-to-day development with laptops and desktop CPUs.
|
||
|
|
||
|
Does one build work on both AVX and AVX2?
|
||
|
-----------------------------------------
|
||
|
|
||
|
Yes. The build system creates two shared libraries, ``libswrAVX.so`` and
|
||
|
``libswrAVX2.so``, and ``swr_create_screen()`` loads the appropriate one at
|
||
|
runtime.
|
||
|
|