<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>Nathan Gauer</title>
		<description>Personal blog where I share some work and some thoughts.</description>
		<link>https://www.studiopixl.com</link>
		<atom:link href="https://www.studiopixl.com/feeds/rss" rel="self" type="application/rss+xml" />
		<lastBuildDate>Sat, 16 Nov 2024 00:00:00 +0000</lastBuildDate>
		
			<item>
				<title>Vulkan-ize Virglrenderer - experiment</title>
				<description>&lt;p&gt;&lt;a href=&quot;https://virgil3d.github.io/&quot;&gt;Virglrenderer&lt;/a&gt; provides OpenGL acceleration to a guest running on QEMU.&lt;/p&gt;

&lt;p&gt;My current GSoC project is to add support for the Vulkan API.&lt;/p&gt;

&lt;p&gt;Vulkan is drastically different to OpenGL. Thus, this addition is not straight-forward.
My current idea is to add an alternative path for Vulkan.
Currently, two different states are kept, one for OpenGL, and one for Vulkan.
Commands will either go to the OpenGL or Vulkan front-end.&lt;/p&gt;

&lt;p&gt;For now, only compute shaders are supported.
The work is divided in two parts: a Vulkan ICD in MESA, and a new front-end for Virgl and
vtest.&lt;/p&gt;

&lt;p&gt;If you have any feedback, do not hesitate !&lt;/p&gt;

&lt;p&gt;This experiment can be tested using this &lt;a href=&quot;https://github.com/Keenuts/vulkan-virgl&quot;&gt;repository&lt;/a&gt;.
If you have an Intel driver in use, you might be able to use the Dockerfile provided.&lt;/p&gt;

&lt;p&gt;Each part is also available independently:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/Keenuts/mesa&quot;&gt;MESA&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/Keenuts/virglrenderer&quot;&gt;VirglRenderer&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/Keenuts/vulkan-compute&quot;&gt;Vulkan compute sample&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
				<link>https://www.studiopixl.com/2018-07-24/vulkan-ize-virgl-experiment</link>
				<guid isPermaLink="true">https://www.studiopixl.com/2018-07-24/vulkan-ize-virgl-experiment</guid>
				<pubDate>Tue, 24 Jul 2018 00:00:00 +0000</pubDate>
				
					<category>libvirt</category>
				
					<category>gsoc</category>
				
					<category>graphics</category>
				
					<category>lse</category>
				
			</item>
		
			<item>
				<title>GSoC 2017 - 3D acceleration using VirtIOGPU</title>
				<description>&lt;p&gt;Several months ago started the GSoC 2017. Among all the projects available one got my attention:&lt;/p&gt;

&lt;center&gt; Add OpenGL support on a Windows guest using VirGL &lt;/center&gt;

&lt;hr /&gt;

&lt;p&gt;In a VM, to access real hardware, we have two methods: passthrough, and virtualization extensions (Intel VT-x, AMD-V..).
When it comes to GPUs possibilities drop down to one: passtrough.
Intel has a virtualization extension (GVT), but we want to support every devices.
Thus, we need to fall-back to a software based method.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Emulation ? Since we want 3D acceleration, better forget it&lt;/li&gt;
  &lt;li&gt;API-forwarding ? This means we need to have the same OpenGL API between guest host, also no.&lt;/li&gt;
  &lt;li&gt;Paravirtualization ? Yes !&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since a couple years, VirtIO devices became a good standard on QEMU. 
Then, Dave Airlie started to work on &lt;a href=&quot;https://virgil3d.github.io/&quot;&gt;VirGL&lt;/a&gt; and a VirtIO-gpu.
Both help provide a descent virtual-GPU which rely on the host graphic stack.&lt;/p&gt;

&lt;p&gt;This article will present VirtIO devices, and what kind of operations a guest can do using VirGL.&lt;/p&gt;

&lt;p&gt;I also invite you to read a &lt;a href=&quot;/2017-05-13/linux-graphic-stack-an-overview&quot;&gt;previous article I wrote about Linux’s graphic stack&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;virtio-devices&quot;&gt;VirtIO devices&lt;/h2&gt;

&lt;p&gt;Since we will use a VirtIO based device, let’s see how it works.
First, these devices behave as regular PCI devices. We have a config space, some dedicated memory, and interruptions.
Second very important point, VirtIO devices communicate with ring-buffers used as FIFO queues.
This device is entirely emulated in QEMU, and can realize DMA transfers by sharing common pages between the guest and the host.&lt;/p&gt;

&lt;h3 id=&quot;communication-queues&quot;&gt;Communication queues&lt;/h3&gt;

&lt;p&gt;On our v-gpu, we have 2 queues. One dedicated to the hardware cursor, and another for everything else.
To send a command in the queue, it goes like this:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;guest: allocate pages on the host&lt;/li&gt;
  &lt;li&gt;guest: send a header and pointers to our physical pages (guest POV)  in the ring buffer.&lt;/li&gt;
  &lt;li&gt;guest: send an interruption&lt;/li&gt;
  &lt;li&gt;VMExit&lt;/li&gt;
  &lt;li&gt;host: QEMU read our header and pointers. Translate addresses to match local virtual address range.&lt;/li&gt;
  &lt;li&gt;host: read the command, execute it&lt;/li&gt;
  &lt;li&gt;host: write back to ring buffer&lt;/li&gt;
  &lt;li&gt;host: send interruption&lt;/li&gt;
  &lt;li&gt;guest: handle interruption, read ring buffer and handle answer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/assets/posts/2017-virtio-communication.webp&quot; alt=&quot;virtio_device_communication&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;virgl&quot;&gt;VirGL&lt;/h3&gt;

&lt;p&gt;VirGL can be summed up as a simple state-machine, keeping track of resources, and translating command buffers to a sequence of OpenGL calls.
It exposes two kinds of commands: let’s say 2D and 3D.&lt;/p&gt;

&lt;p&gt;2D commands are mainly focused on resources management. We can allocate memory on the host by creating a 2D resource. Then initialize a DMA transfer by linking this resource’s memory areas to guest’s physical pages.
To ease resource management between applications on the guest, VirGL also adds a simple context feature. Resource creation is global, but to use them, you must attach them to the context.&lt;/p&gt;

&lt;p&gt;Then, 3D commands. These are close to what we can find in a API like Vulkan. We can setup a viewport, scissor state, create a VBO, and draw it.
Shaders are also supported, but we first need to translate them to TGSI; an assembly-like representation. Once on the host, they will be re-translated to GLSL and sent to OpenGL.&lt;/p&gt;

&lt;p&gt;You can find a part of the spec on this &lt;a href=&quot;https://github.com/Keenuts/virtio-gpu-documentation&quot;&gt;repository&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;opengl-on-windows&quot;&gt;OpenGL on Windows&lt;/h2&gt;

&lt;p&gt;Windows graphic stack can be decomposed as follows:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/posts/2017-windows-stack.webp&quot; alt=&quot;windows graphic stack&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Interresting parts are:&lt;/p&gt;

&lt;p&gt;OpenGL ICD (Installable client driver):&lt;/p&gt;
&lt;blockquote&gt;
  &lt;p&gt;This is our OpenGL implementation -&amp;gt; the state machine, which can speak to our kernel driver.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;GDI.dll:&lt;/p&gt;
&lt;blockquote&gt;
  &lt;p&gt;A simple syscall wrapper for us.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;D3D Subsystem:&lt;/p&gt;
&lt;blockquote&gt;
  &lt;p&gt;First part of the kernel graphic stack. It exposes a 3D and 2D API. Since we are not a licensed developer, let’s try to avoid this.
From the documentation, we have a some functions to bypass it: &lt;a href=&quot;https://msdn.microsoft.com/en-us/library/windows/hardware/ff559653(v=vs.85).aspx&quot;&gt;DxgkDdiEscape&lt;/a&gt; is one.
This functions takes a buffer, a size, and lets it pass trough this subsystem, directly to the underlying driver.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;DOD (Display Only Driver)&lt;/p&gt;
&lt;blockquote&gt;
  &lt;p&gt;Our kernel driver. This part will have to communicate to both kernel/ICD and VirtIO-gpu.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;opengl-state-tracker&quot;&gt;OpenGL State-Tracker&lt;/h2&gt;

&lt;p&gt;OpenGL rely on a state machine we have to implement. Let’s start by drawing on the frame-buffer.&lt;/p&gt;

&lt;p&gt;We start a new application, want to split it from the rest. So we start by creating a VirGL context.
Then create a 2D resource (800x600 RGBA seams great), and attach it to our VGL-context.&lt;/p&gt;

&lt;p&gt;We might want to draw something now. We have two options, either use the 3D command INLINE_WRITE, or DMA.
Using INLINE_WRITE means sending all our pixels through a VirtIO queue. So let’s use DMA !&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We start by allocating our memory pages on the guest.&lt;/li&gt;
  &lt;li&gt;Then, send physical addresses to VirGL (guest POV)&lt;/li&gt;
  &lt;li&gt;VirGL will translate PA addresses to local virtual addresses, and link these pages to our resource.&lt;/li&gt;
  &lt;li&gt;Back to the guest, we can write our pixels to the frame-buffer.&lt;/li&gt;
  &lt;li&gt;To notify the V-gpu, we use the TRANSFER_TO_HOST_2D command, which tells QEMU to sync resources.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, let’s draw some pixels on this frame-buffer.
We will need :&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;create an OpenGL context&lt;/li&gt;
  &lt;li&gt;setup our viewport and scissor settings (ie: screen limits)&lt;/li&gt;
  &lt;li&gt;create a VBO&lt;/li&gt;
  &lt;li&gt;link the VBO to a vertex/normals/color buffer&lt;/li&gt;
  &lt;li&gt;create vertex and frag shaders&lt;/li&gt;
  &lt;li&gt;setup a rasterizer&lt;/li&gt;
  &lt;li&gt;setup the frame-buffer to use the one we created earlier&lt;/li&gt;
  &lt;li&gt;create a constant buffer&lt;/li&gt;
  &lt;li&gt;send the draw call&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A 3D Command is a set of UINT32. The first one is used as a header, followed by N arguments.
A command buffer can contains several commands stacked together in one big UINT32 array.&lt;/p&gt;

&lt;p&gt;Earlier, we created resources in VGL-Contexts. Now we will need 3D objects.
These are created sending 3D commands, and are not shared between VGL contexts.
Once created, we have to bind them to the current opengl-context.&lt;/p&gt;

&lt;p&gt;Now, if everything goes well, we should be able to display something like that:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/posts/2017-opengl-virtio.webp&quot; alt=&quot;opengl in windows with qemu&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Once more, explaining all the commands would be uninteresting, but there is a spec for that !&lt;/p&gt;

&lt;p&gt;If you are still interested, here are couple of links:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://gist.github.com/Keenuts/199184f9a6d7a68d9a62cf0011147c0b&quot;&gt;GIST to present the project&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gitlab.com/spice/virtio-gpu-wddm/virtio-gpu-wddm-dod&quot;&gt;DOD Driver&lt;/a&gt;: The kernel driver needed on the Windows guest to communicate with the VirtIO-gpu&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/Keenuts/virtio-gpu-win-icd&quot;&gt;ICD Driver&lt;/a&gt;: opengl32.dll, the userland driver including a basic state-tracker:&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/Keenuts/virtio-gpu-documentation&quot;&gt;VirGL Reference&lt;/a&gt; : partial reference of VirGL 2D and 3D commands&lt;/li&gt;
&lt;/ul&gt;
</description>
				<link>https://www.studiopixl.com/2017-08-27/3d-acceleration-using-virtio</link>
				<guid isPermaLink="true">https://www.studiopixl.com/2017-08-27/3d-acceleration-using-virtio</guid>
				<pubDate>Sun, 27 Aug 2017 00:00:00 +0000</pubDate>
				
					<category>libvirt</category>
				
					<category>gsoc</category>
				
					<category>lse</category>
				
			</item>
		
			<item>
				<title>GSoC 2017 - API Forwarding</title>
				<description>&lt;p&gt;Writing an ICD is a problem in itself. Add to this Windows kernel interfaces, virtIO queues management, resources transfer between host and guest, and BOOM, you are lost.
This brings us to our first step: something not efficient but simpler, API Forwarding.&lt;/p&gt;

&lt;h2 id=&quot;tasks&quot;&gt;Tasks&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Hook OpenGL calls&lt;/li&gt;
  &lt;li&gt;Serialize function calls&lt;/li&gt;
  &lt;li&gt;Send them to the miniport-driver, then the host&lt;/li&gt;
  &lt;li&gt;De-serialize calls and execute them on the host.&lt;/li&gt;
  &lt;li&gt;Send some data back to the guest&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;realization&quot;&gt;Realization&lt;/h2&gt;

&lt;h3 id=&quot;icd&quot;&gt;ICD&lt;/h3&gt;
&lt;p&gt;The ICD part (Userland) is pretty straightforward. Make your own opengl32.dll, serialize the calls. Now find a sweet function in gdi32.dll to throw your mumbo-jumbo on the kernel side.
Fortunately, we have this:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;NTSTATUS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;APIENTRY&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;DxgkDdiEscape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;_In_&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HANDLE&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;hAdapter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;_In_&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DXGKARG_ESCAPE&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pEscape&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A beautiful function available on both DOD and full display driver. It takes a pointer on an userland buffer, and send it to our display driver.
Wait… userland buffer, no check, kernel part ? Mmmmm…. &lt;a href=&quot;https://googleprojectzero.blogspot.fr/2017/02/attacking-windows-nvidia-driver.html&quot;&gt;What could go wrong ?&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;kernel-part&quot;&gt;Kernel part&lt;/h3&gt;

&lt;p&gt;To initialize a display driver, you must call a function: &lt;a href=&quot;https://msdn.microsoft.com/en-us/library/windows/hardware/ff560824(v=vs.85).aspx&quot;&gt;DxgkInitialize&lt;/a&gt;
This function will take a big structure, containing function pointers to your driver.
For a display only driver, you will have a reduced set of function to implement. And for a full featured driver, well…&lt;/p&gt;

&lt;p&gt;Anyway, now the game is to run the driver, and see where we crash. Sadly we cannot just hope to add some functions, and run only using the working DOD code base. Windows wants something more, and the game is to find what, Yay !
Since we have a working DOD driver, let’s find how we could trick.&lt;/p&gt;

&lt;h3 id=&quot;icd--kernel-communication&quot;&gt;ICD &amp;lt;=&amp;gt; Kernel communication&lt;/h3&gt;

&lt;p&gt;We can register two type of driver: a DOD driver using &lt;strong&gt;DxgkInitializeDisplayOnlyDriver&lt;/strong&gt; and &lt;strong&gt;DxgkInitialize&lt;/strong&gt;.
Windows will then know which kind of features each driver can support (fine tune will be done using query callbacks).
Both drivers can implement &lt;strong&gt;DxgkDdiEscape&lt;/strong&gt;. Great, we will fool Windows and use this DOD as a fully featured 3D driver ! &lt;strong&gt;WRONG !&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Setup of the ICD part, sending everything through our escape functions ? check. But return values seams off.
After investigation, and any function taking a userland buffer, I came to a conclusion: OpenGL ICD part cannot communicate with a DOD driver. Windows knows we are display only, and fall-back our ICD calls on it’s own driver.&lt;/p&gt;

&lt;p&gt;So now, what’s the plan ?
Let’s put this problem aside, and try to focus on the real part: create proper commands for the host.&lt;/p&gt;

</description>
				<link>https://www.studiopixl.com/2017-05-13/gsoc-log2-api-forwarding</link>
				<guid isPermaLink="true">https://www.studiopixl.com/2017-05-13/gsoc-log2-api-forwarding</guid>
				<pubDate>Sat, 13 May 2017 00:00:00 +0000</pubDate>
				
					<category>libvirt</category>
				
					<category>gsoc</category>
				
					<category>lse</category>
				
			</item>
		
			<item>
				<title>GSoC 2017 - Project presentation</title>
				<description>&lt;p&gt;On my arrival at the lab, I started a &lt;em&gt;little&lt;/em&gt; project: working on a display only driver for Windows. A good way to start learning what was hidden under the hood of an OpenGL application.
&lt;a href=&quot;https://developers.google.com/open-source/gsoc/&quot;&gt;Google Summer of Code 2017&lt;/a&gt; arrived, and subject were published. Among these, QEMU’s ‘Windows Virgl driver’.
Great ! Let’s apply !&lt;/p&gt;

&lt;p&gt;Applications closed early April. I took a look at the already existing DOD driver &lt;a href=&quot;https://github.com/vrozenfe/virtio-gpu-win&quot;&gt;(non official repo)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;and also decided to learn a bit more about &lt;a href=&quot;/2017-05-05/first-steps-with-vulkan&quot;&gt;Vulkan&lt;/a&gt;.
Results came,  and I was selected, excellent !&lt;/p&gt;

&lt;h2 id=&quot;mission&quot;&gt;Mission&lt;/h2&gt;

&lt;p&gt;The idea is to bring 3d acceleration on Windows guests running with QEMU. Using VirtIO devices and &lt;a href=&quot;https://virgil3d.github.io&quot;&gt;Virgl3d&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;context&quot;&gt;Context&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;/assets/posts/2017-windows-stack.webp&quot; alt=&quot;windows stack&quot; /&gt;&lt;/p&gt;

&lt;p&gt;On this stack, we can work on three parts: opengl32.dll, ICD and Miniport driver.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;OpenGL32.dll is just  a dynamic library used to communicate with out runtime driver.&lt;/li&gt;
  &lt;li&gt;ICD: this is the OpenGL implementation. This part is the equivalent of Mesa on Linux.&lt;/li&gt;
  &lt;li&gt;Miniport-driver: this is the kernel driver. Hardware specific, we are going to do our hypercalls here.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;problems&quot;&gt;Problems&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Windows is not open source. We have some basic ideas about D3DKrnl subsystem behaviour, but nothing is certain.&lt;/li&gt;
  &lt;li&gt;To develop a complete OpenGL state tracker is a lot of work.&lt;/li&gt;
  &lt;li&gt;Virgl3D takes some calls, and bytecode for shaders, re-translate it to GLSL, and call OpenGL again. Which means we will do the same work twice. Once on the host, once on the guest.&lt;/li&gt;
&lt;/ul&gt;
</description>
				<link>https://www.studiopixl.com/2017-05-11/gsoc-log1-project-presentation</link>
				<guid isPermaLink="true">https://www.studiopixl.com/2017-05-11/gsoc-log1-project-presentation</guid>
				<pubDate>Thu, 11 May 2017 00:00:00 +0000</pubDate>
				
					<category>libvirt</category>
				
					<category>gsoc</category>
				
			</item>
		
	</channel>
</rss>
