include — All-Purpose Mat's Blog

PythonPlusPlus: Bridging Worlds with Polyglot Code

Sun, 28 Jan 2024 22:14:49 +0000

Picture this: you find yourself immersed in a new job, knee-deep in a C++ codebase, yet your heart yearns for the simplicity and elegance of Python syntax. What do you do? You don't just conform – you innovate, and boldly submit this to code review:

#include "pythonstart.h"

def greet(name):
    print("hello, " + name + "!")
    return

def greet2(name):
    print("how are you, " + name + "?")
    return

def bye():
    print("ok bye!")
    return

#include "pythonmid.h"

username = "Mat"

print("Hello from \"Python\"!")

greet(username)
greet2(username)
print("getting ready to say bye...")
bye()

#include "pythonend.h"

The first code review comes in, and it seems your contribution may be in jeopardy:

That's just Python code! It won't work with the rest of our C++ codebase!

Before they can reject your code, you sharply interject:

Hey! Did you even test my code?

With skepticism in the air, one of your brave teammates steps up and runs the code through a C++ compiler. To everyone's amazement, the result is identical to that of running it with Python! The code not only speaks Python, but fluently converses in C++:

$ g++ Python.cpp && ./a.out
Hello from "Python"!
hello, Mat!
how are you, Mat?
getting ready to say bye...
ok bye!

The commit is eventually merged, and your unconventional approach not only saves your job but earns you a place in the annals of the team's most memorable code submissions. You are also banned from touching that codebase again.

How it works: unraveling the enigma

This kind of program is termed as “polyglot,” which literally means “written in multiple languages.” The entire idea of writing one of these relies on finding intersections between the two (or more!) languages' syntax. In Python, a # signifies a comment, while in C++ (and C), it denotes a preprocessor directive. These lines are the key to the program. We'll see how the preprocessor works (and how we can abuse this!) in a little bit.

You'll notice the only preprocessor directives in our code are #include statements. Unlike other languages, C and C++ opt for a simple but effective solution to calling external library functions: copy-paste. Seriously.

When I write #include at the top of a C++ file, what actually happens is the entire file called “iostream” (installed system-wide as part of the C++ standard library) gets pasted by the preprocessor, residing now where that #include statement once was. You don't technically have to use the #include directive to get a C++ program calling library functions: you can get the same behavior by just copying the file's contents manually at the top of your code (but that's a terrible idea!).

For example, here are two C++ header files:

preamble.h:

int main()
{
    int retVal = 0;

postlude.h:

And our beautifully readable code:

#include "preamble.h"

for(int i = 0; i < 4; i++)
{
    retVal += 1;
}
return retVal;

#include "postlude.h"

Let's run our code through the standalone C/C++ Preprocessor cpp:

$ cpp code.cpp
# 1 "code.cpp"
# 1 "preamble.h" 1
int main()
{
    int retVal = 0;
# 2 "code.cpp" 2

for(int i = 0; i < 4; i++)
{
    retVal += 1;
}
return retVal;

# 1 "postlude.h" 1
}
# 10 "code.cpp" 2

You can see the preprocessor outputs a bunch of lines starting with #. These are a kind of comment meant for us puny humans to understand exactly what the preprocessor did. The first number indicates the line number, and the string in quotes is the filename. The optional number at the end of the line represents a flag, where 1 means it's the start of an include, and 2 means we are returning to a file after an include is done. You can find the full docs here. You can see we start at line 1 in code.cpp, which then includes preamble.h. The contents of preamble.h follow, and afterwards we return back to code.cpp. So on and so forth, finally copy-pasting together an amalgamate program that consists of a simple main function that returns 4.

The preprocessor is a very powerful tool, and as long as the final text that is passed to the compiler is valid, anything goes!

Let's break down the polyglot program from the start of the post:

Functions

In Python, functions are defined as follows:

def greet(name):
    print("hello, " + name + "!")

Somehow, we need to translate this into working C++ just through the preprocessor. Because Python allows declaring functions anywhere, but C++ does not, we can use a function pointer instead. C++ has a neat trick here called a lambda, which allows us to define unnamed functions inline, and it's perfect to have our pointer point to.

Armed with this knowledge, we can use #define to create a macro that will turn def into auto (a special C++ keyword that deduces the type of a variable based on what's assigned to it), and another macro that turns greet(name) into our lambda definition:

#define def auto 
#define greet(arg) greet = [](std::string arg) {

Applying this to our Python function from above gets us some of the way there

auto greet = [](std::string arg) {:
    print("hello, " + name + "!")

We still have to handle that pesky : that Python requires at the end of function declarations. Now, where does C++ have a :... aha! The revered ternary operator, that everybody totally loves! Its syntax is as follows: condition ? truthy : falsy. We don't care about the logic here, we just want that sweet : character, so we can add the most cursed ternary expression I've ever written to the end of the greet macro:

#define greet(arg) greet = [](std::string arg) { false?false

Running the preprocessor through our function, we get the following:

auto greet = [](std::string name) { false?false:
    print("hello, " + name + "!")

That's some good progress! There's three main issues left: – the ternary operator is left hanging there. We need a “falsy” value for this thrilling and definitely-very-useful comparison to compile. – there's no print function in C++ (this project was conceived before std::print was added to C++23). – we need to close that dangling curly bracket and add a semicolon at the end of our lambda, somehow.

Implementing the print function can be done with a simple function-style macro that just plops the argument into std::cout. This only works for simple prints, but I'm not going for anything more here :)

Additionally, we can knock the unfinished ternary issue out by adding a stray false; at the beginning. Usually this will just do nothing as it just gets discarded, but in the case that a print occurs right after a function definition, it will complete the ternary operator. Hooray!

#define print(a) false;std::cout << (a) << std::endl;

Now for closing the function... there are no keywords left we can use here. I haven't found a way to make this work consistently without polluting the print macro with closing brackets that would cause it to break if used more than once or outside of a function. Thankfully, Python has a return keyword we can add without changing the behavior of the function:

def greet(name):
    print("hello, " + name + "!")
    return

Then on the C++ side, we can redefine it to close our lambda!

#define return return; };

Finally, our simple function now preprocesses to this valid albeit cursed C++ code:

auto greet = [](std::string name) { false?false:
    false;std::cout << ("hello, " + name + "!") << std::endl;
    return; };

“int main”

Here's the next bit we have to tackle, after the function definitions:

username = "Mat"

print("Hello from \"Python\"!")

greet(username)
greet2(username)
print("getting ready to say bye...")
bye()

This code was given in one of my university courses to showcase the basics of Python. In it, we create a variable, call a couple functions, and call it a day.

Python allows writing code willy-nilly outside of any function, but in C++ this is not exactly the case, especially if we need to call library functions. Our print statements must reside inside the main function. We can have our initial header (the one with all the function macros) also start the main function by adding a lone int main() { at the end of it. We also need a header at the end with the sole purpose of closing that opening bracket:

pythonstart.h:

#define greet(arg) greet = [](std::string arg) { false?false
#define print(a) false;std::cout << (a) << std::endl; 
#define return return; };

// start the main function (will be closed by pythonend.h)
int main() {

pythonend.h (thrilling):

The code

Looking at the first lines of the actual code, a lot of stuff is missing for it to work in C++:

username = "Mat"
print("Hello from \"Python\"!")

The first obvious issue is that C++ requires types, while Python does not. We will need a pythonmid.h header to plop a std::string in there and so tell username its type:

pythonmid.h:

#define username std::string username

Then, oh no! Our print macro-function inserts a stray false right after my string literal, causing a compile error! We must redefine print to remove the false prefix, but keep the lone semicolon as it can serve to punctuate the username declaration:

#undef print
#define print(a) ;std::cout << (a) << std::endl;

Finally, the function calls:

greet(username)
greet2(username)
print("getting ready to say bye...")
bye()

In short, every function must be redefined to expand into a call rather than a declaration, like so:

#undef greet
#define greet(name) greet(name);

That's it!

And there we go! Here's the full “Python” file from the start of this post, put through the C++ preprocessor:

// -snip- the entire contents of the  and  C++ headers
# 2 "pythonstart.h" 2
# 15 "pythonstart.h"

# 15 "pythonstart.h"
int main() {
# 4 "Python.cpp" 2


auto greet = [](std::string name) { false?false:
    false;std::cout << ("hello, " + name + "!") << std::endl;
    return; };

auto greet2 = [](std::string name) { false?false:
    false;std::cout << ("how are you, " + name + "?") << std::endl;
    return; };

auto bye = []() { false?false:
    false;std::cout << ("ok bye!") << std::endl;
    return; };

# 1 "pythonmid.h" 1
# 25 "pythonmid.h"
std::string
# 20 "Python.cpp" 2

username = "Mat"

;std::cout << ("Running \"Python\"!") << std::endl;

greet(username);
greet2(username);
;std::cout << ("getting ready to say bye...") << std::endl;
bye();

# 1 "pythonend.h" 1
}
# 31 "Python.cpp" 2

You can find the full sources on my Gitea.

I hope this was a fun introduction to polyglot programming! It's usually filled with crazy hacks like these, and thus can be very fun whilst being immensely impractical, but believe me: it has its uses!

While researching for a different project in 2020, I came across this perfect example: Cosmopolitan is a project that allows C programs to build to an “actually portable executable”: a file that runs simultaneously on Linux, MacOS, Windows, FreeBSD, OpenBSD, NetBSD, and can also directly boot from the BIOS. I recommend Justine's blog post for a fascinating read!

Thanks for reading! Feel free to contact me if you have any suggestions or comments. Find me on Mastodon and Matrix.

You can follow the blog through: – ActivityPub by inputting @mat@blog.allpurposem.at – RSS/Atom: Copy this link into your reader: https://blog.allpurposem.at

My website: https://allpurposem.at

The vector::reserve fallacy

Fri, 27 Oct 2023 21:53:41 +0000

While reading through some code I wrote for a raytracing assignment, I noticed a peculiar function that had never caused any issues, but really looked like it should. After asking a bunch of people, I present this blog post to you!

Ah, C++ standard containers. So delightfully intuitive to work with. The most versatile has to be std::vector, whose job is to wrap a dynamic “C-style” array and manage its capacity for us as we grow and shrink the vector's size. We can simply call push_back on the vector to add as many elements as we want, and the vector will grow its capacity when needed to fit our new elements.

If you understand how a std::vector works, feel free to skip to the code.

But is it that simple?

Resizing the vector's internal array is not cheap! It incurs allocating a whole new (bigger) block of memory, copying all the elements to it, and finally freeing the old block (note that this copy may be a move, see here). Because we add elements one by one, this would trigger a lot of resizes, as the vector keeps having to guess how many elements we plan to add and reallocating a bigger and bigger internal array every time we push_back past its capacity! So, a conforming std::vector implementation will usually try to get ahead of us and secretly allocate a bigger block when it sees we start pushing to it, and then it can just keep track of the size of the vector (how many elements we've pushed to it) separately from its capacity (how many elements it can grow to before it needs to resize the internal array again).

std::vector kindly exposes this internal functionality to us through some functions. For example, the capacity() function returns the current capacity of the vector's internal array. If we know the size it will grow up to ahead of time, we can use the reserve(size_type capacity) function to have it pre-allocate this capacity for us. This avoids reallocating a lot when doing a bunch of push_backs, which can let us gain a precious bit of performance (see the example here for some actual numbers).

The code

Now that we understand std::vector::reserve, let's take a look at some C++:

std::vector myVec{}; // create a vector of size 0
myVec.reserve(1); // reserve a capacity of 1
myVec[0] = 42; // write 42 to the first element of our empty(!!) vector
std::cout << myVec[0];

When run, the above prints 42. I hope I'm not the only one who's surprised this works! I'm overwriting the value of the first element in a vector... which has no elements. This is an out of bounds write, and should definitely not work. Not only that, but on my machine I can replace index 0 with up to index 15187 and it still works fine! Index 15188 segfaults, though, so at least that's sane behavior (so long as I get far enough away from the start of the vector...). So what the peck is going on??

The peck (it's going on)

Okay, okay, I'll say the thing. We've found what in C++ is called “undefined behavior” (UB). This is a magical realm where anything could happen. Your computer might replace every window title with your username, or your program might send an order to all pizza restaurants in a 5km radius. If you're lucky, your program will just crash. More likely though, your code will do exactly what you intended it to do, and either subtly break something later on, or never signal anything on your machine... and break on someone else's.

Why is this undefined behavior, you ask? We told our vector to reserve a size of 1, so 0 is a perfectly valid index in the its internal array. However, the C++ standard never states that vector should have an internal array! It only asks for vector implementations to be able to grow and shrink, and for reserve() to “ensure a capacity” up to which no reallocations need to happen.

NOTE: after lots of research (and asking the smart folks of the #include C++ community), I've been unable to find an implementation where this does break. That doesn't mean it's okay to rely on this behavior! It's still UB!

Why it works for us

Despite this being undefined behavior, it works consistently in my program. Why is this? When we run the line myVec[0] = 42, the std::vector::operator[] function is called with an argument of 0, to return a reference to the location in memory at index 0 for this vector. Let's look at the source code for this function in GCC's libstdc++ (which I used for my testing, though the same issue applies on clang and MSVC):

/**
 *  @brief  Subscript access to the data contained in the %vector.
 *  @param __n The index of the element for which data should be
 *  accessed.
 *  @return  Read/write reference to data.
 *
 *  This operator allows for easy, array-style, data access.
 *  Note that data access with this operator is unchecked and
 *  out_of_range lookups are not defined. (For checked lookups
 *  see at().)
 */
_GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
reference
operator[](size_type __n) _GLIBCXX_NOEXCEPT
{
    __glibcxx_requires_subscript(__n);
    return *(this->_M_impl._M_start + __n);
}

Looking past all the macros (the subscript thing expands to an empty line by default, we'll look into it later), this simply takes the pointer to the start of the internal array (_M_impl._M_start), adds our argument __n, and returns it as a reference. As long as _M_start points to some valid allocated address, we should be fine accessing it within bounds of the array (note, of course, that this is only true for this implementation of libstdc++! Other implementations may do different things; we're in UB-land here). This explains why our index outside of the vector's size worked: we're indexing the internal array, not the vector! As long as we call reserve on the vector first, and our index is within that reserved array's size the data should be perfectly okay being written to and read from an out-of-bounds-but-within-capacity index of a vector (on this specific version of GCC's libstdc++). If we remove the myVec.reserve(1) line, the program does crash as expected, since _M_impl._M_start is not initialized and thus points to invalid memory.

Array out of bounds

The reason why accessing an index higher than the array's size works is covered here, but a tl;dr is that you are indeed overwriting memory you shouldn't be, and by chance nothing bad is happening. If we run it through the valgrind memory error detector, it indeed detects our error for any index outside the array. Here's the log for a write at index 1, after a call to reserve(1):

Invalid write of size 4
   at 0x1091FC: main (ub.cpp:8)
 Address 0x4e21084 is 0 bytes after a block of size 4 alloc'd
   at 0x4841F11: operator new(unsigned long) (vg_replace_malloc.c:434)
   by 0x109825: std::__new_allocator::allocate(unsigned long, void const*) (new_allocator.h:147)
   by 0x109604: allocate (alloc_traits.h:482)
   by 0x109604: std::_Vector_base >::_M_allocate(unsigned long) (stl_vector.h:378)
   by 0x1093FF: std::vector >::reserve(unsigned long) (vector.tcc:79)
   by 0x1091EA: main (ub.cpp:6)

Let's dissect this output: 1. The first line indicates that we wrote 4 bytes somewhere that's “invalid.” That's the size of a 64-bit int, which is the type we're writing into index 1. 2. The big call stack tells us where the array that we're accessing out of bounds was allocated. The penultimate line points us to that std::vector::reserve call we made, which creates a “block of size 4” (the vector's internal array, with the capacity for a single 4-byte int).

This indicates that we are indeed accessing the internal array out of bounds, and that it is a memory error that will cause UB even on this implementation of std::vector. So that answers that!

Speed at the cost of safety

Although on my GCC install, using this as actual storage “works” “fine,” it has... issues. When we try to do a range-based loop, it will never get the elements we wrote out of bounds. If the vector gets copied, it will only bring over the data within its size, and leave behind everything else. These kinds of issues would be super hard to diagnose had I not spotted the UB here!

Shouldn't std::vector::operator[] warn us that we're accessing an element outside of the vector's size? Let's check the C++ standard on vector functions.

Only at() performs range checking. If the index is out of range, at() throws an out_of_range exception. All other functions do not check.

- The C++ Standard Library: A Tutorial and Reference by Nicolai M. Josuttis (2012), pages 274-275

Well, darn. I can understand why, though. When writing code in C++, we expect to have the lowest possible performance overhead, yet still get to use all these nice abstractions. Performing bounds checks, even if cheap, can really add up if we have to do it for every vector access. Changing it to at(0) does indeed print a (relatively) helpful crash message:

terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 1) >= this->size() (which is 0)

As I was writing this, an excellent relevant post by @saagar@saagarjha.com graced my Mastodon timeline:

Download the video. Original source.

That's not all, though! Remember that curious __glibcxx_requires_subscript(__n); macro in the GCC implementation of operator[], which I said we'd look at later? Now is before's later, so let's take a look at the definition:

#ifndef _GLIBCXX_ASSERTIONS
  # define __glibcxx_requires_subscript(_N)
#else
  # define __glibcxx_requires_subscript(_N)	\
  __glibcxx_assert(_N < this->size())
#endif

So it does do something! You just have to have _GLIBCXX_ASSERTIONS defined. Indeed, if we define that macro with the -D_GLIBCXX_ASSERTIONS compiler flag, we get this wonderful totally-readable error when the code tries to index out of bounds:

/usr/include/c++/13.2.1/bits/stl_vector.h:1125: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = int; _Alloc = std::allocator; reference = int&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.

Okay, it's no “you're accessing this vector out of bounds, please stop,” but it certainly is better than dealing with the potential mess of undefined behavior that awaits otherwise. I guess I'll be adding this flag to all my debug builds from now on!

If you're curious, this is my original code where I found the issue.

Thanks for reading! Feel free to contact me if you have any suggestions or comments. Find me on Mastodon and Matrix.

You can follow the blog through: – ActivityPub by inputting @mat@blog.allpurposem.at – RSS/Atom: Copy this link into your reader: https://blog.allpurposem.at

My website: https://allpurposem.at

Adventures cross-compiling a Windows game engine

Thu, 21 Sep 2023 20:20:48 +0000

As part of my game development major at DAE, I have to work on several projects which were not made with support for my platform of choice (Linux). Thankfully, most of these have been simple frameworks wrapping around SDL and OpenGL, so my job was limited to rewriting the build system from Visual Studio's .sln project file to a cross-platform CMake project (and fixing some bugs along the way). Not too bad. I'd miss the beginning of the first class, but was up and going shortly after. Among these were the first two semesters of Programming. Here's a list of school engines I have ported so far:

Programming 1 “SDL Framework”: https://git.allpurposem.at/mat/SDL-Framework
Programming 2 “GameDevEngine2”: https://git.allpurposem.at/mat/GameDevEngine2
Graphics Programming “RayTracer”: https://git.allpurposem.at/mat/GraphicsProg1
Gameplay Programming “_FRAMEWORK”: https://git.allpurposem.at/mat/GameplayProg

The versatility of having a cross-platform project allowed me to add tons of niceties for some of these. The one I'm most happy with is the “GameDevEngine2” framework from Programming 2, to which I added web support and ended up using it for my and 2FoamBoards's entry in the 2023 GMTK game jam.

Programming 3

I'd been having it easy. A couple nonstandard Microsoft Visual C++ (MSVC) bits of syntax here, a couple win32 API calls (functions that are specific to Windows) there... I wasn't expecting what arrived in my downloads folder today. I applied my usual CMake boilerplate, with SDL support, hit run to see the perhaps 50-100 errors... and instead was greeted with a simple but effective singular error.

apm@apg ~/S/Prog3 (main)> clang++ source/GameWinMain.cpp 
In file included from source/GameWinMain.cpp:9:
source/GameWinMain.h:12:10: fatal error: 'windows.h' file not found
#include 
         ^~~~~~~~~~~
1 error generated.

Oh, no

There's no SDL. There's no OpenGL. No GLFW, Qt, or GTK. It's all bare Windows API calls. I think I was in some form of state of disbelief, as I spent the next 30 minutes slowly creating #defines and typedefs to patch in all the types. Maybe, just maybe, I could patch around the types and it would magically open a window and I could get started with my classwork. No such thing happened.

Options

So: what are my options? Is this salvageable, without having to boot the dreaded virtual machine? Let's see... I could:

continue patching around the 3-4k lines of win32 API calls like I was ineffectively doing before
rewrite the engine from scratch to support SDL
build the native .sln file by somehow running MSVC on WINE (a Windows compatibility layer for Linux)
cross-compile from Linux to Windows and run the .exe file with WINE

Obviously the first two options would be preferable, as they don't come with a hard dependency on the unfamiliar world of WINE. However, they sadly also take the most time. I have not yet discarded the second option (the author of the engine gave me the green light to rewrite it for native Linux, and even use it in exams (that's a first!!)), but as I have to follow the class from the start, I think I'll be going with WINE.

`aur/msvc-wine-git`

Of course, I'm not the first person to want to build a .sln project from Linux. This appears to be a solved problem, with the polished-looking msvc-wine toolchain available as a native package for my distro. So I went ahead and installed it:

apm@apg ~/S/Prog3 (main)> gimme msvc-wine-git
[sudo] password for apm: 
:: Resolving dependencies...
:: Calculating conflicts...
:: Calculating inner conflicts...

Aur (1) msvc-wine-git-17.7.r4-2

:: Proceed to review? [Y/n]:

It diligently fetched MSVC, the Windows 11 SDK, and all the necessary components from Microsoft's servers, while I had time to read the documentation. I happened upon the CMake instructions, which is how I've managed all my school-related projects so far, and it didn't stick in my brain. I don't intend to criticize the writing, but something about it being all the way in the bottom in a FAQ, with no code blocks or example commands, or having a class going on around me while I was doing this prevented me from understanding how I'm supposed to use it. The only time I've ever used a separate toolchain was Emscripten; it provides a nice little emcmake wrapper for CMake which takes care of a lot of the details for you. I gave it a few tries, but seeing I was getting nowhere, and every second was lost class time, I decided to move on to my last option.

LLVM

I knew a little about LLVM before this, from having used clangd as my language server for C++ projects. As I understand it, it's a group of compilers designed in such a way that the “frontends” (which read the text code and output an intermediate language) and “backends” (read intermediate language and output the final binary) are swappable and interchangeable. This means you can use the same backend to compile both C++ and Rust code, while still getting equally well-optimized machine code out the other side. I enlisted the help of @JohnyTheCarrot@toot.community, who I knew has worked with clang before. He told me about the concept of an “LLVM triple”, which is a setting for LLVM compilers that tells it what sort of machine you want it to output code for. Crucially, you can specify a triplet for a completely different system than your own, and it should still work. I tried the following command:

clang++ -target x86_64-w64-mingw32 source/*.cpp -o game

This currently outputs 227 linker errors. I know there were many syntax-related compiler errors which I've since fixed, but it does get us past the dreaded #include windows.h! All of the linker errors take the following form:

/usr/bin/x86_64-w64-mingw32-ld: /tmp/GameEngine-ac27d8.o:GameEngine.cpp:(.text+0xc95f): undefined reference to `__imp_DeleteObject

Fun with the linker

Each of these is related to a call of a Windows-related function. It looks like we're missing the libraries! Adding the -mwindows flag tells Clang it's compiling & linking a GUI Windows app, instead of a command line one. This causes linking against a lot of win32 GUI-related functions, reducing the linker errors to a mere 9. There's two kinds:

__imp_AlphaBlend and __imp_TransparentBlt According to the code, these are used for transparency. I have yet to use this engine, but from the names I'm guessing they allow for drawing semi-opaque images on top of each other and blend the colors together. According to Microsoft's documentation, these are located in Msimg32.dll.
__imp_mciSendStringA These are functions from the defunct Multimedia Control Interface (that's the mci at the start of the name!), which this engine uses to play audio. Microsoft helpfully kept the legacy documentation online, informing me that these belong to Winmm.dll.

At first, I assumed I'd have to get these from a copy of Windows. However, I remembered WINE has a lot of open source reimplementations of these DLLs (Windows's version of .so shared libraries), and sure enough locate msimg32.dll (note the lowercase: I wasted some time with this because Linux is case sensitive, while Windows is not!) pointed me straight to a DLL I could yoink. I added it to the list of files to compile, and the msimg32-related linker errors were gone. Hooray!

...or so I thought. I excitedly copied in winmm.dll and tried to compile...

clang-16: error: unable to execute command: Segmentation fault (core dumped)
clang-16: error: linker command failed due to signal (use -v to see invocation)

Excuse me?? The linker is segfaulting?? To be honest, I have no idea whether this is an actual bug in LLVM's linker, but it sure did stump me for a while. I thought maybe my copy of winmm.dll was corrupt, or WINE did something weird with it. I went as far as downloading Microsoft's version of the DLL, but was met with the same sad message. What could I be possibly doing wrong?

Oh. I'm not supposed to be copying the DLLs into here, am I? The last time I used a linker without going through CMake, I was passing libraries to it was -l. But it can't be that easy for this... can it? It'd have to go to my default WINE prefix to fetch them, which sounds plain weird. Libraries come from system paths, not user-specific folders. Well, might be worth a try anyways...

apm@apg ~/S/P/build (main)> clang++ -mwindows -target x86_64-w64-mingw32 ../source/*.cpp -o game -lmsimg32 -lwinmm
In file included from ../source/GameWinMain.cpp:10:
../source/GameEngine.h:19:9: warning: '_WIN32_WINNT' macro redefined [-Wmacro-redefined]
#define _WIN32_WINNT 0x0A00                             // Windows 10
        ^
/usr/x86_64-w64-mingw32/include/_mingw.h:239:9: note: previous definition is here
#define _WIN32_WINNT 0xa00
        ^
1 warning generated.
Warning: corrupt .drectve at end of def file
Warning: corrupt .drectve at end of def file
Warning: corrupt .drectve at end of def file
apm@apg ~/S/P/build (main)> ls
game.exe*

wait. That built?? HUH???? There's no way it—

apm@apg ~/S/P/build (main)> ./game.exe
-snip-
0130:err:module:import_dll Library libgcc_s_seh-1.dll (which is needed by L"Z:\\home\\apm\\School\\Prog3\\build\\game.exe") not found
0130:err:module:import_dll Library libstdc++-6.dll (which is needed by L"Z:\\home\\apm\\School\\Prog3\\build\\game.exe") not found
0130:err:module:LdrInitializeThunk Importing dlls for L"Z:\\home\\apm\\School\\Prog3\\build\\game.exe" failed, status c0000135

Right. Not so fast, heh. Still, this is great news! I don't know how or why this works, but we're linking to the DLLs somehow somewhere. WINE can't find some mingw32 libraries which were pulled in by -mwindows, but we can easily point it to them with export WINEPATH="/usr/x86_64-w64-mingw32/bin"

And that's it! Here's the engine in all its glory, with audio support and all! It's beautiful...

Right, there's nothing built on it yet. It's just a blank canvas. But hey, it doesn't crash!

What's next?

Having this run through WINE does come with a few limitations:

All WINE apps take a long while to launch, though you can vastly improve this by running wineserver --persistent beforehand.
Usually, I attach gdb (the GNU debugger) to my code from my IDE, neovim. However, with this program running under WINE, I don't know how I would do that. Debugging remains an unsolved mystery (EDIT: see Addendum, I figured it out!).
WINE is slowly merging Wayland support, but at the moment it runs under X11, meaning I'm sacrificing some performance and convenience.
Finally, of course, this will never have Linux support. I don't like that.

Long-term, depending on the course workload and how complex the engine functions end up being, I think I will rewrite it in SDL. This will have the added bonus of enabling, like with my other engine ports, web support (see my Programming 2 end project here and a game jam game made in the same engine here). However, I think this will take longer than I think is reasonable to spend while procrastinating on other classes, so I'm leaving it here. I wrote down my process while it was still fresh in my mind, so I hope this was an interesting read! As always, any and all constructive feedback is welcome directed to me: @mat@mastodon.gamedev.place .

I am considering writing up my general porting process in a separate blog post, so perhaps expect that next!

Addendum

After doing some additional research, and asking around in the very helpful WineHQ IRC room, I found a way to get debugging working! The first step is adding the -g flag to the clang++ invocation, which tells clang we want it to generate debug information (namely source maps, so the debugger can show which line of code we're at). Then I simply have to run winedbg --gdb game.exe, and I am presented with a (nearly) full-featured gdb prompt!

I'm unsure how to hook this up to neovim (maybe I can look into the Debug Adapter Protocol for this?), but for now just having a gdb environment is awesome enough. Unto more adventures!

Thanks for reading! Feel free to contact me if you have any suggestions or comments. Find me on Mastodon and Matrix.

You can follow the blog through: – ActivityPub by inputting @mat@blog.allpurposem.at – RSS/Atom: Copy this link into your reader: https://blog.allpurposem.at

My website: https://allpurposem.at