...inside matrices (gotcha!).
Created on 11/22/2024, updated on 12/07/2024
Today I had a course of informatic programming (it's just C course actually). There was a strange exercise we had to do: from a matrix, create a new one telling for each cell how many neighbors which share the same value they have.
Easy at a quick glance, but harder when thinking about edge cases.
First, I took a paper, because it's always easier for me to visualize by writing and drawing.
There are four edge cases concerning the analyzed cell :
-
It is on the first line.
-
It is on the first column.
-
It is on the last line.
-
It is on the last column.
Four more edge cases happen when you think about corners.
So I used the rough way : checking each edge case, sum the boolean results, then remove one, since we don't want the checked value to be counted as a neighbor of itself.
I define this algorithm inside a function declared as this : size_t countNeighbors(int **matrix, size_t m, size_t n, size_t i, size_t j)
.
m
corresponds to rows and
n
to columns. I could have created a struct, but I already had done this
here.
size_t countNeighbors(int **matrix, size_t m, size_t n, size_t i, size_t j)
{
const int a = matrix[i][j];
const size_t up = (i != 0 && matrix[i-1][j] == a);
const size_t down = (i != m-1 && matrix[i+1][j] == a);
const size_t left = (j != 0 && matrix[i][j-1] == a);
const size_t right = (j != n-1 && matrix[i][j+1] == a);
const size_t up_left = (i != 0 && j != 0 && matrix[i-1][j-1] == a);
const size_t up_right = (i != 0 && j != n-1 && matrix[i-1][j+1] == a);
const size_t down_left = (i != m-1 && j != 0 && matrix[i+1][j-1] == a);
const size_t down_right = (i != m-1 && j != n-1 && matrix[i+1][j+1] == a);
return up + down + left + right + up_left + up_right + down_left + down_right;
}
It's definitely ugly, but it works, and is somewhat readable.
To do so I created a MatrixInt
class, since I didn't want to bother with templates to simplify the algorithm. Later we will see the templated version, making sure the Matrix only accepts numeric types.
Instead an int **
, I'll be using a std::vector
to not worry about memory handling or construction.
class MatrixInt {
std::vector<std::vector<int>> m_elems;
size_t m_rows, m_columns;
}
I'll be using std::optional<int>
to safely get elements from the matrix, which in my opinion will make the code more readable than try-catching exceptions. This also means that all the following pieces of code will work starting from C++17, and some of them starting from C++20.
class MatrixInt {
MatrixInt(size_t m, size_t n) : m_elems(m, std::vector<int>(n)), m_rows{m}, m_columns{n} {}
// Hiding operator methods...
std::optional<int> at(int i, int j) const
{
if (0 <= i && i < m_rows && 0 <= j && j < m_columns) {
return m_elems[i][j];
}
return std::nullopt;
}
size_t countNeighborsAt(int i, int j) const
{
size_t result = 0;
const int a = m_elems[i][j];
// I didn't know this syntax before coding this program.
for (int k : {-1, 0, 1}) {
for (int l : {-1, 0, 1}) {
result += (this->at(i+k, j+l) == a);
}
}
return result - 1;
}
MatrixInt neighbors() const
{
MatrixInt result(m_rows, m_columns);
for (int i = 0; i < m_rows; ++i) {
for (int j = 0; j < m_columns; ++j) {
result.m_elems[i][j] = this->countNeighborsAt(i, j);
}
}
return result;
}
};
You can now use the double for
loop going from -1 to 1. The only concession here is that I have to use int
s instead of size_t
, because the at
method must accept negative parameters.
Why do you put your {
in a separate line only when defining a function or a method?
I follow the coding style of
suckless (at least most of it). It emphasizes the fact that in standard C and C++, you cannot nest functions.
But hey, wouldn't it be fun if we were using a more modern version of C++ (like C++20) and making the last method more pythonic?
#include <ranges>
class MatrixInt {
// ...
MatrixInt neighbors() const
{
#define ifor(instructions) for (size_t instructions)
#define in :
auto range = std::views::iota;
MatrixInt result(m_rows, m_columns);
ifor (i in range(0ull, m_rows)) {
ifor (j in range(0ull, m_columns)) {
result.m_elems[i][j] = this->countNeighborsAt(i, j);
}
}
return result;
}
}
Yeah, definitely. Please don't do this in production code.
Can't we write range(0, m_rows)
?
If you do so, you'll get an unclear error about substitution failure.
matrix.cpp:58:16: error: no matching function for call to object of type '__iota::__fn'
58 | ifor (i in range(0, m_rows)) {
| ^~~~~
matrix.cpp:53:44: note: expanded from macro 'ifor'
53 | #define ifor(instructions) for (size_t instructions)
| ^~~~~~~~~~~~
// shortened for the sake of readability
1 error generated.
This becomes clearer with this piece of code:
#include <iostream>
using std::cout;
void log(int i)
{
cout << "Got an int!\n";
}
void log(size_t i)
{
cout << "Got a size_t!\n";
}
int main()
{
log(0);
return 0;
}
The output is Got an int!
, which means that 0 will be automatically interpreted as an int when it can.
Let's make our Matrix generic. Basically, you add template<typename T>
and <T>
wherever needed.
template<typename T>
class Matrix
{
std::vector<std::vector<T>> m_elems;
size_t m_rows, m_columns;
public:
Matrix(size_t m, size_t n)
: m_elems(m, std::vector<T>(n)), m_rows{m}, m_columns{n} {}
// Hiding operator methods...
std::optional<T> at(int i, int j) const
{
if (0 <= i && i < m_rows && 0 <= j && j < m_columns) {
return m_elems[i][j];
}
return std::nullopt;
}
size_t countNeighborsAt(int i, int j) const
{
size_t result = 0;
const T a = m_elems[i][j];
for (int k : {-1, 0, 1}) {
for (int l : {-1, 0, 1}) {
result += (this->at(i+k, j+l) == a);
}
}
return result - 1;
}
// This is the tricky part
Matrix<size_t> neighbors() const
{
Matrix<size_t> result(m_rows, m_columns);
for (size_t i = 0; i < m_rows; ++i) {
for (size_t j = 0; j < m_columns; ++j) {
// Oddly, you cannot access result.m_elems, so operators are important here
result[i][j] = this->countNeighborsAt(i, j);
}
}
return result;
}
}
We have to ensure that T
is a number type. Actually, in the current program, you'd get an error (or worse, only a warning) if you create a Matrix<std::string>
or Matrix<char>
. This is because I'm using C varargs to construct matrices, and the second parameter of va_arg
will be interpreted as an int
if it can. So no char
s.
Here is the monster:
#include <cstdarg>
template<typename T>
Matrix<T> createMatrix(size_t m, size_t n, ...)
{
va_list args;
va_start(args, n);
Matrix<T> result(m, n);
for (size_t i = 0; i < m; ++i) {
for (size_t j = 0; j < n; ++j) {
result[i][j] = va_arg(args, T);
}
}
va_end(args);
return result;
}
At the beginning I made this function in C. It isn't type-safe, and doesn't check the number of passed parameters.
Let's fix that by using std::initializer_list
and its size
method. It doesn't provide index operator though, so we'll use an iterator instead.
#include <cassert>
#include <initializer_list>
template<typename T>
Matrix<T> createMatrix(size_t m, size_t n, std::initializer_list<T> list)
{
assert(list.size() == m * n);
Matrix<T> result(m, n);
auto it = list.begin();
for (size_t i = 0; i < m; ++i) {
for (size_t j = 0; j < n; ++j) {
result[i][j] = *it++;
}
}
return result;
}
The easiest way is to add the type-check inside the constructor.
#include <type_traits>
template<typename T>
class Matrix {
// ...
Matrix(size_t m, size_t n)
: m_elems(m, std::vector<T>(n)), m_rows{m}, m_columns{n}
{
static_assert(std::is_arithmetic<T>::value, "Type parameter must be numeric.");
}
}
This works, but you can bypass it by declaring an uninitialized pointer, such as std::unique_ptr<Matrix<string>>
. The code will compile, which may lead to subtle bugs. This is the reason why most of the time you should use static_assert
inside functions and not methods/constructors. Also, keep in mind that functions using this technique cannot be overloaded for types not respecting the assertion.
We still want the type-check to happen at compile-time, but we also want it when declaring the type, even before calling the constructor.
In C++20, you can use concepts, which allows a readable and self-explanatory code.
#include <concepts>
template<typename T>
concept arithmetic = std::integral<T> or std::floating_point<T>;
template<arithmetic T>
class Matrix {
// ...
}
The error message will also be very clear.
matrix.cpp:112:19: error: constraints not satisfied for class template 'Matrix' [with T = std::string]
112 | std::unique_ptr<Matrix<std::string>> ptr;
| ^~~~~~~~~~~~~~~~~~~
...
1 error generated.
Note that template<arithmetic T>
is equivalent to template<typename T> requires arithmetic<T>
.
This technique satisfies our needs, but what if for some reasons, you cannot use C++20?
Enters std::enable_if
. It will be less readable, but will do its job.
#include <type_traits>
template<
typename T,
typename = typename std::enable_if<std::is_arithmetic<T>::value, T>::type
>
class Matrix {
// ...
Matrix(size_t m, size_t n)
: m_elems(m, std::vector<T>(n)), m_rows{m}, m_columns{n}
{
static_assert(std::is_arithmetic<T>::value, "Type parameter must be numeric.");
}
}
The shown error will be less clear, but will point to the correct line containing it.
matrix.cpp:11:38: error: failed requirement 'std::is_arithmetic<std::string>::value'; 'enable_if' cannot be used to disable this declaration
11 | typename = typename std::enable_if<std::is_arithmetic<T>::value, T>::type
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
matrix.cpp:111:19: note: in instantiation of default argument for 'Matrix<std::string>' required here
111 | std::unique_ptr<Matrix<std::string>> ptr;
| ^~~~~~~~~~~~~~~~~~~
1 error generated.
I wanted to try out Zig and see how it fares against C and C++. Being a Zig beginner doesn't help, since I dealt with many errors, and the error messages aren't always clear. To construct our Matrix type, we'll create a function that returns a type.
DISCLAIMER: this code uses the version 0.14.0-dev.2245+4fc295dc0
. It may not work on later versions. Remember that below 1.0, Zig will evolve quickly, especially its standard library.
const std = @import("std");
fn Matrix(comptime T: type) type {
// Type-check is really straightforward
const info = @typeInfo(T);
if (info != .int and info != .float) {
@compileError("Type parameter is not numerical.");
}
return struct {
const Self = @This();
elems: std.ArrayList(std.ArrayList(T)),
rows: usize,
columns: usize,
// Define methods here...
};
}
Getting type information is done at compile-time, so the following if condition also occurs at compile-time. Note that in Zig, functions starting with an @
are builtins defined by the compiler.
Should I use the *const Self
as parameter type when I'm not modifying the struct?
No, you can simply use the
Self
type, because when structs are passed as parameters, Zig will pass it by reference
if it is faster to do so rather than copying it.
Tests are very easy to create in Zig.
test "equality" {
const allocator = std.testing.allocator;
const values = &[_]u8{
1, 1,
1, 1,
1, 1,
};
const matrix = try Matrix(u8).init(allocator, 3, 2, values);
defer matrix.deinit();
std.debug.assert(matrix.equalsSlice(values));
const neighbors = try matrix.neighbors(allocator);
defer neighbors.deinit();
const expected = &[_]usize{
3, 3,
5, 5,
3, 3,
};
std.debug.assert(neighbors.equalsSlice(expected));
}
Then you call zig build test
.
Coding the program is Zig was tough not because of the algorithm (I love the fact that optional type is native and easy to use), but, as a pure beginner, because of my numerous syntax errors. Error messages aren't always clear, which can be annoying. The most annoying feature when coming from C or C++ is int casting. There is not implicit type promotion. The most annoying is to learn the appropriate builtin functions to convert int types, but
this article summarizes them very well. All in all I think it's a good thing, as it creates a more predictable code.
The C code is straightforward but unsafe and not totally readable. Meanwhile, the C++ code is more complex, with the language offering multiple techniques, so it took more time to write it, and it always depends on your compiler version. The Zig code is safe, easily testable, and offers a way more easier way to type-check comptime parameters. However, since the syntax differs a bit from C and C++, you have to take time to learn the syntax and features (
Ziglings is a good start).
You can find the source code in the three languages
here.