Card Network

I recently came across Andrew Kensler's fascinating Business Card Raytracer . His code, in just 1337 bytes, generates a full raytraced image complete with depth of field, reflections, a sky gradient, and a textured floor. In a similar vein, I set out to create a full neural network that could fit on the back of a business card. The code below is the result, creating a 3-layer fully-connected neural network with leaky-relu activations and training it to generate a small image of my name.

#include <bits/stdc++.h>

typedef float f;typedef std::vector<f> v;
f K=0.01,R=0.01;struct L{int w,h,i;v W,x,
z;L(int i,int o){W.resize(i*o);for(f&a:W)
a=(rand()/(1.1-(1<<31))-0.5)/sqrt(i+o+0.)
;w=i;h=o;}v F(v X){x=X;v a(h);i=h*w;while
(i--)a[i/w]+=W[i]*x[i%w];z=a;for(f&i:a)i=
K>i?K:i;return a;}v B(v Y){v X(w);i=w*h;
while(i--){f Z=Y[i/w]*(z[i/w]>0?1:K);X[i%
w]+=Z*W[i];W[i]-=R*Z*x[i%w];}return X;}};
int main(){L n[]{L(40,50),L(50,50),L(50,1
)};int l,z,k,i=500337;while(i--){z=242-i%
243;k=40;v x(k),y;while(k--)x[k]=sin((k%2
?z/27/3.:z%27/5.)+6.14*k/20);y=x;for(l=0;
l<3;l++)y=n[l].F(y);v Y{2*(y[0]-("\x1e\0\
\0\b\x01\0@H\374\377B\x12@\x18\x12G\302\
\x10@\x12\372\377\221\x10\0\0\200\0\0\0\
\x02"[z/8]>>z%8&1))};while(l--)Y=n[l].B(Y
);putchar(" .,-*oO##"[(int)(y[0]*8)%9]);
if(i%27<1)puts(i?i%243?"":"\033[9A":"\r\
   --- Oisin Carroll ---\n\
 gh:Oisincar  web:imois.in");}}

You can view the output here by clicking the play button. (I hate to admit it's not actually running the code in the browser :P, but you'll see the same or similar output if you run the code yourself).

> ./card_network

The code is only 878 characters, and (as mentioned previously) creates and trains a small neural network. I'm sure it's possible to make it shorter still, but I was happy enough to get this far! I think making it any shorter would make it impossible to reason about, now at least new layers can be added or removed easily and the forward/back propagation is pretty neat! (I'm not biased I swear).

Here's a small breakdown of how it works.

#include <bits/stdc++.h>

typedef float f;typedef std::vector<f> v;
f K=0.01,R=0.01;

Define simple imports and typedefs. The two floats define constants used for the 'leak' of leaky relu, (I.e. the \(K\) in \(f(x) = max(Kx, x)\)), and the learning rate.

		struct L{int w,h,i;v W,x,
z;L(int i,int o){W.resize(i*o);for(f&a:W)
a=(rand()/(1.1-(1<<31))-0.5)/sqrt(i+o+0.)
;w=i;h=o;}

Start of the layer struct L . The constructor takes the input and output sizes, and initializes a weights matrix in an approximation of Xavier Initilization . Ideally, the random variable would be normally distributed, but linearly seems to work decently too!

	  v F(v X){x=X;v a(h);i=h*w;while
(i--)a[i/w]+=W[i]x[i%w];z=a;for(f&i:a)i=
K>i?K:i;return a;}v B(v Y){v X(w);i=w*h;
while(i--){f Z=Y[i/w]*(z[i/w]>0?1:K);X[i%
w]+=Z*W[i];W[i]-=R*Z*x[i%w];}return X;}};

Two functions for the layer struct. F performs forward propagation, taking the input vector and returning the output. Some values are cached for later. B performs backward propagation, taking the derivative of the output, updating the weights matrix, and returning the derivative of the input. Both are using leaky relu.

int main(){L n[]{L(40,50),L(50,50),L(50,1
)};int l,z,k,i=500337;while(i--){

Start of main, create a 3 layer network. The network takes 40 values for input, and has two hidden layers with 50 nodes each, and returns single value at the end. The input is based on the x,y coordinate currently predicted, and the output is a single value for whether this pixel should be a space or '#'. At the end we open the training loop, which runs for 500336 iterations, which is a multiple of the output size; 243.

				 z=242-i%
243;k=40;v x(k),y;while(k--)x[k]=sin((k%2
?z/27/3.:z%27/5.)+6.14*k/20);

Generate the input vector x . Each element of x is based on the sin of either the x or y coordinate of the pixel currently being predicted.

			     y=x;for(l=0;
l<3;l++)y=n[l].F(y);

Forward propagation

		    v Y{2*(y[0]-("\x1e\0\
\0\b\x01\0@H\374\377B\x12@\x18\x12G\302\
\x10@\x12\372\377\221\x10\0\0\200\0\0\0\
\x02"[z/8]>>z%8&1))};

From the output of the network, y , calculate \(\frac{dE}{dy}\). We use mean squared error, so the derivative is just \(2(y-\hat{y})\), where \(\hat{y}\) is the target value and is encoded in a string. When written with non-ascii characters this is a lot shorter.

		     while(l--)Y=n[l].B(Y
);

Back propagation

  putchar(" .,-*oO##"[(int)(y[0]*8)%9]);
if(i%27<1)puts(i?i%243?"":"\033[9A":"\r\
   --- Oisin Carroll ---\n\
 gh:Oisincar  web:imois.in");}}

Output: Choose different chars based on how high/low the value predicted is. Then print special characters if we're at the end of a line ( i%27==0 ) or we've done a full epoch ( i%243==0 ). Finally, when we're finished, i==0 and we print the end text.

Thanks for reading! I'd love to hear suggestions on how to make it shorter, or any other ideas.

Recommend

Maven中央仓库正式成为Oracle官方JDBC驱动程序组件分发中心

R-MeN: 个性化搜索的关系记忆网络Embedding

Ray tracing in Nim

【Embedding】Metapath2vec：异构网络表征

Redis内存管理

创业者告白：一年前拒绝200万融资如今却被迫卖公司

59款中国App折戟印度？印度封杀59款中国App背后：印度初创企业融资放缓投资者观望情...

抱歉，抖音网红真的不值钱 - 卢松松博客

k8s实践(11) --服务发现CoreDNS详解

手把手教你搞定菜单权限设计，精确到按钮级别，建议收藏

About Joyk