Neural Network in C#

Often we forget that we stand on the shoulders of giants. Machine Learning, Deep Learning and AI have gained such traction, that there are many frameworks available. Today, it is really easy to pick the framework (for example ML.NET) of our choosing and start the project fully focusing on the problem we are trying to solve. However, sometimes it is good to stop and actually consider what we are using and how that thing actually works.
Since I myself am a software developer that switched to machine learning, back in the day I decided to build Neural Network from scratch using the object-oriented programming and C#. This is because I wanted to split the building blocks of neural networks and learn more about them with the tools I already know. This way I was not learning two things, but just one. Since then I often used this solution to explain deep learning concepts to .NET developers. So, in this article, we go through that solution.
What is a neural network ?
Based on nature, neural networks are the usual representation we make of the brain : neurons interconnected to other neurons which forms a network. A simple information transits in a lot of them before becoming an actual thing, like “move the hand to pick up this pencil”.
The operation of a complete neural network is straightforward : one enter variables as inputs (for example an image if the neural network is supposed to tell what is on an image), and after some calculations, an output is returned (following the first example, giving an image of a cat should return the word “cat”).
Now, you should know that artificial neural network are usually put on columns, so that a neuron of the column n can only be connected to neurons from columns n-1 and n+1. There are few types of networks that use a different architecture, but we will focus on the simplest for now.
So, we can represent an artificial neural network like that :

What does a neuron do ?
The operations done by each neurons are pretty simple :

First, it adds up the value of every neurons from the previous column it is connected to. On the Figure 2, there are 3 inputs (x1, x2, x3) coming to the neuron, so 3 neurons of the previous column are connected to our neuron.
This value is multiplied, before being added, by another variable called “weight” (w1, w2, w3) which determines the connection between the two neurons. Each connection of neurons has its own weight, and those are the only values that will be modified during the learning process.
Moreover, a bias value may be added to the total value calculated. It is not a value coming from a specific neuron and is chosen before the learning phase, but can be useful for the network.
After all those summations, the neuron finally applies a function called “activation function” to the obtained value.

The so-called activation function usually serves to turn the total value calculated before to a number between 0 and 1 (done for example by a sigmoid function shown by Figure 3). Other function exist and may change the limits of our function, but keeps the same aim of limiting the value.
That’s all a neuron does ! Take all values from connected neurons multiplied by their respective weight, add them, and apply an activation function. Then, the neuron is ready to send its new value to other neurons.
After every neurons of a column did it, the neural network passes to the next column. In the end, the last values obtained should be one usable to determine the desired output.
Now that we understand what a neuron does, we could possibly create any network we want. However, there are other operations to implement to make a neural network learn.
How does a neural network learn ?
Yep, creating variables and making them interact with each other is great, but that is not enough to make the whole neural network learn by itself. We need to prepare a lot of data to give to our network. Those data include the inputs and the output expected from the neural network.
Let’s take a look at how the learning process works :
First of all, remember that when an input is given to the neural network, it returns an output. On the first try, it can’t get the right output by its own (except with luck) and that is why, during the learning phase, every inputs come with its label, explaining what output the neural network should have guessed. If the choice is the good one, actual parameters are kept and the next input is given. However, if the obtained output doesn’t match the label, weights are changed. Those are the only variables that can be changed during the learning phase. This process may be imagined as multiple buttons, that are turned into different possibilities every times an input isn’t guessed correctly.
To determine which weight is better to modify, a particular process, called “backpropagation” is done. We won’t linger too much on that, since the neural network we will build doesn’t use this exact process, but it consists on going back on the neural network and inspect every connection to check how the output would behave according to a change on the weight.
Finally, there is a last parameter to know to be able to control the way the neural network learns : the “learning rate”. The name says it all, this new value determines on what speed the neural network will learn, or more specifically how it will modify a weight, little by little or by bigger steps. 1 is generally a good value for that parameter.
Perceptron
Okay, we know the basics, let’s check about the neural network we will create. The one explained here is called a Perceptron and is the first neural network ever created. It consists on 2 neurons in the inputs column and 1 neuron in the output column. This configuration allows to create a simple classifier to distinguish 2 groups. To better understand the possibilities and the limitations, let’s see a quick example (which doesn’t have much interest except to understand) :
Let’s say you want your neural network to be able to return outputs according to the rules of the “inclusive or”. Reminder :

- if A is true and B is true, then A or B is true.
- if A is true and B is false, then A or B is true.
- if A is false and B is true, then A or B is true.
- if A is false and B is false, then A or B is false.
If you replace the “true”s by 1 and the “false”s by 0 and put the 4 possibilities as points with coordinates on a plan, then you realize the two final groups “false” and “true” may be separated by a single line. This is what a Perceptron can do.
On the other hand, if we check the case of the “exclusive or” (in which the case “true or true” (the point (1,1)) is false), then we can see that a simple line cannot separate the two groups, and a Perceptron isn’t able to deal with this problem.
So, the Perceptron is indeed not a very efficient neural network, but it is simple to create and may still be useful as a classifier.
Implementation
So, as you can see from the previous chapter there are a few important entities that we need to pay attention to and that we can abstract. They are neurons, connections, layers, and functions. In this solution, a separate class will implement each of these entities. Then, by putting it all together and adding a backpropagation algorithm on top of it, we will have our implementation of this simple neural network.
Input Functions
As mentioned before, crucial parts of the neuron are the input function and activation function. Let’s examine the input function. First I created an interface for this function so it can be easily changed in the neuron implementation later on:
public interface IInputFunction
{
double CalculateInput(List<ISynapse> inputs);
}
These functions have only one method – CalculateInput, which receives a list of connections that are described in the ISynapse interface. We will cover this abstraction later; so far all we need to know is that this interface represents connections among neurons. CalculateInput method needs to return some sort of value based on the data contained in the list of connections. Then, I did the concrete implementation of the input function – weighted sum function.
public class WeightedSumFunction : IInputFunction
{
public double CalculateInput(List<ISynapse> inputs)
{
return inputs.Select(x => x.Weight * x.GetOutput()).Sum();
}
}
This function sums weighted values on all connections that are passed in the list.
Activation Functions
Taking the same approach as in input function implementation, the interface for activation functions is implemented first:
public interface IActivationFunction
{
double CalculateOutput(double input);
}
After that, concrete implementation can be done. The CalculateOutput method should return the output value of the neuron based on the input value that it got from the input function. Here is how the step function looks:
public class StepActivationFunction : IActivationFunction
{
private double _treshold;
public StepActivationFunction(double treshold)
{
_treshold = treshold;
}
public double CalculateOutput(double input)
{
return Convert.ToDouble(input > _treshold);
}
}
Pretty straightforward, isn’t it? A threshold value is defined during the construction of the object, and then the CalculateOutput returns 1 if the input value exceeds the threshold value, otherwise, it returns 0.
Other functions are easy as well. Here is the Sigmoid activation function implementation:
public class SigmoidActivationFunction : IActivationFunction
{
private double _coeficient;
public SigmoidActivationFunction(double coeficient)
{
_coeficient = coeficient;
}
public double CalculateOutput(double input)
{
return (1 / (1 + Math.Exp(-input * _coeficient)));
}
}
And here is Rectifier activation function implementation:
public class RectifiedActivationFuncion : IActivationFunction
{
public double CalculateOutput(double input)
{
return Math.Max(0, input);
}
}
So far so good – we have implementations for input and activation functions, and we can proceed to implement the trickier parts of the network – neurons and connections.
Neuron
The workflow that a neuron should follow goes like this: Receive input values from one or more weighted input connections. Collect those values and pass them to the activation function, which calculates the output value of the neuron. Send those values to the outputs of the neuron. Based on that workflow abstraction of the neuron this is created:
public interface INeuron
{
Guid Id { get; }
double PreviousPartialDerivate { get; set; }
List<ISynapse> Inputs { get; set; }
List<ISynapse> Outputs { get; set; }
void AddInputNeuron(INeuron inputNeuron);
void AddOutputNeuron(INeuron inputNeuron);
double CalculateOutput();
void AddInputSynapse(double inputValue);
void PushValueOnInput(double inputValue);
}
Before we explain each property and method, let’s see the concrete implementation of a neuron, since that will make the way it works far clearer:
public class Neuron : INeuron
{
private IActivationFunction _activationFunction;
private IInputFunction _inputFunction;
/// <summary>
/// Input connections of the neuron.
/// </summary>
public List<ISynapse> Inputs { get; set; }
/// <summary>
/// Output connections of the neuron.
/// </summary>
public List<ISynapse> Outputs { get; set; }
public Guid Id { get; private set; }
/// <summary>
/// Calculated partial derivate in previous iteration of training process.
/// </summary>
public double PreviousPartialDerivate { get; set; }
public Neuron(IActivationFunction activationFunction, IInputFunction inputFunction)
{
Id = Guid.NewGuid();
Inputs = new List<ISynapse>();
Outputs = new List<ISynapse>();
_activationFunction = activationFunction;
_inputFunction = inputFunction;
}
/// <summary>
/// Connect two neurons.
/// This neuron is the output neuron of the connection.
/// </summary>
/// <param name="inputNeuron">Neuron that will be input neuron of the newly created connection.</param>
public void AddInputNeuron(INeuron inputNeuron)
{
var synapse = new Synapse(inputNeuron, this);
Inputs.Add(synapse);
inputNeuron.Outputs.Add(synapse);
}
/// <summary>
/// Connect two neurons.
/// This neuron is the input neuron of the connection.
/// </summary>
/// <param name="outputNeuron">Neuron that will be output neuron of the newly created connection.</param>
public void AddOutputNeuron(INeuron outputNeuron)
{
var synapse = new Synapse(this, outputNeuron);
Outputs.Add(synapse);
outputNeuron.Inputs.Add(synapse);
}
/// <summary>
/// Calculate output value of the neuron.
/// </summary>
/// <returns>
/// Output of the neuron.
/// </returns>
public double CalculateOutput()
{
return _activationFunction.CalculateOutput(_inputFunction.CalculateInput(this.Inputs));
}
/// <summary>
/// Input Layer neurons just receive input values.
/// For this they need to have connections.
/// This function adds this kind of connection to the neuron.
/// </summary>
/// <param name="inputValue">
/// Initial value that will be "pushed" as an input to connection.
/// </param>
public void AddInputSynapse(double inputValue)
{
var inputSynapse = new InputSynapse(this, inputValue);
Inputs.Add(inputSynapse);
}
/// <summary>
/// Sets new value on the input connections.
/// </summary>
/// <param name="inputValue">
/// New value that will be "pushed" as an input to connection.
/// </param>
public void PushValueOnInput(double inputValue)
{
((InputSynapse)Inputs.First()).Output = inputValue;
}
}
Each neuron has its unique identifier – Id. This property is used in the backpropagation algorithm later. Another property that is added for backpropagation purposes is the PreviousPartialDerivate, but this will be examined in detail further on. A neuron has two lists, one for input connections – Inputs, and another one for output connections – Outputs. Also, it has two fields, one for each of the functions described in previous chapters. They are initialized through the constructor. This way, neurons with different input and activation functions can be created.
This class has some interesting methods, too. AddInputNeuron and AddOutputNeuron are used to create a connection among neurons. The first one adds an input connection to some neuron and the second one adds an output connection to some neuron. AddInputSynapse adds InputSynapse to the neuron, which is a special type of connection. These are special connections that are used just for the input layer of the neuron, i.e. they are used only for adding input to the entirety of the system. This will be covered in more detail in the next chapter.
Last but not least, the CalculateOutput method is used to activate a chain reaction of output calculation. What will happen when this function is called? Well, this will call the input function, which will request values from all input connections. In turn, these connections will request output values from input neurons of these connections, i.e. output values of neurons from the previous layer. This process will be done until the input layer is reached and input values are propagated through the system.
Connections
Connections are abstracted through the ISynapse interface:
public interface ISynapse
{
double Weight { get; set; }
double PreviousWeight { get; set; }
double GetOutput();
bool IsFromNeuron(Guid fromNeuronId);
void UpdateWeight(double learningRate, double delta);
}
Every connection has its weight represented through the property of the same name. Additional property PreviousWeight is added and it is used during the backpropagation of the error through the system. An update of the current weight and storing of the previous one is done in the helper function UpdateWeight.
There is another helper function – IsFromNeuron, which detects if a certain neuron is an input neuron to the connection. Of course, there is a method that gets an output value of the connection – GetOutput. Here is the implementation of the connection:
public class Synapse : ISynapse
{
internal INeuron _fromNeuron;
internal INeuron _toNeuron;
/// <summary>
/// Weight of the connection.
/// </summary>
public double Weight { get; set; }
/// <summary>
/// Weight that connection had in previous itteration.
/// Used in training process.
/// </summary>
public double PreviousWeight { get; set; }
public Synapse(INeuron fromNeuraon, INeuron toNeuron, double weight)
{
_fromNeuron = fromNeuraon;
_toNeuron = toNeuron;
Weight = weight;
PreviousWeight = 0;
}
public Synapse(INeuron fromNeuraon, INeuron toNeuron)
{
_fromNeuron = fromNeuraon;
_toNeuron = toNeuron;
var tmpRandom = new Random();
Weight = tmpRandom.NextDouble();
PreviousWeight = 0;
}
/// <summary>
/// Get output value of the connection.
/// </summary>
/// <returns>
/// Output value of the connection.
/// </returns>
public double GetOutput()
{
return _fromNeuron.CalculateOutput();
}
/// <summary>
/// Checks if Neuron has a certain number as an input neuron.
/// </summary>
/// <param name="fromNeuronId">Neuron Id.</param>
/// <returns>
/// True - if the neuron is the input of the connection.
/// False - if the neuron is not the input of the connection.
/// </returns>
public bool IsFromNeuron(Guid fromNeuronId)
{
return _fromNeuron.Id.Equals(fromNeuronId);
}
/// <summary>
/// Update weight.
/// </summary>
/// <param name="learningRate">Chossen learning rate.</param>
/// <param name="delta">Calculated difference for which weight of the connection needs to be modified.</param>
public void UpdateWeight(double learningRate, double delta)
{
PreviousWeight = Weight;
Weight += learningRate * delta;
}
}
Notice the fields _fromNeuron and _toNeuron, which define neurons that this synapse connects. Apart from this implementation of the connection, there is another one that I’ve mentioned in the previous chapter about neurons. It is InputSynapse and it is used as an input to the system. The weight of these connections is always 1 and it is not updated during the training process. Here is the implementation of it:
public class InputSynapse : ISynapse
{
internal INeuron _toNeuron;
public double Weight { get; set; }
public double Output { get; set; }
public double PreviousWeight { get; set; }
public InputSynapse(INeuron toNeuron)
{
_toNeuron = toNeuron;
Weight = 1;
}
public InputSynapse(INeuron toNeuron, double output)
{
_toNeuron = toNeuron;
Output = output;
Weight = 1;
PreviousWeight = 1;
}
public double GetOutput()
{
return Output;
}
public bool IsFromNeuron(Guid fromNeuronId)
{
return false;
}
public void UpdateWeight(double learningRate, double delta)
{
throw new InvalidOperationException("It is not allowed to call this method on Input Connecion");
}
}
Layer
From here, the implementation of the neural layer is quite easy:
public class NeuralLayer
{
public List<INeuron> Neurons;
public NeuralLayer()
{
Neurons = new List<INeuron>();
}
/// <summary>
/// Connecting two layers.
/// </summary>
public void ConnectLayers(NeuralLayer inputLayer)
{
var combos = Neurons.SelectMany(neuron => inputLayer.Neurons, (neuron, input) => new { neuron, input });
combos.ToList().ForEach(x => x.neuron.AddInputNeuron(x.input));
}
}
Simple Artificial Neural Network
Now, let’s put all that together and add backpropagation to it. Take a look at the implementation of the Network itself:
public class SimpleNeuralNetwork
{
private NeuralLayerFactory _layerFactory;
internal List<NeuralLayer> _layers;
internal double _learningRate;
internal double[][] _expectedResult;
/// <summary>
/// Constructor of the Neural Network.
/// Note:
/// Initialy input layer with defined number of inputs will be created.
/// </summary>
/// <param name="numberOfInputNeurons">
/// Number of neurons in input layer.
/// </param>
public SimpleNeuralNetwork(int numberOfInputNeurons)
{
_layers = new List<NeuralLayer>();
_layerFactory = new NeuralLayerFactory();
// Create input layer that will collect inputs.
CreateInputLayer(numberOfInputNeurons);
_learningRate = 2.95;
}
/// <summary>
/// Add layer to the neural network.
/// Layer will automatically be added as the output layer to the last layer in the neural network.
/// </summary>
public void AddLayer(NeuralLayer newLayer)
{
if (_layers.Any())
{
var lastLayer = _layers.Last();
newLayer.ConnectLayers(lastLayer);
}
_layers.Add(newLayer);
}
/// <summary>
/// Push input values to the neural network.
/// </summary>
public void PushInputValues(double[] inputs)
{
_layers.First().Neurons.ForEach(x => x.PushValueOnInput(inputs[_layers.First().Neurons.IndexOf(x)]));
}
/// <summary>
/// Set expected values for the outputs.
/// </summary>
public void PushExpectedValues(double[][] expectedOutputs)
{
_expectedResult = expectedOutputs;
}
/// <summary>
/// Calculate output of the neural network.
/// </summary>
/// <returns></returns>
public List<double> GetOutput()
{
var returnValue = new List<double>();
_layers.Last().Neurons.ForEach(neuron =>
{
returnValue.Add(neuron.CalculateOutput());
});
return returnValue;
}
/// <summary>
/// Train neural network.
/// </summary>
/// <param name="inputs">Input values.</param>
/// <param name="numberOfEpochs">Number of epochs.</param>
public void Train(double[][] inputs, int numberOfEpochs)
{
double totalError = 0;
for(int i = 0; i < numberOfEpochs; i++)
{
for(int j = 0; j < inputs.GetLength(0); j ++)
{
PushInputValues(inputs[j]);
var outputs = new List<double>();
// Get outputs.
_layers.Last().Neurons.ForEach(x =>
{
outputs.Add(x.CalculateOutput());
});
// Calculate error by summing errors on all output neurons.
totalError = CalculateTotalError(outputs, j);
HandleOutputLayer(j);
HandleHiddenLayers();
}
}
}
/// <summary>
/// Hellper function that creates input layer of the neural network.
/// </summary>
private void CreateInputLayer(int numberOfInputNeurons)
{
var inputLayer = _layerFactory.CreateNeuralLayer(numberOfInputNeurons, new RectifiedActivationFuncion(), new WeightedSumFunction());
inputLayer.Neurons.ForEach(x => x.AddInputSynapse(0));
this.AddLayer(inputLayer);
}
/// <summary>
/// Hellper function that calculates total error of the neural network.
/// </summary>
private double CalculateTotalError(List<double> outputs, int row)
{
double totalError = 0;
outputs.ForEach(output =>
{
var error = Math.Pow(output - _expectedResult[row][outputs.IndexOf(output)], 2);
totalError += error;
});
return totalError;
}
/// <summary>
/// Hellper function that runs backpropagation algorithm on the output layer of the network.
/// </summary>
/// <param name="row">
/// Input/Expected output row.
/// </param>
private void HandleOutputLayer(int row)
{
_layers.Last().Neurons.ForEach(neuron =>
{
neuron.Inputs.ForEach(connection =>
{
var output = neuron.CalculateOutput();
var netInput = connection.GetOutput();
var expectedOutput = _expectedResult[row][_layers.Last().Neurons.IndexOf(neuron)];
var nodeDelta = (expectedOutput - output) * output * (1 - output);
var delta = -1 * netInput * nodeDelta;
connection.UpdateWeight(_learningRate, delta);
neuron.PreviousPartialDerivate = nodeDelta;
});
});
}
/// <summary>
/// Hellper function that runs backpropagation algorithm on the hidden layer of the network.
/// </summary>
/// <param name="row">
/// Input/Expected output row.
/// </param>
private void HandleHiddenLayers()
{
for (int k = _layers.Count - 2; k > 0; k--)
{
_layers[k].Neurons.ForEach(neuron =>
{
neuron.Inputs.ForEach(connection =>
{
var output = neuron.CalculateOutput();
var netInput = connection.GetOutput();
double sumPartial = 0;
_layers[k + 1].Neurons
.ForEach(outputNeuron =>
{
outputNeuron.Inputs.Where(i => i.IsFromNeuron(neuron.Id))
.ToList()
.ForEach(outConnection =>
{
sumPartial += outConnection.PreviousWeight * outputNeuron.PreviousPartialDerivate;
});
});
var delta = -1 * netInput * sumPartial * output * (1 - output);
connection.UpdateWeight(_learningRate, delta);
});
});
}
}
}
This class contains a list of neural layers and a layer factory, a class that is used to create new layers. During the construction of the object, the initial input layer is added to the network. Other layers are added through the function AddLayer, which adds a passed layer on top of the current layer list. The GetOutput method will activate the output layer of the network, thus initiating a chain reaction through the network.
Also, this class has a few helper methods such as PushExpectedValues, which is used to set desired values for the training set that will be passed during training, as well as PushInputValues, which is used to set certain input to the network.
The most important method of this class is the Train method. It receives the training set and the number of epochs. For each epoch, it runs the whole training set through the network as explained in this article. Then, the output is compared with desired output and the functions HandleOutputLayer and HandleHiddenLayer are called. These functions implement the backpropagation algorithm as described in this article.
Workflow
Typical workflow can be seen in one of the tests– Train_RuningTraining_NetworkIsTrained. It goes something like this:
var network = new SimpleNeuralNetwork(3);
var layerFactory = new NeuralLayerFactory();
network.AddLayer(layerFactory.CreateNeuralLayer(3, new RectifiedActivationFuncion(), new WeightedSumFunction()));
network.AddLayer(layerFactory.CreateNeuralLayer(1, new SigmoidActivationFunction(0.7), new WeightedSumFunction()));
network.PushExpectedValues(
new double[][] {
new double[] { 0 },
new double[] { 1 },
new double[] { 1 },
new double[] { 0 },
new double[] { 1 },
new double[] { 0 },
new double[] { 0 },
});
network.Train(
new double[][] {
new double[] { 150, 2, 0 },
new double[] { 1002, 56, 1 },
new double[] { 1060, 59, 1 },
new double[] { 200, 3, 0 },
new double[] { 300, 3, 1 },
new double[] { 120, 1, 0 },
new double[] { 80, 1, 0 },
}, 10000);
network.PushInputValues(new double[] { 1054, 54, 1 });
var outputs = network.GetOutput();
Firstly, a neural network object is created. In the constructor, it is defined that there will be three neurons in the input layer. After that, two layers are added using the function AddLayer and layer factory. For each layer, the number of neurons and functions for each neuron is defined. After this part is completed, the expected outputs are defined and the Train function with the input training set and the number of epochs is called.
Conclusion
This implementation of the neural network is far from optimal. You will notice plenty of nested for loops which certainly have bad performance. Also, in order to simplify this solution, some of the components of the neural network were not introduced in this first iteration of implementation, momentum, and bias, for example. Nevertheless, it was not a goal to implement a network with high performance, but to analyze and display important elements and abstractions that each Artificial Neural Network has.
Thanks for reading!