The most basic of the neural networks.

3 min readJun 27, 2021

In this article, we are going to look at the base of modern Neural Networks, a TLU (Threshold Logic Unit) or Perceptron. First proposed in 1943 by Walter Pitts and Warren McCulloch, it was later buried due to the lack of its ability to implement complex functions.

But, the modern neural networks or MLP (Multilayer Perceptron) are very similar to it, they just have TLUs implemented on multiple levels. Doing that divides the complex problem into a set of problems that can be solved by the TLU.

Let’s see how a TLU works:

We have a set consisting of two features. x1 and x2. We can design an AND logic out of these two features.

Now, what can you do to produce the output y in a way that it matches with the table above?

We can use the power of multiplication. The basis property by which all the computers work today.

Before going further, let me introduce some terminology:

n, m: Number of rows in X, number of columns in X.
X: Our feature matrix containing x1 and x2.

To make a prediction from a machine learning hypothesis, you need weights. More precisely, we will multiply the weights by our features to get the desired output.

The letter W is going to denote our vector of weights. But before going further, let’s introduce another term. ‘Bias’. Bias is arguably the most important part of this model.

We will add the bias vector (all values equal to one) of length n to the front of the feature matrix X. This will change the shape of X from (n, m) to (n, m+1).

In the end we want our hypothesis to be in the form:

Predictions = sigmoid(X * W).

Note: The sigmoid function constricts the whole number line between 1 and 0. Meaning that values greater than

Now take some time to figure out what values our W vector should contain.

If you figured that out, great! If you couldn’t, let me explain.

Recall that an AND gate only output 1 if both of the inputs are active (equal to 1). From this logic, given the value of x1 and x2, we want to set the values of W.

The output equation will look like this:

sigmoid(-W0 + W1 * x1 + W2 * x2)

To produce the desired output, our W vector should look like this: [40, 30, 30]. This implies that our equation can only be positive if and only if the sum of the second and third term is larger than the first (This also gives us the output of our AND gate).

When we get the first element of values of x1 ([0, 0, 1, 1]) and x2 ([0, 1, 0, 1]) and put it in the equation, we get this:

x1 = 0, x2 = 0
sigmoid(-40 + 30*0 + 30*0) = sigmoid(-30)
Applying the sigmoid function on -30 gives a value very close to 0. The output we want. The output of AND gate is 1 only when both x1 and x2 are 1.

x1 = 1, x2 = 1
sigmoid(-40 + 30*1 + 30*1) = sigmoid(20)
Applying the sigmoid function on 20 outputs a value almost equal to 1.

Conclusion: Similarly, you can build all types of logic gates (for example, an OR gate by setting the value of W0 smaller than x1 or x2 in the above equation.

This is how you can implement the most basics of logic gates using a TLU, but how we got from this basic computational equation to where we are now? Simple. By using multiple layers of TLUs. Read more about MLPs (Multilayer Perceptrons) here.

The most basic of the neural networks.

Let’s see how a TLU works:

Written by Aditya Kumawat