Learning objectives
- Describe the structural parallel between a biological neuron and an artificial neuron.
- Compute the output of an artificial neuron given inputs, weights, bias, and an activation function.
- Implement a single neuron in NumPy using both the step function and the sigmoid activation.
Concept and real-world motivation
The human brain contains roughly 86 billion neurons. Each neuron receives electrical signals through branching extensions called dendrites, integrates those signals in the cell body (soma), and fires an output signal down its axon if the total input exceeds a threshold. This is an elegant natural computer: many inputs, weighted by synaptic strength, summed, then thresholded.
Warren McCulloch and Walter Pitts formalized this idea mathematically in 1943: the artificial neuron takes a vector of inputs \(x = [x_1, x_2, \ldots, x_n]\), multiplies each by a learned weight \(w_i\), adds a bias \(b\), and passes the result through an activation function \(f\). The output is:
\[z = w_1 x_1 + w_2 x_2 + \cdots + w_n x_n + b = \mathbf{w} \cdot \mathbf{x} + b\] \[\text{output} = f(z)\]
The weights encode how much each input matters. The bias shifts the threshold. The activation function determines whether the neuron “fires.” In deep RL, this computation runs millions of times per second: in DQN, the entire neural network IS the Q-function: \(Q(s, a; \theta) = \text{neural_net}(s)[a]\). Every weight \(\theta\) is updated during training via backpropagation to better predict future rewards.
Illustration:
Exercise: Implement a single artificial neuron in NumPy. Given the inputs, weights, and bias below, compute the pre-activation \(z = \mathbf{w} \cdot \mathbf{x} + b\), then apply (a) a step function and (b) the sigmoid function.
Professor’s hints
- The dot product \(\mathbf{w} \cdot \mathbf{x}\) is
np.dot(w, x)— element-wise multiply then sum. - The step function is simply
int(z > 0)or1 if z > 0 else 0. np.exp(-z)computes \(e^{-z}\). The sigmoid is1 / (1 + np.exp(-z)).- Check your answer: \(z = 0.2(0.5) + (-0.5)(0.3) + 0.4(0.8) + 0.1 = 0.1 - 0.15 + 0.32 + 0.1 = 0.37\). Wait — let me recount: 0.10 − 0.15 + 0.32 + 0.10 = 0.27. So z = 0.27.
Common pitfalls
- Using a Python loop instead of
np.dot:sum(w[i]*x[i] for i in range(3))is correct but slow and not idiomatic NumPy. Usenp.dot(w, x). - Forgetting the bias: \(z = \mathbf{w} \cdot \mathbf{x}\) without adding \(b\) is a common omission. The bias is the neuron’s threshold offset.
- Confusing sigmoid output with a probability: Sigmoid output is in (0,1) and can be interpreted as a probability, but only when the network is trained with cross-entropy loss for a binary classification task.
Worked solution
| |
The pre-activation \(z = 0.2(0.5) + (-0.5)(0.3) + 0.4(0.8) + 0.1 = 0.10 - 0.15 + 0.32 + 0.10 = 0.27\).
Since \(z = 0.27 > 0\), the step function outputs 1. The sigmoid maps 0.27 to \(\frac{1}{1+e^{-0.27}} \approx 0.567\), which is slightly above 0.5.
Extra practice
- Warm-up: Change the weights and observe how the output changes. Try
w = [1.0, 1.0, 1.0]— what is z? Tryw = [-1.0, -1.0, -1.0]— what happens to the step output?
- Coding: A neuron with 5 inputs. Initialize weights with
np.random.seed(7); w = np.random.randn(5)and biasb = np.random.randn(). Generate inputx = np.array([1.0, 0.0, -0.5, 0.8, 0.2]). Compute z and both activations. - Challenge: The step function has a gradient of 0 everywhere except at z=0. Why is this a problem for training with backpropagation? Explain why sigmoid is preferred over step for learning.
- Variant: Implement the ReLU activation: \(f(z) = \max(0, z)\). Apply it to the same neuron above. For which values of z does ReLU give the same output as the step function?
- Debug: The neuron below uses element-wise addition instead of a dot product to combine weights and inputs. Find and fix the bug.
- Conceptual: In the biological analogy, what does the bias \(b\) represent? How does increasing the bias affect how easily the neuron “fires”? What would a very large negative bias mean for the neuron?
- Recall: Write the equation for an artificial neuron’s output from memory, defining each symbol. Then write the sigmoid formula from memory. Check your answer against the page.