Guest Lecture 1: Neural network layers as parametric spans · event: Categories for AI

Dear Attendees,
With the core series of five lectures finishing, it is now time to dive into our exciting guest lectures!

The first guest lecture will be given by @Pietro Vertechi, titled: "Neural network layers as parametric spans"

The abstract for Pietro's lecture is as follows:
Properties such as composability and automatic differentiation made artificial neural networks a pervasive tool in applications. Tackling more challenging problems caused neural networks to progressively become more complex and thus difficult to define from a mathematical perspective. In this talk, we will discuss a general definition of linear layer arising from a categorical framework based on the notions of integration theory and parametric spans. This definition generalizes and encompasses classical layers (e.g., dense, convolutional), while guaranteeing existence and computability of the layer's derivatives for backpropagation.

Pietro's guest lecture will take place in the usual slot, next week (Monday 14 November, starting 4PM UK Time). The lecture will be given on Zoom and live-streamed on YouTube, just as before (the Zoom link should be the same as in previous weeks, but we will confirm the details in advance of the lectures).

Lastly, on behalf of the entire organising team of Cats4AI :cat: , I'd like to thank you all for actively engaging with the course so far! :blush:
I'm sure I can speak for all five of us when I say that this was such a daunting but extremely valuable experience: for several of us it was the first time presenting these concepts to such a diverse audience, but seeing you all engaging with the content (whether it be on Zulip, Zoom, or otherwise) made it all the more worthwhile! :boom:
We will be sure to send out a feedback form in the future, to get a better feel on what could have been done better (for future years? :) )

Pim de Haan (Nov 10 2022 at 14:58):

Bruno Gavranović (Nov 14 2022 at 16:31):

Thoughts during the lecture: it looks like there's a correspondence between Propositions 1. and 2. in Pietro's talk and the Definition 3.5.2.16. in the Categorical Systems Theory book.

Bruno Gavranović (Nov 14 2022 at 16:33):

Petar Veličković (Nov 14 2022 at 16:57):

Fantastic talk @Pietro Vertechi (I watched over YouTube as I was in the office :) )

Ieva Cepaite (Nov 14 2022 at 16:59):

Yes! It was very interesting @Pietro Vertechi, I especially enjoyed your animated illustrations - made everything instantly intuitive :)

Bruno Gavranović (Nov 14 2022 at 17:05):

The idea of permuting the legs of a span to compute the backward pass reminds me of how you'd implement this differentiation in terms of einsum. Turns out the derivative can be implemented by permuting the indices.

Pietro Vertechi (Nov 25 2022 at 14:29):

@Bruno Gavranovic , I'm a bit late to the party, but I realized that this can be a very helpful comparison (thinking of discrete parametric spans as a generalized Einstein summation). Many layers can be implemented that way (well, I'm not sure the second one is accepted in einsum, but it's a useful notation)

y[j] += x[i] * w[i, j] # dense
y[j, k] += x[i, k-l] * w[i, j, l] # convolutional

y[t(p)] += x[s(p)] * w[π(p)]

Category Theory
Zulip Server
Archive

Stream: event: Categories for AI

Topic: Guest Lecture 1: Neural network layers as parametric spans

Petar Veličković (Nov 09 2022 at 16:41):