You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Dear Attendees,
With the core series of five lectures finishing, it is now time to dive into our exciting guest lectures!
The first guest lecture will be given by @Pietro Vertechi, titled: "Neural network layers as parametric spans"
The abstract for Pietro's lecture is as follows:
Properties such as composability and automatic differentiation made artificial neural networks a pervasive tool in applications. Tackling more challenging problems caused neural networks to progressively become more complex and thus difficult to define from a mathematical perspective. In this talk, we will discuss a general definition of linear layer arising from a categorical framework based on the notions of integration theory and parametric spans. This definition generalizes and encompasses classical layers (e.g., dense, convolutional), while guaranteeing existence and computability of the layer's derivatives for backpropagation.
Pietro's guest lecture will take place in the usual slot, next week (Monday 14 November, starting 4PM UK Time). The lecture will be given on Zoom and live-streamed on YouTube, just as before (the Zoom link should be the same as in previous weeks, but we will confirm the details in advance of the lectures).
This guest lecture will help explain key parts of Neural network layers as parametric spans (Bergomi and Vertechi, SYCO 9).
Lastly, on behalf of the entire organising team of Cats4AI :cat: , I'd like to thank you all for actively engaging with the course so far! :blush:
I'm sure I can speak for all five of us when I say that this was such a daunting but extremely valuable experience: for several of us it was the first time presenting these concepts to such a diverse audience, but seeing you all engaging with the content (whether it be on Zulip, Zoom, or otherwise) made it all the more worthwhile! :boom:
We will be sure to send out a feedback form in the future, to get a better feel on what could have been done better (for future years? :) )
The public link is https://uva-live.zoom.us/j/83816139841 (same as before)
The talk will be live-streamed to https://youtu.be/83a-MwlDy6s
Thoughts during the lecture: it looks like there's a correspondence between Propositions 1. and 2. in Pietro's talk and the Definition 3.5.2.16. in the Categorical Systems Theory book.
An in fact, Pietro does say that this can be interpreted as a general lens
Fantastic talk @Pietro Vertechi (I watched over YouTube as I was in the office :) )
Yes! It was very interesting @Pietro Vertechi, I especially enjoyed your animated illustrations - made everything instantly intuitive :)
The idea of permuting the legs of a span to compute the backward pass reminds me of how you'd implement this differentiation in terms of einsum. Turns out the derivative can be implemented by permuting the indices.
@Bruno Gavranovic , I'm a bit late to the party, but I realized that this can be a very helpful comparison (thinking of discrete parametric spans as a generalized Einstein summation). Many layers can be implemented that way (well, I'm not sure the second one is accepted in einsum
, but it's a useful notation)
y[j] += x[i] * w[i, j] # dense
y[j, k] += x[i, k-l] * w[i, j, l] # convolutional
The general discrete parametric span is of the form
y[t(p)] += x[s(p)] * w[π(p)]
where p
lives in some generalize index space $E$.