You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Hello all! This is the thread of discussion for the talk of Paul Wilson and Fabio Zanasi, "Reverse Derivative Ascent: A Categorical Approach to Learning Boolean Circuits".
Date and time: Thursday July 9, 12:35 UTC.
Zoom meeting: https://mit.zoom.us/j/7055345747
YouTube live stream: https://www.youtube.com/watch?v=aXpjE82yy-8&list=PLCOXjXDLt3pZDHGYOIqtg1m1lLOURjl1Q
This talk will start in 5 minutes!
Hi Paul, I enjoyed your talk! I'm wondering if you're going to be in a jitsi room today?
Hi! Cool work! I was just confused about why you chose "0" and "1" from the MNIST set. I understand you need to flatten these images to binary, but isn't it the same for all digits? As shapes, 0 and 1 are far simpler than the others and subject to less human ambiguity, so it would be cool to see performance on more tricky digits :)
Thanks for the interesting talk Paul! I've been thinking about related ideas and I'd love to have a chat. I'll add some questions here in the meantime:
Bruno Gavranovic said:
- Is there a corresponding notion of nonlinearities here, that is present in standard real-valued neural networks? To clarify: composition of two linear maps in usual smooth spaces is closed under composition. The addition of non-linearities is what allows us to get more complex operations. I'm wondering does this issue manifest here in in any way.
Any reverse derivative category has an inherent notion of "linear map" (which one can prove are closed under composition) so since they've shown that Boolean circuits form a reverse derivative category, one gets for free some notion of "linear" Boolean circuits. I don't know what they look like, though!
Just thinking out loud here...following up on the above, any reverse derivative category has 3 things one can derive from it:
Geoff Cruttwell said:
Any reverse derivative category has an inherent notion of "linear map" (which one can prove are closed under composition) so since they've shown that Boolean circuits form a reverse derivative category, one gets for free some notion of "linear" Boolean circuits. I don't know what they look like, though!
Perhaps I should have clarified my question a bit more. I mean that the notion of non-linearities is crucial to learning complex functions, since they're defined in terms of these simpler linear + nonlinear constituents. What I am not clear here is whether this is also the case for boolean circuits. What are the examples of maps that are linear? What are the examples of nonlinear maps?
I guess you're saying that you're not aware of the linear ones, while for me the whole space of is a bit strange to think about in these terms :grinning:
Bruno Gavranovic said:
Geoff Cruttwell said:
Any reverse derivative category has an inherent notion of "linear map" (which one can prove are closed under composition) so since they've shown that Boolean circuits form a reverse derivative category, one gets for free some notion of "linear" Boolean circuits. I don't know what they look like, though!
Perhaps I should have clarified my question a bit more. I mean that the notion of non-linearities is crucial to learning complex functions, since they're defined in terms of these simpler linear + nonlinear constituents. What I am not clear here is whether this is also the case for boolean circuits. What are the examples of maps that are linear? What are the examples of nonlinear maps?
Sorry, yes, I should have been a bit clearer that my thought above didn't actually give an answer to your question of which boolean circuits are linear or non-linear :wink: ; the general theory just provides an abstract definition of linearity which needs to be worked out in this particular case.
Whatever they work out to be, though, from the general theory they will have the same properties as ordinary linear functions, eg., they will be very simple with respect to both the forward and reverse derivative, and hence should be simple with respect to reverse derivative ascent.
I guess you're saying that you're not aware of the linear ones, while for me the whole space of is a bit strange to think about in these terms :grinning:
Oh definitely, for me too! But that's part of what's neat about what they've done: by putting things in this abstract framework, one doesn't have to guess as to what the right notion of linearity in this setting should be; the abstraction provides a definition for you.
Ah, yes, I see what you mean. Certainly there's an option to sit down and work out what these functions are but I thought I could skip that step by asking here :grinning:
Here's the video!
https://www.youtube.com/watch?v=gdinMKXW86g&list=PLCOXjXDLt3pYot9VNdLlZqGajHyZUywdI
Bruno Gavranovic said:
Ah, yes, I see what you mean. Certainly there's an option to sit down and work out what these functions are but I thought I could skip that step by asking here :grinning:
So I think Geoff answered this from the theoretical perspective- graphically for these are just the maps without white generators (Geoff please correct me if I'm wrong!)
It sounds like you're more interested in whether the final 'learned function' (i.e., the map ) can be "doing something interesting", right?
If so, then yes: the circuit example on the second slide is actually a special case of a model which can learn any boolean map , by simply remembering the set of all input/output pairs. IOW: You can construct a "lookup table" circuit where the parameters are the "data" of the lookup table, and the inputs are the "address" of each value in the lookup table. Hopefully that answers your question and makes sense!
This super cool, I'm happy to learn about Boolean differential calculus and great to see that it gives a CRDC!
Simple question since I'm new to Boolean differential calculus: what is the formula for the forward derivative? (which of course can be derived from the reverse derivative but I'm having a bit of trouble with the definition of the derivative in Boolean differential calculus) So say for an arbitrary function what is its derivative ?
And as a follow up question, does this work for any finite field? Or only ?
Another simple question: could you explain how the chain rule works in terms of Boolean functions?
JS Pacaud Lemay said:
Another simple question: could you explain how the chain rule works in terms of Boolean functions?
Ah I think I've worked it out. It really depends on the fact that can only be either the zero element or unit element of the field (which seems like an obvious statement!) and also that if then . Which is probably super obvious for those who work with Boolean functions! So I guess this probably doesn't easily extend to other finite fields.
JS Pacaud Lemay said:
for an arbitrary function what is its derivative ?
I believe I've worked this out to be: , hopefully this is correct!