Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: event: ACT20

Topic: July 9: Paul Wilson et al.'s talk


view this post on Zulip Paolo Perrone (Jul 01 2020 at 20:47):

Hello all! This is the thread of discussion for the talk of Paul Wilson and Fabio Zanasi, "Reverse Derivative Ascent: A Categorical Approach to Learning Boolean Circuits".
Date and time: Thursday July 9, 12:35 UTC.
Zoom meeting: https://mit.zoom.us/j/7055345747
YouTube live stream: https://www.youtube.com/watch?v=aXpjE82yy-8&list=PLCOXjXDLt3pZDHGYOIqtg1m1lLOURjl1Q

view this post on Zulip Paolo Perrone (Jul 09 2020 at 12:30):

This talk will start in 5 minutes!

view this post on Zulip David Jaz (Jul 09 2020 at 13:02):

Hi Paul, I enjoyed your talk! I'm wondering if you're going to be in a jitsi room today?

view this post on Zulip Filip Buric (Jul 09 2020 at 13:04):

Hi! Cool work! I was just confused about why you chose "0" and "1" from the MNIST set. I understand you need to flatten these images to binary, but isn't it the same for all digits? As shapes, 0 and 1 are far simpler than the others and subject to less human ambiguity, so it would be cool to see performance on more tricky digits :)

view this post on Zulip Bruno Gavranović (Jul 09 2020 at 13:05):

Thanks for the interesting talk Paul! I've been thinking about related ideas and I'd love to have a chat. I'll add some questions here in the meantime:

  1. Why is this ascent, as opposed to descent? I'm not sure I understood it from the talk Edit: After giving it a bit more thought, I'm realizing we're in Z2\mathbb{Z}_2 and the set of operations we can do were is pretty limited. As you said, addition makes a lot of sense!
  2. Is there a corresponding notion of nonlinearities here, that is present in standard real-valued neural networks? To clarify: composition of two linear maps in usual smooth spaces is closed under composition. The addition of non-linearities is what allows us to get more complex operations. I'm wondering does this issue manifest here in Z2\mathbb{Z}_2 in any way.

view this post on Zulip Geoff Cruttwell (Jul 09 2020 at 13:33):

Bruno Gavranovic said:

  1. Is there a corresponding notion of nonlinearities here, that is present in standard real-valued neural networks? To clarify: composition of two linear maps in usual smooth spaces is closed under composition. The addition of non-linearities is what allows us to get more complex operations. I'm wondering does this issue manifest here in Z2\mathbb{Z}_2 in any way.

Any reverse derivative category has an inherent notion of "linear map" (which one can prove are closed under composition) so since they've shown that Boolean circuits form a reverse derivative category, one gets for free some notion of "linear" Boolean circuits. I don't know what they look like, though!

view this post on Zulip Geoff Cruttwell (Jul 09 2020 at 13:54):

Just thinking out loud here...following up on the above, any reverse derivative category has 3 things one can derive from it:

view this post on Zulip Bruno Gavranović (Jul 09 2020 at 14:04):

Geoff Cruttwell said:

Any reverse derivative category has an inherent notion of "linear map" (which one can prove are closed under composition) so since they've shown that Boolean circuits form a reverse derivative category, one gets for free some notion of "linear" Boolean circuits. I don't know what they look like, though!

Perhaps I should have clarified my question a bit more. I mean that the notion of non-linearities is crucial to learning complex functions, since they're defined in terms of these simpler linear + nonlinear constituents. What I am not clear here is whether this is also the case for boolean circuits. What are the examples of maps that are linear? What are the examples of nonlinear maps?
I guess you're saying that you're not aware of the linear ones, while for me the whole space of Z2\mathbb{Z}_2 is a bit strange to think about in these terms :grinning:

view this post on Zulip Geoff Cruttwell (Jul 09 2020 at 15:06):

Bruno Gavranovic said:

Geoff Cruttwell said:

Any reverse derivative category has an inherent notion of "linear map" (which one can prove are closed under composition) so since they've shown that Boolean circuits form a reverse derivative category, one gets for free some notion of "linear" Boolean circuits. I don't know what they look like, though!

Perhaps I should have clarified my question a bit more. I mean that the notion of non-linearities is crucial to learning complex functions, since they're defined in terms of these simpler linear + nonlinear constituents. What I am not clear here is whether this is also the case for boolean circuits. What are the examples of maps that are linear? What are the examples of nonlinear maps?

Sorry, yes, I should have been a bit clearer that my thought above didn't actually give an answer to your question of which boolean circuits are linear or non-linear :wink: ; the general theory just provides an abstract definition of linearity which needs to be worked out in this particular case.

Whatever they work out to be, though, from the general theory they will have the same properties as ordinary linear functions, eg., they will be very simple with respect to both the forward and reverse derivative, and hence should be simple with respect to reverse derivative ascent.

I guess you're saying that you're not aware of the linear ones, while for me the whole space of Z2\mathbb{Z}_2 is a bit strange to think about in these terms :grinning:

Oh definitely, for me too! But that's part of what's neat about what they've done: by putting things in this abstract framework, one doesn't have to guess as to what the right notion of linearity in this setting should be; the abstraction provides a definition for you.

view this post on Zulip Bruno Gavranović (Jul 09 2020 at 17:04):

Ah, yes, I see what you mean. Certainly there's an option to sit down and work out what these functions are but I thought I could skip that step by asking here :grinning:

view this post on Zulip Paolo Perrone (Jul 09 2020 at 17:33):

Here's the video!
https://www.youtube.com/watch?v=gdinMKXW86g&list=PLCOXjXDLt3pYot9VNdLlZqGajHyZUywdI

view this post on Zulip Paul Wilson (Jul 09 2020 at 20:09):

Bruno Gavranovic said:

Ah, yes, I see what you mean. Certainly there's an option to sit down and work out what these functions are but I thought I could skip that step by asking here :grinning:

So I think Geoff answered this from the theoretical perspective- graphically for PolyZ2\mathrm{Poly}_{\mathbb{Z}_2} these are just the maps without white generators (Geoff please correct me if I'm wrong!)

It sounds like you're more interested in whether the final 'learned function' (i.e., the map f(θ,):ZaZbf(\theta, -) : \mathbb{Z}^a \to \mathbb{Z}^b ) can be "doing something interesting", right?
If so, then yes: the circuit example on the second slide is actually a special case of a model which can learn any boolean map ZaZb \mathbb{Z}^a \to \mathbb{Z}^b , by simply remembering the set of all input/output pairs. IOW: You can construct a "lookup table" circuit where the parameters are the "data" of the lookup table, and the inputs are the "address" of each value in the lookup table. Hopefully that answers your question and makes sense!

view this post on Zulip JS PL (he/him) (Jul 09 2020 at 20:14):

This super cool, I'm happy to learn about Boolean differential calculus and great to see that it gives a CRDC!
Simple question since I'm new to Boolean differential calculus: what is the formula for the forward derivative? (which of course can be derived from the reverse derivative but I'm having a bit of trouble with the definition of the derivative in Boolean differential calculus) So say for an arbitrary function f:Z2nZ2f: \mathbb{Z}^n_2 \to \mathbb{Z}_2 what is its derivative D[f]:Z2n×Z2nZ2D[f]: \mathbb{Z}^n_2 \times \mathbb{Z}^n_2 \to \mathbb{Z}_2?

view this post on Zulip JS PL (he/him) (Jul 09 2020 at 20:17):

And as a follow up question, does this work for any finite field? Or only Z2\mathbb{Z}_2?

view this post on Zulip JS PL (he/him) (Jul 10 2020 at 12:28):

Another simple question: could you explain how the chain rule works in terms of Boolean functions?

view this post on Zulip JS PL (he/him) (Jul 10 2020 at 13:05):

JS Pacaud Lemay said:

Another simple question: could you explain how the chain rule works in terms of Boolean functions?

Ah I think I've worked it out. It really depends on the fact that f:Z2nZf: \mathbb{Z}^n_2 \to \mathbb{Z} can only be either the zero element or unit element of the field (which seems like an obvious statement!) and also that if f(x)f(y)f(x) \neq f(y) then f(x)+1=f(y)f(x) + 1 = f(y). Which is probably super obvious for those who work with Boolean functions! So I guess this probably doesn't easily extend to other finite fields.

view this post on Zulip JS PL (he/him) (Jul 10 2020 at 14:26):

JS Pacaud Lemay said:

for an arbitrary function f:Z2nZ2f: \mathbb{Z}^n_2 \to \mathbb{Z}_2 what is its derivative D[f]:Z2n×Z2nZ2D[f]: \mathbb{Z}^n_2 \times \mathbb{Z}^n_2 \to \mathbb{Z}_2?

I believe I've worked this out to be: D[f](x,y)=inf(x)yi+f(x+ei)yiDf = \sum \limits^n_i f(x)y_i + f(x + e_i)y_i, hopefully this is correct!