You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
In the seventh talk of the ACT@UCR seminar, Tai-Danae Bradley will tell us about applications of categorical quantum mechanics to formal concept analysis.
She will give her talk on Wednesday May 13th at 5 pm UTC, which is 10 am in California, or 1 pm on the east coast of the United States, or 6 pm in England. It will be held online via Zoom, here:
https://ucr.zoom.us/j/607160601
Afterwards we will discuss her talk here!
You can see her slides here.
Abstract. In this talk, I’ll show how any probability distribution on a product of finite sets gives rise to a pair of linear maps called density operators, whose eigenvectors capture “concepts” inherent in the original probability distribution. In some cases, the eigenvectors coincide with a simple construction from lattice theory known as a formal concept. In general, the operators recover marginal probabilities on their diagonals, and the information stored in their eigenvectors is akin to conditional probability. This is useful in an applied setting, where the eigenvectors and eigenvalues can be glued together to reconstruct joint probabilities. This naturally leads to a tensor network model of the original distribution. I’ll explain these ideas from the ground up, starting with an introduction to formal concepts. Time permitting, I’ll also share how the same ideas lead to a simple framework for modeling hierarchy in natural language. As an aside, it’s known that formal concepts arise as an enriched version of a generalization of the Isbell completion of a category. Oftentimes, the construction is motivated by drawing an analogy with elementary linear algebra. I like to think of this talk as an application of the linear algebraic side of that analogy.
To get ready for Tai-Danae's talk you can look at her thesis:
• Tai-Danae Bradley, At the Interface of Algebra and Statistics.
and/or watch this nontechnical video:
https://www.youtube.com/watch?v=wiadG3ywJIs
(I've updated the event on the calendar)
I'm looking forward to it! I watched a video of her giving a chalkboard talk at the MIT Category Seminar, and it was a very pleasant experience (except for the fact that the camera wasn't following her at all times). From the mathematics, to the format, to the pacing, to the very voice tonality and handwriting -- I enjoyed all of it.
Hi everyone! Looking forward to giving the talk soon. I want to share a heads up now—the work I’ll be describing involves ideas from lots of different places: linear algebra, basic probability, quantum physics, category theory, machine learning.... I could spend hours telling you about each topic alone! But we only have one hour, so I’ll have to omit lots of details and connections and move rather quickly in many places. (The content itself really very simple—there’s just a lot of it.) I’m afraid this won’t do the math justice, so please, please forgive me if the talk is rough at times :) I’m going to try my best! I hope it will at least whet your appetite for the ideas. See you soon!
@Todd Trimble Thanks so much!
Starting soon...
post-talk message for tai-danae: for the record, i would be happy to hear your hour-long lecture on machine learning, ive been meaning to learn a bit about it :>
Dr Bradley, that was one of the most inspiring talks I've ever seen! Thank you so much. I'll be reading your thesis carefully.
I am wondering if one can use the same concepts, but on the rank-3+ case, without losing information. e.g. there exist some matrix products that one can reconstruct the exact from.
Some of the references:
Simon's post on the nucleus of a profunctor: https://golem.ph.utexas.edu/category/2013/08/the_nucleus_of_a_profunctor_so.html
Gantner and Wille, Formal Concept Analysis: https://www.springer.com/gp/book/9783540627715
Davey and Priestley, Introduction to Lattices and Order: https://www.cambridge.org/core/books/introduction-to-lattices-and-order/946458CB6638AF86D85BA00F5787F4F4
All this work of Tai-Danae and others is building pressure on us to make the analogy between linear maps and profunctors more precise - to make them both examples of a common concept, for example.
there was also a thesis mentioned on fuzz/formal concept/enriched category theory ... do you have a link?
Is the logic that you get from this entailment relation related to the quantum logic of Birkhoff and von Neumann which famously has no cut rule?
It's weird that there isn't a simple way to make C into an enriching category and look at vectors in C^X as functors.
Simon also has a great follow-up blog post on viewing truth values as an enriching category in the context of profuctor nuclei/formal concepts: https://golem.ph.utexas.edu/category/2014/02/galois_correspondences_and_enr.html
@Alexander Kurz Yes, here's a nice PhD thesis of Jonathan Elliot, former student of Simon. An investigation into "fuzzy" formal concepts with some nice background on enriched CT: http://etheses.whiterose.ac.uk/18342/
John Terilla said:
It's weird that there isn't a simple way to make C into an enriching category and look at vectors in C^X as functors.
You can think of C^X as functors from the discrete category X to the discrete category C, but this doesn't seem to buy one much: it's the usual embedding of Set into Cat.
@Tai-Danae Bradley Have you thought about running this algorithm on a natural language corpus ?
Cole Comfort said:
Is the logic that you get from this entailment relation related to the quantum logic of Birkhoff and von Neumann which famously has no cut rule?
The latter is the logic where you have a vector space (e.g. Hilbert space) and you take the poset of subspaces (which becomes an orthomodular lattice for a finite-dimensional Hilbert space).
Are there any ideas from formal concept analysis that lend ideas to try out in the linear algebraic case? Like the algorithms for drawing a concept lattice, or leveraging some "symmetry" between objects and attributes?
@Manas Yes, although this particular algorithm wouldn't work very well on a natural language dataset. But that's the goal! There are people (cc @John Terilla) working on this.
Is there some characterization of the case over relations when eigenvector and formal concept fail to coincide?
@Alek Sherstyuk, I've sort of been thinking in the other direction: I'd like a way (e.g. a definition, for starters) for the linear algebra to pick out concepts in a way that's at least motivated by the formal concepts ideas. But I think you might be asking about using LA to compute formal concepts, or related ideas. Maybe it's possible! I'm not familiar with work in this direction.
@Manas "this algorithm" is which algorithm? .... I came late to the talk, maybe that is why I am missing sth ...
@Gershom Yes. I'm pulling from memory here, but I think you can show that formal concepts/eigenvectors will coincide if your relation can be written as a disjoint union of complete bipartite subgraphs. In that case, each component corresponds to a formal concept/eigenvector pair. Otherwise, it's easy to cook up an example where they don't agree. (E.g. add an edge between "purple" and "fruit" in the example graph from my talk.)
@Eric M Downes Thanks! Towards the end of the talk, I was considering probability distributions on an N-fold product of sets (not just N=2). I didn't say this explicitly, though. (Not sure if this answers the question.)
Sorry for the long question. This question wasn't appropriate for the seminar. I feel understanding this can help me gain a bit of insight, but I wouldn't be surprised if it's too tedious to answer so no worries.
Can someone help me count correctly the parameters/degrees of freedom that I can choose both for the general distribution and also for the reduced density matrices of the quantized version and see if they coincide?
For simplicity let and and let be the probability distribution. Let represent the reduced density matrix which represents the quantum distribution marginalized over , so it gives the probabilities of on the diagonal. Therefore this is a positive matrix of dimension 4. Similarly for , being a positive matrix of dimension 3. Note that I don't start from the formula because then we don't have any degrees of freedom for the matrices.
My question is, given two positive , which I can choose however I want, what constraints are there that limit my degrees of freedom, so that I have as many degrees of freedom as when choosing an arbitrary ? Or will I have more?
CLASSICAL DISTRIBUTION
QUANTUM CASE
Currently the total is 14 for the quantum case. How would one impose that the spectrum of the two is the same? How many parameters -in terms of freedom to choose matrix elements- does that "eat up"? Also, I've only considered real numbers as matrix elements, if I consider complex entries then my number of parameters goes up, but somehow they must still combine into a single rank-1 density matrix on the composite tensor space, so maybe those extra degrees of freedom for the phases don't matter? I'm a bit confused.
Thank you Tai-Danae! As someone learning the subject I always find your explanations and illustrations very helpful. Excited to get your book on topology https://www.math3ma.com/blog/topology-book !! :nerd:
Thanks, that makes a lot of sense. I'm trying to think about what a "categorical" property would be that captures such things more abstractly, but coming up a bit blank. I'm not even sure I know how to say "complete relation" in standard categorical terms in Rel. There are plenty of non-complete bipartite graphs which compose with their transpose (both ways) to get a complete graph. Hrm, perhaps a complete graph is one such that f = f . f' ?
or rather a sum of them is, but then i guess that's what the point was to begin with.
@Gershom It could be nice to have a general categorical construction that subsumes FCs and eigenvectors as special cases. If there were such a construction, then it may be fine to know that FCs and eigenvectors don't coincide in some cases since they'd still be examples of a broader notion. This brings to mind what John said above:
All this work of Tai-Danae and others is building pressure on us to make the analogy between linear maps and profunctors more precise - to make them both examples of a common concept, for example.
(Sorry, I haven't figured out how quotes work...)
@Aaron Goodwin Thanks very much! (Enjoy the book!)
Quotes work by starting a line with > or, for long quotes, starting a line with
then a carriage return, then having lines of quoted stuff, then ending with
(It's almost impossible to explain this without Zulip thinking I want to quote some text!)
@Emanuel-Cristian Boghiu might (or might not) be happy to know that TeX works here if you put all math stuff in double dollar signs.
i think you need a newline after the first triple backtick
image.png
Ah, got it, thanks!
ok, so in the formal concept case, when we have a line that connects purple and fruit, then the formal concepts become the full set {g,o,p} <-> {f} and {p} <-> {f,v} right? or am i misunderstanding?
@John Baez Thanks! I hope it's more readable now.
Gershom, yes, that's correct.
Tai-Danae: ok, so what does the eigenvector approach yield in this case then?
@Tai-Danae Bradley Sadly, I missed your talk. I checked out your slides and was reminded of some other folks who got quite a lot of mileage by taking square roots of probabilities:
https://arxiv.org/pdf/1102.1994.pdf
https://arxiv.org/pdf/1404.6255.pdf
@Gershom Let be the 2x3 matrix whose first row is [1/2, 1/2 ,1/2] and whose second row is [0,0,1/2]. Notice that the 3x3 matrix doesn’t have [0,0,1] as an eigenvector, which we would hope it would if it were to recover the “purple” part of the formal concept {p, {f,v}}. Likewise, [1,0] is not an eigenvector of the 2x2 , which would correspond to “fruit” in the other formal concept, {{o, g, p}, {f}}
Note: I’m viewing the set X={o, g, p} as a basis for the vector space C^X by sending colors to standard basis vectors: o=[1,0,0] and g=[0,1,0], etc. Similarly, f = [1,0] and v = [0,1]. Also, is obtained from the joint distribution p on XxY defined by p(x,y) = 1/sqrt(4) if an edge joins x and y, and 0 otherwise.
Here’s another interesting example. Start with the same 3-edge bipartite graph from the talk, but add an extra edge between {g,v}. There are three formal concepts: {{o,g}, {f}} and {{g,p}, {v}} and {{g}, f,v}}. Now define a matrix M similarly as above: first row [1/2, 1/2, 0] and second row [0, 1/2, 1/2]. Then there are only two eigenvectors of , and one of them is [1,0,-1]! I think the zero and the negative are really intriguing. What does negative purple mean? And is the zero a bit like quantum interference?
Thanks again, Tai-Danae, for a great talk! Later today I'll put the video on YouTube and announce that fact here.
so here's something kinda fun which the beginning of the talk reminded me of—
the standard embedding of Set into Rel doesn't preserve products, right? even tho you still have projections π₁ : A × B → A, π₂ : A × B → B, and you can still factor cones with functional relations as legs thru them, at least?
well, the reason it's no longer a product is exactly because of taking marginals destroying info :D
once you have relations more general than functions, you can have correlation that gets destroyed by taking the pair of the composition with π₁ and the composition with π₂
Thanks, John!
sarahzrf said:
the standard embedding of Set into Rel doesn't preserve products, right? even tho you still have projections π₁ : A × B → A, π₂ : A × B → B, and you can still factor cones with functional relations as legs thru them, at least?
The Cartesian product applied to the functions of sets in the span is taken to the kronecker-style tensor product. Thus the diagonal map is no longer natural: the trace can not be copied because it entangles both systems. This is like a toy version of the no-cloning theorem in Rel.
i don't really follow your description :sweat_smile:
Ok, so I tried to think this (the p -> f case) through thinking of mapping relations into boolean matricies with the usual tropical operations (i.e. and and or as * and +) since that way multiplication of matrixes corresponds to the usual relational composition, right? In that case, I think (and the converse) are both such that all elements are 1? So your eigenvectors would only be themselves all 1. This leads me to the following speculation -- working over the booleans, taking the eigenvectors amounts to giving the formal concepts of some sort of "bipartite completion" of some relation R.
i worked out the diagram tho & i see why it's not natural
sarahzrf said:
i don't really follow your description :sweat_smile:
If the monoidal structure is compact closed, then it can't be the categorical product.
I agree btw that the -1 is really interesting -- it makes me wonder if it reflects something when we can consider relations as "directed" graphs where some arrows go one way and some the other
/me imagines some string diagrams
@Cole Comfort what exactly goes wrong?
this is not fully worked out, but i think the bipartite completion of some relation f should (using ' for transpose) be the fixed point of some sequence of compositions of: f . f' . f . f' . f in which case this whole gadget feels a bit coalgebraic?
@Tai-Danae Bradley's talk is now on YouTube here:
so maybe what the eigenvector gadget calculates is the fixed points of the "1 step partial completion", almost just by construction...
sarahzrf said:
Cole Comfort what exactly goes wrong?
Explicitly in FRel, the (co)graph of the diagonal function and the (co)graph of induces a commutative Frobenius algebra, and thus a self-dual compact closed structure. Notice that the diagonal map on in FRel should be decomposed into two diagonal maps on , so that isn't natural because . If you draw the string diagrams for trying to copy the unit for the compact closed structure at the component , then the proof comes down to the connectivity of the string diagrams. I think that the book picturing quantum processes has a nice graphical proof under the name of the no cloning theorem.
Gershom, yes orienting the arrows could be useful! What is a 'bipartite completion'?
Well a "bipartite completion" is this thing i just came up with :-) I was thinking about how one might take a relation that doesn't factor as a disjoint union of complete bipartite graphs into the "best" one that does, and that should somehow be like "connected components". So the idea is if you take a relation, and compose it with its transpose, then the relation again, then elements that are "one hop" apart become related. But elements that are "two hops" apart are still unrelated. So you do it again, and eventually you hit a fixpoint and that tells you that if you start at some point and go back and forth between the two sets as many times as you like, everything in that circuit now is in a single complete bipartite subgraph.
I'd think there's a universal property that goes with this, but I'm not sure what yet. It might be something in terms of the partial order on relations given by the partial order on subsets...
FWIW, in case anyone doesn't know, the structure induced on Rel by the cartesian product of sets has been studied abstractly as a cartesian bicategory.
Prof is listed both as an example and a non-example, the only apparent difference being the inclusion vs omission of "(small)" in front of "categories"
is that the thing making or breaking?
Alexander Kurz said:
Manas "this algorithm" is which algorithm? .... I came late to the talk, maybe that is why I am missing sth ...
The one used to model the even parity set (by reconstructing a joint probability from some samples). I understand that the density operator we try to approximate is generative for a specific length of sequence - was wondering how this could generalise to a natural language dataset
Cole Comfort said:
sarahzrf said:
Cole Comfort what exactly goes wrong?
Explicitly in FRel, the (co)graph of the diagonal function and the (co)graph of induces a commutative Frobenius algebra, and thus a self-dual compact closed structure. Notice that the diagonal map on in FRel should be decomposed into two diagonal maps on , so that isn't natural because .
What's interesting in Rel is that every morphism is still a lax comonoid homomorphism so that and (where and are the copying and deleting maps respectively). The behaviour of the Cartesian product in Rel-like categories is axiomatised in one of my favourite papers: Cartesian Bicategories I, by Carboni and Walters (if you read it, I recommend drawing string diagrams on the side to understand what's going on). @Jules Hedges and @Daniel Cicala had a nice blog post about it on the n-category café.
Oups, just read that @Mike Shulman already pointed out the link to Cartesian bicategories above.
Here is something related that I have been thinking about: the coproduct in Rel is the same as the coproduct in Set. To what extent does it hold for other colimit constructions? I.e. I have a guess that the pushout in Rel of two relations that are injective functions should be the same as the pushout in Set. Does this hold, and does it hold for other sorts of functions as well?
(note that the transpose of functions is typically a non-function, so the pullback we think should not exist does not. also, this gives a good argument why the categorical product must differ from set -- because of self-duality, it must coincide with the coproduct).
The functor from to is a left adjoint, so it preserves all colimits.
sarahzrf said:
Prof is listed both as an example and a non-example, the only apparent difference being the inclusion vs omission of "(small)" in front of "categories"
If I remember, the definition originally given in Cartesian Bicategories 1 requires it to be locally posetal and leaves working out the "real" definition for future work. Pretty sure the full definition has been written down now, but I don't remember where
Jules Hedges said:
sarahzrf said:
Prof is listed both as an example and a non-example, the only apparent difference being the inclusion vs omission of "(small)" in front of "categories"
If I remember, the definition originally given in Cartesian Bicategories 1 requires it to be locally posetal and leaves working out the "real" definition for future work. Pretty sure the full definition has been written down now, but I don't remember where
In the suggestively-named Cartesian Bicategories II.
Oscar Cunningham said:
The functor from to is a left adjoint, so it preserves all colimits.
indeed, it's the left adjoint of the Kleisli adjunction associated to the covariant powerset monad! (up to equivalence)
one might even say that it's a comonadic functor if that word weren't already taken :thinking:
This would all be easier to think about if people would stop calling this category and start calling it .
sarahzrf said:
one might even say that it's a comonadic functor if that word weren't already taken :thinking:
"Opmonadic".
pfft i considered posting that
is that an actual term people use??
sarahzrf said:
Prof is listed both as an example and a non-example, the only apparent difference being the inclusion vs omission of "(small)" in front of "categories"
Haha, that must be a mistake. I bet someone thought it was and someone else thought it wasn't, and whoever made the second edit didn't realize that they were contradicting the first. I think it is (except in the case of noncartesian enrichment mentioned explicitly as a non-example); smallness shouldn't matter.
sarahzrf said:
is that an actual term people use??
I've never heard it used. But it's consistent with other 2-categorical terminology: the Kleisli object of a monad in a 2-category is its Eilenberg-Moore object when regarded as a monad in (reverse the 1-cells but not the 2-cells), while the EM-object of a comonad in is its EM-object regarded as a monad in (reverse the 2-cells but not the 1-cells).
Mike Shulman said:
sarahzrf said:
Prof is listed both as an example and a non-example, the only apparent difference being the inclusion vs omission of "(small)" in front of "categories"
Haha, that must be a mistake. I bet someone thought it was and someone else thought it wasn't, and whoever made the second edit didn't realize that they were contradicting the first. I think it is (except in the case of noncartesian enrichment mentioned explicitly as a non-example); smallness shouldn't matter.
Maybe it is because Prof is not locally posetal, which is sometimes a requirement.
hmm, but it's still the right adjoint of the kleisli adjunction which would be "opmonadic" right
@Tai-Danae Bradley again thinking of the formal concept question, we can consider the multiplication of a matrix by its transpose as giving the transition matrix of a markov process, perhaps. (some renormalization may be necessary?). in such a case, rather than looking directly at the eigenvectors of this, we then might want to look at properties of the stationary distribution of such a matrix, or perhaps the nonunique stationary distributions? (note in particular that the disjoint bipartite condition is related to the resultant multiplication by a transpose being strongly connected or not).
sarahzrf said:
hmm, but it's still the right adjoint of the kleisli adjunction which would be "opmonadic" right
Well, it's the right adjoint in the EM-adjunction that's "monadic", and right and left adjoints get interchanged by passing to , so I think it would be the left adjoint in the Kleisli adjunction that's "opmonadic".
oh huh i thought adjunctions got flipped when you go to ^co
or is it both
i should probably try actually working with bicategories a bit sometime... <.<
@Gershom Thanks for explaining the bipartite completion and the new Markov idea. I especially like that thought of looking at fixed points under iterations. Something for me to think about!
@sarahzrf It's both. An adjunction in has a unit and a counit . If you flip the direction of 2-cells, then the unit and counit become respectively a counit and unit for an adjunction in . But if you flip the direction of 1-cells, then the unit and counit become and , also displaying an adjunction in . (Hence, the direction of adjunctions is preserved in the double-dual .)
Just watched the talk today. It's such an interesting talk! I'll need to take some time to think about this.