Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: deprecated: our papers

Topic: fibrational linguistics


view this post on Zulip Fabrizio Genovese (Jan 05 2022 at 12:46):

We just did this: https://arxiv.org/abs/2201.01136v1
An handwavy explanation is here: https://twitter.com/fabgenovese/status/1478696634628423680?s=20
It's basically "using some basic facts about discrete opfibrations to talk about language". There's more coming. Let me know if you have questions!

STORY TIME! Ok, so this work, in collaboration with @ququ7 and @puca_caterina just came out. I consider one of the best things I did so far, if not _the_ best, and it's just the tip of the iceberg! Let me explain very briefly what we tried to do. 1/n https://twitter.com/arXiv_math_CT/status/1478670528474923008

- Fabrizio Romano Genovese (@fabgenovese)

view this post on Zulip John Baez (Jan 05 2022 at 16:08):

Nice! Small typo in abstract: "indented" should be "intended".

view this post on Zulip fosco (Jan 05 2022 at 18:25):

Good catch, John. Thanks (if you find other misprints, let us know)

Apart from this, let me add that we are really interested in getting some constructive (or destructive) feedback on this proposal, and we have a few other ideas that we will put forward soon (hopefully!).
Fabrizio already did a wonderful job in explaining the gist of this first paper, to the point I actually understood something about our paper in the actual twitter thread.

I would like to add something to what stands behind this project and what exactly motivated me/us to join forces; since I have memory I have always had a crush on linguistics, but I didn't know why until I met Lambek's "The mathematics of sentence structure". I was very young at the times, and I couldn't really appreciate the technicalities and how seminal that work has been for syntax.

I find the overall idea in Lambek's work akin to the famous Aristotelian rebuttal: those who despise philosophy, arguing that it is useless, have to do it through philosophical debate. Similarly, whoever tries to deny the importance of category theory in everyday life has to do it using a category: the language they are speaking. Resistance to adopt categorical thinking is, hence, utterly futile.

Another life-changing meeting was with real linguistics books on real comparative linguistics. I do not claim (and I will never claim) I have any sort of command of them, but I had at least a glimpse at how the grammar of proto-indoeuropean was reconstructed along the last century, and I firmly believe there is something mathematical going on under the surface of the various backtracking techniques that early sanskritists applied to reconstruct the grammar and the morphology of PIE (not necessarily category-theoretic; I have ideas I could talk about for hours, that I'm cooking together with others, and that although very naive might gain some momentum in the hands of a true linguist... shoutout to @Paolo Brasolin and to banyan).

What always failed my understanding is that despite the tremendous influence that Lambek's work has had in the last 60+ years, everyone seemed to care about a categorical foundation for languages as syntactical objects. No one seemed to care about the simple banality of life that real languages are subject to a dynamic, they change over time and under use. Category theory has a lot to say abut this, but you have to move from proof theory, syntactic presentations of categories, etc.... to the realm of topos theory, fibrations, sheaf theory (...some consonant shifts in the passage PIE -> Sanskrit really smell like local-to-global phenomena, and this smells like obstructions to extend cohomology classes, but I will stop here to avoid ridiculing myself)

So, all in all, in my opinion, the Holy Grail of mathematical linguistics is a categorical foundation not only for the syntax, and not only for (part of) the semantics of a language L, but a foundation for the process of interaction and collective construction of a shared deductive system, in which "speakers evaluate terms as computers running in cluster", with the goal to extend the expressiveness of L. But what is expressiveness, categorically? How is the morphological complexity of a language linked to some invariant of the category that L presents?

How can I make sense of the fact that language acquisition is based on a "pareidolic" phenomenon: given a non-grammatical sentence but "close" to a wff in a language L, the brain interpolates mistakes and fills holes in order to attribute meaning to the string. Communication is effective for a non prefectly proficient speaker, at the cost of some ambiguity: "hungry now cat" probably means "the cat is hungry now", or "I'm hungry, hence now I'll eat this cat"; this ambiguity, however, is highly contextual (enter a taxi in Berlin and shout "Flughafen!"; now enter a second taxi and shout "Apfel!": the outcome will likely be different)

So. This is to state a second banality of life: that the process of language modification is collective, and modifies syntax in order to attain a goal. In first approximation, this goal is: gain a better understanding of the semantics, or rather of the perceptive bundle that results from experiencing the world (be it physical perceived or imagined), and thus knits together very tightly said syntax and semantics. Can this be explained categorically, given that it is a subtly game-theoretic phenomenon (so, shoutout to @Jules Hedges here)?

Really, I could go on for days talking about this.

view this post on Zulip Fabrizio Genovese (Jan 05 2022 at 18:48):

To add to what @fosco said, this is why we focused on vocabulary acquisition in this first paper. We need to describe very basic concepts (what is an explanation? What does it mean to "understand it?" etc) to be able to scale up and model the dynamics of language evolution.

view this post on Zulip Matteo Capucci (he/him) (Jan 05 2022 at 20:08):

fosco said:

I had at least a glimpse at how the grammar of proto-indoeuropean was reconstructed along the last century, and I firmly believe there is something mathematical going on under the surface of the various backtracking techniques that early sanskritists applied to reconstruct the grammar and the morphology of PIE (not necessarily category-theoretic; I have ideas I could talk about for hours, that I'm cooking together with others, and that although very naive might gain some momentum in the hands of a true linguist... shoutout to Paolo Brasolin and to banyan).

I'd love to hear about this!

fosco said:

So. This is to state a second banality of life: that the process of language modification is collective, and modifies syntax in order to attain a goal. In first approximation, this goal is: gain a better understanding of the semantics, or rather of the perceptive bundle that results from experiencing the world (be it physical perceived or imagined), and thus knits together very tightly said syntax and semantics. Can this be explained categorically, given that it is a subtly game-theoretic phenomenon (so, shoutout to Jules Hedges here)?

First a random question to make you cringe, is this semiotics? In walking between a pub and another some weeks ago, I formulated the conjecture that semiotics that part of the Curry-Howard-Lambek corresponds that sends a category (meaning) to its internal language (syntax).

Secondly, I've also been wondering about how the internal world/language/model of an agent (or better still, its evolution) could be categorically embodied, for cybercats purposes. I quickly converged on a categorical model of the cybernetic loop which is still quite rough but quite appealing. I think it'd be straightforward to apply to fibrational linguistic, since it's absed on alternating cycles of Kan extensions and Kan lifts.

view this post on Zulip Tim Hosgood (Jan 05 2022 at 21:54):

i'm really looking forward to reading this paper! i'm secretly hoping that it might agree with some thoughts that i've had about language in terms of fibrations, but will be happy to read it even if it shows me that i was wrong!

view this post on Zulip Tim Hosgood (Jan 05 2022 at 21:55):

I once wrote

One final thing that came to me yesterday is that, given this fibration point of view, translation seems more like the change of trivialisation of a bundle: translation from Japanese to Russian, say, is given by πRu1πJa\pi^{-1}_{\mathsf{Ru}}\circ\pi_{\mathsf{Ja}} , and all our fibres (i.e. all our languages) are glued together along entries living over the same point, so things really kind of do look like a bundle of some sort.

and would love to be able to make this formal one day

view this post on Zulip Rich Hilliard (Jan 06 2022 at 00:36):

Looking forward to reading the whole paper and discussing further.
Meantime, a couple of minor typos:

In ref [BBR87], 2nd author should be Bob BERWICK (not Berrywick).
G.E. Barton, R.C. Berwick, and E.S. Ristad, Computational complexity and natural language.

In Definition 3.1 "every morphism f : pE -> C" missing an '#'?

I love the talking fish!

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 00:59):

Matteo Capucci (he/him) said:

fosco said:

I had at least a glimpse at how the grammar of proto-indoeuropean was reconstructed along the last century, and I firmly believe there is something mathematical going on under the surface of the various backtracking techniques that early sanskritists applied to reconstruct the grammar and the morphology of PIE (not necessarily category-theoretic; I have ideas I could talk about for hours, that I'm cooking together with others, and that although very naive might gain some momentum in the hands of a true linguist... shoutout to Paolo Brasolin and to banyan).

I'd love to hear about this!

fosco said:

So. This is to state a second banality of life: that the process of language modification is collective, and modifies syntax in order to attain a goal. In first approximation, this goal is: gain a better understanding of the semantics, or rather of the perceptive bundle that results from experiencing the world (be it physical perceived or imagined), and thus knits together very tightly said syntax and semantics. Can this be explained categorically, given that it is a subtly game-theoretic phenomenon (so, shoutout to Jules Hedges here)?

First a random question to make you cringe, is this semiotics? In walking between a pub and another some weeks ago, I formulated the conjecture that semiotics that part of the Curry-Howard-Lambek corresponds that sends a category (meaning) to its internal language (syntax).

Secondly, I've also been wondering about how the internal world/language/model of an agent (or better still, its evolution) could be categorically embodied, for cybercats purposes. I quickly converged on a categorical model of the cybernetic loop which is still quite rough but quite appealing. I think it'd be straightforward to apply to fibrational linguistic, since it's absed on alternating cycles of Kan extensions and Kan lifts.

We were thinking about iterated loops of adjunctions, which I gues it's similar enough... :stuck_out_tongue:

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 01:00):

Tim Hosgood said:

I once wrote

One final thing that came to me yesterday is that, given this fibration point of view, translation seems more like the change of trivialisation of a bundle: translation from Japanese to Russian, say, is given by πRu1πJa\pi^{-1}_{\mathsf{Ru}}\circ\pi_{\mathsf{Ja}} , and all our fibres (i.e. all our languages) are glued together along entries living over the same point, so things really kind of do look like a bundle of some sort.

and would love to be able to make this formal one day

I have a bit of difficulty parsing this. What is π\pi here? The fibration functor?

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 01:01):

Rich Hilliard said:

Looking forward to reading the whole paper and discussing further.
Meantime, a couple of minor typos:

In ref [BBR87], 2nd author should be Bob BERWICK (not Berrywick).
G.E. Barton, R.C. Berwick, and E.S. Ristad, Computational complexity and natural language.

In Definition 3.1 "every morphism f : pE -> C" missing an '#'?

I love the talking fish!

Thanks! We changed notation 213328 times so yes, that's a missing # :frown:

view this post on Zulip Robin Piedeleu (Jan 06 2022 at 10:41):

Looks like really cool work!

I skimmed the paper briefly because I wanted to understand in what sense language forms a category, according to your framework. I should probably read further, but I also thought I'd ask a naive question here if you don't mind: what are the objects and morphisms of the language category supposed to represent in general? Is it like in Lambek's approach, where objects are concatenations of basic grammatical types and morphisms are derivations? Since your setting aims for greater generality, what do you mean by compositionality here? What is the interpretation of composition in the language category?

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 10:59):

I think Lambek's "pure" pregroup approach is too narrow to be used in our work. After all, this is also true for DisCoCat, if you remember that paper by Preller where she proves how functors from pregroups to vector spaces can send every time to a vector space of dimension at best 1

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 11:00):

We opted for the solution Preller and Lambek concocted. In the pregroup-related examples we actually used free compact closed categories generated by a dictionary. So our pregroup objects are couples of type (word,type)(\text{word},\texttt{type}).

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 11:01):

Then composition is interpreted as usual in Lambek's approach. The fibers over every object represent the possible meanings that word/sentence can have in the speaker's head. But our framework is strictly more general than this in that we do not make strict hypotheses on the language category L\mathcal{L}.

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 11:03):

I think results like Theorem 3.3 are great because they say that "it doesn't matter how meanings are organized in your head and how language is organized categorically, you can always reshuffle the meanings in your head to match the language structure"

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 11:07):

In this respect, discussion after Def. 3.9 is insightful. We postulate that language is tendentially not compositional for native speakers. That is, you have committed so many sentences and pre-formed bits of sentences to memory that the amount of compositionality you need to use to express concepts in your language is minimal and effortless. So you use directly the functor DppL\mathcal{D}^p \xrightarrow{p} \mathcal{L}.
On the contrary, suppose you are learning a new language. Now you don't have direct access to DppL\mathcal{D}^p \xrightarrow{p} \mathcal{L}. Instead, you have to go through DpsEppL\mathcal{D}^p \xrightarrow{s} \mathcal{E}^p \xrightarrow{p^\sharp} \mathcal{L}. Categorically, these are the same via Thm 3.3. But computationally, ss may be very hard to compute

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 11:09):

For us, this is a conceptually formal way to capture the idea that "when learning a new language, you really have to use a lot of compositionality and "arrange your toughts in a way that conforms to language structure". This may also give a semi-decent explanation of why learning languages close to the ones you already know is easier than learning languages from different linguistic families. The farther away the language is, the most you need to compute ss for everything you want to say. :smile:

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 11:17):

In any case I'm here on twiter spaces now if you want to ask questions! https://twitter.com/i/spaces/1gqxvlDVbPOGB

view this post on Zulip Robin Piedeleu (Jan 06 2022 at 12:50):

Thanks a lot, Fabrizio!

My question was even more basic, I think: what's a morphism f:XYf: X\to Y in the linguistic category L\mathcal{L}? If I understand correctly your answer, in the general setting, objects of L\mathcal{L} are phrases/words/morphemes/whatever linguistic unit you care about, and morphisms are ways of putting them together, or something along these lines? Sorry if this is obvious, but I'm just trying to understand what's assumed about language to turn it into a category and what the categorical structure means in linguistic terms.

view this post on Zulip Robin Piedeleu (Jan 06 2022 at 12:51):

Fabrizio Genovese said:

In any case I'm here on twiter spaces now if you want to ask questions! https://twitter.com/i/spaces/1gqxvlDVbPOGB

Sorry, it seems I've missed that!

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 14:39):

Robin Piedeleu said:

Fabrizio Genovese said:

In any case I'm here on twiter spaces now if you want to ask questions! https://twitter.com/i/spaces/1gqxvlDVbPOGB

Sorry, it seems I've missed that!

There's another one going on now! :grinning:

view this post on Zulip Fabrizio Genovese (Jan 06 2022 at 14:40):

Robin Piedeleu said:

Thanks a lot, Fabrizio!

My question was even more basic, I think: what's a morphism f:XYf: X\to Y in the linguistic category L\mathcal{L}? If I understand correctly your answer, in the general setting, objects of L\mathcal{L} are phrases/words/morphemes/whatever linguistic unit you care about, and morphisms are ways of putting them together, or something along these lines? Sorry if this is obvious, but I'm just trying to understand what's assumed about language to turn it into a category and what the categorical structure means in linguistic terms.

Yes, we had this idea in mind, but I think things work out also if words are morphisms somehow. I have to think about it a bit :smile: