Category Theory
Zulip Server
Archive

You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.


Stream: theory: applied category theory

Topic: QNLP


view this post on Zulip Alexis Toumi (May 07 2020 at 17:23):

For reference in case you missed Bob's online QNLP seminar, here's a copy of the chat with some useful question-answers and references:

From Emily Riehl to Everyone: (06:02 pm)

I have to say I’m very impressed with the theatrics.

From Tim Sears to Everyone: (06:10 pm)

This is the coolest seminar ever.

From Me to Everyone: (06:11 pm)

the shady lighting adds so much to the scenery

From Samuel Tenka to Everyone: (06:12 pm)

question: are these states/effects normalized?

From Me to Everyone: (06:12 pm)

They’re not normalised !

From Samuel Tenka to Everyone: (06:13 pm)

thanks! So we keep track of a bit of extra information in our diagrams. good to know :-)

From Me to Everyone: (06:13 pm)

Scalars would be represented as closed boxes with no input or output wires

From Samuel Tenka to Everyone: (06:13 pm)

gotcha. makes sense. so right now these are equivalent to penrose string diagrams?

From Me to Everyone: (06:14 pm)

Yes, they are the same Penrose string diagrams!
(N.B. You need to assume finite dimensional Hilbert spaces if you want a compact-closed category.)

From Samuel Tenka to Everyone: (06:17 pm)

otherwise the cup (special state) won't be well defined?

From Priyaa to Everyone: (06:18 pm)

yes. one would get infinite sum otherwise

From Me to Everyone: (06:18 pm)

Exactly! But in quantum computing you only consider a finite number of quits anyway.

From David I Spivak to Everyone: (06:21 pm)

So is the cat alive or dead right now?

From Samuel Tenka to Everyone: (06:21 pm)

XD

From Tim Sears to Everyone: (06:25 pm)

Is this the best paper for further reading? https://arxiv.org/pdf/1904.03478.pdf
From Brendan to Everyone: (06:27 pm)

Bob listed these as references for this talk in the abstract:

From Brendan to Everyone: (06:27 pm)

[1] B. Coecke, G. De Felice, K. Meichanetzidis & A. Toumi (2020) Quantum Natural Language Processing: We did it! https://medium.com/cambridge-quantum-computing/quantum-natural-language-processing-748d6f27b31d.

[2] B. Coecke (2016) From quantum foundations via natural language meaning to a theory of everything. arXiv:1602.07618.

From Me to Everyone: (06:28 pm)

Yes, that paper gives some strong intuition for the QNLP experiments we did. For the more technical / categorical details we just submitted this tool paper : https://arxiv.org/abs/2005.02975
From Brendan to Everyone: (06:28 pm)

He also mentioned at the beginning of the talk there’s a new paper on the arXiv today, I think, but I can’t find it

From theprotoncat to Everyone: (06:28 pm)

The arXiv is the tools paper Alexis mentioned I assume

From Brendan to Everyone: (06:29 pm)

Yes it is; thanks Alexis

From joscha to Everyone: (06:29 pm)

Are the outputs/inputs ‘labelled’ to know which ones are allowed to match?

From joscha to Everyone: (06:30 pm)

(noun going out, noun accepted, ..)

From Me to Everyone: (06:30 pm)

Yes! The blog post [1] describes the experiments for a general audience, the paper [2] gives a nice story compositionally from physics to NLP
@joscha the basic types are the labels

From joscha to Everyone: (06:32 pm)

thanks

From Konstantinos Meichanetzidis to Everyone: (06:33 pm)

In general, each type-wire carries a vector space of a type-dependent dimension

From theprotoncat to Everyone: (06:34 pm)

I assume in the quantum case they’re all Hilbert spaces of different dimensions?

From Konstantinos Meichanetzidis to Everyone: (06:34 pm)

so after you join the wires according to how the types compose, then you forget the labels and have just some vector spaces
yes that's right, in the quantum case, if you have qubits, they are hilbert spaces of some power of two, say

From quaang to Everyone: (06:36 pm)

To deal with different dimensions in diagrams, you could use a framework which I call qufinite ZX-calculus.

From Konstantinos Meichanetzidis to Everyone: (06:37 pm)

yes!

From Me to Everyone: (06:53 pm)

Here’s the repo if you want to play with it : https://github.com/oxford-quantum-group/discopy
From Samuel Tenka to Everyone: (06:54 pm)

You should share that on the Zulip, too!

From Me to Everyone: (06:54 pm)

True!

From Samuel Tenka to Everyone: (06:57 pm)

Question: what computation did the IBM computer actually do? (e.g. did it predict the next word in a sentence? or, toward learning grammar, did it figure out the arity of various words?)

From Lee J. O'Riordan to Everyone: (06:59 pm)

What kind of circuit depths do you hit for these on both machines?

From Me to Everyone: (07:00 pm)

It solved some basic question answering task, i.e. given “Does Alice love Bob ?” It said “Yes!"

From Giovanni de Felice to Everyone: (07:00 pm)

Check out the circuits here:

From Giovanni de Felice to Everyone: (07:00 pm)
https://github.com/oxford-quantum-group/discopy/blob/master/notebooks/qnlp-experiment.ipynb
From Me to Everyone: (07:01 pm)

The circuits had CNOT depth two

From Yutaka Shikano to Everyone: (07:01 pm)

Question: How many qubits can we maximally use? More than 5Q machines, such as 16 qubits, 20 qubits, and 53 qubits, can we complile?

From Lee J. O'Riordan to Everyone: (07:01 pm)

Cheers Alexis.

From Giovanni de Felice to Everyone: (07:01 pm)

Depth is around 8/10 before the ket compiler

From Samuel Tenka to Everyone: (07:01 pm)

Neat! Near the beginning, Bob mentioned that this framework was able to "understand grammar". Does this mean that you didn't have to hardcode the circuit topology? E.g., how did you figure out that Alice is a noun etc (classical NLP approaches can do this by scanning over a large corpus)

From Me to Everyone: (07:01 pm)

Depth 2 for “Alice loves Bob”, depth 3 for “Alice loves Bob who is rich”

From Stephen Mell to Everyone: (07:01 pm)

Do these need to be done on a real quantum computer, or can they be done in simulation (and if so, can they be done at larger scale)?

From Tim Sears to Everyone: (07:02 pm)

Same question as Stephen

From Me to Everyone: (07:02 pm)

They were done by simulation beforehand. For now it was only a proof-of-concept but it can / it will be scaled indeed.
Basically we used 1 qubit for representing a noun, in general we can use n-qubit noun space

From Tim Sears to Everyone: (07:03 pm)

Would be nice to compare to word2vec or some latter day NN’s like BERT.

From Me to Everyone: (07:04 pm)

@Samuel the idea is we took the grammatical structure of the sentence (which can be computed efficiently) and mapped it automatically (with a monodical functor actually) to the architecture of the circuits

From Konstantinos Meichanetzidis to Everyone: (07:05 pm)

Indeed. We also are aware of word2ket! But priority is scaling up as Alexis said. Also yes we should compare with other models, but we should compare fairly:
i.e. compare models that have the same order of parameters

From Yutaka Shikano to Everyone: (07:05 pm)

Similar question on qubit size or real-device usage:
How many qubits do we need to go beyond the proof-of-concept stage?

From Tim Sears to Everyone: (07:06 pm)

You have a richer, but more compact and interpretable model in the ML sense.

From Giovanni de Felice to Everyone: (07:06 pm)

@Ytaka Using around 50 qubits would match the google supremacy experiment

From Tim Sears to Everyone: (07:07 pm)

Can you do a large scale version of this on a classical computer?

From theprotoncat to Everyone: (07:07 pm)

I have a slight side-quest question: could ZX-calculus be used to translate efficiently between digital and analog QC? Particularly when we are stuck with some specific native gates on some device, say an Ising Hamiltonian is the only entangling gate. It would involve trotterization I guess, but is there any intuition for looking into it?

From Giovanni de Felice to Everyone: (07:07 pm)

The dimension of the space grows exponentially in the number of qubits, so it would be very expensive after 30 qubits or so

From Tim Sears to Everyone: (07:08 pm)

thx

From Samuel Tenka to Everyone: (07:08 pm)

@Alexis can one compute the parse tree of a sentence without knowing the parts-of-speech of words? Most modern approaches to NLP can figure out grammatical rules just from a corpus of un-annotated strings. In other words, can one learn the types of words without manually feeding annotations to the quantum computer?

From Konstantinos Meichanetzidis to Everyone: (07:09 pm)

Also note: expensive classical sim is not only about qubit number: it's about how entangling the circuit is.

From Samuel Tenka to Everyone: (07:09 pm)

(if not, the work is still really cool! I'm just confused by Bob's advertisement that the quantum computer was able to understand grammar because it seems the grammar was manually fed into it)

From Tim Sears to Everyone: (07:09 pm)

Might be worth it since the modeling approach is interesting on its own

From Me to Everyone: (07:10 pm)

@Samuel, that is indeed one of the big open questions we need to solve. There is this paper by Smolensky et al. where they propose a quantum algorithm for learning the grammatical structure : https://arxiv.org/abs/1902.05162
From Samuel Tenka to Everyone: (07:10 pm)

Awesome! Thanks!

From Giovanni de Felice to Everyone: (07:10 pm)

@Samuel we didn’t implement a grammar induction algorithm for regroups yet but it should be possible and would be interesting if a quantum computer could do it

view this post on Zulip Alexis Toumi (May 07 2020 at 17:24):

More copy and paste:

From Konstantinos Meichanetzidis to Everyone: (07:11 pm)

Also: a classical parser would do: it would annotate the text with types (noun, verb, etc...) and then the circuit is created by reducing the types to the sentence type.

From Tim Sears to Everyone: (07:11 pm)

Maybe there is a translation to the nearest neural network. Then you can have better explanation for a NN that “works well in practice”

From Matteo Capucci to Everyone: (07:12 pm)

How universal is this approach wrt languages? Is it mainly targeted at English and similar languages or does it work in more generality?

From Konstantinos Meichanetzidis to Everyone: (07:12 pm)

For now, the assignment of types, and the state preparation (or computing the meaning) as Bob calls it, can be seen as independent stages

From Konstantinos Meichanetzidis to Everyone: (07:12 pm)

@Matteo pregroup grammar is supposed to capture most natural language structure

From Me to Everyone: (07:13 pm)

@Matteo pregroups are equivalent to Chomsky’s context-free grammars so you can do all of those, you can do mildly context-sensitive with a small variation on pregroups

From Konstantinos Meichanetzidis to Everyone: (07:13 pm)

They are as powerful as context free grammars, and they are pretty much universally applicable

From Lee Mondshein to Everyone: (07:13 pm)

Re “meaning in use”: can you comment on meaning as dynamic significance ( or influence) in some dynamic context of shared discourse + belief + action ?

From Brian Pinsky to Everyone: (07:13 pm)

You're eliding things like tense in your verb denotations. How do you want to deal with that?

From Konstantinos Meichanetzidis to Everyone: (07:14 pm)

@ Tim Sears: good point about nearest NN
@Brian Pinsky: the point is that more pregroup types and rules on them capture morphology like that

From Matteo Capucci to Everyone: (07:16 pm)

(thanks!)

From Martha Lewis to Everyone: (07:16 pm)

This paper by Sadrzadeh et al is

From Martha Lewis to Everyone: (07:17 pm)

looks at dynamic aspects of language: http://www.eecs.qmul.ac.uk/~mpurver/papers/sadrzadeh-et-al18semdial.pdf
From Me to Everyone: (07:18 pm)

@Brian on the syntactic side, you can deal with tense by adding more specific types instead of just sentence and noun. On the semantic side, it’s not completely clear how to deal with tense yet.

From Martha Lewis to Everyone: (07:19 pm)

re: tense, Lambek does include aspects like person, sentence type in his original work on pregroups
Lambek 1999: https://link.springer.com/chapter/10.1007/3-540-48975-4_1
From Konstantinos Meichanetzidis to Everyone: (07:22 pm)

@theprotoncat In principle, yes, as ZX is complete. In practice, it's a different topic and it reduces to how one maps from digital circs to cv qc

From Brian Pinsky to Everyone: (07:22 pm)

@alexis tense seems pretty well solved in classical linguistic frameworks. I'm not sure why translating it into a pregroup grammar should be hard, but we should have this conversation on zulip

From Me to Everyone: (07:23 pm)

yes, let’s all go to Zulip !

From theprotoncat to Everyone: (07:23 pm)

@konstantinos Thanks yeah I figured this is a far more complicated question than to be discussed right now. The intuition is ‘it’s possible but probably not efficient’.

view this post on Zulip Morgan Rogers (he/him) (May 07 2020 at 17:32):

Alexis Toumi said:

From Emily Riehl to Everyone: (06:02 pm)

I have to say I’m very impressed with the theatrics.

From Tim Sears to Everyone: (06:10 pm)

This is the coolest seminar ever.


You succeeded in making me feel bad for having missed it :sweat_smile:

view this post on Zulip Alexis Toumi (May 07 2020 at 17:34):

The talk was recorded, it should appear on youtube soon :)

view this post on Zulip Alexis Toumi (May 07 2020 at 17:38):

Oh wow, actually it's already on youtube: https://www.youtube.com/watch?v=YV6zcQCiRjo

view this post on Zulip Antonin Delpeuch (May 11 2020 at 12:18):

The correct link seems to be https://www.youtube.com/watch?v=mL-hWbwVphk now :)

view this post on Zulip (=_=) (May 15 2020 at 04:51):

This may be of interest to the QNLP folks. There's going to be an interactive podcast on May 18, 3pm Pacific Time, by Sam Charrington, of This Week in ML and AI, who's going to be exploring the question of whether linguistics has been missing from NLP research with Emily Bender of the University of Washington. Seems like something that QNLP can weigh in on. Link with registration details below:

https://twimlai.com/live-viewing-party-linguistics-missing-nlp-research/