You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
Sorry to cram that all in the title. I am very interested in getting into what I can learn about these topics through the lens of monoidal categories and/or toposes. We have many finite state machines at work, but my colleagues are not familiar with Behavior Trees per se. Both FSMs and BT's seem amenable to applied category theory, maybe even BT's moreso.
I am slowly reading the relevant parts of Seven Sketches on toposes, and the relation of logic, behavior and syntax seems very relevant. It's just all so much to take in. I was hoping someone here might have tidbits to share to speed me along my learning path here.
https://autcat.github.io/ was a good branching point during my undergraduate thesis - but obviously more on the FSM side
A more specific question could elicit lots of replies here. Just asking for "tidbits" probably won't get folks excited, but there are people here who know lots about some of these topics.
It's really a case of not knowing enough to have a specific question. This is about as specific as I can get:
Really the first exposure i had to FSM (are they different from finite state automata?) was in algebraic coding theory, because you can use a FSM to encode a stream of bits, and it turns out that has a polynomial representation, and algebraic properties, IIRC. Now the FSMs I work with are in software, and I wonder how I can understand and improve them.
"the first exposure i had to FSM (are they different from finite state automata?)"
That's the kind of question mathematicians love most: a precise question that has a yes-or-no answer! There's no waffling around with these questions. And if you pose them correctly, you get precious bits of information.
I looked up finite state machine and in the first line it said
A finite-state machine (FSM) or finite-state automaton (FSA, plural: automata), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition.[1] An FSM is defined by a list of its states, its initial state, and the inputs that trigger each transition.
So apparently the answer to your question is yes.
Eilenberg, one of the founding fathers of category theory, wrote two books on category theory and finite state machines. Apparently all his friends in algebraic topology thought he was wasting his time. You can get them on LibGen. They're fairly interesting. I think there are some newer papers that discuss some results in these books, but I can't remember what they are.
What's a 'behavior tree'?
@John Baez So, Automata, Language and Machines volumes A and B? It's a very trippy journey even through the table of contents!
John Baez said:
What's a 'behavior tree'?
When I first heard about them I couldn't tell what would make them different from FSMs, except for maybe the layout. The idea is to lay out a pattern of behavior that's executed according to certain primitives. The nodes have access to a pool of state, and the tree keeps being executed from root to some leaf where an action happens, potentially many times a second.
Learning about the primitives helps with seeing how one might work. The execution of a node results in either "success, running, or failure" A running node will eventually become "success/failure." There are nodes that act like conditionals, and nodes that execute a sequence of children, waiting for one to succeed, or maybe waiting for one to fail ,nodes that enable looping, and so on. And it sure sounds like you could potentially dynamically graft subtrees into the tree depending on conditionals.
I'm not good at explaining it because so far I've only read a couple websites and chatted with AI a bit to learn what to search for, but it does seem different in character from a finite state machine.
I hear they use them for NPC player behavior in the gaming industry...
@John Baez This looks like a current and relevant extension to what Eilenberg was working on: Mathematical Foundations of Automata Theory
Ryan Schwiebert said:
John Baez So, Automata, Language and Machines volumes A and B?
Yes, those are the ones.
John Baez This looks like a current and relevant extension to what Eilenberg was working on: Mathematical Foundations of Automata Theory
This looks like a nice modern treatment!
(It should be "Automaton Theory", but that's a very minor complaint. Saying "Automata Theory" is like saying "Categories Theory".)
@John Baez Oo, i just ran across this survey Behavior Trees in Robotics and AI. Section 2 appears to be a very helpful and careful comparison of BTs to FSMs, decision trees, and a few other notions I did not know about.
Do you think behavior trees are widely used in robotics? @Joe Moeller - maybe Aaron Ames will know.
There are a million notions in computer science, and in some sense only the most elegant and the most widely used are worth learning for a nonexpert such as me.
I guess what I can say at the moment is:
Based on the pages I've found, and conversations with roboticists at work, they all know about BTs, but it is not yet something they've done. Perhaps it's just catching on. Maybe the AI people are ahead of the roboticists at the moment.
Two of the types of "composite" nodes are called "sequence" and "selector". The first one works like a universal quantifier (succeeding iff its children do) and the second like an existential quantifier (succeeding iff any of its children do.) This tutorial I read today has a separate name for "decorator nodes" that only have one children, and they describe "invert", "always succeed" and "always fail" as cases of those nodes. The final type of node is "leaf" which can do anything basically. It might trigger the agent to do an action, or else do some sort of check against state, and then report success or failure up the chain.
At this point it looks kind of like the non-leaf nodes are largely control structures, and the leaves are where you can access and change state. (Possibly it's interesting that's where the side-effects are.)
Maybe it's just because I"m reading these at the same time, but it sounds like these trees have their own internal logic, possibly like a topos? I'll have to keep my eye out to see if other such features appear.
Here's some context that might be helpful.
Behavior Trees were popularized by the following GDC talk on the AI in the video game Halo 2: https://www.youtube.com/watch?v=m9W-hpxuApk
Here's a transcription of the talk: https://www.gamedeveloper.com/programming/gdc-2005-proceeding-handling-complexity-in-the-i-halo-2-i-ai
Behavior trees were seen as a way to increase modularity over Finite state machines.
In the talk there were suggestive tree diagrams:
My understanding is this talk was reverse engineered into other tools in game development, like the behavior tree editor in the unreal engine:
Unreal example of behavior tree
The tree is somewhat deceptive, because implicit in the diagram is some means of sampling the world and player state to update the tree.
After this popularization, I think the definitions you see on wikipedia were a consolidated. Whatever the history, these definitions were then carried over into fields like robotics.
The name "behavior tree" is still a game developer term of art, it doesn't necessarily match the wikipedia definition. For example, in the original Halo 2 talk, "impluses" were "added" (really they were removed from the definition) to make small changes on the tree when unusual play states happened.
With that context, I agree monoidal categories may be usefully applied here, BTs feel like a sort of multi-category to me. My sense is much of applied category theory amounts to making diagrams like the above rigorous. You've already identified a notion of "And" and "Or" in behavior trees, if you want it to be a topos you'd also need a sort of "object classifier" or "boolean" concept.
Finite state machines, by contrast, have widely recognized careful definitions, or close analogs in category theory with careful definitions. And they are applied much more broadly than just game AI. The intro text on category theory "Conceptual Mathematics" spends quite a bit of space on discrete dynamical systems, graphs, etc.
Basic concepts in category theory remind me of finite state machines. For example, if you imagine a category with a single object , and say three distinct generating arrows from the object to itself, . That gives you a simple syntax, with an example of a morphism in that category. Then finding a functor from this category into , would give the previous "program" an interpretation as a composition of endofunctions.
These functions can be thought of as the arrows which transition the states of the associated set. Of course the entire state space is the object of this category, whereas finite state machine diagrams have arrows between states. You can define such a category with the states as objects by forming the category of elements of the specified functor interpreting into Set.
Individual automata aren't toposes by themselves, but all the possible interpretations collectively form a topos, by inheriting the topos structure from , just like how functions into a vector space, themselves form a vector space.
I find these bundles of connections interesting to think about. I don't know much about how category theory might apply to AI specifically, but if you're interested I bet there's somebody here with some thoughts on how that works.
@Alex Kreitzberg The other thing that struck me as topos like (based on my limited understanding so far, from Seven Sketches) is that the tree evolves over time. Nodes convert to Success/Failure while going through Running. The description in seven sketches is "things happening over time over a site."
Do these boolean-like evaluations of nodes convert into something globally boolean-like for a potential subobject classifier? (Sorry my knowledge of toposes is still so sketchy. i cannot even bring the details of subobject classifiers to mind, and I recall not quite getting the glueing concepts for sheafs.)
Essentially by definition, Toposes are certain well behaved categories of presheaves (The presheaves and category are so well behaved they're called "sheaves" and a "site" respectively). What you're noticing is can look "temporal".
So has a boolean set. Functions into that boolean set are the predicates. When you bake in an implicit time dependence by introducing a domain category, what happens to predicates? That would be how object classifiers are introduced via Sheafs. That's my understanding of how sheafs build toposes (but I confess, I'm an enthusiast so I don't have clever ideas here for your example).
In my explanations, I've been leaning on the more general definition of elementary topos as a crutch. It simply requires our category:
has finite limits, ("And")
is cartesian closed, and ("Implies")
has a subobject classifier. ("Notion of truth")
So you've already mentioned "sequencing" as a notion of "and" - for fun lets see if that satisfies point 1 (maybe this isn't what you were trying to do). Requirement 2, that it be cartesian closed, would mean our diagramatic language would need some notion of anonymous function defined nicey in terms of sequencing, since we're calling it "and". This would be analogous to currying in a functional language.
You need these "higher order functions" to define predicates into the "classifying space" the analog of booleans.
I don't have good ideas for how to do this, because I haven't thought about behavior trees for very long. I should say, I have a friend who worked on Halo who is familiar with the "in the trenches" definition, so you've inspired me to talk with them a bit about the definition. So maybe I'll have something more clever to add later.
Just a drive-by comment that you have the implication backwards: all categories of presheaves are toposes, but most toposes are not full categories of presheaves. Most toposes are certain particularly nice reflective subcategories of presheaf categories, namely those for which the reflector (called "sheafification" in this context) preserves finite limits.
I don't think what Alex said was actually wrong if he was talking about Grothendieck topoi here:
Essentially by definition, Toposes are certain well behaved categories of presheaves (The presheaves and category are so well behaved they're called "sheaves" and a "site" respectively).
I think some people say "topos" and mean "Grothendieck topos", so maybe Alex is working within this tradition. But if I were trying to explain toposes to someone (Ryan) I would have said something like "The nicest toposes, called Grothendieck toposes, are categories of specially nice presheaves called sheaves".
Yes, I was responding as if Alex meant to say "certain well-behaved categories of the form ", which isn't quite right, whereas "certain well-behaved categories whose objects are objects of categories of the form " does work, maybe that's what he meant.
Alex Kreitzberg said:
- has finite limits, ("And")
Finite limits give you the ability to say "and", plus the ability to say "the biggest subobject where f=g"; the latter is impossible with conjunctions only, no?
I'm happy people are calling me out XD
I tried to reverse engineer how the terminology is used on the nlab, but I wouldn't be surprised if my usage, or understanding, are confused.
The sense I got from the nlab is you might have any of
And all of these would be called "A Topos" and it was up to the reader to establish from context whether it was a grothendieck topos or an elementary topos. (I found this odd, because it felt like calling a square a rectangle and expecting the reader to know we'll use square properties)
Even with that, badly self imposed, rule I appreciate folks pointing out there are still mistakes.
Additionally, the definition used in seven sketches (page 242) defined a topos to be a category of sheaves. So I was trying to walk backwards from that definition.
The point is (which was subtly emphasized by fosco's comment), when talking about toposes - the subobject classifier, a generalization of "boolean" used for finding subsets, is the star of the show:
After sleeping on the problem, it occured to me they actually did define an operation that looks very much like finding sub trees for Halo 2:
bitmasking trees in behavior trees
For example, If you're in a vehicle you only have access to some of your "behaviors". Given this "subsetting" via "bit masks", maybe something like intersections and unions becomes meaningful.
A category of presheaves is a category of sheaves (ie Grothendieck topos), which is an elementary topos, which is category with a subobject classifier. But all these implications are irreversible. Most category theorists saying "topos" mean "elementary topos", but for historical reasons and the fact that Grothendieck toposes are pretty great, it's far from unheard of to use "topos" to mean "Grothendieck topos."
One thing that hasn't been clear to me so far in the discussion of behavior tree is the scope of "blackboard" which is shared. It seems like if all nodes have access to global state, it might cause problems with composition. If, say, two subtrees had actions competing over the same piece of shared state, it sounds like it could lead to deadlock or some runaway cycle.
Maybe some scoping better than "everything global" would help. Maybe, for example, a node knows all its children's scopes. Or, potentially, all sibling nodes share a scope, slightly smaller than the parent node.
Another question I have somewhat related to this is about whether or not it's expected you can dynamically modify the tree. If you can add siblings to a sequence node, for example, that seems to get us back into the same state-sharing danger I was worried about.
Perhaps programmers have thought it through already. One person I talked to suggested that dynamically adding to trees is "not the way to go." It would certainly simplify things if the tree were locked in before execution. We would still have the option to make some subtrees function conditionally, which is kind of a proxy for dynamically inserting nodes.
Anyhow, thanks for all your insights into toposes. I never imagined I'd be at the stage where learning such things seemed within reach...
When being scared of mutability, be careful to avoid throwing the baby out with the bath water.
If I have a variable that allows "mutating" a 2 into a 3, that doesn't mean "2 + 1" is bad.
Similarly, whether or not you decide it's a good idea to allow mutability for a tree, I very much think it's a good idea to understand what "adding a sibling to a sequence" means.
These behavior trees might be the objects of some category. In which case figuring out the arrows between them, perhaps adding one tree to another tree, is key to understanding them.
Well, naively one would try composition that grafts the root of tree B so that it becomes the leaf of tree A. Of course, questions immediately arise: where does the root of B go? it's not like A has empty-socket leaf nodes for this purpose.
I have to get a little more used to the syntax of these gizmos. I feel like the main distinction is between leaf nodes and non-leaf nodes. The leaf nodes are the workhorses, which cause side-effects and do conditional checks. We can think of them as "action" nodes. The non-leaf nodes are "control nodes" which enable different stuff like loops, existential and universal quantification. A well formed tree is formed of some number of leaves, all of which are mutually connected with control nodes.
It's almost like the morphisms want to be trees of control nodes with "empty slots" where leaves have to be, and then it can be "applied to" a collection of that many Action nodes, and the result is a single Action node.
I feel like I'm close to something, but it doesn't look like anything familiar to me. Does it look like it's headed anywhere to you guys?
Never mind. I think I want to make it so morphisms operate on only a single thing.
A “composite node” is a finite tree whose leaves are composite nodes, and nonleaves are control nodes. Define a morphism to be a finite tree, whose no leaves are control nodes, and one leaf is a distinguished “socket” node, and all other leaves are composite nodes. Action nodes can be considered trivial composite nodes.
Seems like this allows us to say composite nodes are the objects, and the trees with one “socket leaf” are morphisms?
Does it look like it's headed anywhere to you guys?
People often visualize operads as having tree-shaped operations (see the section "understanding the axioms"). In the "grafting" product one identifies the root of one such tree with a leaf of another. A huge amount is known about operads, and I wrote a paper with Nina Otter about the operad of phylogenetic trees, making phylogenetic trees in biology into operations in a very natural operad. So if you're thinking about trees and attaching the root of one tree to a leaf of another, it might be good to think about operads.
The most important thing is to think about what you want the morphisms, or operad operations or whatever they need to be, to represent, rather than only thinking about what structure is convenient to access. (You have to think about that too of course.)
John Baez said:
In the "grafting" product one identifies the root of one such tree with a leaf of another.
This grafting makes sense to me as a structural operation for abstract trees, but when I was considering it for this case, I hesitated because I had been thinking of nodes as already having some purpose. If you have a leaf node with one purpose, and a tree that accomplishes some other purpose, then identifying the tree with that leaf node didn't seem to make sense. Maybe that would go away if "purpose" was filled in later. And yet, it seems like the nodes I'm talking about have different purposes baked-in... I can't quite visualize how to separate it and set it aside before composition.
James Deikun said:
The most important thing is to think about what you want the morphisms, or operad operations or whatever they need to be, to represent, rather than only thinking about what structure is convenient to access. (You have to think about that too of course.)
I'm not completely sure I have an adequate answer for what I "want to represent." The best answer I have now is, "these trees represent potential behavior in code, and I'd like to have a model of how one can combine these trees to form larger behavior trees."
This grafting makes sense to me as a structural operation for abstract trees, but when I was considering it for this case, I hesitated because I had been thinking of nodes as already having some purpose.
You might want to look at my work on phylogenetic trees, which analyzes a common way of drawing operations in operads where grafting gets a different feel, which is very relevant to your worry. In the material I linked to, I said:
There are various slightly different concepts of ‘rooted tree’ and ‘rooted planar tree’. For example, a graph theorist might draw a rooted planar tree like this:
rooted_planar_tree_graph_theorist_style.png
while an operad theorist might draw it like this:
rooted_planar_tree_operad_theorist_style.png
In the operad theorist's style, when we graft one tree A onto another tree B, we connect the "loose edge" coming downwards out of the root of A to a "loose edge" coming out of a leaf of B. We don't identify two vertices.
Our actual paper goes into this in extreme detail.
Gonna give the tree paper above a go and get back when I'm ready to discuss.
Maybe I'm worrying too much about "blocking ports" when composing. I was somewhat worried you could "block off all the ports" with actions (necessarily terminal leaves.) But maybe that's just an illusion. Sequence and Selector nodes are not really ever blocked off, it seems. Any "action" (which before seemed like a terminal node that would 'plug up' one of the open ports of a tree) could just be wrapped in a Sequence before being plugged in, and now when you attach it to any leaf-position, you don't close off that branch of the tree.
I was trying to figure out there's an analogy to be made here with circuit diagrams with boolean logic, and behavior trees with ternary logic.
Thing is, as I remarked earlier, Sequence and Selector nodes seemed like good candidates for conjunction and disjunction-like operations, but the "left to right evaluation" really makes things different. In particular, they aren't symmetric: for example Selector(Running, True) = Running and Selector(True, Running)=True.
Hopefully that isn't a huge setback... perhaps this sort of noncommutative ternary conjunction/disjunction might be studied already? I know "noncommutative logic" is a thing but I don't know anything about it...
Properly speaking, maybe those nodes further in the sequence that haven't run should be a fourth truth value, even. In the basic system you'll never encounter [Running, True], as the second node is never evaluated before the first Running is completed. Maybe I'll have a look at that...
maybe more to the point, the “lazy evaluation” makes ”almost half the truth table” irrelevant as they won’t occur.
If my idea about this being a topos is solid, then maybe the meaning of “those possibilities never happens in practice” amount to some restriction on sections of a sheaf?
AI was able to point me to a few likely helpful resources which may specifically cover using some form of ternary logic. If you are interested:
Gugliermo, Simona, David Cáceres Domínguez, Marco Iannotta, Todor Stoyanov, and Erik Schaffernicht. “Evaluating Behavior Trees.” Robotics and Autonomous Systems 178 (August 2024): 104714. https://doi.org/10.1016/j.robot.2024.104714.
Martens, Chris, Eric Butler, and Joseph C. Osborn. “A Resourceful Reframing of Behavior Trees.” arXiv:1803.09099. Preprint, arXiv, March 24, 2018. https://doi.org/10.48550/arXiv.1803.09099.
Marzinotto, Alejandro, Michele Colledanchise, Christian Smith, and Petter Ogren. “Towards a Unified Behavior Trees Framework for Robot Control.” 2014 IEEE International Conference on Robotics and Automation (ICRA), IEEE, May 2014, 5420–27. https://doi.org/10.1109/ICRA.2014.6907656.
Souza, Thibaud de. Implementing Behavior Trees Using Three-Valued Logic. n.d.