W.T. Gowers

Udruženi sadržaj Gowers's Weblog
Mathematics related discussions
Ažurirano: prije 4 dana 16 sati

A recent experience with ChatGPT 5.5 Pro

Pet, 2026-05-08 17:40

We are all having to keep revising upwards our assessments of the mathematical capabilities of large language models. I have just made a fairly large revision as a result of ChatGPT 5.5 Pro, to which I am fortunate to have been given access, producing a piece of PhD-level research in an hour or so, with no serious mathematical input from me.

The background is that, as has been widely reported, LLMs are now capable of solving research-level problems, and have managed to solve several of the Erdős problems listed on Thomas Bloom’s wonderful website. Initially it was possible to laugh this off: many of the “solutions” consisted in the LLM noticing that the problem had an answer sitting there in the literature already, or could be very easily deduced from known results. But little by little the laughter has become quieter. The message I am getting from what other mathematicians more involved in this enterprise have been saying is that LLMs have got to the point where if a problem has an easy argument that for one reason or another human mathematicians have missed (that reason sometimes, but not always, being that the problem has not received all that much attention), then there is a good chance that the LLMs will spot it. Conversely, for problems where one’s initial reaction is to be impressed that an LLM has come up with a clever argument, it often turns out on closer inspection that there are precedents for those arguments, so it is still just about possible to comfort oneself that LLMs are merely putting together existing knowledge rather than having truly original ideas. How much of a comfort that is I will not discuss here, other than to note that quite a lot of perfectly good human mathematics consists in putting together existing knowledge and proof techniques.

I decided to try something a little bit different. At least in combinatorics, there are quite a lot of papers that investigate some relatively new combinatorial parameter that leads naturally to several questions. Because of the sheer number of questions one can ask, the authors of such papers will not necessarily have the time to spend a week or two thinking about each one, so there is a decent probability that at least some of them will not be all that hard. This makes such papers very valuable as sources of problems for mathematicians who are doing research for the first time and who will be hugely encouraged by solving a problem that was officially open. Or rather, it used to make them valuable in that way, but it looks as though the bar has just been raised. It is no longer enough that somebody asks a problem: it needs to be hard enough for an LLM not to be able to solve it.

In any case, a little over a week ago I decided to see how ChatGPT 5.5 Pro would fare with a selection of problems asked by Mel Nathanson in a paper entitled Diversity, Equity and Inclusion for Problems in Additive Number Theory. Nathanson has a remarkable record of being interested in problems and theorems that have later become extremely fashionable, which has led him to write a series of extremely well timed and therefore highly influential textbooks. In this paper, he argues for the interest of several other problems, some of which I will now briefly describe.

If is a set of integers, then its sumset is defined to be . For a positive integer , the –fold sumset, denoted , is defined to be . Nathanson is interested in the possible sizes of given the size of . To that end one can define a set to be the set of all such that there exists a set with and .

An obvious first question to ask is simply “What is ?” When , the answer is the set of all integers between and . It is an easy exercise to show that if , then , so this result is saying that all sizes in between can be realized. However, it is not true in general that can take every size between its minimum and maximum possibilities, and we do not currently have a complete description of .

Another natural question one can ask, and this is where ChatGPT came in, is how large a diameter you need if you want a set with and having prescribed sizes. (Of course, the size of must belong to .) Nathanson showed that for every there is a subset of with and , and asked whether the bound could be improved. ChatGPT 5.5 Pro thought for 17 minutes and 5 seconds before providing a construction that yielded a quadratic upper bound, which is clearly best possible. It wrote up its argument in a slightly rambling LLM-ish style, so I asked if it could write the argument up as a LaTeX file in the style of a typical mathematical preprint. After two minutes and 23 seconds it gave me that, after which I spent some time convincing myself that the argument was correct.

The basic idea behind both Nathanson’s argument and ChatGPT’s was that in order to obtain a set of a given size with a sumset of a given size, it is useful to build it out of a Sidon set, which means a set with sumset of maximal size (that is not quite the usual definition but it is the simplest to use in this discussion), and an arithmetic progression. Also, for a bit of fine tuning one can take an additional point near the arithmetic progression. Then if one plays around with the various parameters, one finds that one can obtain sets of all the sizes one wants. Nathanson doesn’t express his argument this way (it is Theorem 5 of this paper), instead giving an inductive argument, but I think, without having checked too carefully, that if one unravels his argument, one finds that effectively that is what he ends up with, and the Sidon set in question consists of powers of 2. ChatGPT obtained its improvement by simply using a more efficient Sidon set — it is well known that one can find Sidon sets of quadratic diameter. (One might ask why Nathanson didn’t do that in the first place: I think it is because the obvious idea of using a more efficient Sidon set becomes obvious only after one has redescribed his inductive construction. Is that what ChatGPT did? It is very hard to say.)

Next, I asked ChatGPT to see whether it could do the same for a closely related question, where instead of looking at the size of the sumset, one looks at the size of the restricted sumset, which is defined to be . Unsurprisingly, it was able to do that with no trouble at all. I got it to write both results up in a single note, to avoid a certain amount of duplication. If you are curious, you can see the note here.

I then asked what it could do for general . I was much less optimistic that it would manage to do anything interesting, because the proof for makes fundamental use of the fact (due to Erdős and Szemerédi) that we know exactly which sizes we need to create. If we don’t know what the set is, then it seems that we are forced to start with a hypothetical set with and and build out of it a set of small diameter with the same property. As it happens, I still don’t know how to get round that difficulty (I’m mentioning that just to demonstrate that my mathematical input was zero, and I didn’t even do anything clever with the prompts), but Nathanson mentioned in his paper a remarkable paper of Isaac Rajagopal, a student at MIT, who must have got round the difficulty somehow, because he had managed to prove an exponential dependence of on for each fixed .

I’ll leave the previous paragraph there, but Isaac has subsequently explained to me that that isn’t really the difficulty. His argument gives a complete description of when is sufficiently large, and if one wants to prove a polynomial dependence for fixed , then assuming that is sufficiently large is clearly permitted. The real difficulty is that constructing the sets with given sumset sizes was significantly more complicated, and necessarily so because the degree of the polynomial grows with , and one therefore needs more and more parameters to define the sets.

In any case, the task faced by ChatGPT was not to solve the problem from scratch, but to see whether it was possible to tighten up Isaac Rajagopal’s argument. Here’s what happened.

  1. After 16 minutes and 41 seconds, it came back with an argument that claimed to have improved the upper bound from exponential in to exponential in for any .
  2. I asked it to write that in preprint form too, which took it a further 47 minutes and 39 seconds.
  3. That preprint would have been hard for me to read, as that would have meant carefully reading Rajagopal’s paper first, but I sent it to Nathanson, who forwarded it to Rajagopal, who said he thought it looked correct.
  4. Both ChatGPT and Rajagopal speculated a little on what might need to be done to push things further and get a polynomial bound, so I got greedy and asked ChatGPT to give that a go.
  5. After 13 minutes and 33 seconds it told me it felt optimistic about the existence of such an argument but there were a couple of technical statements that needed checking.
  6. I asked it to check them.
  7. After 9 minutes and 12 seconds it got back to me with the check having been done, so I asked for this too to be written in preprint form.
  8. After 31 minutes and 40 seconds the “preprint” was ready. Here it is.
  9. Isaac Rajagopal looked at it and declared it to be almost certainly correct. It was clear that he meant this not just at a line-by-line level but at the level of ideas.

Isaac made some very interesting remarks about the nature of what the additional ideas were that ChatGPT contributed. Since, as I have already said, my mathematical input was zero, I invited him to write a guest section to this post. Just before we get to that, I want to raise a question (that will undoubtedly have been raised by others as well), which is simple: what should we do with this kind of content? Had the result been produced by a human mathematician, it would definitely have been publishable, so I think it would be wrong to describe it as AI slop. On the other hand, it seems pointless even to think about putting it in a journal, since it can be made freely available, and nobody needs “credit” for it (except that Isaac deserves plenty of credit for creating the framework on which ChatGPT could build). I understand that arXiv has a policy against accepting AI-written content, which makes good sense to me. So maybe there should be a different repository where AI-produced results can live. But various decisions would need to be made about how it was organized. I myself think that one would probably want to have some kind of moderation process, so that results would be included only if a human mathematician was prepared to certify that they were correct — or, better still, that they had been formalized by a proof assistant — and perhaps also that they answered a question that had been asked in a human-written paper. On the other hand, I wouldn’t want a moderation process that created vast amounts of work (unless the work was itself done by AI, but there are obvious dangers in going down that route). Anyway, until these questions are answered, this result is available from the link above, and perhaps, now that LLMs are so good at literature search, that will be enough to make it findable by anyone who wants to know whether Nathanson’s problem has been solved.

Isaac’s evaluation of what ChatGPT achieved

With just a few prompts, ChatGPT was able to improve the upper bound on (which I will define very soon) from exponential in to polynomial in . While its first improvement of the bound, from exponential in to exponential in , was a routine modification of my work, the improvement to polynomial in is quite impressive. To do this, ChatGPT came up with an idea which is original and clever. It is the sort of idea I would be very proud to come up with after a week or two of pondering, and it took ChatGPT less than an hour to find and prove, using similar methods to those in my own proof. My goal is to explain that idea, in a manner that will be digestible to my friends who are computer science majors as well as my math major friends.

The problem of bounding is closely related to a problem I worked on at the Duluth REU (Research Experience for Undergrads) program, of determining . In particular, is the set of possible -fold sumset sizes , where can be chosen to be any set of integers. is the minimal such that we can achieve all of the values of using -element sets . I spent last summer explicitly characterizing the set for large , by constructing sets such that achieves all sizes which I could not rule out as impossible. So, can be upper-bounded by optimizing my constructions.

I constructed these sets by combining smaller component sets which are simpler to analyze. Some of these components are the geometric series

for various values of and . Unfortunately, the elements of and are exponentially large in terms of . So, I asked ChatGPT (through Tim) whether there exist sets of elements which have similar sumset sizes to these geometric series, but contain only numbers of polynomial size in : I had no idea if this was possible, or how to begin constructing such sets. ChatGPT came back with an answer, constructing sets and which behave like “half a geometric series squeezed into a polynomial interval,” which is counterintuitive. Before I discuss the construction of and , I will explain the important properties of the sumset sizes of and which they recreate.

For , a set is called a set if the only solutions to

with in are the “trivial” solutions, by which I mean that one side of the equation is a reordering of the other side. If is a set of size , then elements of correspond exactly to choices of elements of , with repetition allowed. Using “stars and bars,” one can see that and this is the maximum possible value of among sets of size . So, another definition is that is a set if . Sidon sets, which Tim discussed, are exactly sets.

To make things more concrete, let us assume that in (1). Then, is a set, but it is not a set because of the relations

for any choice of in . In particular, , as these relations are the only ones preventing from being a set. lacks the relations in (2) because is not in . So, is a set, but it is not a set because of the relations

for any choices of in . This gives relations, and one can check that . To summarize, we have seen that

(a) is a set.

(b) is a linear function of .

(c) is a set.

(d) is a quadratic function of .

    ChatGPT was able to find sets and of elements which satisfy (a)-(d), but whose elements all have polynomial size in . The construction of and uses -dissociated sets, which are sets where the only solutions to

    with and in are the “trivial” solutions, i.e. and one side of the equation is a reordering of the other side. For , it is possible to construct an -dissociated set , where is approximately , and in particular polynomial in . Constructions of such a using finite fields date back to Singer (1938) and Bose–Chowla (1963) and are described in Appendix 1. Define

    and

    In hindsight, I have good intuition for the construction of and . All of the relations in (2) and (3) are formed by combining one or two relations of the form . There are approximately relations of the form in and , and approximately such relations in and . There are few other low-order relations in and , and similarly in and because is -dissociated. So, and manage to contain half as many -relations as their geometric series counterparts, while also containing few low-order relations.

    We now see why (a)-(d) hold with and replaced by and , respectively. For concreteness, we assume that and , so contains no nontrivial relations as in (4) with . Then, is a set, but it is not a set because of the relations

    for any choice of in . If we let , we can check that is linear in . In particular, (a) and (b) hold with replaced by , and the linear function replaced by . We can also see that is a set, but it is not a set because of the relations

    for any in . If we let , we can check that is quadratic in . In a similar manner, (c) and (d) hold with replaced by , and the quadratic function replaced by .

    Even though I can motivate it in retrospect, ChatGPT’s idea to use -dissociated sets to control relations of order at most feels quite ingenious. As far as I can tell, this idea is completely original.

    ChatGPT’s proof that its construction produces the desired values of is very similar to my proof that the sets which I construct achieve all possible values of , after replacing and by and , respectively. Properties (a)-(d) capture many of the important properties of and (or and ) which are used in this proof. The final constructions involve combining the sets and (or and in my paper) for each value of between and with another set which is the union of an arithmetic progression and a point. Intuitively, and (or and ) have large sumsets, while arithmetic progressions have small sumsets, so it is plausible that one could get sets which achieve all the medium-sized sumsets by combining them. However, the proof of this is quite involved, and it occupies Section 4 of my paper and the entirety of the ChatGPT preprint. In Appendix 2, I work out the details of the ChatGPT construction to show that for sufficiently large,

    For comparison, it is easy to see that is at least on the order of , and it is unknown what the real value is. In Appendix 3, I give details of the correspondence between my paper and the ChatGPT preprint, which will be helpful for those who want to read either.

    Finally, I want to express my deep gratitude to Tim for allowing me to contribute to this blog. I am still stunned by the coincidence that the problem he chose to put into ChatGPT 5.5 Pro led him to my paper on the arXiv.

    Tim on what this means for mathematical research

    I would judge the level of the result that ChatGPT found in under two hours to be that of a perfectly reasonable chapter in a combinatorics PhD. It wouldn’t be considered an amazing result, since it leant very heavily on Isaac’s ideas, but it was definitely a non-trivial extension of those ideas, and for a PhD student to find that extension it would be necessary to invest quite a bit of time digesting Isaac’s paper, looking for places where it might not be optimal, familiarizing oneself with various algebraic techniques that he used, and so on.

    It seems to me that training beginning PhD students to do research, which has always been hard (unless one is lucky enough, as I have often been, to have a student who just seems to get it and therefore doesn’t need in any sense to be trained), has just got harder, since one obvious way to help somebody get started is to give them a problem that looks as though it might be a relatively gentle one. If LLMs are at the point where they can solve “gentle problems”, then that is no longer an option. The lower bound for contributing to mathematics will now be to prove something that LLMs can’t prove, rather than simply to prove something that nobody has proved up to now and that at least somebody finds interesting.

    I would qualify that statement in two ways though. First, there is the obvious point that a beginning PhD student has the option of using LLMs. So the task is potentially easier than proving something that LLMs can’t prove: it is proving something in collaboration with LLMs that LLMs cannot manage on their own. I have done quite a lot of such collaboration recently and found that LLMs have made useful contributions without (yet) having game-changing ideas.

    A second point is that I don’t know how much of what I have said generalizes to other areas of mathematics. Combinatorics tends to be quite focused on problems: you start with a question and you reason back from the question or if you reason forwards you do so very much with the question in mind. In other areas there can be much more of an emphasis on forwards reasoning: you start with a circle of ideas and see where it leads. To do it successfully, you need to have some way of discriminating between interesting observations and uninteresting ones, and it isn’t obvious to me what LLMs would be like at that.

    Of course, everything I am saying concerns LLMs as they are right now. But they are developing so fast that it seems almost certain that my comments will go out of date in a matter of months. It is also almost certain that these developments will have a profoundly disruptive effect on how we go about mathematical research, and especially on how we introduce newcomers to it. Somebody starting a PhD next academic year will be finishing it in 2029 at the earliest, and my guess is that by then what it means to undertake research in mathematics will have changed out of all recognition.

    I sometimes get emails from people who are interested in doing mathematical research but are not sure whether that makes sense any more as an aspiration. I have a view on that question, but it may very well change in response to further developments. That view is that there is still a great deal of value in struggling with a mathematics problem, but that the era where you could enjoy the thrill of having your name forever associated with a particular theorem or definition may well be close to its end. So if your aim in doing mathematics is to achieve some kind of immortality, so to speak, then you should understand that that won’t necessarily be possible for much longer — not just for you, but for anybody. Here’s a thought experiment: suppose that a mathematician solved a major problem by having a long exchange with an LLM in which the mathematician played a useful guiding role but the LLM did all the technical work and had the main ideas. Would we regard that as a major achievement of the mathematician? I don’t think we would.

    So what is the point of struggling with a difficult mathematics problem? One answer is that it can be very satisfying to solve a problem even if the answer is already known, but I don’t think that is a sufficient reason to spend several years of your life on this peculiar activity. A better answer is that by solving hard problems you get an insight into the problem-solving process itself, at least in your area of expertise, in a way that you simply don’t if all you do is read other people’s solutions. One consequence of this is that people who have themselves solved difficult problems are likely to be significantly better at using solving problems with the help of AI, just as very good coders are better at vibe coding than not such good coders, or people who have a solid grasp of how to do basic arithmetic are likely to be more skilled at using calculators (and especially at noticing when an answer feels off). Mathematics is a highly transferable skill, and that applies to research-level mathematics as well. By doing research in mathematics, you may not get the same rewards as your equivalents a generation ago, but there is a good chance that you will be equipping yourself very well for the world we are about to experience.

    Appendix 1 (Isaac)

    We will construct an -dissociated set , where is approximately . This construction is a very minor modification of Bose–Chowla (1963)’s construction of a set, which I learned about from this paper. For whatever reason, the GPT preprint (Lemma 3.1) uses a different, less efficient construction using moment curves.

    Let be a prime, let , let be the finite field with elements and fix a generator of , so that is equal to . Define a set of elements

    Then, each element corresponds to a unique value of , by taking . Now an additive relation of the form in (4) with can be reframed by taking powers of as

    As is a degree- extension of and is a generator of as an -extension, this means that does not satisfy any nonzero polynomials in of degree . So, both sides of (6) are identical as polynomials in and thus the additive relation in (4) is trivial. So, is -dissociated, and of course one can prune a few elements to reduce to size .

    Appendix 2 (Isaac)

    Fix constants such that (in my paper I arbitrarily chose ). Let the two sets in (5) be called and . Let denote the set of integers satisfying . Similarly to my paper, the constructions of such that achieves the desired sizes will combine sets of the following four types:

    • with choices of and .
    • for each value of , with choices of .
    • for each value of , with choices of .
    • A set of the correct size so that .

    One reason that this construction needs to be complicated is that we need to create at least many sets. To do this, we vary parameters and in the domain and parameters and in the domain . We can choose to be slightly bigger than , and then the above construction gives us different sets where can be made arbitrarily small. So, if we were to remove any of the above parameters from the construction, and not change the others, this construction would no longer create many sets. In comparison, Nathanson’s construction when only needs to create sets. He does this by combining a Sidon set, an arithmetic progression, and one extra value, and varying the size of the arithmetic progression and the extra value in ranges of size .

    We want to combine sets , which are given by , for the values of , for the values of , and a set. By Appendix 1, for all , there exists a -dissociated set of diameter . By the constructions of and , we can take each , where . Let have basis vectors . To combine , we can define as

    Similarly to my Lemma 4.9, this construction ensures that the generating function product holds, which is the identity that both my paper and the GPT preprint use (see either paper for a definition of these generating functions). By (the standard) Lemma 2.3 of the GPT preprint, is Freiman-isomorphic of order to a subset of . Therefore, for sufficiently large (the whole construction relies on this for the same reasons as in my paper),

    Appendix 3 (Isaac)

    In Section 4.2 of my paper, I use a different, simpler construction to construct sets achieving the values in which have , for some small . These sets are subsets of , meaning that all elements have polynomial size in . This is observed in Section 5 of the GPT preprint.

    Section 4.3 of my paper carries out the construction which combines many components including and . This corresponds to Sections 2, 3, 4, and 6 of the GPT preprint. This section has a lot of moving parts; I give an outline in Section 4.3.1.

    In Section 4.3.2, I describe how the different components will be combined, using a construction which I call the disjoint union, and introduce generating functions as a bookkeeping tool to keep track of the sumset sizes of a set . This corresponds to Section 2 and Section 4 of the GPT preprint.

    In Section 4.3.3, I compute the generating function of each of the component sets, including (Lemma 4.15) and (Lemma 4.17). This corresponds to Section 3 and Section 6.1 of the GPT preprint. In particular, is computed in Lemma 3.3 and is computed in Lemma 3.4. Once these generating functions have been computed, the remainder of the proof is almost identical in my paper and in the GPT preprint.

    In Section 4.3.4, I put all the pieces together to show that as we range over the sets which I have constructed, the values of will assume all of the elements of . The key idea is to show that the set of all values of forms an interval, and contains numbers both smaller than and equal to .

    Kategorije: Matematički blogovi

    Group and semigroup puzzles and a possible Polymath project

    Pet, 2026-03-20 22:34

    An Artin-Tits group is a group with a finite set of generators in which every relation is of the form or for some positive integer , where and are two of the generators. In particular, the commutation relation is allowed (it is the case of the first type of relation) and so is the braid relation (it is the case of the second type of relation). This means that Artin-Tits groups include free groups, free Abelian groups, and braid groups: for example, the braid group on strands has a presentation with generators , where represents a twist of the th and st strands, and relations if and .

    A few weeks ago, I asked ChatGPT for a simple example of a word problem for groups or semigroups that was not known to be decidable and also not known to be undecidable. It turns out that the word problem for many Artin-Tits groups comes into that category: the simplest example where the status is not known is the group with four generators and where and commute and all other pairs of generators satisfy the braid relation .

    My interest in this was initially that I was looking for a toy model from which one could learn something about how mathematicians judge theorems to be interesting. I started with a remarkable semigroup discovered by G. S. Tseytin that has five generators, seven very simple relations, and an undecidable word problem. From that I created a puzzle game that you might be interested to play. (NB the puzzles were not showing up in the public version, but that should be sorted out now.) Each puzzle is equivalent to an instance of the word problem for Tseytin’s semigroup, but the interface makes it much more convenient to change words using the relations than it would be to do it with pen and paper.

    I was hoping that because every mathematical problem can in principle be encoded as a puzzle in this game, one might be able to build a sort of “alien mathematics”, where a theorem was an equality between words, and a definition was a decision to introduce a new (and redundant) generator g, together with a relation of the form g = w, where w is a word in the current alphabet. Theorems would be particularly interesting if they were equalities between words that could be established only by a chain of equalities that went through much longer words, and definitions would be useful if the new generators satisfied particularly concise relations (which would allow one to build “theories” within the system). I still hope to find a word problem that will allow such a project to take off, but in the end, after a lot of playing around with the game linked to above, I have decided that Tseytin’s semigroup is not suitable. The reason is that it is designed so that arbitrary instances of the word problem for groups can be encoded as word problems in this semigroup, and once one gets used to the game, one starts to see how that encoding can work. Furthermore, one seems to be driven towards the encoding — I don’t get the impression that there’s a whole other region of this semigroup to explore that has nothing to do with the kinds of words that come up in the encoding. And if that impression is correct, then one might as well start with the word problem for some group in unencoded form, or alternatively look for another semigroup. Nevertheless, I find the Tseytin game quite enjoyable: I won’t say more about it here but have written a fairly comprehensive tutorial that you can open up and read if you follow the link above.

    This is perhaps the moment to say that the words “I created a puzzle game” are slightly misleading. For one thing, I discussed the idea of gamifying Tseytin’s semigroup about three years ago with Mirek Olšák, a former member of my automatic theorem proving group, and he created a basic prototype in Python. But the main point is that I do not have the programming skills to create a game that can be played in a web browser — I vibecoded it using ChatGPT.

    After that experience, I thought that maybe I would have better luck if instead of looking for a group or semigroup with undecidable word problem, which might well have been explicitly designed with some encoding in mind, I looked for a word problem for which the decidability status was unknown. That way, it wouldn’t have been designed to be undecidable, but might nevertheless just happen to be undecidable and provide a nice playground of the kind I was (and still am) after. And that is what led me to the Artin-Tits groups.

    However, those don’t seem to be suitable either, because it is conjectured that they all have decidable word problems. I have created a game for the Artin-Tits group mentioned above, which you can also play if you want, but I have found it very hard to create interesting puzzles. (NB again there was a problem with the puzzles showing up, with only a tiny handful being there, but that should now be sorted out.) That is, I found it difficult to find words that are equivalent to the identity but that are not easily shown to be equivalent to the identity. One nice example comes from the fact that the subgroup generated by and is isomorphic to the braid group with four strands. The late Patrick Dehornoy found a very nice example of a braid with four strands that is equal to the identity but not in a completely trivial way. A picture of it can be found on page 4 of this paper.

    This is where a potential Polymath project comes in. An initial goal would be to determine the decidability status of this one small Artin-Tits group: if we managed that, then we could consider the more general problem. And the way I envisage approaching this initial goal is an iterative process that runs as follows.

    1. Devise an algorithm for solving word problems in the group.
    2. If the current algorithm is , then search for a puzzle that fails to solve.
    3. If the search is successful, then devise an algorithm that solves all the puzzles that solves, and also solves . Make the current algorithm and return to step 2. Devising can be done by playing the game with puzzle several times until one has the feeling that one has solved it in a systematic way.
    4. If the search is unsuccessful and the current algorithm is , then attempt to prove inductively that always succeeds: that is, to prove that if solves and is obtained from by applying a single relation in the group, then solves .
    5. (Maybe) If the inductive step doesn’t seem to work, then try to use that failure to devise a puzzle that defeats and return to step 3.

    I have done a couple of iterations already, with the result that I now have an algorithm — let’s call it — that solves (basically instantaneously) every puzzle I throw at it, including the one derived from Dehornoy’s braid mentioned above. So a subgoal of the initial goal is to find a puzzle that has a solution but that doesn’t solve. If we can’t, then maybe it is worth trying to prove that solves the word problem for this group. One comment about the algorithm is that it never increases the length of a word, though it often does moves that preserve the length. I was told by ChatGPT that it is an unsolved problem whether every braid that is equal to the identity can be reduced to the identity without ever increasing the number of crossings. Of course that isn’t a guarantee that it really is unsolved, but if it is, then that’s another interesting problem. It also means that I would be extremely interested if someone found an example of a word in the Artin-Tits group that can be reduced to the identity but only if one starts by lengthening it.

    I’m hoping that this will be an enjoyable project for people who like both mathematics and programming. On the maths side, there is an unsolved problem to think about, and on the programming side there are lots of possibilities: for example, one could write programs that explore the space of words equal to the identity, trying to do so in such a way that there is a reasonable chance of reaching a word where it isn’t obvious how to reverse all the steps and get back to the identity. The game comes with a “sandbox” that has a few tools for generating words, but at the moment it is fairly primitive, and I would welcome suggestions for how to improve it.

    It seems to me that as Polymath projects go, one can’t really lose: either we find an algorithm for the word problem, which establishes an unknown (at least according to ChatGPT) case of a decidability problem, or we find a suite of harder and harder puzzles, creating a more and more challenging and entertaining game and obtaining a deeper understanding of the Artin-Tits group in the process.

    This Artin-Tits game has only a rather rudimentary and not very good tutorial created by ChatGPT. It’s easy to play once you’ve got the hang of it, and in the end I think the easiest way to get the hang of it is to watch someone else play it for a couple of minutes. I have therefore created a video tutorial, which you can find here. The video itself lasts about 25 minutes, but the tutorial part is under 10 minutes: what makes the video longer is an explanation of various features for making it easier to create puzzles, and also an explanation of how the algorithm works, which again is much easier to explain if I can demonstrate it on screen than if I have to write down some text. (I do have some text, since that is what I used as a prompt for ChatGPT to implement the algorithm, but it took a few iterations to get it working properly, so now I’m not sure what the quality of the code will be like, or even whether it is doing exactly what I want, though it appears to be.) Although I have embedded the video into this post, if you actually want to see what is going on you will probably need to watch it on YouTube using the full screen. I recommend not watching the explanation of the algorithm until you have played the game a few times first. Also, I should warn you that the games in the Advanced category are not all soluble, but games 1, 5 and 6 definitely are. Another thing I forgot to say on the video is that if you want you can “rotate” a word by clicking on one of its letters and dragging it to the right or left. If the word is equal to the identity, then its rotations will be as well, so this is a valid move. There is also a button for disabling this rotation facility if you want the puzzle to be slightly more challenging.

    A quick note about the availability of the two games. They are hosted on the Netlify platform. I am on their free plan, which gives me a certain number of “credits” each month. I’m not sure how quickly these will run out, largely because I have no idea how many people will actually think it interesting to play the games. If they do run out, then the games will cease to be available until the credits are renewed, which for me happens on the 10th of each month. If this has happened and you are keen to play one of the games, another option is to download the html files and open them in your browser. Here is a link to the Artin-Tits game and here is one to the Tseytin game. If you are feeling particularly public-spirited, especially if you think you will play quite a lot, then you might consider doing that anyway, so that the Netlify credits run down more slowly. If the running out of the credits is quick enough for all this to be a real issue, then I may move to a paid plan.

    I’ll finish with a quick tip for playing the Artin-Tits game, which I mentioned on the video but perhaps didn’t stress enough. Many of the moves consist in selecting three consecutive letters and using a relation of the form to change them. Easy examples of this are replacement rules such as . But what if inverses are involved? I’ll represent inverses of generators with upper-case letters, so for example represents , which in the game would be a white followed by a white followed by a black . The word turns out to equal . To remember this, a simple rule is that two letters of the same colour can be bracketed together and “pushed past” the third letter, which retains its colour but changes its value. Here, for example, we write as and then swap them over, changing into in the process, getting , or without the brackets. In group theoretic terms, this is of course saying that and are conjugates, and that is what conjugates one to the other. But when playing the game it is convenient to remember it by thinking that when you see a subword such as , you can push the (and in particular the ) to the left, getting .

    Kategorije: Matematički blogovi