Hey everyone. First of all, I would like to apologize if this is not the right place for these kinds of requests. I am new to the forums and General Discussion seemed like the safest place for me to post.
With that preface out of the way, I would like to suggest using modern AI technology for Boss fights—specifically deep reinforcement learning. In this post I will be tackling three main points: What is deep reinforcement learning and how is it useful? How can it be used to create interesting Boss AI? And what are some of the problems that might appear and how to deal with them?
Deep reinforcement learning is a mix of two ideas: neural networks, and reinforcement learning. Neural networks are a semi-abstraction of how the brain works. It is composed of neurons (or nodes) and these neurons are connected much like the ones in our brains are. We give this neural network an input (for a Boss's AI it might be player action, its Hp, its environment, etc...) and some of the neurons get activated. Depending on which neurons get activated we end up with a different output (for a Boss's AI, these outputs are the action that the boss will take. Which direction should it move? Should it use an ability? Which ability should it use?).
Now that we have a basic understanding of neural networks, let's move to reinforcement learning. It is simply a training method. The way it works is that when an AI does something that is good, we give it a reward, when it does something bad, we give it a punishment. So of course, the AI is incentivized to do more of the things that give it a reward and avoid the things that give it a punishment.
Deep reinforcement learning mixes these two ideas. We take a neural network, and we train it using reinforcement learning. When the AI does something good, we leave the neural network as it is. When it does something bad, we tweak it a little bit. This explanation is obviously a very big oversimplification, but you get the general gist. If you want to learn more; you can check out this article that explains it in a beginner-friendly manner:
https://wiki.pathmind.com/deep-reinforcement-learning
So, why do we care about deep reinforcement learning? It is the architecture that OpenAI used to create the models that dominated Dota 2 pro players. It showed that not only did it have high-level mechanics (even when its APM and effective APM were reduced to human levels), but it was also able to bait and fake out its opponent. It was able to showcase levels of strategy that pro-player started mimicking the AI.
Now for the second point, how can this algorithm be used to create interesting Boss AI? One pet peeve that I always had with MMO AI, in general, is how formulaic it is. The boss starts with a couple of attack patterns and skills that it use, and once it reaches a certain threshold it changes those patterns. You end up with bosses that have very little variety of moves, and they can be beaten easily as long as you have the right equipment and learned the right strategy over the course of a few runs (or more likely googled the clearing guide for the boss). I believe that a boss that not only challenges you from a strategy standpoint but also tactically and mechanically can be very interesting. It would require that the players raiding the boss show a level of skill, adaptability, and coordination, rather than just going through the motion as you already memorized all the patterns and phases of the boss.
Now, for the problems that might appear and how to solve them:
The raid boss might be so challenging that it is impossible for players to win against: I have seen this argument being thrown around in these types of threads multiple times, and I believe that there are many ways to solve this issue. Let's start with the easiest one: nerf the boss stats. Rather than having a boss with millions of Hit points and multiple one-shot mechanics, we can tone that down to a more manageable level. That way even a more dynamic AI can still be beaten by players that are skilled enough.
The other option can be used along with the first option. But to understand, let's first go back to how will we train our Boss AI in the first place. It is similar in a sense to how OpenAI was able to train their Dota 2 AI. We will create a boss model (model/agent is just another name for AI, don't be too confused by it) and some player models, and have them fight against each other. Maybe in the first round the boss wins, in the second round the players win, and so on and so forth. As they continuously fight each other, they keep learning and getting better. What we can do is keep a copy of each of these models as they learn, so that we can access them later. Once we reach the point in the training (if we ever reach it) where the boss is just stomping the player models without them being able to do a thing, we can go back in time and use an earlier copy of it that is less skilled (and can be killed by the player agents). We can check how this boss model fights, and if we like it, we can playtest it. Then the playtesters can tell us whether the boss was fun to fight—if it was too hard or too easy. If it was too hard then we can go back and look for even less skilled copies and test those; if it was too easy then we do the opposite.
In my opinion, this is just a natural balancing problem, and the solution will follow the same old steps. It's an iterative process that will take into consideration multiple things until the design and AI teams reach as perfect of an encounter as can be.
How to make sure that the Boss is controllable/doesn't do stupid stuff: As with all probabilistic algorithms, you cannot enforce policies 100% of the time, but there are tools to make sure that they happen as rarely as possible. Going back to our definition of reinforcement learning, we can enforce our policies by manipulating our reward function. So let's say that we want to program an aggro radius for the boss, we want to have a green zone where the boss can go wherever it pleases, a yellow zone where it should only access if it's fighting players and wants to kill them, and a red zone where it should never step foot. The way we code that is if the boss is in the green zone, reward += 0. If the boss is in the yellow zone reward -= 100. If the boss is in red zone reward -= 10000. This way, the boss can stay in the green zone as much as it likes, but only enter the yellow zone if it thinks that it can kill players(because let's say that killing players give it reward+= 500, which is more than the penalty of the yellow zone), and never enter the red zone (because the reward of killing the player is less than entering the red zone).
Well, to wrap it up, I just want to say that I'm really hyped for this game and can't wait to play it out. It's been a while since I have been this hyped for an upcoming game and that I trusted a game company to deliver on their vision and receive feedback from their players and actually implement said feedback. I'm sorry for the long rant, but this idea has been in my head for quite a while and I really wanted to share it with the community and the team to know their opinion.