0

AI Alignment

 1 year ago
source link: https://devm.io/machine-learning/artificial-intelligence-aligment
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Controlling artificial intelligence so it behaves in our best interest

AI Alignment


At least since the arrival of ChatGPT, many people have become fearful that we are losing control over technology and that we can no longer anticipate the consequences they may have. AI Alignment deals with this problem and the technical approaches to solve it.

Two positions can be identified in the AI discourse. First, "We'll worry about that later, when the time comes" and second, "This is a problem for nerds who have no ethical values anyway". Both positions are misguided, as the problem has existed for a long time and, moreover, there are certainly ways of setting boundaries for AI. Rather, there is a lack of consensus on what those boundaries should be.

AI Alignment [1] is concerned with aligning AI to desired goals. The first challenge here is to agree on these goals in the first place. The next difficulty is that it is not (yet?) possible to give these goals directly and explicitly to an AI system. For example, Amazon developed a system several years ago that helps select suitable applicants for open positions ([2], [3]). For this, resumes of accepted and unaccepted applicants were used to train an AI system. Although they contained no explicit information about gender, male applicants were systematically preferred. We will discuss how this came about in more detail later. But first, this raises several questions: Is this desirable, or at least acceptable? And if not, how do you align the AI system so that it behaves as you want it to? In other words, how do you successfully engage in AI alignment?

For some people, AI Alignment is an issue that will become more important in the future when machines are so intelligent and powerful that they might think the world would be better without humans [4]. Nuclear war provoked by supervillains is mentioned as another possibility of AI’s fatal importance. Whether these fears could ever become realistic remains speculation.

The claims being discussed as part of the EU's emerging AI regulation are more realistic. Depending on what risk is realistically inherent in an AI system, different regulations may be applied here. This is shown in Figure 1, which is based on a presentation for the EU [5]. Four ranges from "no risk" to "unacceptable risk" are distinguished. In this context, a system with no significant risk only has the recommendation of a "Code of Conduct", while a social credit system, as applied in China [6], is simply not allowed. However, this scheme only comes into effect if there is no specific law.

Fig. 1

Fig. 1: Regulation based on outgoing risk, adapted from [5]

Alignment in Machine Learning Systems

A machine learning system is trained using sample data. It learns to mimic this sample data. In the best and most desirable case, the system can generalize beyond this sample data and recognizes an abstract pattern behind it. If this succeeds, the system can also react meaningfully to data that it has never seen before. Only then can we speak of learning or even a kind of understanding that goes beyond memorization.

This also happened in the example of Amazon's applicant selection, as shown in a simplified form in Figure 2.

Fig. 2

Fig. 2: How to learn from examples, also known as supervised learning

Here is another example. We use images of dogs and cats as sample data for a system, training it to distinguish between them. In the best case, after training, the system also recognizes cats that are not contained in the training data set. It has learned an abstract pattern of cats, which is still based on the given training data, however.

Therefore, this system can only reproduce what already exists. It is descriptive or representative, but hardly normative. In the Amazon example, it replicates past decisions. These decisions seemed to be that men simply had a better chance of being accepted. So...


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK